Tuesday, May 31, 2011

Two Kinds of Estimates

There are two only kinds of estimates in my world: the ones you're going to hold me to; and the ones that you're not.

If this is an estimate you're not going to hold me to - a true estimate - then it has certain characteristics:
  • You'll get it quickly. This is a gut sense, side of the barn, back of the envelope, breadbox estimate. (Yes, I could keep going with the euphemisms.)
  • It'll be a bit vague. "Hours, not days."
  • I might not hit it. I'll be close, and I might be over or under, but I won't guarantee it.
If this is an estimate you're going to hold me to, well, that's a whole different story. We've changed the nature of the deal. With this scenario, going over or under is not okay. It's not really an estimate; it's a fixed-fee bid. This kind of estimate has different characteristics:
  • It's going to take longer to get it. I have to think about the work and understand it better to be able to identify a time frame that's reasonable.
  • It's going to be more precise. "June 15th". I know you're going to be watching the calendar, so you'll probably get a date rather than a duration.
  • I'm more likely to hit it. Of course disaster might interfere, but in general these estimates are going to be more padded and I'm going to make the date.
So if you want an estimate, that's completely understandable and legitimate. Figure out, though, whether what you're asking for is an estimate or a guarantee (or as close to a guarantee as possible in this world). If all you need is an estimate, then there's no reason to wait for a guarantee. If you need a guarantee, that's fine, but it's going to take a little longer to get and it's going to sound different than an estimate.

Asking for the one you really need will help make sure no one's surprised later on - and that's a good place to be.

Friday, May 27, 2011

What I Love About Startups

One of the things I love about working for startups is that they attract the kind of people who have no idea what they can't do. Whether it's someone fresh out of college, a self-trained engineer, or simply someone who broke the bonds of a big company, they're all interested in the act of creation itself. So they create. They don't think about what they can't do; instead they just do it.

It's a form of thoughtlessness, really. Any reasonable person would sit around and think about the problem and potential solution, and identify all the risks and possibilities for failure, and ultimately come to the conclusion that there's no way to actually do this. Unreasonable people go work for a startup and say, "Yes, I can build this and yes people will buy it", and they don't consider the risks and the failure points nearly as thoroughly. Instead, they create.

Now, failing to consider risks can result in some spectacular failures, and startups do fail. Optimism of this type cannot overcome technical impossibilities or violations of the laws of physics and the infrastructure that exists. If, for example, your solution depends on 1ms latency between Japan and Paris, well, it's not going to work, no matter how amazing your idea.

However, organizational risks and social risks that stop many people can be overcome by simply not considering them. If the problem with your solution is that no one on earth could possibly express anything in 140 characters or less, well, that's a social risk. Ignoring it produced a huge success... why? Because no one stopped to consider the risk; they just did it.

I love startups because they gather people who make a habit of ignoring certain classes of risk, and who avoid paralysis due to those risks. My job is to make a culture where that kind of dreaming and that kind of creation is encouraged. I remove the organizational risks and help with the social risks, and let people accomplish wonderful things that in any other place they simply couldn't do. Don't tell us we can't, and we just might do it.

Wednesday, May 25, 2011

System Feedback

We've written software, found bugs, fixed bugs, tweaked features, gotten beta feedback, run tests, deployed. Phew! We're done!

Well, except for the next release. Oh, and also any issues we have with this release. And feedback. Now that we're in production we're going to get even more feedback.

Feedback comes in many forms: comments and emails from users, bugs from support. There's another source of feedback, though: the system itself. The system speaks to us in ways customers can't see: log files, data logged in a database, files created on a file system, and the like.

Take the feedback your system provides. Get the logs from production (go bug Ops if you have to) and take a look. Any warnings? Errors? Things happening more or less often than you expect? Check out a dump of the database, too - what's null that shouldn't be, or has weird combinations of values? That's probably something you should look at.... even if it hasn't manifested as a problem yet.

A good system is clean from top to bottom, from UI to logs. Take a look at your system - is it clean?

Monday, May 23, 2011

One Off

I'm working with an engineer who came to me recently and said, "I want to handle the deployment for this by doing X, Y, and Z." I said to him, "Hmm, that's not how we do any other deployments. Why do we need this one-off?"

To some extent, it doesn't matter what we ended up doing. The point is the question. By default, doing things in a different one-off way is a bad idea. It just adds a "special case" that has to be considered for all sorts of future things like IT monitoring and infrastructure, future upgrades or new features, etc. Sometimes, however, it's the right thing to do.

Good reasons to do a one-off:
  • The team is unhappy with the old way and this is a good place to try the potential new way.
  • The old way is simply not technically possible (e.g., the old way only works on Debian and this is Windows, for example).
Bad reasons to do a one-off:
  • You don't know about the existing method.
  • You don't understand the existing method.
  • You just like writing new stuff.
There is a time and a place to do something different from your existing solution. But varying your existing procedures should be the exception, not the rule. If you want to do a one-off, make sure you understand why and that it's a good reason to do so. If you can't say why you don't need a one-off, or if the rest of your team doesn't think it's a good reason, then it's probably not the right thing to do.

Friday, May 20, 2011

Starting Learning

Last week Shaun Hershey emailed me and said, "hey, I got my bosses to agree to start a learning program in June! I'm interested in the dojos idea. Now what?" Shaun was specifically asking about testing dojos, so I referred him to Markus Gärtner. However, there are some general things that can help a new learning program succeed or fail. What follows is an excerpt of the response I sent to Shaun, and wanted to share.

Because dojos can be so broad (literally about anything related to testing or coding or software), it's important to figure out where to go first. For the first few in particular, they're going to need to be really awesome, so that people who have given it a chance go away excited and talking about it to other people. For the rest of this, when I say "dojo", feel free to substitute a less trendy word, like "lesson" or "session".

For the first dojo, I would attempt to do something that:
1. is fun
2. addresses an area that a lot of people complain about (translation: something the group as a whole agrees could be improved)
3. can be done in 1-2 hours, tops
4. has a very concrete lesson
5. involves everyone

For the "is fun" criteria, it's going to depend on what your learners generally like. Some people respond to things like "let's build a balloon bouquet" because, hey, loud bangs are pretty hilarious. I did one once that involved drawing faces on balloons and seeing if we could get them to blow up into a face that looked like something we had drawn on the board. It was about learning iterative processes and scaling behaviors. We put a face on the board, handed everyone balloons, and said, "draw something so that it looks like that face when we blow up the balloon." Everyone cracked up at all the different faces we made on those balloons. Some groups won't respond well to that, though. They're very concerned about being software professionals and so really only see training and learning in the context of software. For those groups, you're going to want to stick to software-based dojos and lessons (or risk the "this has nothing to do with my job, this is stupid" dismissal).

Addressing an area that a lot of people complain about is a way to help make the learning feel relevant. If your message is, "this will help you with that thing you were complaining about last week", more people will think it's worth their time than if your message is, "this can help you be better". So listen for a while and find something everyone complains about. Then structure your first dojo around that.

Time is important. Unless management is forcing people to be there, you want to keep it relatively short. An hour, maybe two, isn't a lot of time commitment. It'll also force you to keep things moving.

You're seeking relevancy here. Part of that is addressing something people are complaining about (see above). Part of it also is about giving them something they can actually do about it. Make sure the lesson is highly concrete and that it's something they can go away and do. Maybe it's a new trick, or a new utility they get as they leave the room. Don't make it something that uses a tool that they'll have in a month; by then they'll have lost the tie between the learning and the reduction of their complaint. The goal is to make the reward for the time they're spending immediate. This is something we can talk about more as you get into specifics.

The death knell of a lesson is a lecture. It's a bunch of people sitting there, checking email, and going, "wow, that was a boring meeting." Have everyone in the group do something, and have the group as a whole doing things as much as possible. You'll need to provide guidance and a little bit of introduction and conclusion, but make the practice and the discussion as large a portion of the session as you can. Split into small groups if possible (2-5 people is good), so that everyone gets a chance to try things, and no one can hide in a corner.

In short, when you're starting to learn, the emphasis has to be on immediate benefit and on having a good time doing it. You're not just helping people learn. You're creating learning evangelists. Excitement and effectiveness are key.

Thursday, May 19, 2011

Things I Take For Granted

I work across a variety of spaces and with a variety of teams. Development teams, test teams, support teams, and the like. I've also hired for all of these kinds of teams, and noticed that the things I look for fall into three categories:
  • skills I absolutely need this person to have
  • skills I would like this person to have but can get by without
  • skills that I didn't even consider because, honestly, I just assume they're there
The first two categories are fairly straightforward. The "must have" skills are usually things that our teams needs and that no one on the team currently has or can pick up easily. I'm filling a hole with these skills. The "nice to have" skills are the ones that I have some coverage on or that I'm okay with learning on the job. I'm shoring up the overall team strength with these skills.

Then there's the third category: the things I just assume you can do. These are the "Engineering 101" skills that I don't even put in the job description because I think of them as so integral to any engineering job they're not even worth mentioning.

Nevertheless, I invariably get candidates who don't have these skills and who don't seem to find that odd. So, forthwith, here are the skills I'm going to assume you have if you call yourself an engineer. These are the baseline requirements:
  • The ability to work with source control (I don't much care which SCM)
  • The ability and willingness to work with a requirements tracking and defect tracking system (again, I don't care which one or even that it's electronic. Boards are fine. Just show me you understand the software development lifecycle and are willing to work within it.).
  • The ability to explain a piece of code (or a test) you wrote to another engineer.
  • The ability to modify a piece of code (or a test) written by someone else to change or enhance its functionality.... without rewriting it entirely.
  • The ability to explain a design to another engineer (test design or software design).
  • For developers, the willingness to write and run basic tests in an appropriate test framework (e.g., rspec or junit).
  • For testers, the ability and willingness to deploy a system for your own use, given reasonable instructions.
In other words, I take the ability to work on a software engineering team for granted. After all, when is the last time any of us worked alone?

Monday, May 16, 2011

Life Moves

When we build systems, we tend to think about the system as it is now. We think of it as a set of behaviors and database tables and APIs and code and data and users - all as they are now. It's a point-in-time thought process.

Too bad that only happens once per system.

There's only ever one fresh deployment. There's only ever one initial state. After that, we have to handle the system as it moves through time - deployments become upgrades rather than clean installs. APIs have to be backwards compatible (or not, but we have to think about it at least). Users have whatever attributes and data they had when they signed up, and that's not magically going to change just because new users have an additional attribute.

Life moves. Systems move. And we are freighted with our past.

The good news is we can handle it. There are deployment utilities that handle database migrations really well (ref. Capistrano or Puppet). There are defaults and diff utilities.

The only movement we can't prepare for is the one we didn't think about.

So every time you add something to your system, or take something away, stop and say, "what about the past?" It's a lot easier to do this when we're building the feature than to wait until just prior to deployment. If we wait, that's when we start to see bugs like, "if I do this with user X I see the problem but if I do it with user Y then I don't", and it takes some digging to realize that user X was created before the new feature was added, but user Y was created after. In other words, it gets to be a headache.

So save yourself the headache and ask yourself what to do about the past every time you take a step toward the future. Live moves. Let's move with it.

Friday, May 13, 2011

All the Things You Could Do

One of the fun parts of working in software is the dreaming of all the things we could do....
We could change the way the cache works.
We could make this field validation better in a certain way.
We could write some documentation to explain integration techniques better.
We could rework a feature to scale better.

We could do a lot of things, yes! And the more we sit around and talk and think about it, the more we'll think of that we could do.

This is great and wonderful. Please stop long enough to actually do something.

After all, "could" never ships. "Could" never actually helps anyone. "Did" ships. "Did" provides benefits.

So when you have some ideas about what you could do, stop dreaming and go do one of them. Heck, do two! Just don't forget that you need both parts: the "could" and the "did".

Wednesday, May 11, 2011

Baby Goals

Many companies work with goals. Frequently these are quarterly or annual goals, and they say things like: "hire new developer" or "release the 3.0 version to GA". These are great goals, and I'm not knocking them, but they're very far from covering everything we do. And goals are rarely maintenance-oriented. When's the last time we got the team all jazzed up about a goal of turning around all support calls in 20% less time? I don't think that one happens too much. In addition, goals are almost always a ways away - they're quarterly or annually, not daily or weekly.

Enter baby goals.

Baby goals are the little games a team plays with itself in the form of very small, very short-term goals. For example, the support team I work with right now has seven open cases that have been around for a while. They're not a big deal; they're done and working, and we're just waiting on client confirmation or hardware returns. But they're annoying. They're another bit of cruft.

So we set a baby goal in the team: "I'm going to get in touch with someone at each client site today and push this resolution forward."

Will it kill us if we miss it? Nope.
Will we get kudos and bonuses if we make it? Nope.
Will we even bother telling the boss about it? Nah.

It's just a baby goal. It's a game that we play to motivate ourselves and to give ourselves something concrete and countable to accomplish. And we did it, and we went home feeling pretty good - all because we'd accomplished this one baby goal. (Oh, and we managed to close out three of the issues just by getting the client to finish out the confirmation of the fix while we were on the phone with them.)

Goals don't have to be big grandiose things. Let baby goals be your mini-motivators.

Tuesday, May 10, 2011


I went to the kitchen today and grabbed a mug, and halfway through my tea I noticed that it looked like this:

It's a fine line between inspiring and silly. I think this one shot straight past inspiring and went to silly. Made my tea more enjoyable, though!

P.S. No, I don't know what's going on with the quotes, either.

Monday, May 9, 2011

Not the Most Urgent Thing, But....

A friend of mine sent me a graph showing his product's build time over the last month. It looked roughly like this:

Is this a problem? Well, yes, and no. There are a lot of very good reasons the build is now taking longer:
  • More features (hooray!)
  • More tests (hooray!)
Of course, there are some downsides to this as well:
  • Longer turnaround time before you get feedback
  • More thumb twiddling waiting for a build means more multitasking, which means less zoned-in oh-boy-is-this-productive time
There are things that the team can do to make the build shorter without giving up features or tests. They can optimize the code they have (faster faster!). They can optimize the tests, or more closely target the test to meet their needs. They can make the build multi-stage (a build/quick test cycle and a more full test cycle). Is this the most urgent thing they can be doing? Nope. There are plenty of new features to add, bugs to fix, and tests to write.

This is an example of a non-urgent task. It's something that you can do. It's something that you should do before it becomes a big problem. It's not the most important thing to do, however.

Non-urgent tasks like these are ideal for those days when you just can't or won't work on the product. These are the tasks you do the day before a major holiday when half the office is on vacation already anyway. You do them in the first morning after a release, while you're still trying to figure out exactly what's going into the next release (or sprint) and you haven't yet divided the work among teams. You do them when that two day task turned out to be easier than you thought and it only took one day. There are pockets of found time in software. Use the found time to address the non-urgent problems.... before they become urgent.

Friday, May 6, 2011

Obviously Wrong

I got an email from a vendor today, saying, "This performance test is taking far longer than it normally does. Here's a screenshot. What should we do?" In the screenshot, it showed the following speed and completion percentage for a linear write on a standard hard drive:

linear write 27.89% 1.8 MBps

No, that is not a typo.

Now here's a case where something is obviously wrong. There's no way a modern hard drive should be seeing write rates anywhere near that slow. I don't know what's wrong. Perhaps there's a bad part in there (backplane? motherboard? cpu?). Perhaps the box was built incorrectly and some cable isn't plugged in properly somewhere, or the CPU isn't seated correctly.

The point is that it doesn't matter. When something is obviously wrong, there's no need to continue to follow your plan. You can skip ahead to the "it's broken, now what" part. That holds true for tests like this one, for prototypes, or whatever you're doing. If it's meant to show a result, you can stop as soon as you see a result, even if you thought you'd have to get a little farther down the path before an answer presented itself.

Thursday, May 5, 2011

Being an Agile Customer

A friend of mine has been working with a development team recently that uses a very simple strategy. It goes something like this:
  • he adds something to the backlog
  • if it's a story, they estimate it by "points" (1, 2, 3, 4)
  • if it's a bug, there's no estimate
  • he orders the backlog
  • they work on the top thing
  • when it's done, they mark it complete
He's welcome to alter the backlog at any time, including interrupting the current thing. Each week, the tool spits out the number of points completed in the previous week and a rolling average of the first three weeks.

It all sounds great.... but it turns out release planning is a huge pain.

Let's say it's Thursday and our backlog looks something like this:
  • X (needed for client B on Monday)
  • Y (absolutely essential for client A on Tuesday, and if it's finished Friday I can use it over the weekend so that I'm not in a rush to get it ready for the client)
Well, now, that's an interesting problem. On the surface, it should be X followed by Y, since X has to go to a client before Y. But there are a lot of meetings Monday and it would be really nice to have Y for the weekend. So that sort of makes Y due on Friday, and so it should go before X.

It's Thursday, and we have two days before the weekend. Maybe we can get both X and Y? Let's check our velocity.

We did 18 points last week. Our rolling average is 21 points, but it's been trending down. We've done 19 points so far this week. Well, shoot. We're already off trend; we're going to be up this week. That means I really don't know whether they'll finish X or Y or both.

(By the way, the team finished both X and Y. All the worry for naught!)

Now, release planning is always hard. And this dev team has definitely made it the client's problem - hence my friend's dilemma. But a one-at-a-time backlog makes it hard to plan sets of work in the short term. In the long term, planning works okay - the averages bear out. In the short term, though, it's hard, because it's really just a guess.

The bad news it, it's still a guess. The good news is, there are things you can do to mitigate those guesses:
  1. The most obvious is to not cut things so tight. Give yourself more of a buffer. This isn't always possible, but there's no reason to wait until the last minute when you do have advance notice of a feature for a customer for a date.
  2. Consider "non-point" items (bugs, meetings) in the count of the backlog. For example, it's possible to post-facto assign a point count to a bug ("this was 1 point of effort to fix"), and use that to calculate an adjusted velocity.
  3. Add "customer points" to your stories in cases where you have to do something after the story is finished (e.g., item Y in the example above). Some stories are done and there's nothing else for the customer to do before it goes live. Other stories have customer work that's done afterward. Effectively, this increases the overall size of the story. So in your customer backlog, it has to show up earlier... which means it has to be ready earlier in your development backlog.