Thursday, April 28, 2011

Rails: Page Names With Numbers

I was working on a Rails application today and kept getting errors like this:

app/controllers/home_controller.rb:7: syntax error, unexpected tINTEGER
def 30onepager

Integer?! What integer?

Turns out you really shouldn't start method names with numbers. I made the method:

def onepager30

and everything came out just fine.

Not a hard error to figure out, but a gotcha.

Wednesday, April 27, 2011

Backwards Compatibility

I was talking with a tester today about a new feature and specifically a changed default. The effect here seemed minor, but one of the things we do with most new features is ask ourselves whether there is a backwards compatibility ramification here.

Let's define backwards compatibility as follows:
If the user doesn't change anything, then the system behavior shouldn't change.

The user should not have to change their software to accomplish an upgrade.

In many ways this seems pretty obvious. Don't change API signatures unless you leave the old API in place. Don't change the program name or library location unless you're making a symlink to it. Basically, if it's an interface the customer uses, don't take anything away from it. (We can talk about deprecation separately).

All changes should be additive: offer the new way and make sure the old way works.

A big part of the consideration of defining backwards compatibility concerns is defining interfaces the customers might use. Make sure you consider the following:
  • APIs
  • GUIs (form field order, workflow, accomplishable tasks)
  • program names
  • library names and locations
  • dependencies
  • file layout on disk (does your user use the number and names of these for monitoring?)
When you're thinking about backwards compatibility, it's up to you whether to preserve it or not. Just take some time to consider your interfaces with the customer so you can make the appropriate choice.

Monday, April 25, 2011

ROI of a Single Test?

A coworker asked me this question the other day:

"How do I calculate the ROI of the coverage of testFooStress versus the time to develop and run it?"

It's an interesting question on several levels. Let's break it down a bit:
  1. Why is he asking? Why about this test in particular?
  2. What does ROI mean here?
  3. This is a stress test, so coverage is a bit of a funny word to use. Note to self: poke at this one a bit.
  4. There's an obvious implied measurement here of "cost" as "development and run time". Note to self: poke at this one a bit, too.
Okay, first and foremost, this person has an agenda. A general curiosity about, say, the ROI of tests, would not include the name of the specific test. Therefore, there's something about this particular test that is bugging this guy. First we need to find out what it is, since this conversation is really about that agenda and is only nominally about the ROI or the coverage or the cost of this particular test.

After some conversation, it came out that he was most worried that we weren't stressing the right thing, or possibly not stressing it enough, and that it was an expensive test in terms of number of hours it took to run each night. That's a fair concern. It's not exactly ROI, but it's legitimate.

Let's say, though, that we really were talking about ROI. How do we calculate that?

There are many different kinds of ways to slice and dice this. One of the most simple is the ROI of choosing to automate a test, which can be calculated roughly like this:

number of runs to recoup cost = cost of automation / (cost of manual run - cost of automatic run)

This, of course, assumes that your automated test accomplishes the same things your manual test accomplishes. I should note that this is only sometimes true.

Beyond that things get rather more complicated. It's rarely useful to talk about a single test in isolation, and it's rarely useful to talk about an ROI simply because so often the return is unmeasurable. The return is things like defects that never hit the field, for example, and that's not really something we can measure very well. If we're proceeding down this path, we want to talk about more of a general cost benefit analysis.

We can take as a given that we have X time, Y people, and Z machines (or other resources) to accomplish stuff. Stuff can be development of software, or tests, or documentation, or a myriad of other things. To do any given thing will take some subset of our time, people, and machines. So we do something, then we do something else, then we do a third thing, and repeat until we're out of time, people, and/or machines.

The trick is to pick the things that are going to give us the most benefit for our cost. A benefit is anything good that happens as a result of doing X. Some benefits are tangible: "If we build the widget control feature, we'll get 1000 more customers paying $1000 each!". Some benefits are best described as the reduction of pain: "If we set up a second web server and a load balancer, we won't have as much downtime." Some benefits aren't directly measurable: "If we change the web site from blue and white to green we'll look totally hip and modern!"

Things that increase benefit:
  • features that our customers have asked for
  • features that the market has asked for
  • features that will make us attractive to customers or acquirers
  • features that improve our brand (e.g., being seen as a market leader by coming out with features before anyone else does)
  • less downtime
  • lower visible error rate
  • cleaner, friendlier UI
  • more consistent behavior
  • lower risk (regardless of whether that risk is eventually realized)
Things that increase cost:
  • longer or more difficult implementation
  • resultant increased complexity of the overall system
  • increased risk (again, regardless of whether it is realized)
  • longer or more difficult run time (e.g., a test now takes longer or uses more machines)
  • changing user metaphors (e.g., new UI)
  • more or harder problems seen in the field
You can come up with constructs to measure the cost-benefit tradeoff of doing something, even of running a single test. It would probably look something like this:

Benefit = (likelihood of finding a defect * efficiency at finding that defect) + (prevented) cost of that defect occurring in the field + (prevented) cost in reputation of finding that defect

Cost = development time + (run time * number of runs) + analysis time

However, all of these numbers are pretty much just guesses. Educated guesses, maybe, but guesses. In the end I have more luck with simply considering the costs and the benefits, and making a choice from those directly rather than running them through a contrived algorithm.

So can you calculate the ROI of a single test? (or feature or refactoring or whatever). Yes. It's probably more productive to think about it in more general terms, though, and to break it down to the fundamental question of: "is this worth what it'll cost me?" In the end, that's the important decision to make.

Friday, April 22, 2011

Do Failure Right

I have a couple of machines that have been hit by this latest Amazon outage. I have one machine in particular that's still out as of this writing. Now, it's not killing me; I can get around it for a few days (good thing, too!). Still, I'm grumpy about it.

And you know what I'm grumpy about? How updates are displayed. Amazon's doing a reasonable job of providing updates and estimates, which I applaud. So why am I annoyed? Every time I want an update, I do this:
1. go to the AWS Service Health Dashboard (already open in a tab)
2. hit refresh
3. scroll back up the page to the line that I'm interested in
4. click the "more" link
5. scroll down to see the latest update

It's a really small thing, and takes maybe 10 seconds total. But the scrolling bothers me; it could be better. Granted, it would shave off maybe 5 seconds, tops, but it's still just a little sloppy, and that tips me over the edge into grumpy about it.

Now, I'm not going to argue that failures are good. I will argue that they're going to happen. My boxes at Amazon EC2 will go down occasionally. My boxes in the office will suffer a power outage (true story: someone backed into a power substation once and poof!). Services will become overloaded, response times will drag, bugs will happen. Given the amount of software and systems we encounter in a given day, the fact of failure is going to happen.

So make failure clean. Make it as enjoyable as possible.

Twitter is a great example of this. People whine and complain when Twitter goes down, but their failure message (the famous fail whale) is cute. It's so cute it's spawned a cult following. People make cakes:

People make necklaces. They make Flash animations. All for a failure message!

It turns the conversation from the failure to something more positive, like your cute error message. That doesn't excuse failure, and it doesn't mean you should accept failure. It just adds a positive note to a negative experience.

And that's doing failure right.

Thursday, April 21, 2011

Helicopter Management

There are about as many kinds of management as there are managers.

There are managers who micro-manage:
"First, you're gonna get coffee. Then you're gonna use the factory pattern to add new widgets to the frobble. Then draft an email to this customer and send it my way for review."

There are managers who manage by metrics:
"I see you did 6 story points this week and 8 last week. Talk to me.... what's going on?"

There are managers who try to manage by exception:
"Look, I got you what you needed, and I know you can do it. So I'll get out of your hair and you just let me know if there are any issues."

Different management styles work for different people, on both sides of the relationship. Some people can't manage their own time, and they simply can't work effectively for a someone who manages by exception. Some managers aren't detail-oriented enough to be micro-managers, even if they wanted to be.

There are many good management styles. There is only one truly terrible management style: helicopter management.

The helicopter manager sometimes looks like a manage-by-exception type. He disappears for weeks, and employees just keep moving along with the project. Then, out of nowhere, he shows up and starts micro-managing. Every document and email needs to be personally reviewed by him, and he wants to know about all the design patterns, the rate and type of support calls, and the status of each internal bug found.

Sometimes the helicopter manager has a pattern, showing up at "big events" (releases, major client meetings, sales calls). Other times, it appears completely random. Either way, that style of management exhibits inconsistency and causes the people being managed to waste time trying to figure out how to react.

And that's the thing about management. It's an implicit contract between the manager and the employee that says, "here is what I expect of you, and here is what you expect of me". After that contract is in place, it's up to both the manager and the employee to keep to it, and that means behaving consistently. A helicopter manager violates that consistency and that contract, and that causes people to spend more time worrying about the relationship and less time getting work done.

Be a manager in whatever style works for you and for the people who work for you. Just don't be a bad manager; find a style that works and stick to it until it doesn't work any more. Don't be the one to break the contract. Don't be a helicopter manager.

Disclaimer: I feel the need to point out that I do not currently work with any helicopter managers. This was spawned by a conversation with a friend who is taking her first management position and wanted some advice.

Tuesday, April 19, 2011

Problems That Are Not Problems

A problem is anything that occurs that gets in the way of achieving your aim. Problems can be bugs, unknowns, or anything else large or small that blocks your intended path.

Much of software development and release are ultimately about solving problems. Some problems are easy to solve. It's a bug, so you fix it. There are two candidate designs on the table, so you prototype them to see which is better. Some problems are less easy to solve, or have no apparent solution. Sometimes we just don't know.

These are the kinds of problems that can stop an effort (or a release, or a product, or even a company) dead in its tracks. Things are going along just fine, then a problem arises, and days or weeks pass while you seek a solution, the release date goes by, other work is piling up, and we're all stuck on a problem without a solution.

Problems without solutions are not problems. They are simply circumstances.

Until and unless you have a solution, a problem is simply a fact. It's something that you have to handle but cannot change (at least, not immediately), just like you have to handle the operating system you chose to deploy on, and the market in which you compete.

Can these things be changed? Yes. Over time, we can chose to stop deploying on Red Hat Linux and start deploying on Windows. We can chose to leave the healthcare market and enter the financial market. We can find a workaround or change the nature of the problem. But until we do, that problem is simply a circumstance with which we have to deal.

For example, I had a problem once where there was a massive bug in the production deployment tool we were using. We simply couldn't use the tool any more. Of course, we discovered this late in the release, and we couldn't immediately solve the problem. So for that release, the problem had no solution. It was simply a circumstance of the release. Instead of spending a lot of time and effort worrying about it, we simply assessed the situation, determined there was no immediate solution, and accepted that in this circumstance we would be going with a new deployment methodology.

Once we accepted that, we were able to keep going on the release. It wasn't the prettiest deployment (we ended up doing it by hand with some quick tooling for validation), but we got through it and we got through it on time. All because we didn't waste time worrying about a problem we couldn't solve.

Oh, and for the following release, we solved the problem: we switched deployment tools.

Friday, April 15, 2011


When I was first starting out in the workforce, my dad gave me one piece of advice that I thought was really weird. Years later, I finally figured out that it was actually really smart:

Never be indispensable.

Now, when I first heard this, I thought it was silly. This was Silicon Valley in the year 2001, after all. The only people not getting laid off were the indispensable ones, right?! Wrong. A lot of people were losing their jobs, some not so good, some very good, and sometimes it was everybody (yes, I did ride a startup all the way down).

The trouble with being indispensable is that you can't leave. You can't be fired, sure. But you can't take vacations. And you can't get hit by a bus (or chuck it all and move to Antarctica). You also can't be promoted.

Be very good at what you do. Be important. Be useful. Be a hard worker.

And then help others do what you do. Help those around you be good at what they do, and know how to do the things you know how to do. Teach others the tricks you know, and the skills you have. Help them replace you, so you can move on to new and bigger things.

Don't be indispensable. Be a leader.

Wednesday, April 13, 2011

Interim Answers May Prove to Be Wrong

Once a project gets to be longer than a day or two, the questions start to come up from team members, from management, from product management, from customers:
"How's it going?"
"What have you learned so far?"
"How much better/faster/stronger/larger is it going to be?"

It's an attempt to start setting expectations early, and that's fine. It's even all right to answer the question, to describe what you know so far, and to provide interim results.


A warning for all of the askers out there. You managers, you customers, you project managers, you team mates.

Interim answers may prove to be wrong.

Halfway through a project, the team may have an inkling of the benefits, but a lot can still go wrong (or right!), and the end results may be very different. For example, we might have some great performance improvements.... but not yet have discovered the fatal flaw that negates them. Or we might have something that now lets us handle 100 concurrent users.... but not yet have done the test that shows we can handle 1000. We say, "we think X", but we might be wrong. We'll know with more information.

So please, feel free to ask the question. Just be prepared to later hear a different answer than the first one you got. Don't make the interim results public, and don't make decisions that require the results to be precise; they may change a bit and you need to be aware of that. Instead, take them for what they are: current status and partial understanding.

Ask questions to understand risk. Ask questions to get a sense of progress. Ask questions to let people share things they're excited about. All these are great reasons to ask for interim results. Just be prepared for the time when the word interim is important, and the final results don't match - for better or for worse.

Tuesday, April 12, 2011

Sessions for Long Tests

I've been working with a program that is primarily focused on data ingest. We send a lot of data through the system and perform tests and analysis based on what happens during and after that data ingest. One of the testers came to me recently and said, "I've been wanting to try session-based test management, but I can't figure out how to do it when our tests take so long!"

Now, I've used session-based test management techniques in the past, but it was for a very different application. So I googled it. The first four examples I found while googling "Session-based testing" and "session-based testing example" were:
  • Testing the new file and open file features of an unspecified program
  • Creating a test coverage list and feature checklist for a win32 application based on its user manual
  • Analyzing a view menu in a map application
  • Testing a bookstore web application
Shoot. Every single one of these is a GUI application that offers very quick turnaround time. Now, the GUI part isn't a big deal; that's easy to accommodate. We have other ways of getting information into and out of our system. The turnaround time is rather harder, though. You see, with all of the examples above, the time lapse between deciding what an interesting test would be, performing the test, and seeing the results is measured in seconds. It's fast.

For us, deciding to perform a test frequently results in several hours before we get the results to analyze. For example, we might want to test what happens when we load the system from the full on-disk file and run data through it when that data has certain characteristics. At a 170MB/s average read rate from disk, the step "load the system from the full on-disk file" takes about 1000 seconds (a bit over 16 minutes). Depending on the characteristics of the data we're interested in, that step can take minutes (20GB of data or so) to hours (multiple TB of data). Only then do we get to analyze our results.

So how do we structure a session around that?

Let's define what the problems really are:
  • Long wait periods (for humans) in our tests mean it's hard to focus on one test or one session at a time
  • Expensive tests mean that we can't do many of them inside a typical session window of and hour or two
  • We frequently find the most interesting information when we compare results across sessions.
Now, I can't say we've fully solved the problem. Frankly, our attempts so far have very much been trying to put a square peg in a round hole, and this may not be the technique for us. However, we're trying not to throw the baby out with the bath water. And there are some benefits to structuring what we approach and when so we don't succumb to death by analysis.

Some things we have learned:
  • Separate gathering data from analysis. We focus on setting up test runs and configuring our data one day, then let the tests run overnight. The next day we separately analyze the output from the previous night's runs, from each other's runs, and from other data we've already gathered. By making analysis an explicit later step, we free ourselves to look at a broader spectrum of data, and we shorten the feedback loop on this part of the testing. Charting and calculating on existing data is much faster than gathering it.
  • Go wide. We don't run one test at a time; we run as many as we can get hardware for. That forces us to think a little bit before we run off just poking at the system, but we get more information per unit of time spent. And if a test does something unusual, we have some near peers to compare it to. (Ask me how that helped us track down a time skew issue in the lab that had nothing to do with our software but that had been bugging us for weeks!)
So are we doing session-based testing? Nope, not in the strict sense. The separation of design and analysis sort of kills that. But it works for us, and it works for the system we work on.

Monday, April 11, 2011

Allowing Mistakes

One of the tropes of management is this:

"Allow people the freedom to make mistakes."

The idea is that by making it okay to make mistakes, then people will stretch farther and the successes will be larger. Mistakes drive innovation. So we should make sure that everyone who works for us knows that mistakes are acceptable.

Here's the thing: we're all human, so mistakes are inevitable. Every single one of us is going to mess up sometime (let's not discuss the time I miscounted how much test data I had put into a system, overfilled it, and didn't catch it until I came back from vacation to spend days recovering and restarting the test instead of presenting results).

So, yes, mistakes will happen, and that's okay. There's a limit, though. "Oops, that was a mistake" is not an excuse for presenting bad results or becoming that guy who breaks the build at least once a day.

One mistake is a simple mistake. Three mistakes is a bad day. Constant mistakes.... that's just plain sloppy.

When you make a mistake, take it as just what it is: a mistake. Also take it as an opportunity to check your work a little more closely before you do it. Be the guy who makes mistakes. Don't be the sloppy guy.

Wednesday, April 6, 2011

Customer-Driven Knob

Yesterday I was in a meeting talking about a feature in our product. One of the things that we were discussing was where to set the queue depth. For this particular queue, anything from 64 to 1024 would function just fine and would fit within the resource budgets we had. The trick of this queue was it's effect: setting the queue depth higher let us handle more volume; setting the queue depth lower made throughput faster. So what number to choose?

Someone said, "Let's just let the customer set this. We can make it a knob."

Okay, yes, we could do that. But how on earth is the customer going to know what value to choose? This is a knob driven by the internal implementation and design choices the team has made. We could explain the rough effects of various settings to customers, but really, they're going to be cranking it to the "more volume!" or to the "faster faster!" end.... and it'll pretty much be a guess from there.

I'm not anti-knob. Not at all. Letting customers flex the system to their needs is very important. But the customer should drive the knob. After all, more things than just the queue depth affect the throughput versus total volume optimization. If we're going to give the customer a throughput vs performance knob, we should flex all of those things under the covers.

Put knobs in your software when there are choices the customer could reasonably make. Just make sure the knobs come from the customer's perspective, not from the internal implementation decisions of the software.

Friday, April 1, 2011

Escalate With Purpose

One of the best parts of what I do is that I get to talk to a lot of different people with a lot of different styles. I talk with engineers, implementation analysts, support reps, sales people, project managers, and software consumers. Some of these people are incredibly thoughtful about technology and others really just want to accomplish some other task and would like the software to get out of their way.

There are different communication styles as well. Some people tend to downplay things. Others are the chicken littles of the world and everything is a huge problem. Still others simply enjoy the attention they get from being dogmatic; you'll hear "always", "never", "criminal", "unethical" a lot.

When we work with issues in a released software product, they generally start in support. If they're bad they escalate into QA and/or development. The same thing goes with interactions between people.

Discussing an approach, or a problem, or a dilemma can be a great thing. Sometimes, the people in the discussion can't agree or don't agree. At that point it will either be dropped or the discussion will escalate. Someone in that conversation is the first to actually perform an escalation. Conversation escalation can take several forms:
  • bringing in an authority figure,
  • expanding the scope of disagreement (think personal attacks)
  • introduction of defensive behavior and terminology ("always", "never", "only an idiot would", "it's criminal to"), usually with a goal of goading the other party into agreement
For example, an engineer and a project manager I work with were having a discussion about when they needed some content from the client. The content would be bundled into the software, munged a bit for display purposes, and then shipped. They disagreed. The engineer said that he needed two days to do the munging and integration, so two days before shipping was fine. The project manager wanted to pad that and tell the client eight days before shipping. This is a simple disagreement, and one that shouldn't be a big deal. However, the engineer chose to escalate the conversation. He said it would be unethical to ask the client to rush unnecessarily and probably create bad content, and he wasn't going to be part of that no matter what the project plan said or what the client thought.

That kind of overreacting, defensive behavior shut the conversation down immediately. All of a sudden a small difference of opinion was a big problem. The project manager was in a corner licking his wounds, and the engineer was no longer listening to any reality-based opinion and would brook no dissent. Oh, and the underlying disagreement - which should have taken 10 minute - wasn't solved for days.

The moral of this story is to beware of escalation in a conversation. Making dogmatic statements or running to the boss is often not the right way to approach a problem.

If you're going to escalate a conversation, make sure you understand why you are doing so and what you are seeking to accomplish.

There are real and legitimate times to escalate. If the client is asking that we do something illegal, for example, that's something that has to go to a higher-level authority figure; it needs to be escalated. If a coworker is implementing an empirically bad design (and "I don't like it" is not a good reason), then that should be escalated to the engineering team or the architect.

Escalate with purpose, or don't escalate at all.