Monday, December 21, 2009

Test Lab Overhead

Le's say we have a test lab, basically a pool of machines available to anyone who wants to develop, run tests, etc. The lab consists of several parts:
  • a pool of machines for running tests
  • various resource machines for specific branches or versions (e.g., a build machine for each version)
  • various common utilities, including a central file server, DNS, a machine reservation system, etc.
In order to successfully use the lab, we've instituted some utilities and checks. For example, every time a machine is released, we run a check script on it to confirm that it has the right packages, DNS, mounts, etc. These kinds of utilities run out of a special mount that the entire lab has access to.

When we have one branch or a few very similar branches in active development, the lab works fairly well. There's really only one correct state for a test machine, and possibly a (very) few variations on that state. Updating DNS is a matter of changing one script on one or a few branches and updating that utilities mount. Simple, and works great.

Things get more complicated when you have multiple products or multiple branches that vary widely. All of a sudden your fairly simple check script has a lot of if clauses for different package versions on different branches. Changes (a new DNS server, for example) have to be propagated to a lot of different branches. It all starts to get a bit unwieldy. Is it still possible to work effectively? Sure. It's just more work.

The other obvious option is to separate the test infrastructure code entirely in source control. Put it in a separate project with separate branching, etc. You still have to maintain test code across branches at the interface points (interfaces will eventually change), but the problem is smaller because it's isolated into this one branch. Tests themselves would of course stay with the code they're testing. The downside to this option is that your dependencies are increased between projects and that introduces a different kind of management overhead.

You can choose either option - test infrastructure in the main source branch, or test infrastructure separate - but whichever one you choose, there are a few things you should always make sure of, just to keep your odds of success up:
  • Don't support a stagnant branch. If you're going to have to support a branch in the lab, then make sure it gets used periodically (built and a smoke test run). This ensures you don't go too long without checking for breakage and handling it.
  • Notifications must always work. Even if you have to compromise on features or make something harder to run, keep your notifications working. When you have things like build failure notifications on head, the absence of those on a branch can give you a false sense of security and let breakages persist silently for longer.
  • Retirement is an option. Maintaining branches forever is a huge proposition. At some point you have to draw a line and say that you're going to stop supporting a branch. Generally this will coincide with end-of-lifing or end-of-supporting that branch in the field.
Maintaining a workable test lab is part of the overhead of a project, and a key part of the success of a project. Doing it with as little fuss as possible is something that takes active thought and maintenance - make sure you're giving it the attention it deserves.

No comments:

Post a Comment