Friday, September 26, 2008


We have a rather large (say 8500 or so) suite of tests that runs every night (and until about noon the next day). These tests are controlled by a single machine.

Yesterday we were having problems with that nightly test controller machine, specifically with the OS on it. So we backed up everything and reinstalled the machine. However, this wasn't going to be done in time for last night's run, so we grabbed another machine and set it up to run nightly. Being slightly paranoid, I logged in last night after the nightly run started, saw the processes chugging along, saw the log file filling with standard information, said, "great!", and went to bed.

This morning, we got in to discover that the nightly tests had not run.



After digging in some more, we found that the nightly test suite hadn't run because the clean build it does (before it does any tests) had failed. Our machine was configured just fine; it was just a coincidence that we happened to switch test controller machines on a day when the build was going to fail.

The lesson of the day?

Coincidences really do occur.

P.S. I've left out a number of steps in our continuous integration/build/test process because they weren't relevant to this particular story.

No comments:

Post a Comment