The Back Story:
We run a number of automated tests:
- Build Verification Tests. These run after every build. The continuous integration system will not spit out the build (isos and debs, in our case), unless these tests pass.
- Nightly Tests. These run every night. They're mostly *Unit based (JUnit, PerlUnit, etc).
- Weekly Tests. These are the tests that simply take too long to run every night. They're functionally identical to Nightly tests, but they take between 6 and 24 hours to run, so we only run these once a week (and they take about 5 days to run).
Sometimes there are bugs in these tests. That's fine, so we log them and dev fixes them (hooray!).
How many runs does an automated test have to pass before it can be marked as verified?
We do a code review, and the developer runs the test before he checks in, but we still want to see the tests run in BVT/nightly/weekly. The question is, how many good runs do we need to have confidence in the verification?
I don't think this is a formula; there are simply too many subtleties based on the frequency and type of failure, the test structure, the risk of it failing either later in the tests or in the field, etc. So I'm really just looking for a rule of thumb.
The best I've found so far is as follows:
A test must run successfully for the longest interval between failures plus one.
So, if a test fails every time, then it needs to run once successfully (0 intervals between failures, plus 1).
If a test fails every fifth time, then it needs to run six times successfully (5 intervals between failures, plus 1).
If a test fails every three or four times, but one time it went ten times between failures, then it needs to run eleven times successfully (10 is the longest interval between failures, plus 1).