Monday, June 30, 2008

Bug Dances

There are lots of tricks to clean out a bug queue. Often it's a developer who wants to hit a certain bug count - either due to official or due to internal pressure. Sometimes it's a QA engineer, and in rare cases it's the project manager "helping" on triaging bugs. These are some of my favorites:

The Friday Afternoon Duplicate Rush
The classic phrase here is "these look pretty much the same". Two things with the same symptoms are typical candidates here. Two bugs in the same area of code with different symptoms but the same apparent result (e.g., stuff happened, and then it crashed) are also typical candidates here. Combine a duplicate rush with one or two bug fixes and you've got a recipe for a lot of closed bugs!

The Freak of Nature
This is another common one. Typically this happens to intermittent bugs that haven't recurred in a while. The bug will be marked as unreproducible, often with a notation like, "the network must have been congested" or "infrastructure hiccup". Sometimes this isn't closed outright; instead it's sent back to the finder "to try to reproduce it".

Info Any Info
This is nearly identical to "the freak of nature". The major difference is that the person is less sure, so the bug is being kicked back for more information. If this is the first time, then it's probably reasonable. When this happens more than once, the requests start getting a little weird.

The Miracle Cure
This one is my favorite. The code around the bug has changed, so odds are the bug isn't there any more! There are a couple variations on this one. Sometimes it's honest belief: "I was working in this module and changed the method that this bug is in. Given my loop change I think I fixed this one, too." Sometimes it's a little bit hopeful: "I worked on this code last week and cleaned it up. Surely this got fixed then." Sometimes it's outright wishful thinking: "I think [some-other-developer] was working on this module last week. This is probably fixed." I worked with a developer who was notorious for this for several years; his success ratio with this was about 20%.

Please note that all of these reasons to close bugs are perfectly valid - some bugs really are duplicates, some are the result of unreproducible infrastructure hiccups, sometimes more info is really needed, and sometimes fixing one bug also fixes another - but if there's a pattern and it happens a lot, then you need to look deeper. Make sure you're not incentivizing (directly or indirectly) low bug counts or high closure rates. Also look for patterns to make sure this isn't just a trick to clean out a queue: a couple of duplicates is normal, a lot is not.

Friday, June 27, 2008

Third-Party Products

The product I'm working on uses a number of third-party products. For example, we use SSH extensively, and we use a standard SSH library - it wasn't one of those things where writing our own made any sense. We also use Java and Tomcat. Again, this is not surprising - these are used in a lot of products. Normally, this is a huge time saver and bug reducer; these products have been stressed in a huge variety of production situations and we're getting the benefit.

There's one drawback, however. These products have bugs in them (horrors!). This is one of the more frustrating things to have happen, because this isn't even code you wrote. It's code someone else wrote! Nevertheless...

You are responsible for bugs in products you use.

Your customer doesn't care (usually) what third-party products you use. The system your customer bought is yours and if it fails, then it's your problem to fix it, no matter where the actual issue lies.

So when you're testing, make sure you're testing your third-party products. And when you're fixing bugs, don't forget the ones in the products you depend on.

Thursday, June 26, 2008

Software Ain't the Only Thing With Constraints

I spent yesterday at the dentist office, and while I was sitting there with my mouth wide open I got to thinking of how much teeth are like software projects.

So we were constrained on resources: the dentist, the assistant and the patient. 

We were time boxed: 90 min and out, according to insurance. 

We were given a task that we didn't fully know the details of: perform a root canal on a tooth with an unknown number of roots (3 - 5, apparently), an unknown number of cracked roots (0 - 3, or maybe 5), and on a patient with an unknown response to anaesthetic.

Sounds like a lot of software projects: constrained on time, resources, AND features.

And now I'm going to go try not to bite my tongue for a bit...

Wednesday, June 25, 2008


When I talk to recruiters and they ask what I'm looking for, my answer invariably includes a phrase like, "I want people who have opinions and don't use their more junior position as an excuse to not express them." Really, I want people who will challenge me. Now I know this is not unusual. I've had more than one boss who thought that his opinions didn't carry more weight just because he was the boss. 


There's a right way to do challenges. And a wrong way to do challenges.

The wrong way (don't do this!)
  • Challenge everything. Don't challenge things that don't matter - "I think we should have this meeting at 11 instead of at 10 just because". 
  • Challenge loudly. Don't shout your challenge to the rafters. People are much less defensive if you talk to them one-on-one.
  • Challenge rudely. Remember, you're challenging the assumption or the conclusion, not the person. So don't challenge the person.
  • Gloat. The words "nanny nanny boo boo" should never escape your lips!

The right way (do this!):
  • Challenge yourself. Before you go rushing around showing other people how wrong they are, challenge yourself. A successful challenger makes darn sure he's right before he goes challenging other people.
  • Challenge with reasons. Bring your point to the table and also why you think it. "Gut" feelings are not god enough.
  • Challenge positively. Don't say that something is wrong with the other point of view; say that something is right with your point of view.
So yeah, don't let rank rule your life. Challenge yourself, challenge your peers, challenge your boss. This can be extremely effective, as long as you're both generally correct and you're doing it with a generally positive demeanor. You know, all that standard works well with others stuff!

Good luck being insubordinate in a good way!

Tuesday, June 24, 2008

Just Testing

I'm a tester by trade, and quite proud of all the things we can accomplish. However, let me be very clear:

Testing alone is not useful.

A test provides information, which is great, but doesn't actually accomplish anything by itself. The key is what you do about that information.  Gathering data almost always implies some change:
  • Changes in expectations. For example, "the tests say our limit is 1 TB/day". The action may be to adjust expectations; to say, "yes, the system as it stands is fast enough for us".
  • Changes in software. A test finds a bug; the software change is to fix the bug.
  • Changes in product planning or marketing. A test finds that a feature limit is too low for client needs. The marketing materials get changed, or the product plan for the next release gets "enhance this feature" added to it.
But all of those are actions that result from testing. Just testing only tells you the information; it doesn't do anything to deal with the information.

So don't accept requirements that something be tested. The requirement is that the feature works, or the speed is x, or the failure is handled gracefully. Be explicit about that; the test is a means to an end, but it's not an end in and of itself.

Monday, June 23, 2008

Rcov and Heckle

A project I'm working on recently got to 100% coverage according to rcov. Wahoo!..... right?

Actually, rcov only tells you about C0 test coverage. This is great as far as it goes, but 100% of the lines of code executed is not the same as testing every branch of the code. Our tests may not be complete.

So we've started using heckle. Heckle does cod mutation to exercise your test coverage. Basically, it changes your code, changes parameters, etc. and looks for your tests to fail. If your tests don't fail, then you're not testing that code path.

Heckle so far has found some areas we're not testing as well as we'd like. This is going to be an adventure!

Friday, June 20, 2008

Test the Entire Method

I'm working on a project that - among other things - takes in a set of data points and spits out a bar chart on the UI. These charts describe things like number of messages per day over the past two weeks. The project is a Ruby on Rails app, and to generate the graphs, we use Gruff. Gruff internally uses RMagick to actually draw the images. Basically, you feed Gruff your data points and it gives you back a PNG.*

So, how to test this thing?

With Gruff as it stands, you can feed it test data and check that the PNG comes back. So my test looks like this:

def test_messages_chart
  get :messages_chart, {}, {:session_id => "1"}
  assert_equal "images/png", @response.content_type
  assert_equal 500, assigns['bar_chart'].width
  assert_equal 250, assigns['bar_chart'].height

This is great as far as it goes.  But I'm not the trusting type, and nowhere am I actually asserting that the right values got onto that chart. And there's a lot going on in the messages_chart method. It has to:
  1. figure out how many days of data to show
  2. query the db to get that data
  3. hand the data off to Gruff (which then munges the data and calls RMagick to actually draw the picture)
  4. get the image from Gruff and display it
My test code can tell me that items 3 and 4 worked right - I get a picture, after all. It doesn't tell me that we did items 1 and 2 right. Looks like my test is incomplete. And it was incomplete because Gruff didn't let me assert on the values that had been fed into it. I wound up having to write a mixin for Gruff to allow me to get the values for each data point, and then added that to my test.

But the moral of the story is that if at all possible, you should look at your method and make sure you're testing all of it. Don't let your tools dictate what you test; make your test dictate what your tools should do.

* As a side note, this is a great way to do these types of graphs. It really eliminates the overhead of maintaining CSS-based graphs across browsers.

Thursday, June 19, 2008

Through the Wall

Some days, the only way to get to the other side of the wall in front of you is to go straight through it.

Put your head down and get it done. And good luck.

Wednesday, June 18, 2008

Beware the Snow

When we have a candidate in for a QA position, I always put two people in the room in the first round interview:
  • A senior team member.
  • The most junior team member.
The senior team member is expected. This is the guy who's going to see how good the candidate really is. Depending on the position, this can be a very technical code-oriented interview. Alternatively, it can be a very test design and testing-oriented interview.

I find I learn some really interesting things putting a junior team member into an interview. It's unexpected, and anything that gets a candidate away from the expected questions and answers is a good thing. 

There are a few candidate reactions to my junior team member:

The candidate starts talking down to the interviewer, usually trying to show he can be a mentor. Sometimes it's just nervousness, but sometimes it's a real personality conflict.

Snow Job
This is the special provenance of the insecure candidate. With this candidate, you'll find that the more senior interviewer had a hard time getting concrete information out of the candidate and walked away with the impression that maybe he doesn't know as much as he thinks he knows. The more junior candidate walks away shaking his head over the plethora of terms that he didn't think went together quite that way.

Mutual Respect
When the right candidate walks out of an interview with the junior team member, they've both learned something. This is a candidate we're almost certainly going to move forward with.

So learn what your candidates know, and watch how they react ... to the people with more experience and to the people with less experience.

Tuesday, June 17, 2008

Perception Is Reality

We got to talking today about a certain feature in a product. And someone said, "Oh, we never use that feature. It's not really good yet."

(Insert here a chorus of developers having angst. This particular feature just went through an overhaul and is actually pretty darn good. Include the use of the phrase, "I haven't seen any bugs logged against it. If there are no bugs, where's the problem?")

So why the disconnect?

After spending some time with the complainer, it came out very simple. The feature itself works just fine every time. However, the complainer is scared to use the feature on a busy system because of other negative consequences that might occur. So there's nothing wrong with the feature itself - we're all confident of that - but the feature is still scary. The end result is that this feature is not usable as is; the customer simply won't use it.

There are two things to realize: (1) there's nothing wrong with the feature, so that's not what needs fixing; and (2) the customer doesn't believe the feature is safe to use. We can't change item 1 - there's nothing to fix - but we can change item 2.  We have two courses of action open to us here:
  • Either change the customer's perception
  • Or change the feature to make it feel even safer
In either case, development has work to do. No matter how wrong you think your customer is, and no matter whether there's evidence to match the customer's perception, you need to keep working until the customer's perception matches reality, and everyone's happy with that state.

The moral of the story?

Your customer's perception is your reality.

* Sorry for the vagueness. Names and features have been changed to protect the innocent.

Monday, June 16, 2008

Rails Asserting Image Encoding

This is a little thing, but it took me a bit to figure out today. Basically, in this project we had an issue where people uploaded photos and they "rendered funny". Turns out the problem was that no matter what was uploaded we set the MIME type to "image/gif". Hence, jpgs (and anything not gif) "rendered funny".

So I wanted to write a test to show the problem. Here's what we wound up with... easy, once you know what on earth that is called:

def test_image_display
   get :image, {:id => "28"}, {:session_id => "3"}
   assert_equal "image/jpg", @response.content_type

This should work for non-image MIME types as well - XML, etc. - although I've only tried images so far.

Friday, June 13, 2008

Ask When You Know the Answer

There are three reasons to ask a question:
  1. because you want or need to know the answer
  2. because you want or need to know what the answer is not
  3. because having asked the question is what matters
The first reason to ask a question is fairly obvious. You are seeking positive information - what something does, when something will, etc. For example:

"Have we fixed bug 3628 yet?"
"Yup. It'll be in the next build."

The second reason to ask a question is to eliminate possibilities. These are the types of questions you ask when positive information is unavailable or unimportant. Much like the use of negative space in art*, the thing you are seeking is defined as much by what is not as by what is. Questions with non-assertive answers come up a lot in debugging problems. For example:

"What could be slowing us down?"
"Well, we don't see any processes in disk wait, so we know it's not raw disk I/O that's causing it."

The third reason is the oddest. Sometimes you ask a question and really don't care about the answer. The point is to have asked the question. This one gets used a lot in human situations, and the point is the interaction rather than the exchange of information.

Usually, it's not important to understand why you're asking a question - you'll get an answer and use it in one of these three ways subconsiously. However, when you're getting frustrated because you just can't get an answer, it can help to remember that sometimes the answer is not the point.

* I was going to include a gratuitous picture showing negative space in art, but it didn't quite fit well, so I'll stick it down here for those who made it this far. (This is MC Escher's Three Worlds)

Thursday, June 12, 2008

Timeline Tools

I wrote a while ago about time-based event modeling as an analysis technique. Basically, identify what the system is doing over time explicitly - draw it out - and the patterns you see will point you toward the problem (or problems). This is great, but getting a tool to do the actual timeline with is somewhat problematic.

So, what are your choices?

Whiteboard/Large Poster/Blackboard
This is a good tool for fairly well-known timelines. However, when you run out of room you can find yourself rewriting a lot. It also is difficult to share this with people who aren't physically present.

Sticky Notes on a Whiteboard
This is my preferred method when remote access to it is not a concern. You put your events on sticky notes, so that when things get crowded or your understanding changes you can simply move them around. This does have a problem that you better use good sticky notes; you don't want to walk in one morning and discover that the stickiness has worn out and your timeline is now on the floor.

List-Based Timeline
This can be done on paper, in Excel, on a wiki, etc.; the format choices are numerous. This one has the advantage of being shareable. However, I find it makes it a lot harder to see patterns. Lists can't show gaps, or different rows for different components.

Online Timeline Tools
In general, these tend to be a bit finicky. They show timelines with icons, colors, gaps where nothing happens, etc. However, they're often hard to use. We have just started working with the SIMILE timeline, though, and it's working great. Once it's set up, adding elements and moving them around is very simple. The integration with graphs and charts is very nice, as well. My thanks go to Michael Fortson for pointing out this one. When remote access to the timeline is important, definitely check this one out.

In the end, the tool you use is up to you - go for the sticky notes or the list or GUI timeline tool. Just make sure your tool helps your analysis and doesn't stand in the way of seeing the system patterns.

Wednesday, June 11, 2008

Cost of Running Automated Tests

One of the cliches of testing is that developing automated tests is expensive, but running them is free, or nearly so. That all sounds great, but doing anything eventually costs money. So, how free is it really?

The example test I use is the cost of running one of our weekly tests. This is an automated test that gets run once a week and confirms that with 7 servers we can write a lot of data and track the rate. Let's look at the cost of actually running this thing. I should note that all costs are general and based on Boston averages.

Test Duration:    3 days (72 hours)
Machines Used: 51 (50 servers and one client)
Total Machine Hours: 3672
Test Run Frequency:   once a week (52 times a year)
Machine Lifespan: 3 years

Buying Machines
A server costs about $3000 and lasts for 3 years. This server spends about 3 days a week running this test. It spends about 3 days a week running other tests. It spends about 1 day a week idle (not continuously, of course!). So, the cost of this particular test is about 40% of the cost of the server. We can say that this test will run once a week for the entire life of the server. So our cost per test run is as follows:

$3000 * 40% = $1200 (cost of the server for running this test)
$1200 / 3 (years) / 52 (times per year) = $7.69

So it costs $7.69 per test run per server to run this test.

Cost of Infrastructure
Servers use a lot of electricity. There's also additional electricity for cooling, etc. Then there's the cost of use of the network, etc. Our infrastructure costs us about $13.33 per server per month for power, cooling, lights, it's share of infrastructure (switches and whatnot), etc.

$13.33 * 40% = $5.33 (cost of the electricity to run this test)
$5.33 / 4 (times per month) = $1.33

So it costs $1.33 per test run per server to power/cool/etc this test.

Cost of Administration
This is an internal system, not a production system. However, if the servers go down, development basically stops. So we put some pretty good administrators on it. These servers take about 2 hours to set up (including checking, racking, installing, adding to the network, etc). Then they take about an hour a month of maintenance. So we've got 36 hours of maintenance plus 2 hours to set up over the life of the server. A Linux administrator gets about $65,000/year salary (or about $32.50 an hour) in Boston.

38 hours *40% * $32.50/hour = $494 (cost over three years to administer this server)
$494 / 3 (years) / 52 (times per year) = $3.17 per test run

So the actual cost per test run is as follows:
Server: $7.69
Infrastructure: $1.33
Administration: $3.17
Subtotal Per Server Per Test Run: $12.19
7 servers per test run: $85.33 per test run

So our "free or nearly free" automated test is costing us a little over $85 each time we run it. Manual tests require a lot of the same things, so generally this is worth it. Just make sure you check your numbers when so that your "should I automate" calculations include the true costs of running the automated tests.

Tuesday, June 10, 2008

Test Estimates

Doing test estimates is hard. In particular, test estimates often come with hedges like, "well, if the code's not too buggy", or "this long each time we have to try it, and we don't know how many times it will have to go back to dev". However, we've come up with a bit of a workaround.

The key to it all is that we don't estimate stories (dev or QA) until there are defined acceptance criteria. (I should note that we use a variant of Extreme Programming as our development process, and one of the things we're very strict about is describing acceptance criteria as we're describing requirements.) Then estimation is done by both development and QA. The development estimate accounts for building the feature and building the tests that prove the feature works; dev is "done" when the automated acceptance tests pass. The QA estimate accounts for any manual tests involved and integration testing (based on the size of the feature and its relationship to the rest of the product).

The biggest problem with this is that there is still a "gut feel" portion of the estimate, based on two things: (1) how buggy is the feature going to be; and (2) how much integration testing is necessary before we are comfortable with the feature?

So, handling of the "how buggy is this feature" is done basically by folding it into the development portion of the estimate. There may be a lot of bugs in other areas of the code (whoops!), but that feature at least won't spend a lot of it's "testing" time waiting for development to fix bugs. This hinges on having thorough acceptance criteria, but practice gets you good at that one.

Handling of the "how much integration testing is necessary" is much more of a dark art, and this one we could use more practice with. Our general criteria is to weight our basic estimate by size of the module, interaction with external programs, and interaction with other modules of the overall product; this tells us the likelihood that the code will wind up back in dev fixing something and how long it's likely to be back in dev. So a change to the UI that doesn't interact with external programs, doesn't interact with other modules in the system, and is a single text item on one screen is considered near zero change of going back to dev and would be back in dev for a few minutes tops. A change in the core data parsing module dependent on a specific external client's data pattern, and that changes the main data parser (a large module) is considered high chance of going back to dev for up to 30% of the original implementation time. But this one still boils down to a gut feel, and we're definitely looking for a better way.

Monday, June 9, 2008

Best Subject Line Ever

This was in my RSS reader today:

"God has a memory leak"

God, for the record, is a Ruby monitoring tool. Think CPU, memory, etc.

Friday, June 6, 2008


There are several qualities that make for a really good QA engineer:
  • ability to design a test that provides interesting information
  • a dogged desire to track down the little niggling oddities (that often lead to really deep bugs)
  • well-developed, if informal, systems thinking
  • ability to distill a lot of information into clear and concise descriptions
These are all what I think of as product-oriented abilities. They are the mental tools that allow a QA engineer to approach a system, no matter how opaque, and develop a deep understanding of that product (or tool or application or whatever you call your thing under test) and its reactions to internal and external influences.

But the one thing that really sets a QA engineer (or any engineer, I suspect) apart has nothing to do with the product. It has everything to do with the human organization. And that one thing is anticipation.

The ability to anticipate the needs of the human organization is what makes a good QA engineer great. Many of these are not large changes; they're the little things that make the day flow more smoothly. For example:
  • Knowing that the developer has a standup at noon and will want to know the results of last night's automated tests, anticipation says the QA engineer should look at them enough to offer a summary by about 11:30.
  • Knowing that a big client is preparing for go live in a week, the QA engineer should set up a similarly-configured system in order to be prepared for questions and issues that arise as the system goes live.
  • Understanding that a problem at one client will inevitably lead to the question: "will this affect other clients?" and taking the characterization of the problem to other client scenarios before being asked.
So ask yourself not just "what do they want", but "what will they want next". That's what will tip you over the edge from scrambling to keep up to that corporate zen state called being proactive.

Thursday, June 5, 2008

Failing Around Tests

We run a lot of automated tests. In general, these are fairly straightforward, as you would expect. They basically consist of:
  • Setup
  • Perform Test
  • Teardown
(I warned you it was straightforward!)

The problem comes when we start to talk about test failures. Sometimes, this, too, is straightforward: the test itself failed. Maybe you expected to get "2876" and you got "2875"; whoops, obvious test failure. Other times, it's less easy. So the question of the day is:

If a test doesn't setup correctly, did it fail?
If a test doesn't teardown correctly, did it fail?

To address the latter of these first, my opinion is that a test that doesn't teardown cleanly failed. If you can't clean up after yourself, then the test has some unintended consequence, and those generally translate to bugs. Usually a failing teardown means you aren't asserting on something, and that something is what changed. So this is a failure.

Tests that don't setup correctly are trickier. A test that didn't setup correctly may just be the victim of some prior test (that didn't tear down cleanly), but it might also be missing some setup step. So for these I look to see if the prior test running on that machine had a clean teardown. Assuming everything in the previous test came down cleanly, then you've got a setup failure and that's a bug.

So the moral of today's story is this: 

A test doesn't pass unless every part of it passes. 

Just having passing assertions (the test itself) is not good enough. The parts around the test - setup and teardown - have to pass, too.

How do you handle failures around your automated tests?

Wednesday, June 4, 2008

Dirty Glass

There's an anti-crime theory called the "Broken Windows" approach. Basically, the idea is that when something small is wrong, it should be fixed or it will lead to larger things going wrong. (The example that lends this the name is that broken windows in a building should be fixed, or they will lead to people breaking into the house, then squatters, and eventually larger crimes.)

This theory is increasingly being applied to software development, particularly by proponents of continuous integration. Basically, if you skip the small stuff, it rapidly becomes big stuff. Ignore a small bug and all of a sudden you'll have a lot of bugs - leading to the "whoa! That's an impossible bug list!" scenario. Ignore a failed build and eventually you won't be able to reliably build at all. This concept also goes under the term "technical debt". Basically, being a little behind with the "nice to haves" (refactoring, test coverage, etc) isn't too noticeable, but being a little behind often leads to being farther behind, and then big things start to fall apart.

I think it's a great ideal to say that we won't tolerate "broken windows". If a build fails, the top priority is to fix it. If a bug occurs, the top priority is to fix it rather than finish this other new feature.


At some level a zero tolerance policy is unrealistic. Sometimes a build will fail, and the whole team will be in the middle of a major client issue. You just can't look at the build right then. Sometimes you can't do a code review because too many team members are on vacation and you can't get enough fresh eyes on it - so it has to wait a week. And that's okay, as long as you recognize it, account for it, and build in time to catch up.

So, to stretch our metaphor to the limit, what are the software broken windows, and what is just a bit of dirty glass? When is something really a big problem, and when is it just something to keep an eye on and not let it get worse?

Broken Windows:
These are the things that shouldn't be allowed to continue for longer than some very small period of time.
  • Failing builds
  • Tests that don't even run
  • Checking in without code review
  • Checking in without running tests

Dirty Glass:
These are the things that will bite you in the end, but they're not emergencies. As long as you're regularly going back and cleaning them up, it's okay to let these slip for a little while.
  • Build warnings
  • A few failing tests in edge cases
  • Checking in without pairing (we're in an XP shop. As long as we're code reviewing, we're doing okay. But pairing is more desirable.)
I suspect this will be a bit controversial, but I'd rather have a way to handle our imperfections than to simply state that we're just going to have to be perfect. An easy summary:

broken windows = BAD. Get in your car and go buy a new window now.
dirty glass = GONNA HAPPEN. Just make sure your regular chores list includes washing the glass, and that you're actually doing your chores.

* Sorry, this metaphor got pretty darn strained, but I think it's still holding together!

Tuesday, June 3, 2008

All Things Must Come to An End

We spend a lot of time talking about the start of software - defining it, designing it, building it, shipping it, deploying it. One thing that could stand a little more talk is the end of software -phasing out support, upgrading customers, and ultimately end of life-ing it.

Now, depending on the software and your audience, this is something that can take a few weeks, or it can take several years. Either way, there are several areas to consider:

  • Individual Features. You have two chances to remove this: before anyone is using it, and after it has stopped being used. 
  • Versions of Software. While you cannot enforce upgrades faster than your clients will take them, it is also your responsibility to make your later versions compelling so your clients will want them.
  • Hardware. If your solution includes hardware (as ours does), your hand will eventually be forced by dead machines. If you can no longer purchase hardware, and if you can no longer keep the hardware you have running, you cannot continue to support your clients on that hardware. Better to cease support before it gets to that point
At least in enterprise software (and I suspect with other software, in which I have less experience), the end must be planned at the beginning. So make sure your release plan includes sending the software out and also bringing it back when the cycle is over.

Monday, June 2, 2008

Automated Automation

In the way of the blogging community, some posts on some random blogs create a lot of traffic. I ran across one of those today*, read it, and come to find out, there was an interesting nugget buried in there. The way I read the entry, Steve Rowe isn't just advocating test automation (which seems to be what he's catching a fair amount of flack for). He's advocating automated test case automation. I read this a lot like code generation tools, just for testing.

Whether to automate, when to automate, etc. is a rapidly dying horse the test community has been beating for a while (with great fervor on all sides). But code generation for tests? Now that's interesting.  

Code generation as a concept has been around for a long time; it's basically what compilers do. More recently, the concept has been used to generate more programmer-readable levels of code. Basically, a code generator will take elements and build them out into a more complete program using some sort of a model. The idea is that you provide the business specifics and it uses one or more mechanisms to create a consistent program that conforms to those business needs. There's a much better description in wikipedia.

So, how could this apply to tests?

Well, on a certain level there are some tests we write over and over again in a given project. In one of my recent projects there are at least 8 places where we perform what is essentially the same logic: if a member of a group uploads a picture for this journal entry, then use that. If they don't, use the group's picture. If the group picture does not exist, use the default picture. Launder, rinse, repeat, substituting "team" for group and "message" or "activity" for journal entry. Now, when I did this I wound up using a set of helper methods to actually do the work and check the rules. And it worked. But I'll have to do it again (and probably tweak the helper methods) when this project adds some other similar feature. Code generation could be an interesting tool here. It's such a small scale that I'm not sure it's worth it, though.

On another level, there are the tests that are done time and again in lots of different projects. Think login forms, for example. There are eleventy-zillion username/password/remember me forms out there. Wouldn't it be great to have a code generator that basically set up the tests for you? You just feed it rules about a valid username and a valid password, and it tries all the combinations for you. The combinations are well known (correct, incorrect password, nonexistent user, etc). And since the generator itself wouldn't contain any business logic, it'd be a natural for distribution to your local coding community.

I've never actually done this, but the notion - especially the second one, where it's useful across a fairly broad audience - might be worth a shot. It certainly doesn't solve all our testing problems, but I'm a big fan of getting the easy stuff out of the way, and if code generation can help us all do that together, then I'm up for trying it.

EDITED: I noticed the Visual Studio 2005 has some tools to generate unit tests. I've not used it, but it looks like it generates a positive test for each method. The tool I'm thinking of would be a much deeper test (actually test incorrect values and create multiple tests), but would only work in a given problem space.