Thursday, December 30, 2010

Feels Right and Is Right

I have a reasonably large stash of about 200 cookbooks at home. I'd been organizing them roughly by category (baking, Italian, fancy meals, etc.) but my collection was growing large enough I decided to turn the organization of my bookshelf over to the experts: the Library of Congress.

The Library of Congress catalog puts cookbooks in the TX category (T = Technology, X = Home economics. I'm not making that up.). So I sat down with my cookbooks and the online catalog lookup, and sorted and organized. When I was done, I had a completely correctly ordered bookshelf. It was great! It was perfec.... well, actually, that's funny.

There was a section that had six books in this order:
Northern Italian | Greek | Indian | Italian | Italian | Italian

According to the Library of Congress, that's correct (regional comes before whole country). According to me, that's confusing. So we have something that is right, but it doesn't feel right.

You guessed it: I moved the Northern Italian book to be next to its Italian brethren.

It's a useful thing to remember when we're testing. There's what the spec says, but we also need to consider whether the spec itself matches with the way the software will actually be used. When in doubt, and whenever it's possible, usable trumps correct. Every time.

Monday, December 27, 2010

Some Names Are Protected

This post falls into the category of "I knew that, but I didn't think about it."

I'm working on a project that is a small, fairly simple Rails application. Or seemed to be until I ran into this piece of code:
<%= Chart::TYPE_OPTIONS.each do |o| %>
<%= f.radio_button(:type, o) %>
<%= f.label :type, t("activerecord.attributes.chart.values.#{o}"), :value => o %>
<% end %>

This is on the new chart form, and it's one of several fields (but the rest aren't important). Everything was working fine except this one field; the field just wouldn't save. It kept throwing "Can't mass-assign protected attributes" warnings.

Anyone see the problem? This one took me a good 10 minutes.

Yup, "type" is a protected keyword. Don't name attributes "type".

I renamed it to "chart_type" and lo and behold everything works. I could probably have made type work, but in general I'd rather rename the attribute than try to figure out all the places I have to do something odd. The moral of today's story is: know your language and your framework well enough to know what words you just shouldn't use.

Thursday, December 23, 2010

Just How Much Do We Need?

Sometimes when we write a piece of software it runs for ages and ages and it get used a lot. Other times, we write very "one off" software. Migrations, data cleanup utilities, etc. can often fall into this category.

A friend is working at a company where they license television airing data (think a TV guide) from a third party. They then massage this data and put it into a database, then do stuff with it. they recently switched providers for this data, and are having to do a few one-time things to get the data to be clean throughout. For example, there was about a three-day overlap in provided data, so for three days they have a lot of duplicates (e.g., two database entries that say, "Rehab Wars aired at 8pm on Tuesday"). So they need a one-time job to reconcile these entries, get rid of the duplicates, and update references to this data.

So how much testing is needed?

Well, let's see. On the "we need to test a lot!" side:
  • television data has lots of quirks (weird characters, subtitles for shows, overlapping air times)
  • we can't take down production for this, so it's going to have to be done as other items are changing, which means we can't simply restore from backup if it fails. And undoing it is going to be non-trivial (aka pretty hard).
On the "eh, let's not be dumb about it but let's not go overboard side:"
  • it's a relatively small amount of data
  • we wrote it so that it simply stops on any error, and we're pretty aggressive about error checking (so the odds of it running amok and hurting things are fairly low)
  • we can do it by hand (aka someone in the db reconciling SQL) in about 2 days, so inflating the project much beyond that is kind of silly
I don't actually know what they decided to do in this case. I do know that they finished the project (presumably successfully).

So what tests would I actually run?

Here are the parameters of the project:
  • This is a single SQL script that will be run on a production database once.
  • The script will fail on any error or data mismatch (that it finds) and is restartable from the fail point. (allegedly, of course - you may want to test for this)
  • The script identifies duplicates by name and air time and merges their records in the database.
So now it's your turn: what tests would you run in this situation? What would you skip? Why?

I'll tell you tomorrow what I would do.

Monday, December 20, 2010

Why Look at Automated Results

A developer I know socially asked me today why QA looks at the output of the nightly automated tests. He said that at least at the places he worked: "development mostly looks at the output of the automated tests. After all, the tests say, ' I tried to do X and didn't' and that means development has something to go fix."

That's interesting. After all, where I work, QA looks at the output of the nightly automated tests and logs bugs (or updates them). But why? What does this exercise tell us?

I want to look at nightly results to:
  • identify regressions so they can be fixed
  • see how well new features (and their tests) work in the overall context of the system. Ever see module A checked in and completely - and accidentally - break module C? Often the automated tests show it.
  • as a catch to identify race conditions or other infrequent bugs that are more likely to be found the more often I run a given test that has a chance to expose the condition
  • look at coverage reports to see what hasn't been covered at all
  • find areas that I should be exploring more. If the, say, upgrade tests, break frequently, then I know they're probably fairly brittle and so I go explore that area of the code in my own testing.
All of those, except the last one, are things that could be done by developers or testers. But that last one - guidance as to what I should be looking at more deeply - that I'm not willing to give up. I get information out of looking, and that's worth the 15 minutes a day I spend on the task. Of course, development is welcome to look, too! More eyes will see more information.

Thursday, December 16, 2010

In Theory

How many times have we heard or said those infamous words: "in theory that shouldn't be a problem"?

That's nice, but I don't live in Theory. I live in Boston.

Now, the only mental leap is to prepend the phrase "in theory" to anything that says "shouldn't be X" or "should be Y". "Should" is a word that indicates uncertainty - and that's a place we're going to want to test! After all, Theory is a pretty empty place.

Tuesday, December 14, 2010

Correcting Vocabulary

Semantics: the words we use, the way we describe things, the jargon we employ. All these are hugely important and entirely trivial at the same time.

Good semantics is precision.
Bad semantics is off-putting.

Often the only difference between those is your audience.

For example, Michael Bolton talks fairly frequently about testing versus checking. He goes on at great length in several forums about how testing is a sapient learning process and checking is simply comparing X to Y (I'm paraphrasing; go read the article). That's all great, for the right audience. Making that distinction provides a learning opportunity for testers, which is wonderful.

And sometimes correcting that vocabulary is a really negative thing to do.

Here's a hint: your average Product Manager who wants to publish a test plan really doesn't care if it says checking or testing. Your average Implementation Manager who wants you to review a product evaluation plan really doesn't care that the client has provided a plan that says "test X" when it should say "check X". He just wants to know if your product is going to pass.

Remember that guy in school who was always correcting everyone? "You don't mean literally got pushed under the bus. You mean metaphorically." "Actually, the appendix does have a purpose; it's vestigial, not useless." He was usually right, but nobody liked being around him. He was so stuck in the details and in proving how smart he was that he completely missed the bigger picture - an expression of frustration, or comfort offered to a sick and worried friend. Don't be that guy.

I recently was reviewing an evaluation plan provided by a potential client, and it was written by someone who evaluates products for a living. He isn't trained as a tester; his background is development and research. His evaluation plan says things like, "test that the memory utilization is 1GB". That's probably a check. So did I send back a bunch of comments that describe what a check is and why these particular things were tests and not checks? Nope. It wouldn't have helped. He was going to do these things during the evaluation, regardless of what they were called (so why bother making him call them something else?). He's not going to apply that knowledge going forward; after all, this evaluation technique has worked for him for years. All I would have accomplished would be to send the message, "I'm scared of something here so I'm trying to throw you off your game by drowning you with vocabulary that isn't going to affect what you do and therefore really isn't important." Later, when we have a relationship and when there's an opportunity to talk about testing versus checking in an environment where he's open to learning, then I'll bring it up. But right now, it just looks like a snow job intended to make him feel dumb.

So when you go to correct someone's vocabulary - to say, for example, "You called that a test but it's really a check and here's why" - make sure you're doing it for the right reasons. Make sure it actually matters. If you feel the need to correct someone's vocabulary, ask yourself three questions first:
  • Does this person care whether he got this term wrong?
  • Will correcting him cause any positive behavioral change?
  • Am I doing this to make him think I'm smart or because there's an actual purpose?
Don't set out to prove you're smart. Set out to help. Accept that the English language is imprecise, and seek increased precision in vocabulary only where precision will help.

Tuesday, December 7, 2010

Tools-Oriented Test Plan

I've been reviewing a lot of test plans lately that, well, that weren't written by testers. They were written by fresh computer science graduates, or by developers for the most part. To be fair, I've also seen junior testers write test plans a lot like these.

They're the tools-oriented test plans. Basically, it's a test plan that says, roughly, "we're going to use these tools." The relatively good ones even say, "we're going to use the tools in these configurations.".

Reading a tools-oriented test plan is like asking your kid, "what are you going to play in the park?" and getting the answer, "there's a slide and a swing set and a monkey bars!"

Okay. That's nice. But what are you going to actually do when you get there?

The biggest problem with a tools-oriented test plan is that it doesn't describe what you're trying to accomplish. There are no success criteria. There are no intended areas of learning. There are no pass/fail or conformance criteria. Because it doesn't convey intent, you can't know if you've accomplished that intent. There's no way to know if you've succeeded.

The good news is that a tools-oriented test plan is often really good at making you aware of the existence of tools. It's also generally salvageable. You've written a lot about how you're going to accomplish things, so all you need to do is add and rework it to describe WHY you're going to use this tool and WHY these configurations are interesting or important. To get from a tools-oriented test plan to something usable, I do the following:
  • walk through each section and ask the question, "what is doing this going to tell me?"
  • make sure I put the answer to the question into the test plan
  • create a section called "tools"
  • put the tool name and a description of all of the relevant features into the tools section
  • replace all the tool info with a reference to the appropriate feature(s) in the tools section

But if you find yourself reading and saying, "I see what to do, but what will that tell me?", you've got a tools-oriented test plan. Tools are great, but you don't have to let them run you. Instead, put them in their proper place as servants to the information you need.

Monday, December 6, 2010

Quick Creativity Exercise

Sometimes we get into a rut. The team's the same for a while. The product's basically the same (sure, new features, but pretty much the same idea). The test strategy's the same. The customers are solving roughly the same problems with it. In other words, you have a stable team working on a mid-life product. In many ways this is a good thing. But it's a rut nonetheless.

As testers, we need to stay away from ruts in our thinking. There is an element of creativity in what we do, dreaming up uses, configurations, combinations, deployment strategies, and whatnot for the product. When the test team is in danger of a rut, I like to do a quick creativity exercise that I call "How Many?" (I'm sure there are other names for it, and it's highly unlikely I invented this, but I don't know who did.)

Here's how it works:
1. pick some small thing in an area you all know well
2. ask the team "how many ways can you accomplish this small thing?"
3. give everyone 5-15 minutes to brainstorm. Actually trying it is optional; the point is creativity, not necessarily that it would actually work.
4. compare notes
5. the person with the most things wins a small prize. We tend to put a congratulations message on that person's whiteboard or something (doesn't have to cost money or be a physical prize).

For example, we did one recently that was:
How many ways can you think of that a file might be removed from a perforce change?

Example answers:
  • the person owning the change deletes it from the file list
  • in doing a p4 resolve, the person might choose to skip the file
  • a trigger on submit might find a problem with that file and remove only it from the change (this, by the way, wouldn't be desirable, but it's possible)
  • the file might never have gotten added to the change in the first place
The game doesn't have to be about your product directly. (Hint: I'm not on a team that builds Perforce. We just use it for source code management) It should be about something work-relevant, though: your product, your tools, your customer's environment, etc.

If you're worried about a rut, give it a shot. Just a few minutes of explicit creativity can make a difference across tasks.

Thursday, December 2, 2010

Error Messages You Don't Hate

Getting an error in something you're using stinks. You have to stop what you're doing, figure out if you did something wrong or if there's a problem with the tool you're using, figure out a way around it, and only then can you get back to what you were actually trying to do in the first place.

How long this takes and how you feel about it partly depends on the error messages you get. Some are no help at all: "An error occurred. Please try again."
Some are incomprehensible: "Error 0x930432. Contact support."

But there's only one that made me laugh out loud yesterday:
"421 Timeout - try typing a little faster next time"

Go ahead and have fun with error messages. Be informative and a little irreverent. After all, when's the last time an error made you smile?

Wednesday, December 1, 2010

Team Jazz Band

We had the following conversation yesterday:

1: "Hey, look! The mem option is a signed int. Whoops!"
2: "Oh yeah, 3 just logged one for the scanner that the object option is signed"
1: "Cool. Let me see..... oooooh that one core dumped. Mine just threw a bad error message."
3 walks in with some hot dogs (it's lunch time)
2: "3 wins!"
visiting dev: "what does he win?"
3: "hot dogs, apparently"
visiting dev: "you got him hot dogs for a core dump?"
2: "oh! business idea! core dumps spit out hot dogs. We'll make a fortune in the food court!"
1: "sweet"
random other dev coming in for the day: "how'd nightly tests do last night?"
visiting dev: "nightly did great. Those humans are making up for the gap by logging lots of bugs."
1: "Yup, option testing on the different binaries. It's producing kind of a lot."
1: "Hey, 2, did you ever figure out when that performance drop happened?"
2: "I know it was after the 28th, because I just dug up the test results and they were faster then. So I wanna say it was in early November. Still digging."
1: "Visiting dev, I think the boss said you thought it was the 5th of November or so. Was that based on anything or just speculation?"
Visiting dev: "Boss's speculation, maybe. Although the 5th of November sounds familiar."

And the conversation continued on from there. I felt like hanging a sign on the wall: "Serious testing going on here."

Were we having fun? Definitely. We're the laughing-est group in engineering.
Precision language? Nope.
All work related? Nah, but most of it.
Were we testing? You bet.

I like to think of this as the jazz band state for the QA team. We're all working on separate things, and if you asked us what we were doing we would say:
1: Testing options on the reader binary
2. Trying to track down a performance regression
3.: Testing options on the scanner binary

But we're doing much more than that. We're creating ideas and bouncing off each other while continuing to work on our separate things. It's like loosely coupled pairing. We're not really pairing, but by keeping an ear open to each other's conversation, we get ideas to improve our own tests and we answer each others' questions.

It's a great flow state for us when we can get there. Our tests improve, our speed improves, and our rhythm improves. So whenever we're all in the office testing (no meetings or other interruptions), we try to work on similar problems in different areas. For example, today most of us were working on testing the options to a binary (--verbose, --mem, etc.). That way our ideas actually make sense to each other for immediate use. We turn from individuals playing our instruments into a jazz band, bouncing solos and harmonies off each other and making great music along the way.

Of course, like all musicians, sometimes we need to practice on our own. That's what the team quiet room is for; we put a computer in the team room and each of us can shut ourselves away for some quiet focus time. But the real magic happens when we come together and play a tune.

Give it a try and see what wonderful music ya'll can play together.

Tuesday, November 30, 2010

The Woodchuck Factor

We've been writing our test plan for the next release, and the first draft contained a sentence that might as well have said:

Run the following tests:
  • Identify how much would a woodchuck chuck if a woodchuck could chuck wood, varying the amount of wood and the chuck factor.
  • Confirm that a woodchuck would chuck as much wood as a woodchuck could chuck, if a woodchuck could chuck wood.

Score one for incomprehensible sentences. The actual phrasing was of course, different, but it was the same effect: completely accurate and a good test, but expressed in a truly confusing way. We had hit the woodchuck factor.

The woodchuck factor is when a sentence in your documentation could be replaced with the woodchuck sentence and make about as much sense. (This is one of those things where we know it when we see it.)

Now first of all, these are the sentences that make me laugh, so that's a reason to look for woodchucks. But there's an actual problem here, too. It's really hard to understand, and it's fairly simple to misinterpret or simply skip because, well, that "doesn't make any sense" at first read.

If you hit the woodchuck factor, look for two underlying causes:
  • Overloaded Vocabulary. Are you using the same word as a noun and as a verb? Are you using the same word to describe multiple different things? Often we get away with describing two things with very similar terms, until we try to use them together - and then there's a woodchuck.
  • "Fancy" Phrasing. Woodchuck factors show up when you're trying to be "formal" or "fancy". Legal documents are notorious for this - "the party of the first part shall hereinafter be referred to as the First Part Party" - but it can happen to any of us. You can use formal language and still be declarative. Start with an example and generalize from there, but keep it simple.
Woodchucks are fixable. So the next time you find yourself reading a document and saying in your head, "How much wood would a woodchuck chuck...", laugh. Then fix it. Future readers will thank you.

Wednesday, November 24, 2010

Weekend Testing Americas Retrospective

On November 13th, I participated in the first Weekend Testing America. Weekend Testing is a challenge we take, getting together for a few hours to test an application.

Our application was a Dance Dance Revolution clone. First, I have a confession to make: I used to play that game on the PlayStation. And I was sort of okay. However, I was terrible at this game. I had a lot of fun! We paired up and figured out how to test together. It was certainly a target-rich environment, so the question wasn't how to find a bug. It was how to approach this application and how to do it with a remote partner. Joe Harter and I paired for this one.

He and I spent two hours together testing the application and talking about it with the other participants. Our charter was to pair and discover the application.

Things I noticed about pairing with a remote tester:
  • Use a different computer for talking than the one you're using to test.
  • Screen sharing rules.
  • Chat is a bit tough. It's harder even when you type at different speeds or in different ways (Joe was a type-a-lot-at-once guy, and I'm a frequent-short-sentences girl). When in doubt, calling each other helps.
  • Spend some time before you pair setting up sharing
Things I noticed about testing:
  • In a target-rich environment, double check your charter. It's really easy to stray.
  • Writing up bugs always takes far longer than you think it will, if you're interested in doing it properly.
The single biggest thing I noticed was the sheer diversity of view points. At one point I was sitting there saying, "That is so weird! It must be a bug!" and Joe was telling me, "Actually, I think a lot of video games are that way." Oh, really? Oops, good to know. During the post-test discussion, even more viewpoints came out. Some people spent a lot of time actually playing the game, some people were very bug focused, some people were very focused on comparative tests, and some people were simply overwhelmed.

If you at all have the ability to do one of these, please try Weekend Testing. Shaking up the routine, working with new people - it's amazing what a change in routine will do.

Tuesday, November 23, 2010

So You Wanna Make a Big Change

There's been a lot of buzz recently about "selling". Let's "sell" agile, and let's "sell" testing, and we're all really "selling brand You" when we look for a job (that last one was a particularly atrocious local news story on what had to be a slow news day).

I think it's great. After all, sales isn't really a dirty word. Sales is simply making people aware of the benefits and costs of something with the end results of asking them to commit to that thing. I'm sure there are more official definitions out there, but this is the one I like.

A lot of the things we're trying to sell, though, amount to a large cultural and strategic change. We want to sell someone on moving to an agile development methodology. We want to sell someone on providing testers with early access to a system. The costs of this type of change are high; it can easily take months or years to change a culture and it's inefficient while it's in progress. Cultural and strategic changes also rarely go unnoticed. Maybe if you're HugeCo and you start with one tiny project, then you can slip it under the radar. But most of us don't work for HugeCo, and most of us don't have a project that is our own domain that we can try new things with.

So let's assume we have to sell this cultural change. We're borrowing the sales verbiage, so let's borrow the techniques, too. Approaching it like it's a sales challenge, then we need to identify:
- the benefit of the change ("who wins")
- the cost of the change ("who pays")
- the people who - directly or indirectly - have to change along with it ("who cares")

From there, it's generally a matter of identifying the decision maker and decision influencers. The decision maker it the go/no-go person in the end, and it's probably the guy who's going to end up both getting the most benefit and absorbing the most cost. In the case of changing development methodologies, it's probably the head of engineering. It might be the entire executive committee.

Decision influencers are a lot more numerous. It's the sales guys who are counting on a release next quarter, and the marketing types who need a new feature for a trade show in 4 months. It's the support group who will need to handle different ways of getting customer hotfixes through development. And it's your software architects who have to implement this thing. The decision influencers are everyone who wins, everyone who pays, and everyone who cares.

Convince the influencers, and they'll help you convince the decision maker. But you have to hit most of the decision influencers, and you have to start selling. Start showing them why they should care, why they win, and why the cost is something they want to absorb because of the benefit they'll end up getting. Then you're really selling. And who knows - maybe you'll get the eventual decision in your favor.

Monday, November 22, 2010

Chicken Little Eats Risk

Software testing is highly prone to chicken little syndrome. It's relatively simple to translate "we don't know what would happen if..." to "think of all the terrible things that MIGHT happen if...". This isn't a testing problem, per se. Nervous engineers and managers come from all over an organization.

For example:
"We don't know what would happen if a hard drive failed while the system was doing X"

can easily turn into

"If a hard drive fails while the system is doing X, it might crash! And if that happens at our biggest customer, they might decide not to use the system any more because if it crashes they can't trust it! And then they might tell someone they're not using the system any more! And then our other customers might get nervous and decide to stop using our system because if major customers don't trust it then the system must be bad! And then we won't have any customers and we'll go out of business!"

(I should note that doomsday scenarios tend to include a whole lot of exclamation marks.)

"We don't know" means that we don't know terrible things won't happen. But we also don't know that terrible things will happen. This is where risk comes in. For example, if the thing we don't know much about is the static HTML help, then it's probably not too likely that terrible things will happen. If the thing that we don't know much about is the core algorithm of our product, then that's a bigger deal.

This is where testers come in. Testers, after all, are generally the ones who actually work with the software as an entire system frequently. They know which parts of the system are generally well understood, and which parts have a lot of unknowns. Often, that means testers are the first ones to say, "well, we don't really know what would happen if....". This is the point at which we get to affect chicken little syndrome, and we get to do it by describing what we do know.

The panic of chicken little syndrome comes from the perception of a vast area of potential problems; the bigger the grey area, the more likely it is that someone can think of a huge problem. So to avoid chicken littles, make sure you bound the area of unknown as much as possible.

Using our html help example from earlier, we can say, "We don't know much about the static help; we haven't looked at it yet this release." And we can stop and chicken little will come running through the office looking panicky. Or we can keep going, "However, we do know that this is all static html, so the likelihood of any side effects from issues or of any problems beyond the content is pretty small." By restricting the area of the unknown, we have started to state our risk, and we have also decreased the likelihood that anyone in our reporting circle will panic.

So go ahead and report what we know and what we don't know. Just be sure that you report the boundaries clearly. Placing your report in context makes all the difference between calm handling of uncertainty and potential panic.

Tuesday, November 16, 2010

The Forest, The Trees, and the Leaves: Part II

Yesterday, we talked about thinking about the effects of our work on the next biggest goal. When we work on the leaves, we must consider how they help the trees. When we work on the trees, we must consider how they help the forest.

Today an example of a leaf:

I have decided I want to host a brown-bag lunch for our engineering team, providing an overview of a new language I've been playing with.

This is a small thing. It's one lunch, probably an hour of everyone's time.

It's tempting to jump to the forest. How will one brown-bag lunch help the company's bottom line?

The short answer is: it doesn't.

One brown-bag lunch will do nothing at all for the company's earnings this quarter. Or next quarter. Or even next year.

However, one brown-bag lunch just might...
... inspire an attendee to write a little reporting prototype, just to play with the new app
which just might ...
... inspire a passing product manager to make that a new feature
which just might ...
... get the product a "most improved 2012!" writeup in a major industry magazine
which just might ...
... increase sales by 75% that quarter.

Now there's a lot of mights in there. It might not happen. But it might. And if we don't hold that brown bag lunch, then it definitely won't happen.

So our leaf - a brown bag lunch - has helped our tree - employee training. And because we helped our tree, we gave our forest a better chance.

Handle the leaf, and think about the tree. The forest will follow naturally from there.

Monday, November 15, 2010

The Forest, The Trees, and the Leaves: Part I

What's that old saying: "You can't see the forest for the trees"?

The truth is, we have to see the forest sometimes; sometimes we really need to be looking at the trees; and sometimes the leaves are the targets of our investigation.

If we consider that the smallest unit of work we can do, that is a leaf. A test session is a leaf. A bug verified is a leaf. A

The forest is what we commonly call "vision". The forest is the vision of the company and the ultimate goal: to make money for shareholders. When we look at the forest, we are attempting to directly answer the question, "What will X do to make money for the company?"

The trees are the in-between. The trees are the sum of our actions that together provide value for the business, and indirectly affect the company's vision. Often, we call the tree "strategy", and in a medium way it is. Employee training is a tree. Brand identity is also a tree. Customer service is a tree, too. Quality of product is a tree. There are many trees.

The leaves service the trees, the trees make up the forest. We don't ask what the leaf does for the forest; we ask what the leaf does for the tree. We ask what the tree does for the forest.

We often hear: "Don't do something unless it helps the company." I offer a corollary: "Consider how this helps the next biggest thing."

Over the next few days we'll talk about examples of considering the next biggest thing.

Tuesday, November 9, 2010

Manage People, Not Roles

Alan Page wrote recently wondering where the innovative thoughts in software testing really were (I'm paraphrasing; go read the blog post). In a comment, he mentioned that "The way we manage test teams (entire engineering teams, really) has to change."

He's right. We have to learn how to manage better.

I, like many managers, worked in my discipline (software test) and was promoted up to being a lead, then to being a manager. This makes me a technical manager; I understood the discipline of test very well. I understood the discipline of management rather less well.

Fortunately, I've learned a lot since I was first a manager. The single biggest lesson:

Manage people, not roles.

If I'm a Test Manager, there's a tendency to read it exactly that way: test first, manager second. When this happens, it's easy to approach every problem as a test problem. After all, test is the comfort zone, and manager is the new and slightly mysterious stuff. Things like team makeup become a matter of "ensuring coverage" by hiring different types. Things like correcting a team members behavior become "logging a bug" and "verifying the fix", which often translates to saying, "here's a problem. Let me know when you've fixed it." That might work when logging a bug for dev, but it isn't likely to work when you're dealing with someone who is frequently late on deadlines.

The better way is to realize that a Test Manager is a manager first. You happen to manage people who test. A Development Manager is a manager first, too; she just happens to manage people who develop. I'll give you three guesses what an Engineering Manager is first!

I don't manage testers. I manage people who happen to test for a living. People have different quirks than tests do. People need reinforcement and guidance and praise and correction and freedom to fail and encouragement to learn and a chance to succeed. This is true for developers, for testers, for support engineers, and probably for accountants and writers, too. (I'm guessing on those last two, but it seems reasonable).

My first responsibility is to the people who need to get the testing done, not to the tests that the people have to do.

If I can get the right people and give them the right environment, the testing will fall from there. Don't worry about the roles. Worry about the people. Get that right and everything else will follow.

Monday, November 8, 2010

Elevator Test: Part II

Apparently there are a lot of really awkward elevators in this world.

Here's an elevator button panel for us to test (courtesy of Robyn's Posterous):
Let's say the entire thing works correctly. I'm pretty sure I'm still going to log at least a few bugs. We've got usability and consistency problems galore.

Just because it functions doesn't mean there isn't a bug!

Friday, November 5, 2010

Goal vs Task

I met a friend last night out at a restaurant he'd never been to. To tell him where to go, I said, "from your office, walk down Boylston until you hit Dartmouth, and make a right. Then walk down Dartmouth and it'll be on your right." This was a mistake. You see, my friend wasn't coming from his office; he ran an errand on the way and all of a sudden my directions weren't very helpful. What I should have said was, "it's at the corner of Dartmouth and Clarendon." That way, no matter where he was, he could figure out the best way to get there. I told him what to do, not what my goal was.

This applies to testing, too. When the product manager asks me, "can you please run a performance test in configuration foo?", that's telling me what to do. It might be right, or I might get some element of it wrong and basically waste the test (and who has time to waste!). If instead, the product manager wants to "run a performance test to update this analyst with the amazing benefits of our new feature foo", that's telling me what you're trying to accomplish. With that I now know to run the exact same test as I did the last time we published results for that analyst - same data sets, same hardware, same system configuration - with the exception that I'll make sure feature foo is enabled. Because I understand what the product manager was seeking to accomplish, I'm much more likely to provide useful test results.

Don't tell me what you want to do. Tell me what you're trying to accomplish.

I'm much more likely to do the right thing, if I understand the end goal. So if I hear something that sounds like a task to me - that sounds like telling me what to do - I'll probably respond with something like, "Just so I make sure I get the information you want, can you tell me about what we're going to do with the results?" You save yourself a lot of time and test repetition that way.

Thursday, November 4, 2010

Expected Bug Levels

I had a conversation with my boss today that started like this:
Him: "Gosh, I've noticed a lot of bugs lately. The bug count is higher and there are lots of new ones."
Me: "Yup."
Him: "Should I be worried?"

Now that's an interesting question. The correct answer changes based on when we have this conversation. (In this case, we're just after feature freeze, so no, he shouldn't be worried.)

Every project I've worked on has a natural bug flow, like a tide in the ocean. Sometimes there are more bugs found, sometimes fewer. With small variations, it usually goes like this:

  • Project start: Early on in a project (or release), there isn't much new. Most of the work being done is analysis and coding on things that simply aren't checked in yet. The average number of bugs found in this period is pretty low.
  • Mid-Development Cycle: As some features start to get checked in, and the testers get some of their tests done, the bug count starts to creep up. Some bugs have been introduced, and some are being fixed, but the focus of the work is generally still on feature implementation, and the number of bugs found only comes up a little.
  • Feature Freeze: Regardless of whether feature freeze (also known as "a first pass at each feature has been done but it isn't baked yet") is formal or informal, there comes a point where pretty much all the basic implementation is complete. At this point the code is just starting to play together, and the volume of code being checked in is pretty high. The number of new bugs logged also goes way up; this phase often represents a high point in terms of quantity of bugs found. Note that this isn't inherently bad - we've got a lot of things and now we have to get them all to play together. It's just a relative high point, and it's to be expected.
  • Polishing: After the features are basically implemented and before release, the number of bugs found and number of bugs open generally trends down. Integration points are smoothed out, bugs that were ignored because someone was heads down in a feature are fixed, and the entire product solidifies toward something release-able.
This isn't a metric, and it isn't a value judgement. It's simply a project heartbeat. Seeing "lots" of bugs logged - regardless of what "lots" means to your project - might be okay or it might be a big problem, depending on where you are in a project. Take a look at your defect tracking system: it's one place that might help you see whether your project is behaving normally or if there's something going on that warrants investigation.

Wednesday, November 3, 2010

Manufacturing Software

Matt Heusser wrote a great riff on "Quality" as interpreted by American business. Go read it and come back.

Back? Great.

So here's the problem with many of the examples in Matt's piece: they're all about manufacturing. Taylor wasn't concerned with the software engineer; he was concerned with paper and steel manufacturing. Lean principles came from car factories. All these things are concerned with building something consistently and doing that as efficiently as possible. Innovation isn't intended to make something different or to make the product better; it's intended to make the product more consistent and to use less effort to make that product more consistent.

So now we apply this to software!

After all, software is about building the same thing over and over and simply doing it more efficiently and.... wait. That doesn't sound right at all.

This is the fundamental breakdown: as software professionals we seek innovation in the product itself, not innovation in the process used to create the product.

There is a process of manufacturing software. There is a compiler that manufactures bytecode. There is a library that calculates a hash. Those are manufacturing processes in the sense that the way we improve quality is to produce the same end result (the same bytecode, the same hash) more efficiently and more consistently. Taylor's "labor" - the workers actually doing the manufacturing - is all software for us now. We, the developers and the testers, we are Taylor's management.

We as management have responsibilities, too. It is our responsibility to seek better products for our workers (compilers) to build. It is our responsibility to seek out inefficiencies in the manufacturing process (are we using SHA-2 hashes when MD5 will do?) and to correct them. We are the ones who must conduct time and motion studies (only we generally call this optimizing the software or performance analysis).

So yes, go ahead and use a manufacturing analogy - just recognize that the humans are all on the "management" side of that analogy. (Yes, even those of us that don't have "manager" in our title.) Taylor wanted to make laborers into highly efficient drones with focused areas of expertise. In software, he has succeeded - a computer program makes a highly efficient drone with a focused area of expertise. It's time to move on; it's time to think about what we're building instead of how we're building it.

Monday, November 1, 2010

Developer and Tester Pay

When I'm building a development team, I'm looking for certain characteristics:
  • ability to learn quickly
  • logical thinking
  • familiarity with common patterns and techniques
  • knowledge of relevant tools
  • self-directed and able to solve problems on their own
When I'm building a test team, I'm looking for certain characteristics:
  • ability to learn quickly
  • logical thinking
  • familiarity with common patterns and techniques
  • knowledge of relevant tools
  • self-directed and able to solve problems on their own
Those look the same!

In many ways, testers and developers are really quite similar. A lot of the expectations we have for developers are the same as the expectations I have for testers.

So why would I pay my testers and my developers differently?

I'm hiring a software professional. Sometimes it's a software professional who happens to have expertise in front-end technologies; sometimes it's a software professional who happens to have experience with file systems; sometimes it's a software professional who happens to have experience with performance analysis. In all cases, I'm prepared to pay according to how rare that expertise is. Your average kernel developer is going to cost more than your average web developer. Your average performance tester is going to cost more than your average GUI tester.

Judge by the expertise you need, not by the title you're going to give. That will tell you how much you need to pay for the next software professional you hire.

Thursday, October 28, 2010

Ant on OS X

I ran into this last night and scratched my head for a little while, so this is for anyone else who runs into this:

I have a Java project that I build with Ant on my OS X machine. It compiles and produces a war file called "project_client.war". After the most recent OS X update late last week, that stopped working. Instead, it started producing a war file called "project_${}.war".


Here's the relevant ant task:

<target name="buildWarFile">
<delete file="${current.war}" failonerror="false" />

<war destfile="${current.war}" webxml="${current.resources}/facilitate_core/resources/facilitate.xml" duplicate="preserve" update="false">
<fileset dir="${current.resources}/${current.secondary_resource_dir}/html" />
<fileset dir="${current.resources}/facilitate_core/html" />
<webinf file="${}/templates.jar" prefix="WEB-INF/lib/" />
<webinf dir="${current.resources}/${current.secondary_resource_dir}/resources/templates" prefix="WEB-INF/resources/templates/" />
<webinf dir="${current.resources}/facilitate_core/resources/templates" prefix="WEB-INF/resources/templates/" />
<webinf file="${current.jars}/velocity-1.4-p1.jar" prefix="WEB-INF/lib/" />
<classes dir="${}">
<include name="com/spun/**" />
<target name="pushWarFile">
<isset property="" />
<ssh host="${}" username="${current.ssh.userName}" password="${current.ssh.password}" version="2">
<sftp action="put" remotedir="${current.ssh.remoteDir}" verbose="true">
<fileset file="${current.war}" />

That references two properties files, a (read in first) and a (read in second).

This is

This is

When I ran the ant task with -v, it showed an error:
" is not defined"

Well, that's funny. This worked just before the update, with no code change. Also, I see name right there - it's defined!

The trick turned out to be removing "current" from the war property, making look like this:

And that did it. My best guess is that it's something that has to do with the new namespaces features for properties in ant. But if anyone else runs into something like this, maybe this will help.

Wednesday, October 27, 2010

Beware the Matrix

Matrices are very common in testing. Here's one we use at one of my clients:

It looks simple, and it really is. We have our tests written. All we have to do is run the test in a loop for each of the squares in our matrix. Hooray!

There's a catch. (There's always a catch.)

In this case, the catch is that each test takes about 5 hours to run, and we don't really have the machines to spare to run 100 different configurations at 5 hours each.

That's the problem with matrices; they seem like such a good idea, and they're really easy to design as test cases. Unfortunately, they're often something that simply can't be completed within your constraints, and even presenting a matrix like you see above generates an expectation of completeness on the part of anyone who looks at it. If I show this to my boss, for example, his response will be, "Great! Let me know when we can see the results." Unless he's willing to spend a whole lot of money on machines (and we do have a budget), the answer is "No time soon, and frankly there are better ways to spend our time."

Depending on what's in your matrix, it might be a good candidate for combinatorial testing. In particular, if your equivalence boundaries are accessible, consider using this technique. If your needs are different, then consider flexing along one variable at a time or simply skipping steps. In my example above, we're testing performance, so we don't necessarily care about every single step along the way. We can run with the extremes and a few values out of the middle. Only if something "looks funny" (that's a technical term!) do we need to go back and fill in the blank. In the end, we wound up running every other cache size and only at two chunk sizes. From that we learned that cache size didn't seem to make a major difference, but chunk sizes did. Test complete, information gleaned; the total cost of the effort was 10 tests - 10% of the cost of the entire matrix.

Sometimes you really do need to fill in the whole matrix, and when that happens, get cranking and start filling in boxes. Just make sure before you present a matrix that you really do need the entire matrix. If you only need part of a matrix, only show your test consumers (dev, management, whoever) the part of the matrix that you actually need and that you actually intend to test.

As with all tests, when you're looking at a test matrix, ask yourself what you're intending to learn from running these tests. Then run only the tests that give you the learning you need. Skip the rest; there are plenty of other things to do!

Tuesday, October 26, 2010

Rspec and Generators

I learned a new trick today!

When working in Rails, I use the generators as easy ways to create models with migrations, and whatnot. I got used to running my generators with "--rspec" since that's the test framework I'm using currently.

I can save myself some time by adding this to my application.rb:

config.generators do |g|
g.test_framework :rspec

Simple, right? But it sure saves me from forgetting and having to hand-make my own spec files. (It's the little things...)

Monday, October 25, 2010

Anybody Need a Pair?

Here's a dirty little secret: testers are probably the biggest generalists on any team. Your average tester can write some code, parse a customer's request into a requirement, draw boxes on a board to explain the system design, install a new package in the test lab, and go through logs with a support engineer to see what on earth the customer might have done. Oh yeah, and test. Generally, your tester isn't your best developer, is not your greatest business analyst, probably doesn't understand the system as well as the architect, but he's pretty good at a lot of things (and often very good at testing).

So if you're a tester and you want that kind of breadth of knowledge, how do you start?

Pair with anyone who will let you.

Okay, maybe you're not in an agile environment and ya'll don't do pairing. In that case, the question is "can I sit with you for a while and...?". It's still pairing on some level, we just won't call it that! You get to learn about what they do and about the system from a different perspective, which can only help you test better. They get the benefit of fresh eyes, and of having to explain things to a newbie.

I don't think this was set up intentionally, but in every functional engineering team I've worked with, the test team is the glue that crosses silos. You want to be in that position. You want to be the team that sits with development and says, "You know, I heard the support team complaining about that taking a long time to do serially. I think they said they had a customer that added 200 widgets a day, and the customer was none to happy about having to do it one widget at a time." It may not get the workflow changed, but it starts the conversation, and you have a decent chance at fixing real problems before you ship.

So when in doubt, find someone and go pair. It's amazing what you might find, and you didn't even know you were testing!

Thursday, October 21, 2010

Metric Types

I've been at STPCon this week, giving a few presentations and going to several presentations. One of the more controversial sessions there was on metrics, and how to use metrics to get to "success" (whatever measure of success you're using). The most interesting part to me was the discussion afterward.

You see, metrics sound like they ought to be awesome: we can measure what we're doing and then we'll know whether we get better or not! In practice, they tend to be pretty much unrelated to things that actually help us and actually help the business, and they are either neutral or they can do more harm than good. I'm not quite ready to throw this baby out with the bathwater yet, though.

Metrics are a time for

Not a time for

There are three main types of metrics that you can use:
  • internal metrics
  • process-oriented metrics
  • business-oriented metrics
Internal metrics are the ones you don't tell anyone outside of engineering about. They're the things you track to measure your own performance. Internal metrics include checks on how accurate your estimates are: "On this last project, we were on average 20% over our estimates." or "The five last releases, we found a lot of bugs in the whizbang component relative to all the other components. Maybe we should go looking at that one a bit more". Internal metrics can be great information providing tools. Some of them are really only useful to point out potential problem areas in a one-time look, while others you can use for years. The important part of internal metrics is they provide information to engineering and don't have a lot of relevance to other departments. These are the things you go find when you say, "I wonder what..." and don't need to put up on a big project dashboard.

Process metrics are the ones that can be the most dangerous, and they're really common. These are metrics like "number of bugs found per story" or "number of bugs per thousand lines of non-comment code". The problem with process metrics is that they measure how you do things, not what you accomplish. Given that most people are trying to game metrics, gaming these changes how you do things, but not necessarily what you do.

Business metrics are the ones that measure effect on customers. Revenue is an example of a business metric, as is % of returning versus new customers, or brand value (Coke is worth over $66B). These are the metrics that are directly tied to the success of your business. Bringing these metrics into test is often somewhat difficult since better testing rarely can be tied directly to new customers (indirectly, of course, the overall quality of your product affects how many customers you bring in and how many you keep). However, if there is an issue will cause you to lose customers, or to be publicly embarrassed and hurt your brand value, the effects of that issue can be tied to these business metrics.

Don't throw metrics out entirely. Just be careful that your metric is limited in scope to ways that it will be useful, or that it can actually measure what you affect for the company's overall purpose. Throw the rest of it away - you'll have more time to spend actually testing.

Monday, October 18, 2010

They Don't Have To Justify

Here's the situation:
You found a bug. You think it's a really annoying bug. You reported the bug, including your opinion on the likelihood and the annoyance of having to do this really obnoxious workaround.

Then it got deferred. The official reason: "Not important enough to hold the release. Fix it for the next service pack."

Wait, what?! It's really annoying! Why on earth would you defer it?! It's time to go to the person making the decision and get an answer. You need to understand why they would make such a counterintuitive decision.


Deep breath.

Lesson #1 in corporate politics: you are a tester. The person making the ultimate call is probably a director, VP, or someone else who's been around for a while and seen some things. They do not have to justify themselves to you. It doesn't matter if they've been promoted to the level of their incompetence; they still outrank you. Demanding justification will get you nowhere.

Your job is to make sure they understand the bug and its implications. They have to "get it". Once they understand, they can make a decision. If the person then wants to explain that decision to you, that's wonderful (and not uncommon), but it's not actually a required part of the process.

Seek confirmation of understanding, not justification.

Remember, you are an ally of and an advisor to the person making the release decision. You need to be someone they can trust, someone they want to see. If you're constantly asking for justification and making them defend their decisions, then you can't be an ally. So do what you really need to do - be an advisor and ensure understanding. Don't force your understanding of the ultimate decision; it'll generally come out anyway, and you won't have had to be obnoxious about it.

Friday, October 15, 2010

On Network Debugging

I've been working on a project that involves essentially screen scraping. Basically, I perform an http get, parse the response, form a request, and then perform an http post (launder, rinse, repeat). It's not pretty, but this particular site doesn't offer an API and this was their suggestion.

As I work through this, I spend a lot of time looking at the requests and responses going over the wire, trying to figure out what parameter I didn't set properly, etc. To do this debugging, I had my choice of network monitoring tools. The ones I use most are:
  • Wireshark
  • tcpmon

Wireshark is a network packet tracing tool. It runs as a wrapper around your network driver and picks up all the traffic. This is great when you are trying to figure out basic connectivity, any sort of network congestion, or the like.

tcpmon is a tool specifically for monitoring TCP traffic. It lets you see the requests and responses, and even lets you modify a request and resend it.

Choosing a network monitoring tool depends on where you think your problem is. The OSI Model for networks has seven layers, and you should aim your tool at the layer(s) where you think you have a problem. Think about the kinds of problems that you might see (bottom to top):
  • Physical Layer
  • Data Link Layer
  • Network Layer - Wireshark shows here up
  • Transport Layer - Tcpmon shows here up
  • Session Layer
  • Presentation Layer
  • Application Layer

For this problem, I only cared about the transport layer and up - what was in my tcp request and what the app did with it. So I was able to use tcpmon rather than Wireshark. Neither is better than the other, but tcpmon showed me what I was looking for without the extraneous information Wireshark offered.

I tend to choose tools that are as high level as possible while still showing me the error I'm seeking. It's not a perfect rule, but as a general rule of thumb, it works pretty well.

Twist Podcast

Matt Heusser and I had a chat not too long ago, and he's put up this Twisted Podcast (free registration required). We talked about... how we got into test, some about test teams in an agile environment, how testers and devs can pair to accomplish all sorts of tasks, and a few other things.

Go have a listen!

Wednesday, October 13, 2010

Test Variations

Once you have a base of test automation, it becomes easy to do variations on a test. What if I ran test X with changed configuration Y? What if I ran it with different hardware Z?

These are one of the benefits of a good test automation stack; it becomes fairly easy to run lots of variations on the same thing, and that can result in a lot of learning about how system behavior changes as various things flex. For example, I recently took a performance test we run and tweaked the setup to make it run with different memory allocations - and in not too long I had a much better understanding of how system throughput changes as a result of adding or constraining that particular resource.

This isn't the kind of test i need to run frequently. I run it every once in a while to make sure the system behavior in this area hasn't changed, but it's overkill to run it every single night with a lot of our automation.

So here's the dilemma: Do I check in the test with my modified configuration?

Pros to checking in:
  • Next time I run it, I'll be sure I have the same setup, so my results are comparable
  • Someone else could run the same test and gather their own information from it (or a variation on it)
Cons to checking in:
  • Checked-in code that doesn't run is likely to get stale and stop working as other code changes around it. This goes for test code, too!
  • Someone might try to run it more frequently, and that will take up a lot of machines for very little benefit
There's no particular right answer here. Which one you choose will depend on how hard it was to set up, how much consistency matters, whether it's really likely to stop working, and how frequently you do want to run it. Pick the one that works for you; just make sure it's a choice you make, not something you fall into.

Tuesday, October 12, 2010

On My Own

Well, it finally happened. I quit my job.

Readers who have been around a while will know that I was an employee during the day and a writer and project tester at night (and some weekends, when it wasn't too pretty out). Through no fault of my employer, I have decided that it's time to pursue my own path.

I've joined Abakas full time as a test consultant.

So what's an Abakas?

Abakas is at its core a group of engineers who really like diving into new code bases and working with a lot of interesting people. We're developers, testers, and a designer or two. We've been doing - actually doing, not just talking about - various forms of agile development (SCRUM, XP, and lots of "tweaked" variations thereon) for years, and we like sharing what we've learned. We've developed everything from a cloud deployment module to a personal health website complete with calorie calculations and exercise logs to a huge restructuring of a multi-language multi-hour build system. Working mostly in Java and in Ruby on Rails, we've done many different things, and we've gotten pretty darn good at it.

Not to sound casual, but engineering can be fun and engineering can be collaborative, and part of what we do is help you get that excitement, too. There's no reason you shouldn't see your work in progress, and there's no reason we shouldn't work toward your end goal iteratively.

So if you have a project and don't have the guys to do it (maybe they're all busy on other things, or maybe you don't have an engineering team yet), or if someone said "hey, we're scrum" and your testers don't know what that means for them, drop us a line. Let's plunge in and get started - we'll see where it goes, together.

Oh, and tomorrow I promise a testing-related post!

Thursday, October 7, 2010

Between Us and Our Creation

Lo back in the mists of time, there was the engineer. And he had punch cards. And that was it. And since then, we've grown many tools. We have compilers and debuggers and libraries and test frameworks and code generation and IDEs and runners and automation tools. And all these things are good. Unfortunately, all these things stand between us and our creation.

I wonder if we really still understand the software we create?

I've been hiring for development and QA positions, and discovering that people can't tell me about the systems they've worked on. They can only describe what their tools showed. They don't really understand what's going on. I find that very disappointing: when I ask a tester what portions of their systems are prone to race conditions, I get a blank stare. When I ask what a failed disk would do to their system, another blank stare. They don't really understand what's happening to a level where they could really identify a potential problem area or and track it down. They only understand the lumps of things their tools expose.

So here's my challenge to you: Throw away your biggest crutches for a day. Get rid of your IDE and your log parsing tool. Really look at your system and what it's doing rather than just what your tools are telling you. Let's see how good our understanding really is.

So here's my challenge.

Tuesday, October 5, 2010

What Will You Do With

Tests are good. I like tests. Tests show us things. Sometimes we call those things bugs. Sometimes we call those things information. Sometimes we call those things validation. Sometimes we call those things "someone should tell support because customers aren't going to understand this one".

The point of running a test is that we get something for our efforts. We have, after all, spent time, computing resources, brain cells, and possibly some money on this test. So whenever anyone wants to do a test, I ask one question:

What will you do with the results?

This question serves two purposes:
  1. It helps us understand whether doing the test is worth the cost.
  2. It helps us structure the test so we get the information we need from it.
In short, considering the output - including the reporting - before we start the test helps make sure its a good test.

For example, if I'm doing a performance test, I might ask "what will I do with the results?" If the answer is "give them to marketing for the website", then I'm going to design a performance test to give me the biggest number I can get out of a configuration that could exist in the real world (hey, it's marketing - we're trying to show off a little bit!). If the answer is "figure out what our biggest customer is likely to see" then I'm going to design the test to match the customer site and workload to the best of my ability.

Thinking about what you want - and it doesn't have to take long - will help make your test results much more usable. So before you start, ask "what will I do with the results?"

Thursday, September 30, 2010

Rails Bundler

I've been working with Rails3, and in particular with the bundler feature. Bundler exists because dependency management is truly obnoxious, and bundler tries to make it easier.

I'll let the tutorial teach you how to set up bundler. Let's look at what we're striving for.

The ideal is that setting up a Ruby application is as simple as:
  • install your database
  • install the right Ruby
  • install the bundler gem
  • check out the source tree
  • configure config/database.yml
  • run "bundle install"
  • run "rake db:setup"
  • Go!
This is really cool. No. Really, it's that cool. It's much easier than clicking on a page and watching it go boom because of some dependency you forgot.

Here's what's going on:
You specify the gems you want in the Gemfile, including the version you want. Then you run "bundle install", and it creates a Gemfile.lock file with all the gems that you're using (including their versions).

This directly addresses the consistency of various environments issue, in that each person does a "bundler install" which retrieves the exact versions specified in the Gemfile.lock and makes them available to the application. This is much more reliable than simply saying "require 'mime-types'" and maybe getting version 1.16 or maybe getting 1.15 depending on what was installed on that developer's system or that server. The only ruby gem needed, in fact, is bundler itself - all others, including rails, are loaded using it in isolation of any other versions of those libraries installed on the system. You can have four versions of some gem installed, and it'll use exactly the one specified in Gemfile.lock.

There must be some downside:
The biggest drawback to the bundler arrangement we have now is that deployment to servers (or development environments) has a dependency on the availability of the server and specific versions specified in the project Gemfile. If that server(s) is down or the specific version required no longer available, the "bundle install" will fail.

The fix:
Run "bundle package", which loads the gem packages into your application. You can then check them in. No more external dependency (but your source tree is rather larger).

Now that's cool. If you're on Rails3, give bundler a shot.

This guest post is courtesy Dan Powell.

Wednesday, September 29, 2010

STPCon 2011 - Call for Proposals

Okay, confession: I didn't write this, but it's worth repeating:

Submit! Try your hand at speaking to other testers!

Seriously, if you have anything you like to rant about, that's probably a good topic. Put it together coherently and send it in. I believe STP pays the conference fee for speakers, so you'll even get to all the cool stuff and only have to pay plane and hotel - not a bad gig.


You are invited to submit a speaker proposal for one of the leading events in the software test and quality assurance profession – Software Test Professionals Conference 2011, March 22-24 in Nashville, TN at the Gaylord Opryland Hotel.

If you have a session topic that meet’s STP’s standards, please visit our website at:

STP needs you – active software test and quality assurance practitioners, consultants, professional speakers, trainers, and solution providers– to share your expertise with your peers at one of the many one hour breakout sessions that will be featured at this event.

STP’s conference program board will review and select from these submissions the topics that support existing and emerging trends facing our industry today. We are seeking session proposals with solid content featuring specific how-to’s, and valuable take-aways. We are interested in sessions from industry leaders who have a particular expertise they can share, one that sheds light on an issue, practice or process in our industry. Product pitches will not be considered a viable breakout session. Strong platform and presentation skills are a must!

It does take work to put together a good presentation, but the rewards are great! If selected to present at STP 2011, you will receive a complimentary registration to the conference. Speaking at a Software Test Professionals conference not only gives you and your organization exposure at the conference but also leading up to the conference. Speakers who are chosen to present will have their name, title, organization, photo and bio published on the STP 2011 Conference website and in the full conference brochure. Speakers are provided with additional opportunities to promote their expertise and/or business. For instance, conference speaker articles have been featured on our website and e-blasted to our community of 50,000 members. When the article is featured, we also send out social media alerts to the software testing community-at-large.

Submit your proposal to speak now:

For consideration, please complete the online proposal form by September 30, 2010. Feel free to submit more than one topic. We look forward to receiving your proposal and hope to see you in Nashville in March!

Pass this email along to any of your colleagues who you believe would be an outstanding speaker at STP’s 2011 Annual Conference and Expo.

Thank you,
STP 2011 Conference & Expo Team

Tuesday, September 28, 2010

A Note On Su

I just learned this yesterday, and thought I'd share.

On a Linux system, doing:
(sudo) su -

And doing
(sudo) su

Does different things. When you include the dash, it loads root's environment. When you skip the dash, it makes you root but leaves you in your standard user environment. This is obvious once you know what to look for, but it took me a minute to figure out when I had forgotten that dash.

Monday, September 27, 2010

Defect Injection

So you wanna know how good you are.

You know how many bugs you find. You know how severe they are. You know how frequent they are. You know the ones that cause people to say, "oh, okay" and the ones that cause people to say, "Oh boy, am I glad that one didn't get out to the world!"

But you don't know much about the bugs you missed. Maybe you missed 10, maybe 1000. Maybe the ones you missed will be seen in the field, or maybe not. Maybe they're high severity, maybe they're low severity. That's a lot of maybes and a big black hole in your knowledge of testing the system.

Enter defect injection.

Defect injection is a technique designed to see how effective testers are and what kinds of things they find and they miss. Here's how it works:
  1. Go find your friendly local development team
  2. Ask them politely to insert 10-100 bugs throughout the code.
  3. Test as normal
  4. Go back to your friendly local development team and show them what you found
  5. Compare lists
  6. Ask your friendly local development team to fix all the bugs they inserted before shipping, please.
Basically, this is a test of your defect finding skills. Given a known list of bugs, it's possible to make statements about the bugs you missed. It starts to become possible to see patterns. For example, maybe you miss bugs in a given module, or maybe you miss race conditions because your tests tend to be really short.

If you have a willing development team, and you'd like to know what you find and what you don't in your particular situation, consider using defect injection. Just don't forget to remove the injected defects.... before you ship!

Friday, September 24, 2010

Beware the Zone

Ah.... the zone. That fabled place where we are tuned in to our work, where we're humming along with our system, where synchronicity and productivity bloom.

Beware the Zone

A zone is a temporary communion with your work, yes. But a zone is also a circumscribed place, a place that by definition has boundaries and edges - like a tunnel. A zone is characterized as much by what you do not see as by what you do see.

Many of the bugs I've found, and most of the really serious behaviors I've been able to characterize have started as a distraction. They weren't part of the test I was doing, but rather something that made me say, "hmm, that's funny."

When you're in the zone, you tune out distractions. That means you aren't seeing the guys across the hall playing basketball with a trash can, sure. That also means you're not noticing the odd setting in the log file that isn't what it should be.

It feels great to be in the zone. So let yourself go into it. Just don't stay too long - and ask yourself formally what sidelines you should be pursuing. It takes time to learn, but make sure that anything you accomplish includes the question, "and what else interesting do I see?". If you ask it formally long enough, it'll enter your zone, and then you're really testing.

Leave the basketball players out, but try to make sure the log file oddities are within your zone. After all, we're testers; very little we do is narrow. So let's make sure our zone is good and wide.

Tuesday, September 21, 2010

Ease Into It

I'm working with a support engineer who has some scripting chops. He wants to dig into a much bigger code set now so he can continue improving his development practices. I'm all for this. He's expanding his skills, he's providing help with projects, and he seems to be having a good time doing it.


It's a big code base he's coming into, relative to the code he's been working on. Most of his background is 100 line scripts written to be used in isolation and mostly by himself. The test code base he's going into is about 250,000 lines, is object oriented, and is highly interdependent. It's a brand new environment.

So the first change I asked him to make in there was related to a report it spits out. This change is going to involve a lot of things he already knows how to do - string and file manipulation, searching, etc.

When I give someone a project that involves a lot of new learning, I try to make sure it has three attributes:
  • It's useful.
  • It's straightforward.
  • It relies on techniques he's already comfortable with.
We're trying to set this engineer up to succeed, and a comfortable engineer is much more likely to succeed. So we give him a project that helps him stretch his skills but that leaves him grounded in skills he feels comfortable with. The goal here is to introduce uncertainty only in the areas where there is active learning. Everything else - the requirements of the project, whether anyone will care, and language or other techniques - should be easy or familiar. This helps the engineer learn without increasing the frustration level too much.

Learning something large isn't easy. So ease into it - start from what you know, and change only what you need to learn. There you'll find a useful project and a good learning experience. Repeat that enough, and you'll discover a whole new world of skills.

Monday, September 20, 2010

Explain It To A Newbie

As we work with code, we form habits. Some of them are great habits - run the tests before checkin, for example. Some habits are bad habits - writing only unit tests and no acceptance tests until after all the code is written, for example. Many habits are neutral habits - using rspec, for example. In all cases, they're ruts we fall into.

I'm suspicious of ruts. When I stay in a rut too long I worry that I'm not doing things as efficiently as I could. Maybe there's some newer methods that make things faster/better/stronger. Sometimes switching doesn't make sense, and sometimes it does. But you have to think about your habits to even consider changing them. It's shark engineering - keep up with the tools or you'll die!

To evaluate my habits, I'll grab a friend who's new to the codebase and explain what's going on.

For example, this is a recent one (a good habit):
Me: "Run bundle install to grab all the dependencies"
Him: "What is bundle install do?"
Me: "Bundler is a Rails 3 utility that allows you to specify all the gems your project depends on and just install them as you need them."
Him: "Oh, that's pretty cool. Saves discovering dependencies manually or installing them weirdly."

Or another recent one (neutral, this time):
Me: "Yup, we're using rspec here"
Him: "Why rspec?"
Me: "Well, it's in Rails core so there's not much to install, and the feature set is broad enough I haven't found much I can't do yet."
Him: "Oh, well, okay."

And a third (oops - a bad one):
Me: "I'm loading these with a rake task."
Him: "That's seed data, right? Why not use seeds.rb?"
Me: "Well, I taught the person providing the data how to write yaml so it's easy."
Him: "Sure, but seeds.rb can run any Ruby, so you can load the yaml from there, and then a new developer will be able to use the standard method."

This was a bad practice - and I fixed it. He was right, after all.

The point here is that I'm explaining my practices to a newbie - someone who doesn't know the code but who is an engineer. That innocent question - "Why?" - forces me to revisit a habit and to justify it. If it's good, the habit survives. If it's bad, the habit generally doesn't survive the explanation - and then I know to change it.

So if you're worried about habits, find someone and start explaining yourself - it's amazing what you can learn.

Wednesday, September 15, 2010

The New Toys Stack

As I may have mentioned, I've just started a new project. That means we get to make technology choices. We get to pick what tools we'll use for testing, what language we'll use, what libraries (or gems or plugins) we use. It's fun!

It's also a chance to play with new toys. Wanting to try HAML? Here's my chance. Never really got to mess with Watir? Now I can.


It's easy to go too far. There are so many new things to try, and it's easy to get carried away. You wind up using a technology that dies before it gets off the ground. Or you'll wind up using technologies that are great but that don't actually add value.

So when I'm building my new toy stack I put in all the new things I want to try that meet two criteria:
  • I'm actually going to use it now. Not eventually. Not soon. Now.
  • It's moving. It's gaining features and being actively worked.
There are no guarantees when we pick a technology that it will do what we want, or that it will grow with us. But we can try to protect ourselves a little bit by looking for projects that are likely to be useful and that are likely to grow with us.

When you're starting a project, feel free to put some new toys in your stack. Just make sure that you'll be happy with your new toys.