Monday, December 31, 2007

Logging Bugs Politely

I've been testing a portion of our software lately that has a lot of potential error messages - that's right, I've been testing a GUI. One of the things I've been looking for is consistency and quality of error messages.

I found a bunch of error messages that look roughly like this:
CommandFailedException: Could not connect to server 127.0.0.1 The username or password was not correct.

The actual errors were usually a mis-entered (or unspecified but required) field. It's a bug, certainly. But you need to be a little careful about how you log it. Usability issues, when you're not doing formal usability analysis (or if you're doing it after the fact) are a bit of a touchy subject because they can come close to subjective opinions.

Make sure your bug does this:
  • Be clear about what the error message should say. Consult the product manager or designer if necessary.
  • Be considerate about the structure of the system. Don't ask for information that the system doesn't have at that point. 
Make sure your bug avoids this:
  • Be rude, even if it's funny. "Incomprehensible error" will not make you any friends.
  • Point out in excruciating detail what's wrong. Just say what it doesn't do briefly and move on to what it should do.

Friday, December 28, 2007

Always Touch It

Let me state up front that I'm a huge fan of automated tests. I love the idea that while I'm fast asleep every night the computers in my lab are humming away doing all sorts of things to the code and generating a nice little log file for me showing what passed and what failed.

BUT

I always test the system as a user would see it.

Automated tests are great and wonderful, but your users aren't automatons. They're humans, or other systems, and as such they have their vagaries. Your job is to see a feature as your customer (human or system) will see it. If you don't try it at least once as a user would see it, how do you know what they'll really experience?

I use these "user proxy" tests to look for:
  • Usability issues. Is the time to accomplish a task just too long? Do I find that I'm clicking through to another screen a lot for information? Is three clicks really too many for this common action?
  • Perceived performance issues. What does performance feel like to the user? Sure, it may start rendering in 2 seconds, but if it takes another 20 seconds to fully render in my browser on my computer, is that really okay? Note that perceived performance may be different from what my performance measurements gather.
  • Context. Does this feature make sense with all the other features? Does it hang together when it's being used, no matter how good the screenshot is? Can I get at the feature where I expect to?
  • Inconsistencies. Does this feature feel like an integral part of the system? Does it have the same UI metaphors? Do the messages - no matter how correct - match up with the messages other parts of the system display?
I'm certainly not advocating avoiding test automation. I'm simply advocating living with a feature for a while. Just like you don't really know a house until you've lived in it for a bit, you don't really know a feature until you've used it for a bit.


SIDE NOTE:
When I first wrote this entry I wrote it as "manual testing" versus "automated testing", but I believe these terms are imprecise. It's more "testing as an end user" versus "suites (scripts and other code) that test the system". I haven't come up with really good shorthand terms for this.

Thursday, December 27, 2007

Official Handbook

I talk about process a fair amount, particularly as it relates to testing and as I've seen it used. To a certain extent, process doesn't matter: you don't ship process; you ship a product. However, the process use (and the mere fact of using a process at all) can greatly affect the product you ship.
So.
Most processes I've worked on are based on standard processes that we've all heard of - RUP, XP, SCRUM, etc. - but they're pretty much all adapted to fit the needs of the company. And the processes themselves pretty much allow for that. One example is that we claim we're doing XP here, but we don't actually work directly with customers all the time. Instead, we work with a product manager who speaks for the customer. Does this mean we're not doing XP? Sure, in the strictest sense we're not doing XP. But if I tell someone we are doing XP, they will understand 95% of what we do and more importantly how we approach problems.
That got me to thinking, though. I've seen variations on most processes I've worked on. So what's the official handbook for various processes?
  • RUP
  • SCRUM (there are a lot of links on this one, but I think I've included the most official). I find SCRUM to be one of the more adaptive and adaptable processes, and I think that's reflected in the very active community around it.
  • XP. This is another very active community with a lot of proponents and a lot of people evolving the process.
If you're going to deviate from the specifically defined process, whatever it is, that's okay by me. Just make sure you know what the specifically defined process is and why you're deviating from it. If it's for good pragmatic (not lazy!) reasons, then I say do what works for you.
Don't let process adherence keep you from shipping good product.

Wednesday, December 26, 2007

Watch Your Spacing

Remember with some systems, spacing counts.

foo (bar)

is not the same as

foo(bar).


This is fine in some circumstances, and can be useful when you're using space-delimited options (say, in a command line interface). However, be aware that at some point your users will get this wrong. So if you're going to be picky, you need to provide feedback to users right where they're entering the option.

If it's in a config file, throw an error message when the config is read. If it's on a command line, throw an invalid argument error. If it's in a GUI, put up a nice error message next to the field that is affected.

True example:
Today I created an NFS share and put in an access control list with an option. It looked like this:

Access: 10.0.0.20 (my_option)

There was a space in it.

The end result is that the share wouldn't export. It took us 45 minutes to track it down to that space.

So do your users an error and throw a nice error when the syntax is that picky. Spaces are both picky and hard to see from documentation.

Friday, December 21, 2007

Heuristic for Verifying Automated Tests

I'm in search of a heuristic for verifying automated tests.

The Back Story:
We run a number of automated tests:
  • Build Verification Tests. These run after every build. The continuous integration system will not spit out the build (isos and debs, in our case), unless these tests pass.
  • Nightly Tests. These run every night. They're mostly *Unit based (JUnit, PerlUnit, etc).
  • Weekly Tests. These are the tests that simply take too long to run every night. They're functionally identical to Nightly tests, but they take between 6 and 24 hours to run, so we only run these once a week (and they take about 5 days to run).
The Bugs:
Sometimes there are bugs in these tests. That's fine, so we log them and dev fixes them (hooray!).

The Question:
How many runs does an automated test have to pass before it can be marked as verified? 

We do a code review, and the developer runs the test before he checks in, but we still want to see the tests run in BVT/nightly/weekly. The question is, how many good runs do we need to have confidence in the verification?

My Answer:
I don't think this is a formula; there are simply too many subtleties based on the frequency and type of failure, the test structure, the risk of it failing either later in the tests or in the field, etc. So I'm really just looking for a rule of thumb. 

The best I've found so far is as follows:

A test must run successfully for the longest interval between failures plus one.

So, if a test fails every time, then it needs to run once successfully (0 intervals between failures, plus 1). 

If a test fails every fifth time, then it needs to run six times successfully (5 intervals between failures, plus 1). 

If a test fails every three or four times, but one time it went ten times between failures, then it needs to run eleven times successfully (10 is the longest interval between failures, plus 1).

Thoughts?

Thursday, December 20, 2007

See Around the Problem

I was talking to a developer today about how I approach a system.

Don't just see the problem. See around the problem.

When I see a problem I have to reproduce it. But often what makes a problem happen isn't what you see as the problem. It's what happened to get you to the problem. For example, a user can't log in. The problem may actually be all the way back at user creation, and the exception that was thrown then.

So it's important to see the context of the problem. Find what else was occurring in the system. I look for different things:
  • Was there an exception earlier that didn't seem to have an effect?
  • What else is the system doing?
  • Is there another system involved?
  • What other processes are running?
  • What was I doing right before this happened? Even if it was in another area of the system?
I don't know how to teach this yet. Any ideas?

Wednesday, December 19, 2007

Testing and XP Generalism

One of the things that XP encourages is generalists rather than specialists. Anyone on the team should be able to do any of the team's activities. This is a bit slower to start, but over time amplifies the team's effectiveness by reducing bottlenecks and lowering the team's truck number.

As a tester coming into an XP team, this can prove to be a challenge. I possess a different set of skills than most of the other team members, so how do I fit in a generalist team? Testing isn't the only discipline with this conundrum. The other role that is called out from the XP team is the customer.  So, in a generalist team, we now have two specialized roles: customer and tester.

A tester is really just one person playing multiple roles - in a way, I'm a generalist. Sometimes, I act like a developer. One morning, I pair with a developer and we write unit tests. Then I pair with another developer and we write automated acceptance tests for a completely different module of the software. Late in the afternoon, I sit with the product manager (our "customer") and we work on a new story. It's not exactly what I think most XP types call a generalist, but it's a start.

In some ways, though, I'm a specialist. I exist to provide information. I support development with testing and feedback. I support the business with information about risk and the current state of the system. In general, I find that this ends up making the standard XP test tasks better. My exploratory testing on a new story finds issues that we simply didn't anticipate. Then we add automated tests for a lot of it.

So yes, I'm something of a specialist when I join a team. I don't think this is a bad thing, although it does seem to go against some XP principles. We're not following XP to the letter. I happen to think that this is fine; I'm not dogmatic about the processes I follow. Where it doesn't work for our situation, we'll change it until it does.*




* I know this rubs some process zealots a very wrong way. If you have a real-world solution where you've been able to follow any process 100%, I'd love to hear about it. In the meantime, I will continue to make my customers happy, even if it means that some parts of XP get modified a bit to help us do that.

Tuesday, December 18, 2007

Not Everything's a Nail

"He who is good with a hammer tends to think everything is a nail." -- Abraham Maslow

I've been to a lot of holiday parties lately, and when geeks party, well, apparently we talk about software development process!* It's been really interesting getting people together who come from various schools of thought. The party attendees have ranged from a SCRUM zealot to an XP believer, to someone who thinks waterfall processes in general have gotten an unfairly bad rap.

That got me to thinking. When thinking about each process from a test perspective**, none of the software processes I'm aware of adequately address every problem and every situation.
  • SCRUM: This one really doesn't address the timing of test very well. Whether you have QA on the team or as a separate team, test tends to get squished in at the end of iterations or (worse) leak beyond iterations.
  • XP: There is a dogma here that manual testing is universally bad because it's not repeatable, can't be run constantly by anyone, etc. I'm comforted to see this changing in the XP community, and the value of manual testing for usability and as a first step of acceptance coming in. This community really seems to be starting to address the problem and trying to figure out how to effectively harness the power of human testers (rather than just coded tests).
  • RUP:  This tends to emphasize pre-defined tests at the expense of exploratory testing. There is also a tendency to wait to start testing until later in the project than I'm a fan of.
  • No Defined Process: This one is catch as catch can for every phase of the process, not just test. It mostly doesn't scale beyond a very few people who are very closely aligned in their goals.
All those risks in all those processes are why I believe that pragmatism will trump dogmatic adherence to process every time. No hammer I know of can turn all problems into nails, and I'd rather solve problems than wield a hammer.

In the end, my company doesn't succeed because we follow any software development process. My company succeeds because we give our customers what they want.

*Disclaimer: I swear, we're not normally a dull bunch! And yes, we talk about other things, too, at least some of the time.
** I think about most things from a test perspective. I'm not a developer or a product manager or a sales person, and I wouldn't presume to understand fully how these various processes apply to them. I can make guesses, but I'm most knowledgeable in the area of software testing. Just FYI.

Monday, December 17, 2007

Livin' the XP Life

I've now been at my current employer for exactly two months. For those of you who are counting, thats:
  • 4 iterations
  • 8 XP Customer team meetings
  • 40 standups
  • 40 pairing sessions (we pair once a day)
Even times like this are a good chance to go back and reflect on what it's been like, living the XP life.
Recruiting
XP is fabulous for recruiting. It's a very quick and easy way to express that this is a company built by engineers and for engineers. It's a company that embraces new techniques, and that's exciting for the kind of people I want to have working with me.
Pairing
This is fair to middling. When we don't pair, everything gets code reviewed before checkin. For complex things, the end result is usually pairing after the fact to tighten up the code. In the case of QA, we pair on things like test planning and test infrastructure coding. Pairing for us works best when we're creating acceptance tests for stories. We get better tests and better knowledge of the system. In the end it takes more time, however, but it's a good way to train new people.
Stories
Stories are one of the elements of XP that is not new to me; other methodologies use stories or similar concepts. The best part about stories is the forced thought and the fact that you wind up with a lot of documentation (ours are all kept on a Wiki). The worst part about stories is that there's little to no information about the system as a whole and how each story fits in. This has resulted in a lot of inconsistencies in the system.
Net Net
In the end, working in an XP shop is a mixed bag. But it's been a fun experiment, and I'm looking forward to helping it continue.

Friday, December 14, 2007

Managing From the Bottom Up

I did a phone screen yesterday, and the candidate asked me "What do you think of as your management style?" I gave my standard answer, but later on I sat down and really thought about it.

As your manager, my job is NOT to:
  • Tell you how to build something
  • Define the structure of your tests
  • Describe for you what you should be doing in any given moment
  • Mediate every technical dispute
All these things are about being the "parent" of the group. They're micromanagement at its best. For me, management is really more about getting all the obstacles out of your way.

As your manager, my job is to:
  • Get the resources our team needs, including the money to buy those resources
  • Point us all in the same direction so we're all marching toward the same goal
  • Hire really smart people who each think about problems a little differently
  • Help remove obstacles
  • Be a source of ideas and thoughts about what to do and how to do it
  • Provide the tools to resolve disputes on their merits
I finally decided that my job is a little different than I had thought. I don't actually manage my team. I manage my team's environment.

In the end, this helps me manage a much larger team effectively. I am not the decider (apologies to George Bush). I'm merely the one who makes sure that everyone can be the decider. I don't have to manage every detail; I just have to build a team that lets me know what details are important. 

I'm most effective when my team is telling me what to do, not the other way around.

Thursday, December 13, 2007

Personal: Snow Day

Today the first real storm of the winter hit Boston. Since I get positively giddy around snow, I'm taking a day off the blog and I'm going to QA a few snowballs!

This is what I see outside.

Enjoy, everyone. Drive safe and have fun!

Wednesday, December 12, 2007

Human Anti-Patterns

In software there are patterns - known good solutions to common design or code problems. There are also anti-patterns - things you really really shouldn't do because they may look great at the start but they lead down a bad path. More generally, anti-patterns are simply patterns to avoid. Patterns and anti-patterns are most commonly referred to as part of software design, but they also apply to QA. Common patterns are things like pairwise, etc.

I like to think about patterns and anti-patterns in software development process, too. These are human patterns and human anti-patterns.

Examples of human anti-patterns are:
  • Manually comparing large datasets. This looks fine, but humans are fallible and easily bored, and they will miss something doing this kind of tedious comparison by hand.
  • Allowing an infrastructure problem to persist. See my post from yesterday for an example. Today I went in and we traced the problems and fixed them; not having a reliable build is not something you want hanging around!
  • Testing only right before release. Not testing as you build seems okay, even good. Hey, it doesn't waste developer time testing partially implemented features. But it will extend your release cycle because it leaves a lot of changing code that all has to come together perfectly - not likely.
So when you think about processes, think about your patterns - the repeatable solutions to common problems that you want. Also think about your anti-patterns - repeatable things that you don't want.

Just like designs, humans have anti-patterns, too!

Tuesday, December 11, 2007

QA Is for Optimists

I love my job. And today was not a good day!

Today was one of those days when nothing goes right. I walked in this morning, asked a developer about a feature, and we quickly determined that the code was in the branch but not in the build. You know, the build I had been testing against for about 36 hours. That's 36 hours down the drain because I can't trust that build. Darn.

Then I turned to the automated tests that ran last night. The failure rate was roughly 6 times normal. Almost all of it was traceable to the same root cause, but we had to go through each test log to determine that. Two hours later, we actually were able to publish the results (this usually takes 20-30 min). Darn.

Then I went to my big system that needed upgrading. No problem. I had a newer build (that actually had the right code in it!), so I placed it on the machine and upgraded. It proceeded to fail. Miserably. For unknown reasons. Darn.

By the time 6pm rolled around, if it could have gone wrong, it probably had.

Now, normally I don't talk much about my day unless there's a thought or a lesson in there. So what's the moral of the story?

QA is for optimists.

At the end of the day, when I was retrying the upgrade for the fourth time, there was nothing left to do but laugh at everything that had gone wrong, and talk to the screen (since asking politely sometimes gets a program to do what you want!). And  that's what it takes to do QA. Your job is to find things wrong all day, to seek out problems so they can be fixed. Don't get into this business if you can't do that cheerfully.


* Side note: I walk to work, which is good. I shudder to think what my odds of a car accident would have been on a day like this one!

Monday, December 10, 2007

Just Pick One!

One of the things that characterizes effective workers is the ability to simply make a decision. In many cases it's unclear who should decide what to do about something. A really effective worker (manager, engineer, etc) will make a decision and go with it*. An ineffective worker will be unwilling or unable to decide.

A quick story:

On Friday I went to pick up a friend from his office. He wasn't done yet, so I waited for about an hour. Now, this is an open plan office, so while I was there I was listening to various project managers talking about a new document they were writing and the template for it. The entire topic of conversation for that hour was about the template for that document and who they needed to get to sign off on it (and they weren't done when I left!).

So I started calculating. The average salary for a project manager in a small company in Boston is $85K.  This is approximately $42.50 an hour. So that hour, for 5 project managers, cost the company $212.50, and nothing got accomplished. The moral of the story?

Just make a decision.


* Disclaimer: If the decision is not yours to make and you know who should make it, then get the right person to make it. This is really just decisions that don't have a clear owner.

Friday, December 7, 2007

It's Like Wearing Your Skinny Pants

Our company, like many companies doing XP, SCRUM, or some other type of agile development, uses two week iterations*. We develop from stories, and the stories are placed in a queue. There's only one problem:

Our stories don't always fit in an iteration.

This means we don't actually finish a story every iteration. Sometimes a story is half done. Note that this does not mean that the software is broken or unusable; that's never an acceptable state at the end of iteration. What it does mean is that sometimes there's code in there that simply isn't usable by an end user - it's code that will be part of some feature X, but feature X isn't done and therefore isn't enabled. Think incomplete, not broken.

In practice, this works out all right from a code stability standpoint. Our automated tests make sure we haven't broken anything that we've said is done. However, from a process and a product planning standpoint it's actually a big problem.

Let's say I've signed up for a four-week (two iteration) story. Well, that means I should finish 50% of it during the first iteration. The problem is, until I've finished the whole thing, I don't really know what 50% is. Of the multi-iteration stories we've done since I've been at this job, all but one of them involved a last minute scramble because more than one iteration's worth of work was left in the last planned iteration. Reference the old cliche:

"The first 90% of the code accounts for the first 90% of development time. The remaining 10% of the code accounts for the other 90% of the development time."**

So, my rule for QA is very simple:

All stories must fit in one iteration. 

If a story can't fit in an iteration, we trim it and split it and rework it so it does. Often this is hard to reconcile with the requirement that stories provide some use benefit, but we've always managed to make it happen.


* I'm not sure what's magic about two weeks. Iterations certainly could be just about any length, but two weeks is very popular.

** I don't know who said this, and Google turns up a lot of sources.


Thursday, December 6, 2007

Be Nice: Phrases to Heal All Wounds

Of course it's important for QA and developers to get along. Despite the bugs logged and the "unreproducible" resolutions, in the end, you all want the same thing: to ship a good product.

So what's the fastest way to make friends with a developer and remind him that we're all on the same side?

"I'm just here to prove you're perfect."

The developer will laugh, but the message will come through. You're not there to cast blame; you're there to provide information about the code and how good it really is.

Wednesday, December 5, 2007

In the Zone

I work for a storage company. We spend a lot of time writing data on to systems for testing. So I'm always in search of new tools.

Running along the spectrum of sophistication, we've done the following:
  • Hand-copy files. This one is really easy and gives you really static data. Just don't expect it to be hugely scalable. You also run out of files to copy very quickly. Oh, and don't copy that private employee information accidentally! If you really want to do this just open up a bash terminal (or ksh or tcsh or whatever) and type: 
cp my_source_dir\* my_dest_dir\*
  • Copy junk data. Enter the dd command (we're on Linux). the problem with this is that you wind up with a lot of files of the same size.  It is better than copying files, because you can do it as much as you like, you get unique files, and you can name the files. You can't easily vary file size, however, and you can't measure performance of your reads and writes without a lot more scripting. It is useful as a quick and dirty data generator, however. Just do the following (change your loop to get the desired number of files):
until i=10; do
dd if=/dev/random of=my_dest_dir\file$i bs=8K
let i=$i++
done
  • Use IOZone. This tool is actually intended to measure performance of a disk or a filesystem (basically block and file read, write, etc). It will write files of varying sizes to disk, read them off, rewrite them, etc. It has options, also, to leave the data on disk, so you can use it to fill drives. Also, it will automatically calculate performance of each operation and output it in Excel-compatible (space-delimited) format. Try it out!  www.iozone.org
Good luck, and happy data creation!

Tuesday, December 4, 2007

Keep the Balls in the Air

I was looking at queues today. There are a lot of different areas that QA needs to touch. As a basic example, my team needs to keep on top of all these work queues:
  • team 1 bugs ready for verification
  • team 2 bugs ready for verification
  • team 3 bugs ready for verification
  • team 1 story queue
  • team 2 story queue
  • team 3 story queue
  • nightly test runs
  • weekly test runs
  • customer support escalation
  • QA story queue
  • story stub queue
So how do we keep all these balls in the air?

I've tried two ways:
  • Touch everything every day. Keep on top of your queues and make sure you work them all every day. The goal is to keep any one queue from getting out of control.
  • Pick a queue a day. Avoid context-switching, which is wasteful of your time. Pick one queue each day and get all the way through it, doing everything you possibly can. Sure, each queue will be longer, since you're not touching it as often, but you'll be more efficient about working that queue.
In all practicality, my team winds up touching every queue every day. We wind up needing to touch multiple queues anyway, either because we're blocked on something, or because someone has a question, or because there's an urgent need. So we go with it - pick the highest priority thing across all queues at any point in time.

How do you handle all your queues?

Monday, December 3, 2007

Customers Aren't Testers (Without Your Help)

Imagine if you turned on the TV and you no longer had channels. Instead, you just had lists and lists of programs you could watch. Some of them are available right now. Some of them aren't available yet. Some of them have really detailed descriptions.  Some of them just say "comedy, 30 min". Some of them happened in the past and probably won't come on again, but who can really say? It's just a huge flood of information.

This is roughly what happens trying to define tests in an XP environment. The focus of stories is very short-term and rather isolated to that story. Stories are also defined by the customer (or customer proxy) rather than a trained engineer or tester. Lastly, stories are very positive-outcome focused; they describe what will happen, not what shouldn't happen. What does this mean for the tester?

Your job, as the tester, is to help the customer write good acceptance criteria. You also need to make sure that the negative aspects of the story are accounted for; that all the things that could go wrong react appropriately. In short, you have to help the customer make sure that the story reflects what he wants, not just what he asked for.

I've come up with a simple framework for helping customers write acceptance tests. The goals of this framework are to:
  • Make testing accessible. If we bring in complexities, even just in terms like "boundary value analysis" or "threat modeling" or "equivalence class partitioning", we'll intimidate and ultimately lose our customer. Even if we're doing these things, we need to make them seem accessible.
  • Address the external first. Describe your acceptance tests as what the user will see or experience, then as external influencers on the system are affected. Avoid describing how internals of the system will react, because in XP this could change at any minute. Do keep in mind that external influencers might be other components of the same system. If I'm describing a change in my application, my database might be an external influencer.
  • Describe context. No story works in isolation, and unit tests should handle the story itself as it's being written. Acceptance tests need to not only confirm that the story does what it says, but also call out areas of integration and other modules with affected behavior. This is the context in which your tests run, the environment in which they exist. This could be as simple as a dropped network connection mid-test or as complex as a multi-part system failure with several different processes occurring in the background.
  • Be precise. Don't test "bad passwords". Test "blank password", "old password", "another user's password", "too long password". Often this alone will help better describe the feature and ultimately influence the implementation of the story.
Now that we've figured out what we want to do and that we don't want to intimidate our fellow testers, what are some things we can do? What kind of very light framework can we build?

I find that the hardest things customers have are (1) getting precise; and (2) understanding context. So let's address those:
  • Getting precise. This is easiest when there is a user interface. Ask the user to draw the interface on a whiteboard, then walk through it. Ask lots of "what if I..." questions. Repeat this often enough and you wind up with the user asking those questions themselves.
  • Context. I actually use (real) buckets for this. I put big labels on each bucket describing typical ways that the system might get exercised. Then I put good test cases in each bucket. For example, I might have a bucket labeled "HA failover" for a high availability system. Then all we have to ask ourselves is "if an HA failover happens during this story, will it change anything?". If the story is "new icon", then it won't, and we put the bucket away. If the story is "I can upload a 100MB file", then it will, and we run the test cases in that bucket. I'll talk more about the buckets in another post.
Having pre-made test cases tends to help people riff, and walking through things as the user will helps people feel a story before it's done. All this, and you walk out not only with acceptance criteria, but with a stronger story. Oh, and you haven't scared off your customers with intimidating test jargon.

Bravo!

Sunday, December 2, 2007

Down With Selenium (Core)

I spent a chunk of this weekend trying out some test cases using Selenium. Selenium, for those who aren't familiar with it, is a tool for testing websites.

It was pretty much a disaster.

Let's talk a bit about the application I was testing. It's a browser-based management application for a storage archive. There are approximately 15 pages, with auto-refresh capabilities but no AJAX or other partial page refresh. There is a fair amount of JavaScript, particularly for input validation.

So, what happened?
  • Breaking out of iframes. Selenium depends on putting your app in an iframe and then running all the JavaScript commands through it. All the "_top" methods in our app completely broke the test.
  • Username & Password Problems on IE. The site I was testing displays a dialog asking for the username and password. Selenium's recommendation for getting around this is to put the username and password in the URL. This is disallowed in IE. Although there is a registry setting you can use to get around it, I generally believe that modifying the registry of your test client taints that client and may interfere with other results. So, I find the workaround unsatisfactory.
  • Non-parallelizable. Selenium uses the actual browser, which is great in that it shows true browser behavior. However, this means I can only run one test at once. To scale up (more tests), I have to scale out (more machines). I will have several hundred tests running in continuous integration when this is done, so parallelization is important.
  • Modification of my test server. If I want to run Selenium Core, I have to install things on my test server. This is a HUGE no-no in my world; the sanctity of the test system is inviolate. You don't get to deploy code onto a test server that didn't come from the build system (right down to the OS and drivers). Test tools are no exception. My other options are Selenium IDE (Firefox only) or Selenium Remote Control (runs through a proxy server, which may be a possibility).
My biggest problems with Selenium are really philosophical. The tool demands that I modify both my test client and the server on which my system is deployed in order to allow it to run. I prefer a tool that doesn't force me to compromise the integrity of my system under test.

I will be trying Selenium Remote Control and Canoo WebTest, as I think those are better fits for my testing philosophy.

I much prefer to test things as they will actually be deployed.*


* Yes, we really do this, right down to the same hardware and network configuration.

Thursday, November 29, 2007

Justifying Usability

One of the most frustrating things a QA engineer can do is get into an argument with the product manager over usability. There are several different times this could reasonably occur, and several things you can do about this:
  • When the matter really is preference.
  • When there is difference of expectation about user knowledge or skills.
  • When no one's asked the user yet.
Let's take each in turn:

When the matter is really preference.
You will ultimately lose this one. If it truly is preference, then by the time it gets to QA it represents a change. Why change it when it's not going to make it better? If your way really is better and you can explain why, then it really isn't a matter of preference.

When there is a difference of expectation.
Here you may be right or you may be wrong. Ideally you have a detailed user persona that you can consult. If you don't have these, ask your nearest user proxy. Typically, this is the support guy or the implementations guy. Let that person's input stand. Oh, and develop user personas.

When no one's asked the user yet.
This is when you could do usability testing ("ask the user") but you simply haven't yet. Ideally, you'd schedule some usability testing or convene a focus group to identify this. If you can't, fall back to asking your closest customer proxy.

Sure, it's easy to get annoyed when your usability bugs get closed as "our users wouldn't want it that way". So stop being annoyed and start finding a leg to stand on!

Wednesday, November 28, 2007

Formatting Addresses

I ran across a great resource for international (i.e., non-USA) address formats: http://bitboost.com/ref/international-address-formats.html.

It covers format, available postal codes, preferred line breaks, required and optional fields, and common things that are also found on envelopes (e.g., "do not fold").
Can your app handle all of these?

Tuesday, November 27, 2007

Are You a Wader Or a Diver?

I wrote a post yesterday about learning new languages, and how incredibly valuable it is to deepening your understanding of programming. So today I sat down to start learning Perl*. That brought up an interesting question:

Am I a Wader?

Or

Am I a Diver?

Waders are the kind of people who start to learn a language by getting a good grasp of the fundamentals of the language. They follow tutorials, read articles on the history and philosophy of the language involved, and generally work their way up from the basics.

Divers are the kind of people who jump right in and start working on the project that has caused them to start learning the language. They generally figure that they know what they're trying to accomplish, so they'll pick things up as they go along.

I'm definitely a Diver.

As a Diver, I have a hard time with tutorials and the like. Sure, I've tried them, and sure they taught me things. But I walk away from a good tutorial figuring that I know a lot... and then I can't do much with it. It doesn't string together for me until I've applied it. Skipping directly to the application of the new language is a great way to identify how it functions. On the downside, I find that because of my grasp of the language is lacking.

So all you Waders out there, be sure you know how to apply all your new-found knowledge, how to string it together in a program.

All you Divers out there, be sure you actually understand why things work, instead of just finding that they work. Be very careful of language areas and niceties that you haven't found yet.

Which are you.... a Wader... or a Diver?

* Why Perl? It's very widely used across the company I just joined, and I see no reason to launch a huge and probably futile effort to change that. Plus, it comes up a lot as a quick and easy test automation language, so it's good to know.

Monday, November 26, 2007

Building a Team of Polyglots

There are two classes of developers: those who know one language, and those who know more than one language.

But why? Why would you need to know more than one language when C++/Java/C#/PHP/Perl/Ruby can do it all?!

Except it can't. Not well.

There is a lot of value in learning a language and becoming an effective developer in that language. This is what college CS programs are for. There is even more value in learning a second language. Until you learn a second language, you don't know what is programming and what is your language.

All languages are different (yes, yes, some are more different than others), but the underlying development principles are the same regardless of language. The more languages you learn, the more you'll be able to determine what is a feature or constraint of your language and what is a feature or constraint of programming itself.

So yes, C++/Java/C#/PHP/Perl/Ruby may be perfectly fine for what you're doing. If you really want to understand your profession, though, turn away from that language you know and learn a second, and a third. In the end, it will help all the languages you know.

Disclaimer: Thanks for the idea for this entry goes to the New York Times article on child polyglots (free login required). The parallels between a young child learning multiple languages and a young (well, relatively) developer learning multiple languages are quite apt.

Wednesday, November 21, 2007

The Very Core of Testing

Testing is a subject about which people can argue all day. "Real testing follows the same steps every time." "Real testing let's testers follow their noses." "Real testing requires you to know the expected result first." "Real testing can't be done on a developer system." And so on and so on ad infinitum.

Other than the highly amusing use of the emphatic real testing, these arguments are in the end orthogonal to the problem they purport to solve. Arguing like this actually takes us away from the heart of what testing really is. In the end testing is simple.
  • Create a state in the system you're testing
  • Perform some action
  • Identify the resulting state of the system.
All the other arguments are about how we perform those steps, or what we do with the information we've gathered.

So let's stop arguing about whether something is testing, and start arguing about what we really can improve - how we're testing.

Tuesday, November 20, 2007

Attitudinal Bias

One of the interesting things about working in QA is how much you discover about attitudinal bias. Basically, people's perspectives are colored by their approach to problems and their approach to a given system. Basically, if you ask different people the same question, you'll get a different answer.
  • If you ask marketing, you'll find out what it COULD do.
  • If you ask product management, you'll find out what it SHOULD do.
  • If you ask development, you'll find out HOW it does it.
  • If you ask support, you'll find out WHETHER users actually do it.
  • If you ask QA, you'll find out what it REALLY does.*
The really odd part is that they'll all answer the question by saying, "It does....". So be on the watch for attitudinal bias, and ask the person who will give you the slant on the question that you need.


* Yes, these are generalities and for every generality there is an instance of nonconformance. Call it the exception that proves the rule.

Monday, November 19, 2007

Test Commonalities: Localization

Welcome to part 4 of my Test Commonalities series. In this series we discuss test areas that come up over and over again across many projects. The goal is to create a good cheat sheet so we don't have to reinvent the wheel every single time. Today: localization.

Some people can go their whole career working on English-only applications. Most of us, however, will have to deal with localization. Note that there are degrees of localization, from simple translation to full localization.

So, what are the kinds of things we need to test?
  • Translation Completeness. Are there any words still in English? Be sure to check title bars, prompts, and error messages. Also check logs, if you will have sysadmins looking at the system.
  • Spacing. Other languages often take more space. For example, "Cancel" in English is "Annullieren" in German. Better check that your button is wide enough! Check anything with space constraints - buttons, menus, tabs, field titles, field lengths.
  • Layout. In particular, some cultures prefer a right-to-left layout. For example, your logout button is typically on the upper right in the US; in Dubai that logout button is usually on the upper left. This may go so far as to reverse the entire layout of the screen.*
  • Double-byte character sets. If you're ASCII encoding everything you will have a problem. Be sure to check double-byte character sets - Kanji and Arabic are good choices. One thing to look for here is data entry; if your users enter data it may appear to go in fine but be stored incorrectly, so always be sure to read it all the way back out.
  • Declared encoding. Be sure you haven't declared your encoding as us-en. Ideally you'll declare your encoding based on the user's browser header. Barring that, be as general as possible.
Localization is a huge project for even a small app. So be sure to define how far it goes, and then allot yourself plenty of time for testing, because this is going to touch your app from UI to database.

* Watch out for what this does to your automated tests!

Friday, November 16, 2007

If You Don't See It, Is It Really Gone?

Today's post is all about one of the surprisingly difficult tests:

If you test that something is not there, and you don't find it, did your test succeed or is your test broken?

Let's say, for example, that I want to test that a field is not present in the GUI. I fire up my IDE of choice and language of choice and write something like this*:

def test_field_not_present
    get :detail, {:type => "profile", :id => "1"}, {:user => "1"}
    assert_select "input#go_away_field", :count => 0
end

What I'm doing here is getting the page, and then checking that there are zero inputs with the id "go_away_field" (the field that shouldn't be there). Simple enough.

Here's the problem: I don't know whether the field is really not there or if there's an error in my test. Maybe I had a typo in my test and the field is still there.

I haven't figured out how to solve this one. Any ideas?


* The example is in Ruby. Choose whatever language you like.

Thursday, November 15, 2007

How NOT to Answer an Interview Question

This actually happened today. Names have been changed to protect the perpetrators.

During an interview, the candidate was discussing how the test automation she had worked on performed user actions against a file store. The tests as described were quite extensive and were very sure to hit all possible combinations.

Interviewer: "How did you find that your tests mapped to what clients actually did? Did the tests find most of the issues?"
Candidate: "Well, duh!"

(It should be noted that the interview took a major turn for the worse with those two words.)

Interviewer: "So, your clients didn't report issues that you hadn't found with your tests?"
Candidate: "They did. But we couldn't reproduce them, so we know our tests covered everything."


Now confidence I've seen before, but that level of arrogance was unusual. And in the candidate's own words, that arrogance was certainly not justified! Not being able to reproduce issues doesn't show that your tests are good. On the contrary, it shows that you're missing something - maybe a bug, maybe not - and you haven't got a test environment that truly matches what the customer does yet.

We will not be hiring this candidate.

Wednesday, November 14, 2007

Oh No My Queue Has a Bug!

One of the things about working in a de facto SCRUM environment is how you handle defects.

Basically, at the start of an iteration, you have a force-ranked list of what you're going to work on. The team walks down the list, commits to some portion of it, and the iteration starts. The list of tasks can be features, bugs, overhead work (install computers, etc).

Now, let's add a little twist (just a little one; this kind of thing happens every day):

Someone found a bug.


Okay, so that feature that you thought you had nailed had a bug in it. Now what? There are a lot of ways to handle this. You could:

Put the bug in the product backlog and handle it just like any other task.
  • Pros: Doesn't break the process!
  • Cons: If you have an urgent bug, you're basically stuck until at least the end of that iteration.
  • Net: This is great for non-urgent items. But for emergencies it's really not feasible. If you're really seriously considering this you've either got extremely patient clients or you're being overly optimistic.
Add the bug to the iteration - at the top of the queue.
  • Pros: Bugs get fixed.
  • Cons: All those tasks you committed to? Those aren't going to happen.
  • Net: This is probably swinging too far in favor of bug fixing.* It also will have you doing things your customers want less than all those other backlog items they've asked for.
Allot some amount of time for bug fixing as a task in every iteration.
  • Pros: Allows for bugs to happen, either previously existing or new, without destroying the iteration.
  • Cons: If there are no bugs and you have a lazy team, then you get people idle. Also, the amount of the iteration you need to allot is uncertain until you've done this for a while and learn what your needs really are.
  • Net: No bugs; yeah, right.
So, my preferred method is to allot some amount of time for bug fixing as a task in every iteration.

What have you seen tried? Do you have an answer for this dilemma?


* Disclaimer: Yes, we QA types do get to notice when something goes too far toward bug fixing. It's great when bugs get fixed, but sometimes that's not the best thing to do.
** Disclaimer Part II: The title is a bit sensational, I admit.

Tuesday, November 13, 2007

That's Not My Job

One of the most frustrating phrases I hear come out of people's mouths is "that's not my job". 

I work in startups, and the concept of what is and isn't your job is very flexible. So when you hear "that's not my job", it usually translates to "boy, that really doesn't sound very fun" or "boy I don't think I can do that".

There are two types of things that tend to cause choruses of "that's not my job":
  • Boring, dull or inconvenient tasks. Tonight there was a scheduled power outage in the building. So we stayed to bring the servers back up. I'm not in IT, but you know what, it's my job to get machines up so we can get emails running through and tests started.
  • Tasks with a high risk of public failure. You see this with perfectionists and new, nervous managers a lot.* If they think that they will likely fail, they'll avoid the task. How? "That's not my job."
This type of negative assertion really gets under my skin because it's not helpful. Saying that it isn't my job doesn't help me figure out who's job it is. And now we have a task that needs to get done and we don't know who can or should or will do it. Golly, we haven't accomplished much!

Let's turn this negative assertion around. We now have a declaration about what my job is not. Great. What exactly is my job?

My job is to help the company get to its goals - revenue, profit, exit. Does it help us get to those goals? Then it's my job.


* Disclaimer: Not all new managers are like this. Promise!

Monday, November 12, 2007

Real Options vs Touch It Once

I was reading an article on Real Options today (article can be found here).  In a nutshell, real options are like financial options (aka stock options aka chase the startup dream and hope they're not bathroom wallpaper!). Financial options have an expiration date and the smart user will avoid deciding whether to purchase or not until the expiration date; until they have to. Real Options are pretty much the same thing, only with non-financial decisions.

The point here is that you should avoid making a decision until the last possible second. Then, once you've made the decision, you should implement it as fast as you possibly can. Sound familiar? Your list of future decisions is your product backlog and your decision point is when an item pops off the top of the queue and into development. The trick of it is that you have to keep watching your future decisions so that you can tell when it's time to make a decision - just like you keep going over your product backlog and prioritizing it to reflect how important each item (or decision) is. It's a long article, but a good one.

Then I got to thinking.

I'm a huge fan of the school of thought that things should be touched as few times as possible. Every time you touch an item, it takes a certain amount of time. The more you touch it, the more overhead you're creating for yourself.

Take email, for example. My inbox is nearly empty. It's not that I don't get a lot of mail, because I do, but that I touch each email message once or twice at most. When it's time to handle my email I go through each message and do one of four things: (1) delete it; (2) file the information it contains into the wiki that is our engineering team's collective brain and then delete it; (3) respond to it and then delete it; or (4) mark it with a due date, put a note in my to do list, and file it into a "long responses" folder. Messages in the first three categories get handled just once. Messages in the last category get handled twice.

So, we've got one approach that says "when something comes in, touch it once" and an approach that says "keep it around in a pending state until you absolutely have to do something about it".  The former brings your decision point much earlier. The latter means you have to touch items many times.

I think the best approach is a hybrid of the two. Use the "touch it once" technique for interruptions and short items. If it's going to take less than an hour, just do it. The overhead simply isn't worth it. The same thing goes for interruptions or unusual events (these also tend to be higher priority or higher urgency, so it coincides nicely). If it's going to take more than an hour, go ahead and postpone it until you have to make a decision.

So go for it, use your Real Options and make more informed decisions. Just don't spend so much time reviewing your options that you don't get around to actually implementing something.

Friday, November 9, 2007

Producing an Impressive Pile of Paper

Writing and maintaining a huge document with highly detailed test cases is a huge pain. Often maintenance of the document gets in the way of actually testing the application!

On a related note, testing time is often what gets crunched. The question "Do you really need two weeks? We've got to ship in one week!" is very common. There is a strong need to open up and show management (and dev and other teams) exactly how much work there is to do. This is more true in testing than in dev, I've found, simply because coding is considered more of a black art than testing.

Sometimes you need to produce that impressive pile of paper to say "This! This is what's going to take us two weeks."

So, if you need to produce an impressive pile of paper and you don't want to spend a long time writing and maintaining test cases, what do you do? You produce paper that shows how you really test.

1. Test checklists. This is the poor man's test cases and it's a whole lot faster to write.
2. Bug verification lists. Just export this from your defect tracking system.
3. Automated test definitions. Whenever you're doing an automated test, you should be documenting what you're trying to test right there in the code. Run Javadoc, or Sandcastle, or perl2html, or whatever the appropriate doc generation tool is.
4. Test session plan. Are you doing exploratory testing? Put out your test mission schedule. This coordinates nicely with test checklists since test checklists match up to test missions.

And presto! You have a stack of paper that is impressively thick.

Don't fear documentation, just don't write it all by hand!

Thursday, November 8, 2007

Exploratory Testing 101

Many many people have written about exploratory testing, James Bach premiere among them. Companies then go off and get really excited and try to implement exploratory testing. And it fails miserably.

Why?

It's pretty simple, actually. Most of the time when exploratory testing fails it's because it's actually ad hod undirected testing. Without focus, exploratory testing veers into "just clicking" territory. In some cases, teams are actually doing exploratory testing and they're getting shut down because they can't explain how what they're doing isn't "just clicking around the app".

So let's break it down. It's exploratory testing if you're doing several things:
* Tests are broken into sessions of reasonably short duration (rarely more than 45 minutes)
* Each test session has a mission
* Test missions are specific and accomplishable during a single test session
* Testers prepare for but do not plan test sessions before they begin. Note that "prepare for" means they understand the system at a level sufficient to be able to find issues. "Plan" means define the details of what will be tested and how in a given session.
* Testers are able to "follow their noses" off the original mission
* Testing with a goal of learning about the system and its behavior

If you don't meet any one of those criteria, you're not doing exploratory testing. You may be doing some other form of improvsed testing (more on that later), but it's more likely you're just clicking around.

So stop just clicking and start testing!

Wednesday, November 7, 2007

Hiring Criteria

I'm hiring.

Now, the first thing to understand is that I'm in an extremely technical place. Your standard GUI tester is not going to cut it.

So, what am I looking for?
  • Good test instincts
  • Lack of fear
  • Ability to speak developer
  • Ability to speak business
  • Good tracking of systems from the highest level to the nitty gritty details
  • Joy
These aren't exactly things you can measure. You can't throw it down like an algorithm and see how the candidate solves it. So what do you do?

You ask questions by proxy and you test what you can.

So every candidate who comes before me does the following:
  • Takes a test. Yes, a real test. Sit down before a program and show me the bugs you can find.
  • Answers logic questions. Tell me how you think.  Show me how you can think at a high level and lower down. A common question will go something like this: "Tell me how gmail works. Walk me through the design and some of the possible pain points." You don't have to know, but show me that you know how to think about it.
  • Gives a technical description of the last system or application they worked on. I usually do this with a developer and allow the developer to ask questions about the system. The candidate should be able to have this kind of conversation easily.
  • Shows passion. The successful candidate can describe a favorite bug, a really cool test, an interesting problem.
  • Is honest about his or her coding skills. If you say you can code, you'll be asked to show it. If you say you can read, you'll be asked to walk us through some code.
This is a pretty special person. What do you get for being this kind of person?

You get to be on a team that cares deeply about pushing software test in directions no one ever has.

You get to work with developers who value your work and actively seek your help.

You get to play - to try new things and to solve new problems. There are very few ideas that aren't at least worth an experiment.

You get a really cool lab. Over 400 machines, all for testing.

Interested? Email me.

Tuesday, November 6, 2007

Processes and Dogma

I've been writing a fair amount lately about development process (especially Extreme Programming) and project management process (SCRUM, mostly), but there is one very important point I've made.

Process is good. Dogma is bad.

Following a process is great. It gives you many many benefits, from predictability to a framework for thinking about problems to rules that prevent infighting. However, processes are subject to the real world, and each implementation needs to be flexible and tuned to your needs.

So, what is the purpose of having a process like XP or SCRUM?
  • Provide consistency. Having a process will help you do things the same way each time.
  • Take advantage of others' mistakes. Every process was created over a number of years with numerous mistakes that were corrected - and you get the benefit of it.
  • Ensure smooth flow throughout the development process. Having a process to provide rules and checks prevents developers fighting with marketing, developers fighting amongst themselves, etc. Normally, this shouldn't happen, but sometimes having the process is nice.
But.... don't let process get in the way of what you're really there for - shipping product. Sometimes you have to set aside the process for some emergency. You should know why you do it, and you should correct for it so you don't have to step outside your process, but it will happen.

So have a process. Follow your process. But don't forget that process isn't the point. Shipping your software is the point.

Monday, November 5, 2007

Test Commonalities: Email

Welcome to part 3 of my Test Commonalities series. In this series we discuss the test areas that come up over and over again across many projects. The goal is to create a good cheat sheet for each so we don't have to recreate the wheel every single time. Today: email.

Email comes up all over applications. It can appear as a login, as a place to send notifications, a place to receive notifications, even a unique identifier for users. In many cases, you'll want to ensure emails are valid. In other cases, you'll want to handle the responses to bounced emails and other trouble.

So, what are the kinds of things we need to test?

  • Email format. Is the email well-formed? Technically, this is covered by RFC2822. However, many mail servers are stricter about what they will accept. Typically, you're looking at allowing some special characters in email: dash ( - ), dot ( . ), plus ( + ). You're also looking at making sure that there is an at symbol ( @ ) and making sure there are characters around it. Don't forget to use special characters right around the @ symbol, which will defeat many email validators.
  • Handling responses. Test how your system handles various responses. What happens if someone replies to an email you send out? How are bounced messages handled? Delivery delays? Often, you'll want the responses to go somewhere so you can get to them.
  • Unique identifiers. Often emails are used as unique identifiers for users.* Test to be sure that an email is unique. Be sure you test across cases to be sure you're not doing case-sensitive comparisons. Also test with an without special characters in emails, since they may be stripped.
  • Comments in emails. It's not uncommon to see emails formatted as user@example.com. This is fine for display, but if you actually need to email users, you will need to find and remove standard comments. Be sure to test for the most common comment formats: all caps separated by dots, delimited by <>, and delimited by ( ).
  • Non-spam format. If your system posts emails anywhere they can be seen or retrieved, consider adding a simple anti-spam helper to them. Simple things like writing out the at symbol ( @ ) will make your users happier without increasing the test burden hugely. More substantial changes are likely to make the testing burden too heavy unless the goal is to render the email unreadable.

This is a case where having a good data set is a great start (I've blogged about that before). To really test emails properly, though, you need to consider both the data and the behavior of the data - not just whether the email is formatted well, but what happens when you send, receive or display it.

* This is actually a really dumb idea, because invariably your users will want to consolidate across email addresses. Email does make for a good login; just be sure you have some other unique identifier to tie your users' multiple emails into a single account.

Friday, November 2, 2007

SCRUM and XP

Working in an XP environment, I've come to notice some of the hidden dependencies, and noticed in particular how symbiotic XP can be with SCRUM.

First, a bit of background. XP is a development process that uses the idea of very rapid iterations. The point of XP, in a nutshell, is that code will change, so your development process should be designed to accomodate code change, large and small. See my earlier blog post for more.

SCRUM is a project management process. It tells you nothing about how to write code; it only attempts to describe how to set up an environment around the activity of development such that you can ship. I wrote a blog post about this before.

I've started to notice that the terminology and ideas behind XP show up in SCRUM as well. For example:
  • Iterations (XP) = Sprints (SCRUM)
  • Force-ranked development cards (XP) = Product backlog (SCRUM)
  • Velocity (XP) = Velocity (SCRUM)
  • Stories (XP) = Product backlog items (SCRUM)
But it goes even deeper than terms. There are a fair number of similar goals, as well, that are somewhat isolated to these two processes:
  • Shippable product. Sure, every software development process has a shippable product as its goal. SCRUM and XP are more specific and call for a shippable product at the end of every iteration/sprint.
  • Development periodicity. In an ideal SCRUM and XP world, there is very little "crunch time". A given iteration/sprint is short enough and well-understood before it begins, so there should be few surprises and therefore little crunch time. It should be a steady pace throughout. In reality there are crunches, but they tend to be short - a few days instead of several weeks.
  • Small tasks. Because of the shortness of durations and the emphasis placed on estimation, tasks tend to be small - no more than a development week or so. There may be many tasks to accomplish a large feature, but each task is itself fairly short. This is what ensures that tasks can fit into an iteration/sprint.
  • Client emphasis. Both SCRUM and XP emphasize the power of the customer. Whether it's the customer himself (as is ideal in XP) or some proxy (usually product management, and more normal in the real world), the client is the source of all requirements, and requirements are specified in terms the customer can understand.
So, what does all this mean or imply?

The only real conclusion I can draw is that if you're in an XP environment, you should strongly consider SCRUM for your project management process. It will help bring the entire company into the same way of thinking, and lead to fewer process-based clashes.


* Disclaimer: I make no statements about which came first, or which is better. I'm merely noticing similarities.

Thursday, November 1, 2007

Verifying Automated Tests

Let's say you've reached near-nirvana. Not only do you you have tests defined, you've defined which ones should be automated tests. Then you've gotten developer buy-in and now every feature you receive comes with an implementation of the tests as you've defined them. There's just one more question....

How do you verify automated tests?

There are several things you have to know in order to know that an automated test is sufficient. You need to do the following:
  1. Define the tests before they're written. This way you know that the developer is working from a good spec. Don't expect the person writing the code to also identify the tests. Having more than one person involved will give you better tests because you won't be at the mercy of a single person's assumptions.
  2. Verify that the tests run. The first step once you receive the tests is to prove that they all run and that they pass. This gives you a baseline to know that the tests should pass; it's a simple sanity check. If the tests don't all pass, go back to the developer. There's an error in code or an error in the assumptions behind the tests.
  3. Verify that all defined tests are implemented. This is, in the end, a code review. Every test defined should be implemented completely. Sometimes certain tests or assertions are difficult or time-consuming, but they shouldn't be ignored. Check the setup, the logic, and the assertions in each test. Also, check that the test data is complete and exercises the code.
  4. Test the feature manually. As a sanity check for the tester, use the feature manually, as your users will. This will help catch anything that the pre-defined tests missed. After all, the automated tests are only as good as your original specification. So test that original specification by using the feature. Often you'll find you missed one or more tests. Occasionally you'll find that you have duplicate or extraneous tests.
That you have automated tests to verify is great. Just make sure that you don't trust the green bar* until you know you can trust the tests it's running.

* Disclaimer: Not all test frameworks have a green bar for passing tests, but it makes a good metaphor. No complaining; I fully respect your alternate green bar-like passing indicators.

Wednesday, October 31, 2007

Which Came First?

My current job uses an XP (aka Extreme Programming) development process. Now, lots of companies say they do, but these guys really do, right down to stories, a force-ranked queue, pairing, and writing tests first.

There are a lot of tenets of XP that affect test, but the two that I would like to consider today are:
  • Test first.
  • Automated tests are good (yes, this is a simplification, but parsing this better is a job for a later post)
This is all well and good, but what should you really be testing first? There are three ways to approach testing a feature:
  1. write the automated tests, then test manually for usability and context
  2. test manually, then write automated tests for regressions
  3. split automation from manual testing and try to accomplish them roughly in parallel
Extreme Programming lends itself very well to approach #1. You write the test automation as (or even before) you write the code. Then, when it's basically done, you take a look at it as a user would see it - through the GUI (however that's expressed).

More traditional software development methodologies put test automation on QA's shoulders, and approaches 2 and 3 are very common. You look at a feature first as the user would see it, then you automate.

So, now that I find myself in an Extreme Programming shop, how do I change the more traditional approach? Or how do I change the XP approach to provide some of the benefits of the old way?
  1. Do not eliminate manual testing. Yes, I know this violates XP methodologies, but if you don't see what your user sees at some point, then you will make a product that is very hard to use. If you can figure out how to automate the user experience once you're happy with it, great, but do have the courtesy to step into your user's shoes at some point.
  2. Do allow automation to reduce your manual testing time. Automation can't cover everything, but it can eliminate many classes of tests. No more manual boundary value testing of that form - just automate it and move on. Save the manual tests for the things machines can't do.
  3. 100% defect automation is good. If you can describe it well enough to be a defect, you can automate the test for it. This saves a lot of time in regression testing in particular. I wrote a whole blog post on this earlier.
  4. Automate the boring stuff. All those tests that have to be done for every release or on every build and that are the same over and over are good candidates for automation. Humans make mistakes and take shortcuts because they believe they know what it will do, so make the computer do it instead. This also frees up your humans for more advanced, more interesting, more complete testing - and keeps your team engaged.
  5. Test as the user. Figure out what tests of yours users really care about and hit them manually, at least sometimes.
Put these all together and you've accomplished two pretty important things. First, you still put your user at the center of your testing. Second, you build a base of test automation so you can test more and test better.

Tuesday, October 30, 2007

It WILL Get Done

Commitment is very important in QA. Generally by the time QA is in the spotlight, it's close to a release and tensions are starting to get just a little bit high. Part of QA's job at this point is to maintain and encourage calm.

So what's the best way to do this?

Be very very careful of your commitments.

If you say you're going to do something, get it done. If you have to stay late to finish, stay late. In our profession, more than many others, slippage is a problem. It's easy to say that it's getting late, and you have to get home, and tomorrow's good enough. But there are consequences. If you said you would finish something today, and it doesn't happen, then there is often a house of cards that will fall around you.

Just today I was at work late installing a demo system. Why? I had said I would get it done today. So what was the difference between 10pm one night and 2pm the next day?
  • 6 hours less of demo practice time for sales
  • The demo data would finish loading 6 hours later - too late to ship out in time for the demo.
  • QA would seem less dependable.
The last item is by far the most important part. As QA, your job is to eliminate surprise. You should be the most dependable part of the entire software development process. The goal is simple:

If you say you're going to do something, everyone around you needs to know that it will get done.

Monday, October 29, 2007

Test Commonalities: File Attributes

Welcome to part three of my Test Commonalities series! In this series we talk about common items that come up over and over across projects.

A file is generally treated by a program - and by most users - as a unitary item. That is, a file is a thing, one single thing. In general, though, that's simply not true. A file is actually a thing (the file itself) and a bunch of meta-information. Depending on the context, these are given different names, but in general they are attributes of the file.

So, when we test file system attributes what do we need to consider?

Well, first we need to define the scope of the concern. In some cases, the system will be isolated on a file system. For example, a simple desktop application may never access a remote file system. In other cases the system will involve a remote file system. Examples of this include sync applications and applications allowing or utilizing a network drive.

If we are isolated on a single system, we must consider the following:
  • Performing legal operations. For example, read a hidden file, write to an archiveable file.
  • Attempting illegal operations. For example, write to a read-only file.
  • Displaying things to the user that match what the operating system displays. For example, show a read-only file to the user.
  • Displaying things to the user that don't match what the operating system displays. For example, show a hidden file to the user.
  • Changing an attribute of the file
  • Preserving file attributes across modifications of the file. For example, writing to a hidden file should leave it hidden.
  • Avoiding unintentional modifications to attributes. For example, reading a file should not update the modified time.
  • Inherited attributes. For example, if a directory is read-only, a file inside that directory can be expected to be read-only.
When there are multiple systems to consider, we must test everything above and some additional items:
  • Unavailable attributes. Some systems may not support certain attributes that other systems do. For example, older versions of Windows do not have an archiveable attribute.
  • Additional attributes. This is the inverse of unavailable attributes. For example, Windows Vista offers extended attributes not available to older versions of the operating system.
  • Different attribute names. In particular, when crossing operating systems (e.g., Mac to Windows), some of the same attributes may have different names.
  • Entirely different attributes. In particular when crossing operating systems. This is a special instance of unavailable attributes and different attributes.
  • Transferring attributes. When transferring a file, the attributes for that file must also be transferred. This is often a separate action from writing the file itself.
In short, there's a lot more to a file than just the file itself, and it all needs testing. Maybe your program deals with attributes already, maybe it doesn't, but no matter how isolated your program is, you have to assume that files and file attributes can change underneath it, so test away!

Friday, October 26, 2007

Self-Governing Groups

Mishkin Berteig published a blog entry yesterday about truthfulness in Agile (blog entry here:  http://www.agileadvice.com/archives/2007/10/truthfulness_in.html). His point, put shortly, is that "truth" is an essential part of agile. He defines "truth" as visibility and honesty.

The post got me to thinking about visibility and honesty in software groups. The actual issue isn't so much visibility. The issue is that agile assumes a self-governing group. The only way a self-governing group works is if you have self-governing people. However, that's not the only thing needed to make a real team work effectively.

So, how do these self-governing groups (like agile teams) work effectively?

First, the group must share risk. Either the group succeeds or the group fails, and the entire group has to believe that. If this doesn't happen, you foster competition among the group, and that will destroy the group's effectiveness. After all, if you're competing with me, why would you stay an extra hour to help me finish a task? But if the group will fail with the task undone, then you'll stay the extra hour to help finish. 

Second, the group must have mutual respect. If you think you're better than me, you'll behave accordingly. And I'll react accordingly. All of a sudden, I'm doing the easy tasks and doing them less well than you would - all because you seem better than I am. But if we respect each other, than we'll each push the other to our limits, and that's good for the whole group.

Lastly, the group must have the ability to discipline or expel its own members. This provides the group with sustainability. A group that cannot solve its own problems, even with its own composition, is a group that is governed by some external force (thus defeating the self-governing portion of our group definition).

So, if you want a self-governing group, get good people. Get people who are motivated and able to govern themselves. Then put them together and stand back.

Magic will happen.

Thursday, October 25, 2007

Test Commonalities: Paths and Filenames

This is part two (and the first substantial post) in my Test Commonalities series. In this series, we discuss the test areas that come up over and over again across many projects. The goal is to create a good cheat sheet for each so we don't have to reinvent the wheel every single time. Today: paths and filenames.

Handling paths and filenames is a huge part of testing any time you have interaction with the file system. Every time you have an application that writes a file, or creates a directory, or reads a file, you have a system that should be tested for paths and file names.

So, what is this testing? Basically, when you read or write a file, you subject your system to the rules and conventions of the file system on which you are running. You are interacting with a third party just as much as that interface with another system.

So for each type of operating system, you need to understand the rules of the file system.

On Windows:
  • Paths, including file name and extension must be less than 256 characters*
  • File extensions are optional
  • File system is case insensitive
  • Certain paths change per user (e.g., a %USERHOME% is usually c:\Users\catherine)
  • "Reserved paths" are usually addressed by environment variable (e.g., a user's home directory, the default program storage location, the temp directory location)
  • "Reserved paths" are not in the same actual location across various OS versions
  • Certain directories require special privileges to write to them
  • Hidden and system files are denoted by file system attributes
  • Paths begin with a drive letter
  • Certain characters (e.g., : or / ) are illegal in path and file names
  • File extensions may be of any length from 0 to 250 characters
On UNIX/Linux/Mac**:
  • File system is case sensitive (except macs)
  • Files beginning with a . are generally hidden from the user
  • Hidden and system files are denoted by the file name and location
  • Certain directories require special privileges to write to them
  • Certain paths change per user (e.g., ~/ takes you to a user's home directory)
  • Paths begin with a /
  • Directory delimiters in a path must be /
  • Certain characters (e.g., /) are illegal in path and file names.
You wouldn't skip testing your system's interface with a billing system, or with an HL7 feed. Don't skip testing your system's interface with the file system, either.

*Disclaimer: Yes, there are ways around this, but they're generally ill-supported and will get you in trouble.
** Disclaimer part 2: Yes, I know there are a lot of different file systems available for UNIXand UNIX-like operating systems and they don't all match. This post is intended to cover mostly client systems where this kind of interaction is likely. Do your own research for your own wacky file system if you want to use one.