Friday, January 29, 2010

Irrelevant Imperfections

A developer at work today gave a brown bag lunch talk on a virtualized test environment he's been working on. Part of his introduction included this:

"This takes the imperfect parts of our simulation of production and attempts to move them into memory size, disk space, and other hopefully less interesting areas."

He's talking about irrelevant imperfections, which is a key concept in testing.

Assumption: No matter how hard you try, you will not completely mimic the production environment. Something will be different, whether it's number of users, amount of data in your database, machine configuration, use of self-signed vs purchased certs, etc.

The further assumption of successful testing is to make sure that the differences between your test system and your production system are irrelevant. It's the tester's job to make this as true as possible.

Analysis of your system can help determine which imperfections are irrelevant:
  • For many systems, the hardware resources won't limit most functional testing. If this is a standard webapp, for example, a machine with 2GB RAM is probably about as good as a machine with 4GB RAM.
  • In general, the patch level of the underlying OS won't matter unless your product is particularly tied to the OS. Sure, Windows 2003 vs Windows 2008 Server will be different, but Windows 2008 Server patched vs unpatched probably wouldn't be relevant.
  • The particular network configuration is probably safe to change in many cases. If you have a 192.168.x network internally and a 215.16.x network in production, that's hopefully irrelevant to any product issues.
All these are imperfections, but for many systems they are irrelevant. Other imperfections are more relevant (e.g., separating the database server to a separate box might matter greatly to your application, or testing with 1 client when production usually has 10 clients simultaneously connected).

Accepting that you won't completely mimic production, the onus is on you as a tester to figure out what is different and make sure that the list of differences are as irrelevant as possible.

Wednesday, January 27, 2010

Expect to Find

Creating test cases is really easy. I can do that all day long, and wind up with dozens or hundreds of 'em if I'm on a roll. For example, I'm currently thinking about how to tests a process that should run in a fairly isolated way on a node in our system. There are other background processes on the system, and then there are foreground processes that provide the core user functions while this goes on. So I can write test cases of the form "Run test process while background process Y occurs" and come up with a lot of them.

But is that really useful? In a truly black-box system where I don't understand the potential interactions between processes, I might have to run all of these. I'm not in a truly black-box environment, though; I understand the architecture of the system. Knowing the architecture of the system lets me intelligently pare down (or reorder) my test cases to the ones that are more likely to show any problems. I can find potential issues much more quickly than I could blindly poking at system functions.

Given a potential test case, to see if it's useful to run, consider the following:

What is an example of an issue that this test might find?

Simple, really. Just like I can imagine a lot of test cases, I can imagine a lot of bugs. Let's put that to use. The purpose of running this test is to gather information, often in the form of unexpected behavior. Speculating as to what information this test case might provide or what issues it might find will help me figure out if the test should be run early, late, or not at all.

For example, in my system when I have this test process running, I know that a process on another machine in the same system is unlikely to affect this test process, which runs completely on box with no external dependencies. I should confirm or refute that assumption, but that can be done with a few test cases; I don't need 20 test cases to prove the same thing.

So as I'm thinking of tests to run, I ask myself what this issue might show. That way I know whether I should run the test, or if it just duplicates something I've already done, or if it's just poorly designed. Speculate about what you might find; it'll save some time in the end.

Tuesday, January 26, 2010

Reverse Your Tests

We have a set of tests we run every night. They generally run in the same order:
  • Test A
  • Test B
  • Test C
Often this works just fine. However, sometimes bad things happen. Test B might hang, and Test C just doesn't get run at all. Maybe there's a power failure halfway through Test A, so that night you don't get Test B or Test C.

Over time, you accumulate more runs of Test A than of Test B, and more of Test B than of Test C. You're exercising some code more than others now. Any defects that are shown by Test C are likely to remain hidden for longer than defects shown by Test A, simply because you have fewer opportunities to discover them. In the short urn this isn't a big deal, but in the long run, your testing starts to get lopsided.

If you're not getting through all your tests for one reason or another, try one simple thing to smooth out your coverage:

Reverse your tests.

For a while, run your tests in this order:
  • Test C
  • Test B
  • Test A
It's hardly perfect, but with most test systems this is pretty simple. In our case, it was a matter of adding a call to reverse(@tests).

Running your tests backwards actually provides several benefits:
  • It runs your least-run tests first for a while and evens out the frequency of coverage a bit. You probably won't run Test A quite as much, but you'll have more runs on Test C and you'll flush out problems there. When the pendulum has swung far enough, change order again.
  • It tells you whether your tests are really as isolated as you thought. If there are dependencies between tests, then you'll start seeing breakage and failures due to those (now missing) dependencies.
  • It's change. Maybe this one is just me, but even little changes like this make everything I'm doing just a bit more interesting.
Watch your tests, and watch how often you really run them. If it starts looking lopsided, it's time to change things up. It's time to reverse your tests.

Monday, January 25, 2010

Ambiguous Specs

Sometimes I get a spec that is ambiguous. For example, the following is part of a spec I saw recently:

Users have widgets, and each widget must get a license from the central license server. To get a license, each widget must make an activate call to the central license server. ... When the widget license expires, the admin will extend the license, and the widget must then make a reactivate call to the central license server. ... If the user has licenses available for multiple widget types and tries to activate a widget, then the widget is marked ambiguous. The admin should assign the widget a specific type, then the user can reactivate that widget.

Forget for the moment that the work flow is a little odd. The word "reactivate" here is ambiguous. "Reactivate" can mean "use the reactivate call" or it could mean "call activate again".

So how do we disambiguate our requirement so we can confirm it works? We have options:
  • If we have access to the customer (or whoever wrote the spec), we can ask.
  • We can assume either works.
  • We can look at what was implemented.
Obviously, asking the customer is ideal. It's not always possible, however. Sometimes, even when you do get to the customer, the answer you get is, "I don't know" or "It really doesn't matter. Just pick one."

Assuming both cases are valid may happen to work. In the case of a public API that's trying to be really flexible, it may also be the customer's intent. I've certainly had customers that had the attitude that "if someone might assume it works like that, we have to support it." (This is very hard in practice, even though it sure sounds good!)

Or we can look for other sources of information, in this case, what was implemented. Looking at the actual code is a bit of a slippery slope, because it's dangerous to make the assumption that the developer both understood what the customer wanted and then implemented it successfully (or at least successfully enough you can glean the customer's desires from it). However, it's a good source of starter information. For small things, where neither behavior is truly wrong, often the customer doesn't care. Consistency is what's most important. And as long as the behavior the developer picked is consistent with the rest of the user experience, often the customer will be perfectly okay with that.

When faced with an ambiguous spec, remember that you have a source of information right there in the code you're facing. It's not always the correct answer, but then again, it's awfully hard to find a source of information that's right all the time. So take as a starting point what the code does, and validate it the way you would test any requirement. Evaluate it for consistency, efficacy, and reasonableness. If it's good, then it's likely that the code really is telling you what the application ought to do.

(Note that there still may be bugs. We're looking to resolve ambiguities here, not find total perfection just yet!)

Friday, January 22, 2010

Where Is the Line?

Take the following situation.

The product under test is a software library. We have also written a wrapper program that implements the library and provides some helper functions (such as the ability to run it from the command line, and some lightweight reporting). Lastly, we have some automated tests we have written that exercise the library directly as well as through the wrapper program.

We find a bug in the wrapper program. Here's the question of the day:

Should QA just fix it, or not?

Let's posit the following:
  • having the same person write and test code is generally to be avoided; two minds and viewpoints will find more of the problems than one alone
  • QA has the coding and technical skills to fix the problem
  • QA has source code access and can actually check in a fix (this is a useful part of fixing something!)
  • having a tester work in the code "taints" that code for that tester and means someone else needs to start
There are some arguments in favor of fixing it:
  • It reduces turnaround time and gets rid of the problem quickly.
  • The bug isn't strictly in the product, so we're not "tainting" the tester for that product. That is, the tester isn't now testing their own code.
  • In some sense, the wrapper script is test code, and that's completely okay for QA to fix (heck, we work with it constantly!).
There are some arguments against fixing it as well, though:
  • We have a wrapper script distinct from test code for a reason. In our case, the wrapper script is intended to be sample code as well, and that makes it product code.
  • We found a bug. Cool. Now our job is to report it and move on with more testing (that thing we do best).
  • Since we believe that two pairs of eyes should be on each piece of code, we'll need to bring in another tester for this piece of code.
I'm really on the fence with this one. In general, I try to take the middle road and get another set of eyes on the fix. This can be done by pairing with a developer to fix it, or by submitting a patch for review and inclusion. Whatever method you choose, before you go do that quick bug fix, be cognizant of what you're doing.

Thursday, January 21, 2010

Unexpected, Not Wrong

Yesterday I wrote about some behavior in Ruby that I didn't expect. Basically, I was attempting to use "downcase!" (rather than "downcase") and getting nil if nothing changed. A commenter noted that was intentional and kindly linked to the spec.

Here's how I got into the situation:

I was trying to get an attribute off an object, and that attribute took the form "num_[Type]". I already had the type handy in a variable, but I couldn't guarantee it was the same case as the attribute. So I thought to myself, "Self, this is simple. I'll just downcase it since I know the attribute is always all lower case." Then I said to myself, "Self, I'm not going to deal with assigning variables here; I'll just modify that type in place." So I added a "!" to the "type.downcase" I had already written.

It turns out I was wrong. When it failed and I looked at the rdoc for that method, I found out what I was doing wrong (and removed the "!"). It wasn't a big deal. But it did make me scratch my head.

So... is it a bug?

It was unexpected. However, it was also doing exactly what the implementer intended. Intent isn't everything, but that means it wasn't a simple mistake (aka bug). There are two questions to ask:
  • Is that intent correct?
  • Is that intent expressed to the consumer of the product?
The first question is fairly obvious. Okay, the implementor meant to do that, but is it really the right thing to do? Is it consistent with the remainder of the product? If similar workflows and similar items exhibit similar behavior, then it's probably expected behavior, and that's generally good. Does it meet the consumer's need that it intends to fix? This is the harder part of correctness. The product's consumer is going to do something with this particular feature, and the product needs to allow the consumer to accomplish that need or action. This is generally where your customer or your product management needs to weigh in.

The second question is a bit more obscure. Assuming you've crossed the first two hurdles - intent and correctness of intent - there is a third hurdle of actually letting the product consumer understand this intent. In short, we have to document the behavior. That can be through documentation, code samples, comments, etc. It can also be through simple consistency, conforming to customer's expectations by being similar to something familiar.

No matter what product you're using, you'll occasionally run across behavior that makes you scratch your head. Figuring out if that behavior is a bug revolves around figuring out what was intended, whether that was right, and if the intended behavior is conveyed to the consumer. When you know all that, then you'll know whether you have a bug, or just a scratched head.

Wednesday, January 20, 2010

Ruby Downcase Bang

I have some Ruby code that does the following:

type = Foo
limit = "num_allowed_#{type.downcase!}"

It's fairly straightforward code to tell me how many allowed widgets a given customer has, or frobbles ("limit" is the attribute that is the max allowed widgets or frobbles or whatever). There's some surprising behavior, though.

If the string I'm attempting to downcase is already entirely lowercase, then it becomes nil.

For example:
penitentes:src cpowell$ irb
>> type = "Foo"
=> "Foo"
>> quantity = "num_#{type.downcase!}"
=> "num_foo"
>>
?>
?> type = "foo"
=> "foo"
>> quantity = "num_#{type.downcase!}"
=> "num_"
>>

That's not what I expected. If you don't use the bang (just use "downcase"), then it works as expected. For example:
penitentes:src cpowell$ irb
>> type = "Foo"
=> "Foo"
>> quantity = "num_#{type.downcase}"
=> "num_foo"
>> type = "foo"
=> "foo"
>> quantity = "num_#{type.downcase}"
=> "num_foo"
>>

I'm not sure it's a bug, strictly, but it's certainly odd.


P.S. I've been coding a lot lately, so the posts are getting a bit technical. The pendulum will swing the other way at some point.


Tuesday, January 19, 2010

Cucumber, Webrat and XML

I've written before about using cucumber and soap4r to test SOAP web services. That still works, but it's annoying when you're trying to test a SOAP web service you're writing. In order to test a web service you're writing with soap4r, you have to pretend its remote and run it outside your standard test environment. So in order to run tests that way, I had to:
  • start my server
  • put the right data into the database my server is pointing at
  • reconcile the data in the server database with my test data
  • run the tests
  • re-reconcile the data in the server database to clean it up
I wound up spending a lot of time on maintenance and tracking down issues that really didn't have anything to do with the software. Plus, the setup requiremented certainly limited the incentive to run the tests. A typical run would look like this:
  • run tests. (FAIL)
  • Whoops. Forgot. Start the server.
  • run tests. (FAIL)
  • whoops. I didn't clean out from the other manual testing I was doing on that server so my database has unexpected data in it. Clean out database.
  • run tests (actually find something useful or interesting)
  • etc...
Step back a moment, and let's look at what we're testing. We're exercising the code, yes. We're also testing our system configuration (Is the web server properly configured for ssl? Does it run on the expected ports?). This is useful in some scenarios, but when it's just my development server, testing the system configuration seems less than ideal. I'm really more interested in testing the code at this point.

So why are we doing this? Long story short, I blame one key limitation in soap4r, and that is that soap4r expects to make API calls to a full domain name. You have to call "http://localhost/webservice/api/MyCall", and you cannot call "webservice/api/MyCall". This makes it really hard to run in the standard cucumber test environment. It's fine as long as you're consuming a truly remote web service (e.g., a Google API or something), but it's rough when you're trying to test your own web service.
But there is another way. You can simply skip the problem component (soap4r in this case). Instead, I simply use cucumber and webrat's existing infrastructure and feed it my self-formed SOAP requests. For example, this is how I send a call to our Activation web service:
When /^"([^\"]*)" activates a widget with description "([^\"]*)" with password "([^\"]*)"$/ do |email, description, password|
cert_key = "-----BEGIN CERTIFICATE-----
MIIDGDCCAwwCADADBgEAMAAwHhcNMTAwMTE4MDUxMDM0WhcNMTAwMTE4MDYxMDM0
WjCBujELMAkGA1UEBhMCVVMxFjAUBgNVBAgMDU1hc3NhY2h1c2V0dHMxDzANBgNV
BAcMBkJvc3RvbjETMBEGA1UECgwKQWJha2FzIExMQzEtMCsGA1UEAwwkNzhCQUJG
MzgtOEEzOC00MjAwLTlBN0ItNjIwQjQ3MTlGNDkzMT4wPAYJKoZIhvcNAQkBFi83
OEJBQkYzOC04QTM4LTQyMDAtOUE3Qi02MjBCNDcxOUY0OTNAYWJha2FzLmNvbTCC
AiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBALFCwTbWA95OwK1fKFLycsgG
PuawxjKtkYaiSbSSwMgpU8qqqN4pvv4aPzqLJiZweGMnGFZA7zOePUgu1AHBIKy2
cfrOyxpyrxNUy8mK8RnQKdLkJkmLPBIj5i+w2QoTNs/v+hRW2XLvrzntkVO6U7VJ
NrvKTTfck7rR9v1+9ctrDR8+h0qo/N6fy1k+2jsiEb6VcuC8KjNqVaqragiQe7LT
pbwqeaaYKkH6kNmzNX7tRItaPfyn5G3eE+LSjrpyyzOImWRD/PeA1ZGx8HwYI5rG
gz9Dvs5XGwwyjgwJLCYSEoDnZ0F3o/lhDUjrKjwrgJW0uEQwfwLzCL5APneNpPn5
5owXUCzHe+U6xhO4josYPG7Qx5eDHnuxLWCFuxqvIVSisTRNrW9IYHGifRoNeDEy
qSRlCZB35SMXMAW+b3iVd8uz9HO2UELGnirmPTXY1SV7+1TvAbJJG64ISXCrw/dq
v5mU0nLp1AxX7hv2jjq302t7mVHvA+b+D1I8O13HEANkL3ebVuMotCQ52AgAHTDK
8BfCS+lp3Cdm/778dDRq6UAk/ZLIKb+6/m33yYS9j1IS3Xri5qroIC5rMU6IGtLz
XIOy9DA0CRiOSev8c/g83VpOnhCrhkfmKUivvBchEMvlqOFZa1TQqbq/z5KxeCBl
sS1nbjn/2AyCrVcFQFqTAgMBAAEwAwYBAAMBAA==
-----END CERTIFICATE-----
"
body = <<-body
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
#{description}
#{cert_key}
body
post "/csh/api", body, {"HTTP_AUTHORIZATION" => basic_auth(email, password), "HTTP_SOAPACTION" => "soap/csh/api/Activation"}
end
There are a few key things to notice here:
  • The second argument to the post call contains the SOAP message itself (in XML form, not in any object form)
  • The third argument to the post call is where all the headers go. For a SOAP message, you have to set HTTP_SOAPACTION.
  • I'm using basic authentication here, so I had to set that in the headers as well. You could use cert-based auth by setting "SSL_CLIENT_CERT" in the headers instead.
  • Note that I've munged the example a bit so I didn't have to include all the code that constructs the XML, but I'll leave that as an exercise to the reader.
There is a place in this world for soap4r and the like, but for someone developing their own web service, this other way might be useful.

Monday, January 18, 2010

Passenger and Apache Load Order

On OS X (Snow Leopard), I'm trying to use Apache and Passenger to serve my Rails app. I ran into an issue where it just wouldn't load. Instead, I got errors saying that the server simply wasn't started (and Apache wasn't running).

Trick 1: What the Passenger PrefPane does
The first trick to getting it working was understanding exactly what the Passenger pref pane really does. When you add a Rails app to the Passenger pref pane, it:
  • Adds this to the bottom of httpd.conf
NameVirtualHost *:80
ServerName _default_
Include /private/etc/apache2/passenger_pane_vhosts/*.conf
  • Creates a vhost file named with the address you specified in the pref pane
  • Puts the following in the vhost file

    ServerName csh.local
      DocumentRoot "/Users/cpowell/projects/csh/public"
        RailsEnv development

            Order allow,deny
              Allow from all

                    Trick 2: Load Order
                    When you set up Passenger, if you followed instructions, you made a file called other/passenger.conf. Your passenger.conf looks something like this:
                    LoadModule passenger_module /Library/Ruby/Gems/1.8/gems/passenger-2.2.9/ext/apache2/mod_passenger.so
                    PassengerRoot /Library/Ruby/Gems/1.8/gems/passenger-2.2.9
                    PassengerRuby /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby
                    The trick is that this file has to be loaded before you load any of your Passenger vhosts (aka Rails apps running under Passenger). So, the load order in httpd.conf is important.

                    There are three relevant lines in httpd.conf. If they are not in this order, your application simply won't run.
                    1. Include /private/etc/apache2/other/*.conf (this loads your Passenger module and all the Rails goodness your apps need)
                    2. Include /private/etc/apache2/extra/httpd-vhosts.conf (this loads your vhosts, if you're putting them in the file named, logically, httpd-vhosts.conf. This is where I put things when I wasn't using the Passenger pref pane. Note that this is commented out by default, so if you're going to put your things there, you need to uncomment it.)
                    3. Include /private/etc/apache2/passenger_pane_vhosts/*.conf (this is where the Passenger pref pane puts your apps)
                    Trick 3: Passenger Pref Pane appends
                    After you've set up your application in the Passenger pref pane, if you modify it, the behavior is slightly odd. For example, I might switch the app from running in Development to Production. When I do this, the pref pane adds the following text to the vhost configuration in passenger_pane_vhosts/[my app].conf:
                    Order allow,deny
                    Allow from all
                    It doesn't replace the original entry; it adds a new entry. This means I get several directory entries over time. So I have to go back and clean up the vhosts file by hand.

                    None of these things are killer, but I at least found myself rather frustrated by them. Hopefully this saves someone else some time!

                    Friday, January 15, 2010

                    Ritual Question

                    It's easy to make tasks into habits over time. For example, our team stands up every day at about 9:30 am and talks about what we've done and what we're doing. It's similar to a SCRUM standup. When we started it was a little odd, but over time we got the hang of it and we have a rhythm. Now it's a daily ritual.

                    This is great for us. We get a view into what each of us is working on. There's a chance to crow about some really cool thing we did. A struggle or frustration becomes apparent and can be noticed so it can be handled after the meeting.

                    Sometimes, though, updates start to get sketchy. "I'm working on the frobble, still going." Or "I did this awesome thing!" (response: mumbles). Has our ritual turned into just a meaningless ritual?

                    Every so often - once every 3 months or so - it's useful to ask if the ritual has purpose, or if it has become rote. But over time, are we really getting things out of it? Is this 5 or 10 minutes still valuable? After all, if it's useless, we should stop, or change things.

                    Identify your rituals and your routines, and make it a point to ask yourself periodically if it's still serving a purpose, and if that purpose is valuable. Sometimes it's better to stop than to let your rituals go stale.

                    Thursday, January 14, 2010

                    Ticket Clutter

                    We use a ticketing system (Jira) that uses a pretty common method of displaying ticket information. Basically, it looks like this:

                    Ticket Summary: Bug in widget
                    Description: When the frobble in the widget grows to over 32 MB, the widget starts to wobble and it all falls down.
                    Comment 1: Failed in nightly on 2010-01-05
                    stack trace here
                    Comment 2: Failed in nightly on 2010-01-06
                    stack trace here
                    Same failure - maybe it's consistent?
                    Comment 3: (Assigned to Mr. Ed)
                    Comment 4: I can reproduce this 80% of the time if I grow the frobble over 40MB, but only 50% of the time at 33MB.
                    Comment 5: Failed in nightly on 2010-01-07
                    stack trace here
                    Comment 6: Failed in nightly on 2010-01-08
                    stack trace here

                    Etc...

                    Jira is of course not the only system that uses this same basic format, many different systems do. For the most part it makes sense. You have a basic understanding of what's going on at the top, and then a running commentary of what happened and what people working on the bug are thinking and doing. It gets awfully cluttered, though. If a bug is around for a while, or if it causes a lot of test failures, you can get pages and pages of comments that basically amount to "yup, failed again", interspersed with comments from humans who are actually working on the issue and not simply recording that it still is an issue.

                    So how do we handle this?
                    • Combine comments where possible. In particular the "still happening" comments all basically say the same thing, so they can be combined. If, for example, the same problem takes out 5 tests in the automated suite in one night, I'll make that one comment. That helps some, once you're sure they're the same problem.
                    • Defer the test. If the test is failing consistently and not providing more information, I'll defer it until the bug is fixed. It's a little unnerving to decide to not run a test, but if it's not telling you anything new, then it's okay to not run it until something changes (probably an attempt at a fix or some additional debugging code goes in).
                    • Filter comments. I haven't actually seen a good defect tracking system that does this, but I'd love it if I could look at a ticket and filter the comments to only show those logged by humans, or by a particular person. That way I can skip the "still happening" status updates (all logged by our automated system) and just look at the actual work being done.

                    How do you handle ticket clutter?

                    Wednesday, January 13, 2010

                    Looks Funny

                    Looks funny is a term with specific meaning. That meaning is:
                    "Has a result or value that appears incorrect but has not been shown to be wrong".

                    Something that looks funny might really be a problem. Alternatively, it might be something that is perfectly correct, but your understanding needs some clarification.

                    For example, we were testing data ingest rates, and we got results like these:
                    Plain text = 49 MB/s
                    Office = 48 MB/s
                    VMWare Images = 76 MB/s

                    That VMWare image number "looks funny". We didn't yet know it was wrong, but it was certainly not in line with our other results. After digging in, we identified that a significant portion of the time was being spent with opening files. So in data sets made up of relatively many smaller files (plain text and office), the amount of time spent opening files as a percent of the total ingest time was high. For larger files (VMWare images), the amount of time spent opening files was lower as a percent of total ingest time, which resulted in an overall faster speed.

                    Other times, "looks funny" is the first sign of a bug. For example, we noticed that sometimes doing actions was a bit slow. "Sometimes", by the way, is a general sign that you might have something that "looks funny". It took a long time to track it down, but it eventually turned out to be a background process running at too high a priority. Most of the time that didn't cause a problem, but every once in a while there would be enough other stuff going on that the thing you were trying to do had to wait around for a while.

                    As you're testing, keep an eye out. Something that looks funny is probably worth following up.

                    Tuesday, January 12, 2010

                    Rails, Apache, Passenger, and SSL

                    I had a Ruby on Rails project that I wanted to test using SSL. This is how I configured it to work with SSL, Apache, Passenger, and my app.

                    You will need:
                    - OS X Snow Leopard
                    - Apache (the version that comes with Leopard is fine)
                    - Passenger
                    - the Passenger Pref Pane (optional, but useful)
                    - your application, running successfully locally. Your "application path" is your RAILS_ROOT here (i.e., the folder that contains app, config, etc.).

                    Note that this might work on other versions of OS X, or on other similar OSes, but I haven't tried it.

                    Install Prerequisites
                    This is just about getting the software you need. If you have any of these, just skip that step.
                    1. Install Passenger from here http://www.modrails.com/. Just download and follow the README.
                    2. Install the Passenger PrefPane from here: http://github.com/alloy/passengerpane or here http://www.fngtps.com/passenger-preference-pane. Just download and follow the README. Note that this is 32 bit, so when you access the prefpane, it will restart System Preferences in 32 bit mode. If this annoys you, skip this step. It's your choice whether you install this for just you (runs in user mode), or for everyone (needs an admin password to change configuration).
                    3. Your application can be running about anywhere. Make sure the full path to your RAILS_ROOT is world executable. For example, my application is in ~cpowell/projects/csh, so I ran the following in a terminal:
                    chmod o+x ~cpowell/projects/
                    You can see if this is right by doing
                    ls -l ~cpowell/projects/csh
                    and looking for this output:
                    CMP:~ cpowell$ ls -l ~cpowell/projects/
                    total 0
                    drwxr-xr-x 19 cpowell staff 646 Jan 10 20:34 csh

                    Add Your Application to Passenger and Apache
                    This will let Apache (with your Passenger) serve your application.
                    1. Open System Preferences
                    2. click on the Passenger icon. This will display a dialog that says "To use the "Passenger" preferences pane, System Preferences must quit and reopen." Click OK.
                    3. Click on the plus sign to add an application
                    4. In the address, you can call it basically whatever you like. I use the .local ending just because that's an easy Mac convention. In my case, I called it csh.local
                    5. Put the path to your Rails project in the Folder. In my case, it's ~cpowell/projects/csh.
                    6. Choose development or production config. This one is purely up to you; I use development because I'm just testing, but you can choose production if you prefer.
                    7. In the main System Preferences page, click Sharing, and then select "Web Sharing". This is what OS X calls Apache.
                    8. Select the Web Sharing checkbox to start Apache.

                    At this point you should be able to open a browser and go to http://csh.local and see your application. (Substitute the address you chose, of course.)

                    Configure SSL
                    Once you have this running, we can add SSL support, if we need it.
                    Add an SSL vhost to your passenger config.
                    1. Open your passenger apache config. It's probably in /etc/apache2/passenger_pane_vhosts/csh.local.vhost.conf (substitute your application name). You'll need sudo for this. It will have a virtual host entry for port 80.
                    2. Add a virtual host for port 443 that looks like this. Note the paths to your application, to your certificates, and to the ssl log. All those paths should exist.
                    ServerName cloudswitch-home.local
                    DocumentRoot "/Users/cpowell/projects/csh/public"
                    RailsEnv development
                    Order allow,deny
                    Allow from all
                    Order allow,deny
                    Allow from all
                    SSLEngine on
                    SSLProtocol all -SSLv2
                    SSLCipherSuite ALL:!ADH:!EXPORT:!SSLv2:RC4+RSA:+HIGH:+MEDIUM:+LOW
                    SSLCertificateFile /Users/cpowell/projects/csh/server.crt
                    SSLCertificateKeyFile /Users/cpowell/projects/csh/server.key
                    SSLVerifyClient optional_no_ca
                    SSLOptions +ExportCertData
                    SSLOptions +StdEnvVars
                    SSLOptions +StdEnvVars
                    SetEnvIf User-Agent ".*MSIE.*" \
                    nokeepalive ssl-unclean-shutdown \
                    downgrade-1.0 force-response-1.0
                    CustomLog /var/log/apache2/ssl_request_log \
                    "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
                    3. Generate a self-signed certificate according to these instructions (http://www.tc.umn.edu/~brams006/selfsign.html). Use section 1A or 1B, your call.
                    4. Put those generated certificates and key in the locations you indicated in your passenger vhost.
                    5. Enable SSL in Apache by uncommenting this line:
                    Include /private/etc/apache2/extra/httpd-ssl.conf
                    6. In the /private/etc/apache2/extra/httpd-ssl.conf file, get rid of the vhost for port 443. We have to do this because otherwise it will conflict with the port 443 vhost we just set up in the passenger config. To do this, get rid of everything below:
                    ##
                    ## SSL Virtual Host Context
                    ##
                    7. restart apache

                    Note that if you have difficulties here, check out the logs in /var/log/apache2 to help troubleshoot.

                    Try It
                    Open a browser and go to https://csh.local. You should see your application.

                    Monday, January 11, 2010

                    Who Was On My System?

                    At work, we often wind up sharing systems. Multiple people will be doing things on the same system at the same time, or we'll borrow each others' systems to run tests. This reveals each of our quirks really very clearly....

                    The guy who always creates an enterprise user (one of our user types) with the name "Kirk".

                    The guy who comes in behind him and creates another user named "Picard".

                    The guy who names things starting with "moo" (e.g., "moo1", "moo3").

                    The guy who names things thematically after fruit (e.g., "banana" and "apple" and "pear").

                    It's fun to look at a system that's been around sometimes and see our fingerprints on it. It's actually useful, too, because you can look at a system that's doing something unexpected and ask the previous user about it; because we have the fingerprints we know who to ask!

                    Friday, January 8, 2010

                    Classic: Anticipation

                    There are several qualities that make for a really good QA engineer:
                    • ability to design a test that provides interesting information
                    • a dogged desire to track down the little niggling oddities (that often lead to really deep bugs)
                    • well-developed, if informal, systems thinking
                    • ability to distill a lot of information into clear and concise descriptions
                    These are all what I think of as product-oriented abilities. They are the mental tools that allow a QA engineer to approach a system, no matter how opaque, and develop a deep understanding of that product (or tool or application or whatever you call your thing under test) and its reactions to internal and external influences.

                    But the one thing that really sets a QA engineer (or any engineer, I suspect) apart has nothing to do with the product. It has everything to do with the human organization. And that one thing is anticipation.

                    The ability to anticipate the needs of the human organization is what makes a good QA engineer great. Many of these are not large changes; they're the little things that make the day flow more smoothly. For example:
                    • Knowing that the developer has a standup at noon and will want to know the results of last night's automated tests, anticipation says the QA engineer should look at them enough to offer a summary by about 11:30.
                    • Knowing that a big client is preparing for go live in a week, the QA engineer should set up a similarly-configured system in order to be prepared for questions and issues that arise as the system goes live.
                    • Understanding that a problem at one client will inevitably lead to the question: "will this affect other clients?" and taking the characterization of the problem to other client scenarios before being asked.
                    So ask yourself not just "what do they want", but "what will they want next". That's what will tip you over the edge from scrambling to keep up to that corporate zen state called being proactive.



                    (For those of your paying attention, I did first publish this about 18 months ago, but I've been thinking about it again lately, and it still rings true.)

                    Thursday, January 7, 2010

                    Testing for Marketing

                    Much of the testing we do is to provide information about the product that is targeted for engineering or others involved directly in releasing and supporting software. We gather information about unexpected states (bugs) to help developers resolve issues. We provide deployment information to help services build successful production configurations. We help do root cause analysis on field problems and feed that into support for workarounds (and developers for fixes). For all of these things we're trying to come up with scenarios that might occur in the real world. They may be extreme or rare, but our goal is to provide information that is predictive of how the software might behave in the field.

                    Sometimes we're testing for a different purpose, though. Sometimes we're looking for tests that help marketing or sales. These are what I call "testing for marketing". These are tests you do not just because they're valid but also because you need to them to help you look good. For example, marketing may be producing a white paper about your system performance. They want the fastest performance numbers that are reasonably possible. This doesn't mean you should fake anything; this means you need to build your tests appropriately. You may use data that is "real" (e.g., non-generated) so that it's understandable to an analyst when you explain it. For example, "We get 60 MB/s on data that is real office data. We took a couple of backups from our file server" sounds better than "We get 60 MB/s on data that we created." (This is true even if you created data with real-world characteristics.)

                    As you're creating your tests and your test data, remember all your audiences, and you'll better e able to provide usable information.

                    Wednesday, January 6, 2010

                    Cold Turkey

                    As we learn new tools, we're often transitioning from old ways of accomplishing the same task. Maybe, for example, we used to work on Windows and now we're working on Linux. Or we were a QTP shop but we're moving to Selenium. Perhaps we used to manually install machines and we're trying to create (or use) an automated installation and deployment mechanism.

                    As always with learning something new, things are hard for a while. It's a pain to figure out how to get good resource usage information out of Linux if all you know is Windows. Selenium feels just plain weird if you're used to QTP metaphors. And for a good while it's faster to just install the darn machine by hand rather than try to use an automated system that's probably still a bit picky (read: buggy).

                    Go cold turkey.

                    It's easy to cling to the old way of doing things. You may just spend time "maintaining" the QTP tests you have, and figure you'll convert to Selenium later. Or you might install a machine by hand "just this once" rather than use the framework. Don't.

                    You'll learn by actually using the new thing you've chosen. Only actually doing it for a while will make your new tool or your new process comfortable. It's going to hurt for a while, but it's going to hurt for a lot longer if you don't commit to the new way. So go cold turkey on the old way, and embrace the new.

                    Tuesday, January 5, 2010

                    I Am Not (Only) A Tester

                    One of the benefits of taking some time off is the chance it affords you to look at yourself and see what really is there without being caught up in the moment. I took the last week off, looked at myself, and discovered that I'm not really a tester. Or more precisely, I am a tester, but that's only part of what I do.

                    Look, for example, at the recent things I've done or been involved with:
                    • drafted management, monitoring, packaging and configuration requirements for a new system we're working on (I'm the one who thinks about the infrastructure stuff)
                    • modified and extended test infrastructure (and fixed a couple bugs along the way)
                    • been the general email/document reviewer for my boss and several others on some business development work we do. They call it the "grammarian" hat.
                    • tested software (the "I am a tester" part)
                    • worked with some team members to help them learn how to find trends in field and internal issues
                    • written a couple blog posts
                    What I am is what many testers are - someone who is skilled at looking at systems, breaking them down into their component pieces and then building them up to see the larger picture. This systems thinking plays across the software development life cycle, from requirements elicitation to development, deployment, and into supporting and maintaining the system.

                    I'm a tester, and so much more than that.

                    Monday, January 4, 2010

                    How Many Bugs Will We Find?

                    When we do an iteration or a release or a sprint or a [insert unit of work here], it generally goes something like this:
                    • Figure out what you're going to do and commit to it
                    • Start building it
                    • Start testing it
                    • Find urgent bug in the previous output and squeeze it in
                    • Finish "first implementation"
                    • [scramble to fix issues found, finish implementation, etc]
                    • Done
                    • Repeat
                    I'll note that most of the books you'll read on development methodologies skip one or two of those steps, in particular the defect handling and near-the-end juggling steps. Life is generally less orderly than the books.

                    After a few tries at this, someone generally comes to engineering and says something like, "I've noticed we have a crunch at the end and we're not really finishing what we signed up for because we're having to deal with issues, and testers are saying that with features rolling off the line right up until the end of the sprint we're going to keep having integration issues after the iteration is over. What do ya'll think we should do about it?" After talking for a while, the answer generally comes down to one of two things:
                    1. Leave a bit of slack in this interation and use that to fix issues found in the field or found in the iteration. Generally this will happen at the end of an iteration.
                    2. Create inter-iterations or some other slack period between iterations for issue resolution, either found in the field or found in iteration. For example, an iteration might be 12 days long (2.5 weeks) and leave 3 days "between iterations" for resolving issues.
                    The basic idea in either case is to leave some room in the schedule for handing the issues that will inevitably come up. The next question, of course, will be:

                    How many bugs are we going to find?

                    This is not an easy question. Keep in mind we're in planning and scheduling mode, so the real question is how much time the team is going to need to allocate to bug fixing. After all, finding 5 bugs could mean 15 minutes of work, or it could mean 15 days of work, depending on what the bugs are. So let's start thinking time. How much time should the team plan to spend on bugs?

                    There are three major inputs to this equation:
                    • historical data
                    • contents of the iteration
                    • tolerance for risk of things slipping out of the iteration
                    Historical data is your most valuable way of keeping yourself honest. This is what you look at to know how well you - your team, your testers, your management - will actually do in a given scenario. Look at past iterations and ask yourself how much time you spent resolving bugs. Break your iterations into two types: those right after a release hits the field, and those with no major new field work. You'll probably find your time spent on field issues is higher right after a release, so these two numbers might be quite different. This is your baseline number, and you want it in man days. As a rough baseline, this is probably on the order of 10 - 20% of your total man days (at least, in my experience).

                    The contents of the iteration will help identify any unusual characteristics of the iteration. A major new feature or work in a problem area will increase the number (and possibly time to fix) issues. An iteration that's fairly safe may require less budgeted bug fix time. Generally this will change your baseline number by up to 25%. You can do this by function points or task hours, but in the end also listen to your gut. That "eek!" you hear is your gut saying, "bug time needed here!". Listen to it.

                    Lastly, consider the "slippability" of the iteration. Sometimes it's okay if things change in the middle of an iteration. Sometimes it's okay if bugs wait. If that's the culture or the iteration you have, that's fine. Schedule less time for defect handling. Other times there are major things riding on hitting an iteration end, no matter what. If your culture is that conservative, or if revenue depends on hitting the targets you say you're going to hit, then schedule more time for defect handling during the release, because the alternative - not making it - is less acceptable.

                    Much as we'd all like to have perfect code that does everything that the customer wants, there will be defects and we have to handle them. How you handle them is up to you, but better to think about that before you're in the thick of it.