When we're doing estimations, one of the tools I use is what I call a breadbox sizer. Basically, this is a tool to help us figure out how much work we should be committing to in an iteration - it's a measure of our velocity. Here's how it works:
Categorize the things you've done.
The idea here is to create a list of the things you work on. Usually this is features or product areas. You can also put performance testing or other categories on there. This is basically how you categorize your testing effort. For example, my categories include:
- GUI management
- Test Infrastructure
- Replication and snapshots
Try to keep this between 10 and 20 categories. Fewer than that and you'll be putting too much in each category. More than that and it gets unwieldy to work with.
Put your past stories in the categories.
Go back over the last three or four (or more if you can) iterations and put the stories/features/tasks you worked on into the categories you've created. For example, we just did a story "add separate timeouts to setup, test run, and teardown". That would go in the "Test Infrastructure" category.
Layer in the estimate sizes for each item.
This part is important. Take the size of the story as you estimated it, and add that to the category for that iteration. I don't care how long it actually took you. I care about how much time you thought it was going to take up front. (We're going to use this for estimates, so I care about how much of your estimated time you actually got done. We don't need another layer in the middle to translate "estimated time" to "actual time" worked.)
For each story, decide if it was "small" (1-3 units), "medium" (3-5 units), "large" (5-10 units), or "extra large" (bigger). If one category had multiple stories, add them up and put the overall time spent in that category. You'll wind up with something that looks like this:
Translate that into work done.
On the spreadsheet we're going to total the work done. Basically, we add 3 for every small, 5 for every medium, etc. This gives us the total amount of estimated work that we actually wound up doing in each iteration. Our example now looks like this:
In our first iteration, we managed 8 units of work, total. In our second, we got 15 units of work. In our third iteration, we did 16 units of work.
Get to a common denominator
Now we start dividing by the things that change. For example, during iteration 1, one of our QA engineers was on vacation, so we had two engineers. During iterations 2 and 3, we were at full strength. So we're going to divide this up to define how much work per engineer per iteration we can do. In addition, our iteration 3 was three weeks long (doesn't matter why for this example), but the first two iterations were two weeks long each. So we divide that up, as well. This gives us a number per person per week.
So now we know that each QA engineer can do about 2.5 units of estimated work each week. When we go into the next estimation session, that's where we'll draw the line for test work. We estimate just like we always do, and we then will walk down the list committing to 2.5 units of work per week. When we run out of allotted time, we'll stop. (By the way, our 2.5 is days, so we're getting about 50% effectiveness; my general experience says that's about right, and the rest of the work day is spent on email, meetings, escalations, sales assistance, and other tasks.)
There are a number of ways to calculate velocity. Many of those methods involve calculating how much time you actually spent, getting a velocity of actual time, and then working to get your estimates to match up how much time you spent. You can choose to go this way; it's perfectly legitimate. However, I prefer the way I've outlined here simply because it avoids the fuzziness factor you get when trying to figure out how much time you actually spent (which I find really hard to do).
So see what you find when you calculate your velocity. Good luck!