Monday, November 30, 2009

Helpfulness Balance

Interacting with your team and with external parties is all about preserving balance. You want to be helpful, but not stifling. You want to get help but not be seen as leeching off someone. Sometimes you'll be the giver of help, and other times the asker of help, but it all needs to even out in the end. If it doesn't, you start to get a bad rap.

So for everyone you interact with, you've got a little green bar showing your balance of helpfulness. Think of it like this:

You can ask for help, which drags the green bars down, or give help, which moves the green bars up.

So what happens when you first meet someone? Where's the bar then? After all, you haven't had time to establish a balance yet. Where you sit depends on your relationship with the person. Let's look at a few examples:
  • The person is a potential client. You're getting ready to ask for something (and you can bet the potential client knows it), so you're already implicitly in debt. Your green bar is pretty low.
  • The person is a formal mentor. You're in a mentorship program and both of you know it. This person entered the relationship seeking to help you. Your green bar is quite high.
Okay, so we have to give to receive, and vice versa. So what?

So use this to figure out your behavior. If your green bar is low, keep your requests and questions few and carefully worded. Look for ways to be helpful, so you bring the bar up. If your green bar is high, don't feel guilty about asking questions and making requests. Help where you can, of course, but don't feel like you can't do anything until you've helped.

None of this is rocket science, but as you're getting ready to ask for help, it pays to think about where your helpfulness balance is with someone so you can make your request in an appropriate way.

Wednesday, November 25, 2009

Steel Threading

This is a true story. Names of people and components have been changed to protect the innocent.

Background:
There are a few things you need to know:
  • We work in two week iterations.
  • We basically work from stories in the XP(ish) style.
  • Stories as written have customer benefit
The Problem:
We wanted to put in a major feature (let's call it the "Behemoth"). The Behemoth was going to basically be a UI replacement. It was going to be great for our customers, and give us better UI scalability and testability, too. There was only one downside: the Behemoth was huge. As in a year or so, rough estimate. There's no way it was going to fit in an iteration, or even in a single release.

Options:
As with most things, we have options. We could....
  • ... branch and send a team off to work on this. When it's done, merge, test, and release!
  • ... hold the release until it's done (eep!).
  • ... break it down into parts.
Now that last one, that's interesting. What if we could break the Behemoth down to some size that would fit in a release, or even better, in an iteration? That sounds good except we have a mandate to not put anything in that doesn't help the customer.

Normally, when you break a Behemoth down, we'd do it into components - say, coordinator, and renderer, and config, for example. We'd then build each component, and at the end we'd string them all together and we'd have a Behemoth. Trouble is, a coordinator is no good to any of our customers, so now we have unused code in the system and we're not providing customer benefit. That's not particularly good of us.

Enter steel threading.

Steel threading is when you break down a project into the smallest end-to-end thing you can do. Then you do another small end-to-end thing. Repeat until you have the thing you're looking for. I don't actually know where the name came from, but I think of it like bridge cable - lots of long skinny steel threads all wrapped up together to make a huge cable that holds up a bridge.


We can use the same trick on our Behemoth. Instead of building a coordinator and then a renderer, we're going to do one tiny use case. For example, we're going to do "view a single widget" in the Behemoth, and we're going to write just enough code to be able to do that. It's far from the customer's full need, but it provides some marginal benefit to the customer, and we can write it small. Next iteration, when we've done "view a single widget", we're going to do "view 2 widgets". Then we'll do "add a widget", followed by "view 1000 widgets". And we'll just keep going until the whole Behemoth is built. This also tends to reduce integration problems, because you've had to integrate your components the whole way along, so if your coordinator can't talk to your renderer, you find out before you'd written a whole lot of code.

As with any technique, steel threading is not universally appropriate. In cases where you're faced with a huge task, though, give it a shot.

Tuesday, November 24, 2009

Partial Payment

So we have this piece of code. It's been around for a while, and it's been hacked on by a number of people. Some of them knew what they were doing, some not so much. It started out as one thing, and then suffered from "wouldn't it be neat if we also..." syndrome. Now it does several things, mostly related. And it's getting a bit.... crufty. It's not awful, but it is definitely starting to smell a little bit. (I'm pretty sure we all have a piece of code like this!)

So the right thing to do here is go in and refactor it when we next need to make a change. In principle, no problem.

As luck would have it, I happen to be the next person daring to venture into this piece of code. I want to add something fairly simple that makes it generate a list of successes in addition to the list of failures. Should be straightforward. Oh, except I need to go do some refactoring.

My task used to be this:
  • add a method to get a list of passed tests
  • add a few lines to some pre-existing methods to generate the file and header information
  • add a few lines to dump the list of passed tests into the output file.
But because I need to do the refactoring, my task now looks like this:
  • generalize the getFailedTests method to get either passes or failures, and refactor all the calls to that method
  • add a few lines to some pre-existing methods to generate the file information (separate output for successes)
  • refactor the existing generateHeader method to give me a different header based on whether I want successes or failures or both
  • refactor out the formatting of the output so it can be be HTML (failures show up on a web page) or plain text (passed tests just go into an archive)
All these are good things, but my 2 hour task just became most of a day. Ouch!

The risk with doing a refactoring next time you touch the code is the same as the reason you didn't refactor last time you touched the code - you're just out of time. Yes, this is how technical debt grows and stays around.

So how do we overcome this? We don't have time to pay back as much technical debt as we should here, and there's a risk that I'm simply not going to add the feature because I don't have a day to give to the feature+refactoring. So we compromise. I did some of the refactoring, but not all of it, and I added my feature. Total time: 4 hours. The code isn't where it should be, but it's closer.

Just like any other debt, technical debt takes time to pay down. You don't have to pay it all at once, but every little bit you pay is helpful. So don't be afraid of how much is there. Just start paying it off, bit by bit. The goal isn't to get rid of everything crufty in one fell swoop. The goal is to leave the code you touch better than you found it. Do that often enough, and you'll pay off your debt.

Monday, November 23, 2009

Fail Safe

Like many people, we have scripts that do various tasks. They update libraries on lab machines, check and clean out temporary directories, archive old test results, and myriad other things.

There's one thing that all our scripts must have before we'll begin using them:

A fail safe.

That's right. The problem with these kinds of background cleanup mechanisms is that when they go bad they go really really bad. Updater installs a package that leaves the machine inaccessible over the network? Multiply that by several hundred and you have a real problem. Test archiver fills up its target? Continuing to flood in requests isn't going to get you anything but network traffic.

Having learned that lesson the hard way, all utilities we use have to have a simple fail safe. They check their operations already, to make sure there's no problem. If an operation fails, or fails more than n times, it shuts itself down. This prevents all kinds of runaway code problems, and things move a lot more smoothly.

You'll need utilities and scripts to perform cleanup and maintenance tasks. Go ahead and write them. Just be sure that you put in a fail safe so it doesn't go from helpful to nasty!




Friday, November 20, 2009

Find the Heartbeat

Projects generally have heartbeats. These are the rhythms of a team, and they're both large and small. You probably have a small team heartbeat - a daily standup, a weekly meeting. Then the project probably has a larger heartbeat - a bi-weekly iteration, or a monthly release.

A heartbeat implies conformance, repeatability, closure. Think about what happens, for a minute, when the human heart beats:
  • the heart expands, allowing blood in
  • the heart squeezes, moving the blood around inside the heart
  • the heart squeezes differently, moving the blood out of the heart into the body
  • during all this, the valves of the heart open and close
(I'll note I'm definitely NOT a doctor.)

When a heart does things in a kinda sorta way, you've got a problem. A leaky heart valve, that's a problem. A heart that doesn't pump all the blood through, that's a problem. These are inefficiencies in the human heart, and when they get bad enough then you're in serious trouble.

The same thing is true of a development project. We don't create and maintain a project heartbeat because it feels good. We do it because if we don't have rhythm then we're showing ineffficiencies and when those get bad enough then we're in serious trouble. For example, if our stories no longer fit in an iteration, that's a leak. Once or twice is okay, but if it happens a lot or badly, then the project is in trouble, and we're likely to be late or broken, or both.

Don't have a project heart attack. Keep an eye on your project's heartbeat.

Thursday, November 19, 2009

Cheap Usability

Usability testing is an art form. To do it properly takes significant experience, time and resources. But....

If you lack the time, resources, expertise, or will to do full usability testing, don't give up. You can do one thing that will be a huge first step:

Draw what you're talking about.

That's it. Simple, huh?

You can use a whole lot of words describing something. But if you just draw it out, you'll start to see some of the big problems. It might or might not be the most usable thing in the world, but shy of doing actual usability testing, it's a good start.

Wednesday, November 18, 2009

The Untrained Tester

I'm pretty much an untrained tester. And yet somehow I generally know how to test stuff. Huh? Okay, so my testing classroom time is limited. I am, after all, an autodidact in test. I'm untrained. But I'm not unlearned. That's an important distinction.

There are many ways to learn how to test something:
  • Training. Yes, this can work. Classroom or online, both count.
  • Books. I still swear by "The Complete Guide to Software Testing" by Bill Hetzel. It's a bit outdated in some ways, but it's got a lot of things I can poke at and say, "oooh, is that relevant and how would I apply it?"
  • Blogs and other online guides.
  • Google. This one's great for learning specifics, like how to work with certain tools. I tend to hit this one late in the process.
  • Past experiences. Things we tried that worked, or dev techniques, or things that failed. I learn a lot from coworkers, both testers and developers.
So keep in mind that learning counts, whether its formal training or something a lot more information. If you can to say, "what do I need to know?", then you can go learn it. Don't wait for the formal training. Just go learn.

Monday, November 16, 2009

One New Thing

One of the amazing things about testing is that you get a chance to try something over and over again. Every release, you get a new chance to try a similar process. Every time you run automation, you get a chance to make it stronger, whether that's per build, nightly, weekly, whatever. We're lucky to have so many opportunities to do roughly the same thing.... better.

Take advantage of that opportunity.

Every time you test, ask yourself what one thing you can do better this time. If you're feeling ambitious and things are generally under control, go for two or three things better. Try a new test technique. Try a new ordering of the test plan to shake out problems earlier. Fix a couple of the reporting oddities in your test infrastructure that have been bothering you.

This isn't news. I've written about changing up your test plan before. It still bears repeating. Do what you were doing... and do one thing better.

Friday, November 13, 2009

Project Doldrums

Sometimes a project gets into a pretty frustrating state, in which:
  • it's "mostly" done
  • it's highly visible
  • it's just starting to get tried by a broad audience, and not all of them know the background and details of the project, just this thing they have been asked to try.
If you're not careful, this is where you get stuck in the project doldrums. Now is the time to avoid getting stagnated on the project. You're getting feedback, which is probably introducing new requirements or ideas. You're probably finding a few issues. It's likely that you have one or two things you already knew you needed to do. And those things just sort of keep piling on each other.

It's now up to you to get control of it and get momentum again. (I could keep the doldrums metaphor going and say that you have to turn on the motor and get out of the listless winds.)

Getting momentum isn't hard, really. There are only three key things that you must do:
  1. Time box it. You should be happy to take feedback, but you're only giving people a set amount of time to provide it. After that, no new requirements, no whining. It will go into production as it is spec'd.
  2. Make your task list public. You have a set of things you now need to do (fix bugs, update config, add a few features). Publish it, and publish where you are on that list. That way you don't get the same complaints over and over, and when it's fixed, you can tell people to try again. It's a way to show that feedback is not ignored, that you will get to it, and that you are making progress.
  3. Do only your tasks. Don't make random changes or other changes. Every change you make should be based on a task in your list. It is imperative that your task list be complete. If you find something else, add it to the task list, then do it. You don't want to give your (now very public) audience the impression that you're flailing around making random changes. It makes them lose confidence in you.
Projects can hit the doldrums. It will happen eventually. Don't worry about it overly; you can get out of them. Just do it with momentum, and do it with confidence.

Thursday, November 12, 2009

Rituals

After you work in a team for long enough, you start to develop rituals, formal and informal. A standup is a daily ritual. An iteration retrospective is a ritual. The guy who brings in donuts most Wednesdays is a ritual.

Rituals are great. They are the affirmation that you're a team, and that the team is almost a living organism. It has a heartbeat and habits - and those are your rituals.

But...

Rituals are only affirming if they continue to have meaning. There's no point to having a retrospective if you're no longer coming to small stopping points every iteration. Otherwise it's just a standup, only longer; you can't retrospect in the middle of something. There's no point to having donuts on Wednesdays if you're forced to bring them in; the beauty of that ritual is the small thrill of informality.

Embrace the rituals you have. But evaluate them to make sure there is still meaning behind your rituals. As soon as they lose their meaning, stop doing them. There is no affirmation in empty rituals.

Wednesday, November 11, 2009

Break It Down

We're working on an internal project that involves (among other things) sending notifications programmatically to Jabber users. It's at that stage where it works for some people and not for others. There are two versions of code doing the sending (different OSes). There are two OSes on the clients, and there are about 10 different clients.

ACK!

So it's time to break it down. It's overwhelming to try to tackle it all at once, but if we make a table we can see what works and what doesn't, and start to try to get all "Y"s into the table, and see if there are any patterns. It gives us a base to work from.



When you're lost, write down what you know, manipulate the data to visualize it, and you'll see a way out.

Tuesday, November 10, 2009

Sufficient Quality

How do you measure yourself? How do you know your release is of acceptable quality? You've found a lot of bugs, and you've fixed a lot of bugs. You have a set of great new features, and you've done all sorts of interesting security and usability testing. It's a great release! Or is it?

Your release is of sufficient quality if your customers are sufficiently happy.

The real trick here is to define "sufficient". You could have hundreds or thousands of bugs in the product, and if your customers don't hit them or don't mind them (or think a bug is a feature!), then it's still a release of acceptable quality. You could have a total of 5 bugs, but if your customers hit them a lot and they're bad, then this is not a release of sufficient quality.

So if you want to know how you as an engineering (and requirements gathering and sales) team are doing, ask your customers. They're the ultimate arbiters.

Monday, November 9, 2009

Grunt Work

I've been working a bit on some data analytics projects. I've been looking at two major things:
(1) what kinds of issues we find in the field; and (2) what kinds of issues we find late in a release. To do this, I go diving through our defect tracking system. We use Jira, so this is mostly creating filters, and generally runs along the lines of "show me all the issues by client in the customer escalations project".

The problem - and this happens with many things - is that our reporting now has more data than it used to. For example, we didn't used to track which client an escalation was open at as a query-able field (it was just in the text). We now track it as a separate field, but that means that all the old issues were never updated. So I have two choices: I can either construct a special query that pulls the info out of the comments; or I can move the data into the field where that field is not populated.

The advantage to a special query is that I can construct it and I don't have to touch a lot of bugs. The disadvantage is that I have to reuse and maintain that fairly complex query every time I need the information. (And if someone else wants to use it, well I hope they can figure out how!) So instead I'm going to make our old issues comply with our new practices - and populate the field we're now using.

The moral of today's story is:
Sometimes, you just have to do the grunt work.

It's not fun, and sometimes it's more manual than I'd like, but your future self will thank you.

Thursday, November 5, 2009

Window of Opportunity

It's relatively easy to decide to change things. It's even fairly easy to decide what you're going to do differently, generally. In order to be successful, though, you also have to consider when you make the change.

Any change has a window of opportunity, time period during which it is most likely to be effective.

For example, let's say you decide to change your release process. Instead of simply sending an email to operations letting them know that the release is ready, you're going to appoint a "development liaison" who will work with operations getting the release in production. The goal of this change is to prevent unintentional misconfigurations (which you've had a problem with in the past). You could do this change right at the beginning of your development effort, but it wouldn't really buy you a lot - after all, you're not releasing so you're not going to try your great new change. No, instead your window of opportunity is a bit before release.

As another example, let's say you're doing iterations and you're not quite perfect at it yet, so the end of an iteration is a bit... frantic. Don't introduce change when you're frantic - it'll only make you more frantic. Your window of opportunity is earlier in the iteration.

So describe your goal, describe your change, and then think of your window of opportunity. All those together will help you gain success.

Wednesday, November 4, 2009

State Your Purpose

Being a tester, I see a lot of tickets. Some tickets, unfortunately, hang around for a while, and tend to be worked on by multiple people. These wind up with the basic ticket writeup and a series of comments by different people. Particularly when the ticket is a difficult one, there are theories being tried and discarded.

Let's use an example:
We have an issue where access to the system, either for standard use (reading, writing data over mounts) or for diagnosis (logging in to the box) was slow. The system had several exported mounts, was performing replication, and was deployed in our lab. That's about all we knew going in to it.

As we're working on the ticket, a lot of theories came up, ranging from load on the box to a kernel problem to a network issue (turned out to be saturation of the switch when other system using that same switch we engaging in network-intensive operations).

So the question becomes, how do we talk about this in the ticket? There are good and bad ways to write this up.

A Poorly Written Comment
The replication schedule is:
- 20:00 (average duration: 90 min)
- 07:00 (average duration: 45 min)
- 13:15 (average duration: 80 min)

A Well Written Comment
We noticed that the slowness described only occurs sometimes. Looking at what the box is doing at the time, it always seems to be replicating.

The replication schedule is:
- 20:00 (average duration: 90 min)
- 07:00 (average duration: 45 min)
- 13:15 (average duration: 80 min)

We've seen slowness at:
9/12 21:10
9/13 07:08
9/16 07:14

Earlier comments in this ticket indicate that load average is not the problem, but what else might replication be triggering? Early thoughts: increased threads, increased memory use, increased network use...

The Six Month Test
A good comment is one that makes sense six months later, after you've forgotten all the details. This means it needs to:
  • describe how it relates to the issue as a whole
  • describe what the reader is intended to do or take away from the comment
Just like your bugs, write your comments for posterity. Future you will thank you.

Tuesday, November 3, 2009

Paralysis

There are days when I walk into work and have a whole lot of different things that need doing, none of them short. A typical list would look like:
- reinstall an object store (1 hour)
- finalize a task list (45 min, and needs people)
- run a scanner utility against a test data set to gather a baseline (2 hours)
- write up how to do a big configuration I've been working on (3 hours)
- provide feedback on a document (1 hour or so)

And I don't want to start any of 'em because that means I'm not making progress on the others! This is a form of paralysis. Fortunately, it's mild.

There's only one way I know to get out of it, and that's to write down my task list for the day, pick one, and start. It doesn't matter which one I pick, as long as it's one single item.

In this case, I invoked my particular prioritization method:
  • first the stuff that's blocking other people
  • then the stuff I'm going to forget if I don't do
  • then the stuff that others are waiting for but not blocked by
  • then everything else.
In this case, I did the feedback on the document, followed by the task list (also needed by others). After that, I did the configuration writeup, and then started on the scanner utility (and then the day was over!).

Does this ever happen to you? How do you deal with the "too much to do to even get started" problem?

Monday, November 2, 2009

The Rest of the Product

We have a test plan, and it's great. It covers all the features, and all the workflows of the application. We've got stories, we've accepted them. We've written some automated tests.

Congratulations, you're now half done with the product.

The product is not useful until you can actually use it in production. So now that you've built the darn thing, it's time to think about:
  • What's production going to look like? How many machines? What configuration?
  • How are you going to get the software into production?
  • How about the config info? How're you going to go from dev to test to prod? (hope it's not hard coded into the war file or rpm somewhere!)
  • Okay, once you got it into production, how're you going to start it?
  • Come to think of it, how're you going to stop it?
  • One day you're going to have to maintain this thing. Got a plan for that? Is down time okay? Can you do some rolling upgrades or maintenance?
  • How are you going to see what's going on? Got logs? Got a way to get logs back to dev for analysis?
  • How will you know it's running? Any monitoring? Notifications?

There are a lot of questions to answer once you've done the basic implementation. Don't forget to include those when you're thinking about testing it, too.