Thursday, August 28, 2008

Seeing Everything That Matters

Imagine you could see absolutely everything about your system.

Every message.
Every button.
Every process.
Every state.

How cool would that be? 

Except that it wouldn't be cool at all. We'd never get anything done. 

I've written several times about the dangers of NOT seeing things - how when you look at a screen certain things just don't jump out, and how testing in the same way can cause you to only see what you expect to see. There's an equal danger in seeing everything. 

Even the simplest commercial system is not small. Imagine all the things you can see:
System logs
OS logs
Files in and files out
Network traffic - to and from your box and broadcast traffic
User actions and reactions
Mouse movements

And now we're overwhelmed - too much information. This is actually a standard human state, not just those of us hardworking engineers!

When asked to remember cars and faces, car lovers couldn't do both effectively. When asked to make a choice between 6 kinds of jam, consumers picked one. When asked to make a choice between 24 kinds of jam, consumers simply walked away. Full study is here (pdf).

Human perception is about filtering. The difference between an effective engineer and an ineffective engineer is the ability to filter appropriately. Eliminate what's unimportant and see what's important. 

Our goal isn't to see everything. Our goal is to see everything that matters.

Of course, this is far easier said than done. So how do we build our filters and learn what's important and what's not?
  • Look through old issues. What were the clues that led to the real answer?
  • Spend time with the system. Getting a sense of what the system does when its behaving normally will help you understand what is an anomaly and what's not. Look in a different area every time - logs one day, process table one day, GUI one day.
  • Mix it up. Work in different areas of the product so you don't become overly familiar with any one area and start to miss things. You want to keep a sense of slight surprise about each area of the product.
  • Explain what's going on. Pair with someone and walk them through what's happening to the system. Make sure you can explain the entire work flow, start to finish.
It takes time to learn what matters and what doesn't. When you've done it, though, you'll find that you can find (and fix!) issues more quickly, you can track down defects more quickly, and your "gut" will get really good. So take the time and learn to see your system with good filters.

No comments:

Post a Comment