Tuesday, September 11, 2012

Status Message Messaging

For those of us who use hosted services, status pages are an essential communication point. They're a way to indicate service health ("it's not us, it's you") and, when there are problems, they provide a forum for disseminating updates quickly and loudly. The "everything is okay" update is pretty easy. The "stuff's broken" update is a much more delicate thing to write. It has to reflect your relationship with your users, but also reflect the gravity of the situation.

Here's part of a status update Beanstalk published this morning:

"Sorry for the continued problems and ruining your morning." 

Oh man. That's an update you don't want to have to publish. To provide some context, we'll just say that Beanstalk has had a bad few days. Beanstalk provides git and subversion hosting; that makes them a high-volume service. Pushing, pulling, committing, checking in/out, etc. happen very frequently and, well, software teams on deadline are not known for being nice when their tools get in the way. The last few days have been hard on Beanstalk: they got hit by the GoDaddy attack, then had a problem with load on their servers traced to an internal problem, and finally are again having difficulties with their file servers that host repos. And you can see it in that status update. "[R]uining your morning" is the phrasing of someone who is seriously exasperated. That update does some things well: it shows they understand the problem is severe; and it reflects the increasing frustration users are likely experiencing. It's escalating, just like user's tempers are probably escalating. However, it goes too far for my taste. It reeks of frustration on the part of whoever wrote the update, and that's clearly not a good head space for actually solving the problem. It also implies a sense of fatalism. That update was at 9:23am - my morning might still be salvaged, if they can get the system back up relatively quickly. Don't give up, guys!

There's an art to writing the status update when things are going poorly. When I'm working with a team fixing a major problem, I'll dedicate someone to doing just that. They sit in the middle of the war room (or chat session or conference call) and do nothing but manage status communications. Let the people fixing the problem focus on fixing the problem. Hand off status to someone who will maintain a level head and write a good status update, considering:

  • Frequency of updates. At least every hour for most issues, and whenever something changes there should be a status update. Silence does not inspire confidence.
  • Location of updates. As many channels as possible are good. Use twitter, the status page, email or phone calls to really important customers, and any other tools at your disposal.
  • Tone of updates. This needs to match your general tone of communication (see the Twitter fail whale - still cute and fun, even in error!) but also show that you know your customers are getting frustrated.
  • Inspiring confidence. Providing enough information to look like you are getting a grip on the problem and will get it fixed is important. Providing a proper postmortem also helps inspire confidence.

