Monday, November 23, 2009

Fail Safe

Like many people, we have scripts that do various tasks. They update libraries on lab machines, check and clean out temporary directories, archive old test results, and myriad other things.

There's one thing that all our scripts must have before we'll begin using them:

A fail safe.

That's right. The problem with these kinds of background cleanup mechanisms is that when they go bad they go really really bad. Updater installs a package that leaves the machine inaccessible over the network? Multiply that by several hundred and you have a real problem. Test archiver fills up its target? Continuing to flood in requests isn't going to get you anything but network traffic.

Having learned that lesson the hard way, all utilities we use have to have a simple fail safe. They check their operations already, to make sure there's no problem. If an operation fails, or fails more than n times, it shuts itself down. This prevents all kinds of runaway code problems, and things move a lot more smoothly.

You'll need utilities and scripts to perform cleanup and maintenance tasks. Go ahead and write them. Just be sure that you put in a fail safe so it doesn't go from helpful to nasty!

No comments:

Post a Comment