So, we have a number of things on tiered storage. For example:
- test logs are stored on the machine that runs the test for 5 days, then deleted (gasp!)
- logs for tests that failed are stored on a network server (think NAS), then backed up to archive storage (our own product)
- logs for issues that have happened at clients are stored on a network server, then backed up to archive storage
- generated test data, syslogs, and other non-test artifacts are stored on a network server, then deleted
In many of these cases, we have scripts that actually do the work - monitor the fullness of the file systems and then back things up. Basically, they check every half hour to see if the primary store is more than 90% full. If it is, we email the group as a notification and then start cleaning it up.
When we originally wrote the cleanup script, we wrote it to loop through and clean things out until it got below 90% full. As a result we were getting notified multiple times a day. It would clean to just below 90% and then as soon as someone wrote to it, the file system would go back over 90%, we'd get notified and the whole thing would start over.
Here's the small trick:
Notify at 90%. Clean to 80%.
We changed the script and notifications dropped from two or three times a day to once a week or so. That's a lot less email.
Small change, big effect.
What small change can you make today that just might have a big effect?
No comments:
Post a Comment