Wednesday, December 5, 2007

In the Zone

I work for a storage company. We spend a lot of time writing data on to systems for testing. So I'm always in search of new tools.

Running along the spectrum of sophistication, we've done the following:
  • Hand-copy files. This one is really easy and gives you really static data. Just don't expect it to be hugely scalable. You also run out of files to copy very quickly. Oh, and don't copy that private employee information accidentally! If you really want to do this just open up a bash terminal (or ksh or tcsh or whatever) and type: 
cp my_source_dir\* my_dest_dir\*
  • Copy junk data. Enter the dd command (we're on Linux). the problem with this is that you wind up with a lot of files of the same size.  It is better than copying files, because you can do it as much as you like, you get unique files, and you can name the files. You can't easily vary file size, however, and you can't measure performance of your reads and writes without a lot more scripting. It is useful as a quick and dirty data generator, however. Just do the following (change your loop to get the desired number of files):
until i=10; do
dd if=/dev/random of=my_dest_dir\file$i bs=8K
let i=$i++
  • Use IOZone. This tool is actually intended to measure performance of a disk or a filesystem (basically block and file read, write, etc). It will write files of varying sizes to disk, read them off, rewrite them, etc. It has options, also, to leave the data on disk, so you can use it to fill drives. Also, it will automatically calculate performance of each operation and output it in Excel-compatible (space-delimited) format. Try it out!
Good luck, and happy data creation!

1 comment:

  1. That ought be /dev/urandom, or you'll be waiting a very long time!