Tuesday, July 20, 2010

Find the Oracle

In testing, an oracle is something we look to that can define the "should" of an application. The oracle (or oracles) tell us what an application should do, how the UI ought to look, how the application ought to perform, etc.

Oracles can take many forms:
  • requirements specifications
  • screen mockups
  • other (competing or complimentary) products
  • style guides (yours or someone else's)
  • similar or related functions in the same application
  • a previous or alternate version of the application being tested (think porting projects)
When you start to test your application, one of your first jobs is to find the oracle(s). This will help guide your future tests, bugs identified, and behavioral expectations.

Sometimes finding the oracle is easy; for example, if you're handed a requirements specification, then you've got at least one of your oracles. Other times, you'll have to get creative.

For example, a friend recently approached me with a problem:

"We're processing files, and we discovered that on this file server with 573million files, we detected 19 million ZIP files. Is that reasonable?"

This is a problem in search of an oracle. It's infeasible at that scale to hand-check all of the files. We're not yet sure we can trust our detection software, though; after all, that's what we're testing. So what's our oracle?

Seems hopeless. "Hopeless" is just another word for "needs creativity". We can do this.

Our problem revolves around the percentage of files on a file server that are of a certain type. There are two ways I think to go about this:
1. run one or more other file type detectors and see how much they agree with my results
2. find other file servers (or reports about other file servers) with a typical file type breakdown. Academic papers, storage vendors, and OS vendors are great sources for this kind of information.

What would you use for an oracle in this situation?


  1. I like your definition of Oracle Catherine.

    Have you come across Michael Bolton/James Bach HICCUPPS? (http://www.developsense.com/articles/2005-01-TestingWithoutAMap.pdf)

    With regards to your question about what Oracle would I use. I think one of the most important would be the user.

    Then you would need to look at the design of the system - how is a 'zip' file defined?

    What about rar, gzip, arc, arj, tar?

    Is it just zip files you are looking for or compressed files?

    Is it possible to save compressed(Zip?) files with an incorrect extension?

  2. When finding an Oracle, it is important to understand the context. What's missing from the description is why is it important to know how reasonable the number of zip files is.

    If it's a rhetorical question to satisfy someone's curiosity, finding an oracle shouldn't be a 300 hour task (of company time anyway...)

    If it's something that has audit, security or cost impacts, then it's a bit more complicated.

    Assuming that there is some very large impact that is going to be mitigated,I would add
    Bing/Google/Yahoo. The internet search engine is the de facto oracle for many people. While it isn't the ultimate source of the informaiton, it is the first thing consulted.