Storage performance testing: Not all IOPS are created equal

Every once in a while I come across postings of astonishingly awesome IOPS numbers achieved with a relatively moderate setup. Often this is due to the fact that the benchmark used ran a “maximum throughput” IO pattern, which doesn’t say a lot about actual storage performance. This is because these kinds of patterns issue idealized, sequential IO (often with small IO sizes) to measure the maximum, theoretical throughput of the storage subsystem. Unfortunately this kind of IO pattern practically never occurs with real world applications.

An IO stream hitting a storage device:

  • Is Random to a certain degree
  • Has a certain read/write ratio
  • Can have variable IO sizes between 512 Byte and several MiB (though that is hardly real-world relevant, up to 128KiB is normal)
  • Is stuffed into queues at various levels in the storage stack with various queue depths

All these factors have a huge influence on what everyone casually and generalizing calls “IO Operations per Second”, or IOPS. The quite basic rule of thumb is:
The smaller the IO size and the more sequential the workload, the higher the IOPS number will be.
It should be noted that randomness of IO does not affect flash-based storage like SSDs, but impacts traditional spinning disks to a very large degree. Also, taken a single (spinning) disk, write IOs are not necessarily more expensive than read IOs, considering the local cache on every disk has is a pure read-cache (and don’t you dare to enable write-caching on the disks themselves). Of course this is where RAID-penalties come in, but I’m trying to focus on IO itself regardless of physical storage properties in this article.

When vendors claim their system or disk delivers (up to, hehe) [insert arbitrary number here] IOPS, they usually refer to a workload that is completely random, mainly consists of writes and has a 4-32KiB IO block size.
If you conduct your own storage IO testing, you should use similar IO patterns if you want to compare real-world storage
performance. Otherwise you may end up with unrealistic huge numbers like in the maximum throughput cases.

A great resource for comparing storage IO performance is this thread on the VMware community forums, where hundreds of results using a common IOmeter configuration are posted.
You can get the IOmeter config file here and also paste your resulting csv files there for summarizing numbers.

There is also the VMware IO analyzer virtual appliance, providing various pre-defined IO pattern configs for applications like Exchange or SQL server, but last time I tested it behaved a bit “funky”. (It runs Windows IOmeter in Linux via wine – urks).

Last but not least, a tip if you want to analyze just what kind of IO workload your specific VM/application is issuing: vscsistats is your friend and provides tremendous in-depth information on this.

Further recommended reading: