This post is a 100% copy of a message that Tanel Poder sent to the Oracle-L mailing list in response to a thread about the performance of SSD. It’s not just a good answer to the question, it’s a wonderfully succinct insight into how to think about what you’re really testing and it displays the mind-set that should be adopted by everyone.
If you measure write performance on an idle Exadata machine without any other load going on, you are not comparing flash vs disk, you are comparing flash vs the battery-backed 512MB RAM cache in the “RAID” controllers within each storage cell!
This is how the “disk” that’s supposed to have a couple of milliseconds of average latency (it still rotates and needs to seek + calibrate to next track even in sequential writes) gives you sub-millisecond write latencies… it’s not the disk write, it’s the controller’s RAM write that gets acknowledged.
And now when you run a real workload on the machine (lots of random IOs on the disk and Smart Scans hammering them too), your disk writes won’t be always acknowledged by the controller RAM cache. When comparing *busy* flash disks to *busy* spinning disks vs. *idle* flash disks vs *idle* spinning disks (with non-dirty write cache) you will get different results.
So, I’m not arguing here that flash is somehow faster for sequential writes than a bunch of disks when talking about throughput. But if you care about latency (of your commits) you need to be aware of everything else that will be going on on these disks (and account for this in your benchmarks).
Without queueing time included, a busy flash device will “seek” where needed and perform the write in under a millisecond, a busy disk device in 6-10 milliseconds. So your commits will end up having to wait for longer (yes, your throughput will be ok due to the LGWR writing multiple transactions redo out in a single write, but this doesn’t change the fact that individual commit latency suffers).
This latency issue of course will be mitigated when you are using a decent storage array with enough (well-managed) write cache.
So I’d say there are the following things you can compare (and need to be aware of which hardware are you really benchmarking):
1) Flash storage
2) Disk storage without (write) cache
3) Disk storage with crappy (write) cache
4) Disk storage with lots of well-managed & isolated (write) cache
And the second thing to be aware of:
1) Are you the single user on an idle storage array
2) Are you just one of the many users in a heavily utilized (and randomly seeking) storage array
So, as usual, run a realistic workload and test it out yourself (if you have the hardware :)