I made a few comments recently on a post in the Oracle Forum that raised the issue of CPU usage and how busy a CPU can get.
Shortly afterwards my blog dashboard showed a couple of incoming references from a blog entry that Chen Shapira had made about my comments. Her blog had received a couple of follow-up comments (from Nuno Souto, among others) that were worth collating, and also prompted me to demonstrate how nasty things can get if you start to work very hard in a very small area of the database.
If you follow the blog trail, you’ll see that I’ve mentioned a query that jumped from 15 seconds of CPU usage to 45 seconds – purely because of the effect of spin_gets on latches.
If you want to try the same test on different platforms and different versions of Oracle the code to build the test case is shown below. (It’s based on a simple script I wrote many years ago to demonstrate how pointless it was to depend on the buffer cache hit ratio as a source of meanigful information – subsequently hi-jacked by Connor McDonald for his ‘pick a hit ratio’ procedure).
alter session set "_old_connect_by_enabled" = true; create table kill_cpu_&1(n primary key) organization index as select rownum n from all_objects where rownum <= 25 -- Adjust number (slightly) to suit ; pause Press return set timing on spool latch_test_&1 select count(*) X from kill_cpu_&1 connect by n > prior n start with n = 1 ; set timing off spool off
A couple of notes:
The code expects an input parameter; uses that parameter as part of the table name it creates and queries, and as part of the name of an output file. This means you can run multiple copies of the job against the ‘same’ table, although each copy will drop and recreate the table created by the previous copy.
To ensure that no query starts until all the copies of the script have dropped and recreated their tables, I have a ‘synchronisation’ step in the code. The code above shows a simple “pause” – I’ll be posting a note some time in the future about the method I usually use.
Something else I’ve omitted from the sample is the code I use to take snapshots of various dynamic performance views to see where the work goes. Looking at v$session_event, v$mystat (your session’s slice of v$sesstat), and v$latch is very informative. (In 10g, watch out for variations in results that depend on whether or not your table happens to collide with the “simulator” latches).
Warning – on my laptop with a Turion dual-core CPU running at 1.6Ghz, the query (with rownum <= 25) ran for 91 seconds under 10.2.0.3. The run time will double for every row you add to the index organized table (IOT) that I’ve used to hold the driving data.
As a flavour of what you might see, here are a few results from my (dual-core) laptop; the first set compares a few versions of Oracle, the second set compares a few scenarios in the same version:
- 220.127.116.11: Single run 59.35 CPU seconds
- 18.104.22.168: Single run 64.56 CPU seconds
- 10.2.0.3: Single run 91.5 CPU seconds
Now trying to cause collisions (using 22.214.171.124, which was the worst case by a significant margin):
- Single run: 64.56 CPU seconds
- Two concurrent runs, using different tables: 70.5 CPU seconds each
- Two concurrent runs against the same table: 177.8 CPU seconds each, plus 7.7 seconds each on “latch free” waits.
The extra CPU time for the last test was basically the consequence of 33 million misses and spin_gets (out of 67M gets) for the cache buffers chains latch. The latch sleep time, by the way, came from 1,500 sleeps (each).
Moral: Although the test is deliberately designed to exaggerate the issue, latch sleeps aren’t necessarily your biggest performance problem when your system is running with a high degree of concurrency – misses and spin_gets can have a significant impact on your CPU usage: and the performance impact can be exacerbated if your basic CPU load is high.
Footnote: Here’s a link to an interesting little note from Hermant Chitale about CPU disappearing “outside” Oracle.
Update – June 2011
Following a question on OTN about creating waits for “Latch: cache buffers chains”, I’d like to point out that you can use the code, with minor changes, to demonstrate that problem. Simply create a single table (chop the _&1 of the end of the table name) once, and then run several sessions that query the same table. They will all be competing to access one block millions of times.
It’s quite instructive to see the way the number of “session logical reads” and latch collisions vary with versions of Oracle as the internal code changes. It might also be interesting to see how different platforms (or even just differences in CPU speeds) cause changes in results when using the same version of Oracle.