Oracle Scratchpad

May 10, 2008

CPU usage

Filed under: Performance, Troubleshooting — Jonathan Lewis @ 10:32 am UTC May 10,2008

I made a few comments recently on a post in the Oracle Forum that raised the issue of CPU usage and how busy a CPU can get.

Shortly afterwards my blog dashboard showed a couple of incoming references from a blog entry that Chen Shapira had made about my comments. Her blog had received a couple of follow-up comments (from Nuno Souto, among others) that were worth collating, and also prompted me to demonstrate how nasty things can get if you start to work very hard in a very small area of the database.

If you follow the blog trail, you’ll see that I’ve mentioned a query that jumped from 15 seconds of CPU usage to 45 seconds - purely because of the effect of spin_gets on latches.

If you want to try the same test on different platforms and different versions of Oracle the code to build the test case is shown below. (It’s based on a simple script I wrote many years ago to demonstrate how pointless it was to depend on the buffer cache hit ratio as a source of meanigful information - subsequently hi-jacked by Connor McDonald for his ‘pick a hit ratio’ procedure).


alter session set "_old_connect_by_enabled" = true;

create table kill_cpu_&1(n primary key)
organization index
as
select
	rownum n
from
	all_objects
where
	rownum <= 25  	--  Adjust number (slightly) to suit
;

pause Press return

set timing on
spool latch_test_&1

select
	count(*) X
from
	kill_cpu_&1
connect by
	n > prior n
start with
	n = 1
;

set timing off
spool off

A couple of notes:

The code expect an input parameter, and uses that parameter as part of the table name it creates and queries, and also in the name of an output file.  This means you can run multiple copies of the job against the ’same’ table, although each copy will drop and recreate the table created by the previous copy.

To ensure that no query starts until all the copies of the script have dropped and recreated their tables, I have a ’synchronisation’ step in the code.  The code above shows a simple “pause” - I’ll be posting a note some time in the future about the method I usually use.

Something else I’ve omitted from the sample is the code I use to take snapshots of various dynamic performance views to see where the work goes. Looking at v$session_event, v$mystat (your session’s slice of v$sesstat), and v$latch is very informative.  (In 10g, watch out for  variations in results that depend on whether or not your table happens to collide with the “simulator” latches).

Warning - on my laptop with a Turion dual-core CPU running at 1.6Ghz, the query (with rownum <= 25) ran for 91 seconds under 10.2.0.3. The run time will double for every row you add to the index organized table (IOT) that I’ve used to hold the driving data.

As a flavour of what you might see, here are a few results from my (dual-core) laptop; the first set compares a few versions of Oracle, the second set compares a few scenarios in the same version:

  • 8.1.7.4: Single run 59.35 CPU seconds
  • 9.2.0.8: Single run 64.56 CPU seconds
  • 10.2.0.3: Single run 91.5 CPU seconds

Now trying to cause collisions (using 9.2.0.8, which was the worst case by a significant margin):

  • Single run: 64.56 CPU seconds
  • Two concurrent runs, using different tables: 70.5 CPU seconds each
  • Two concurrent runs against the same table: 177.8 CPU seconds each, plus 7.7 seconds each on “latch free” waits.

The extra CPU time for the last test was basically the consequence of 33 million misses and spin_gets (out of 67M gets) for the cache buffers chains latch. The latch sleep time, by the way, came from 1,500 sleeps (each).

Moral: Although the test is deliberately designed to exaggerate the issue, latch sleeps aren’t necessarily your biggest performance problem when your system is running with a high degree of concurrency - misses and spin_gets can have a significant impact on your CPU usage: and the performance impact can be exacerbated if your basic CPU load is high.

 
Footnote: Here’s a link to an interesting little note from Hermant Chitale about CPU disappearing “outside” Oracle.

 

Blog at WordPress.com.