Oracle Scratchpad

May 26, 2009

CPU used

Filed under: Statspack,Troubleshooting — Jonathan Lewis @ 11:31 am BST May 26,2009

[Further Reading on Statspack]

From time to time users of Statspack on the newer versions of Oracle are surprised to see the “CPU Time” in the “Top N Timed Events” section of the report looking very different from the “BUSY_TIME” that appears in the “OS Statistics” part of the report.

There are various reasons why the numbers can differ, but one of the reasons is simple and highly beneficial – prior to 10g Oracle usually updated time figures at the end of each database call; but from 10g there are some views where time is updated more regularly.

We can see this difference with a simple test (in this┬ácase using the session-based views views v$sesstat (old style) and v$sess_time_model (new style)) by executing a long-running, CPU-intensive query (such as the code from my “kill CPU” routine) in one session, then monitoring it from another.

After starting my killer query – I ran the following statement roughly once every second:

select
	sn.name, ss.value
from
	v$sesstat ss,
	v$statname sn
where
	sn.statistic# = ss.statistic#
and	sn.name = 'CPU used by this session'
and	ss.sid = &m_sid
union all
select
	stat_name , round(value/10000,0)
from
	V$sess_time_model
where
	sid = &m_sid
and	stat_name = 'DB CPU'
;

This produced a string of results, which I’ve trimmed down to show only the results where critical figures change:

NAME                           VALUE
------------------------- ----------
CPU used by this session        3456
DB CPU                          3452

NAME                           VALUE
------------------------- ----------
CPU used by this session        3456
DB CPU                          3988

NAME                           VALUE
------------------------- ----------
CPU used by this session        3456
DB CPU                          4588

NAME                           VALUE
------------------------- ----------
CPU used by this session        3456
DB CPU                          5187

NAME                           VALUE
------------------------- ----------
CPU used by this session        5739
DB CPU                          5735

Notice how the “DB CPU” from v$sess_time_model increases every six seconds, while the “CPU used by this session” from v$sesstat changes only at the end of the test when the killer query completes.

As a quick check – if your BUSY_TIME time is much larger than the CPU time in the Top N, it’s possible that you’ve had some CPU intensive SQL in the period that didn’t complete in the period. Conversely if your CPU Time is much larger than your BUSY_TIME then it’s possible that a CPU intensive query that ran for a long time in the previus period completed in this period.

(There are other reasons for the difference, of course, but it’s worth remembering that this one is a possibility).

[Further Reading on Statspack]

14 Comments »

  1. Another reason is that v$osstat reports the OS level system wide CPU usage. But Oracle’s v$sysstat and v$sys_time_model report CPU usage of current INSTANCE only. So if you have other stuff going on in the server, you’ll see discrepancy.

    Comment by Tanel Poder — May 26, 2009 @ 9:34 pm BST May 26,2009 | Reply

    • Tanel,

      I guess it would have been sensible to make it clear at the start that the OS CPU isn’t restricted to just the the single instance CPU and therefore ought to be greater than the instance CPU – but I forgot that this wasn’t something that everyone would be aware of. Thanks for highlighting the point.

      Of course, there’s a more significant point that people might expect the “DB CPU Time” from the “Time model” section to be similar to the “CPU Time”/”CPU used by this session” – and the difference in the moment of recording is more appropriate for that comparison.

      Comment by Jonathan Lewis — May 27, 2009 @ 8:01 pm BST May 27,2009 | Reply

  2. [...] the Oracle Scratchpad, Jonathan Lewis describes “CPU used,” which demonstrates how, “‘CPU Time’ in the ‘Top N Timed Events’ [...]

    Pingback by Log Buffer #148: a Carnival of the Vanities for DBAs | Pythian Group Blog — May 29, 2009 @ 4:47 pm BST May 29,2009 | Reply

  3. Hi Jonathan,

    I have a question nobody has answered yet.

    I have a 9.2.0.8 db on a Solaris 10 server. I review Statspack reports in a weekly basis, since last week, the CPU Time dropped to a 10% of what it used to be so, I got an increase on controlfile and log file waits… CPU Time dropped from 90% to 60%!

    What could be the reason for that?
    I know the workload from the application hasn’t changed, neither anything else on the Statspack reports.

    Thanks in advance for any tip!

    Comment by German Minicucci — June 25, 2009 @ 12:56 pm BST Jun 25,2009 | Reply

    • German,

      I can imagine a lot of people reading your note and thinking: “How can a drop from 90% to 60% be the same as dropping to one tenth of what it used to be”. I think I can guess what you’re looking at when you say this – but it’s not obvious. I can also guess why this might happen – but it’s not compatible with no change in the workload, application, or statspack output.

      If you want to email me a pair of one-hour reports from before and after the change that you think are good examples of the difference, I’ll take a quick look at them – and write a note about them if they demonstrate a useful principle.

      Comment by Jonathan Lewis — June 26, 2009 @ 7:41 am BST Jun 26,2009 | Reply

  4. Could an increase in the Controlfile and Logfile waits (IO?) be preventing the applications (whatever they are) from making as much use of CPU – i.e. acting as a botlteneck to processing?

    Comment by Nigel — June 25, 2009 @ 4:17 pm BST Jun 25,2009 | Reply

  5. Just to clarify, the CPU Time in seconds dropped to a 10% (from 100 seconds an hour to 10 seconds an hour).
    That impacted the Top 5 Wait Events causing the controlfile and log file wait events to increase in % but not in amount of waits, so, the CPU Time % of Total Elapsed Time dropped to 60%.

    Jonathan, I’ve just sent you an e-mail with Statspack reports so you can check them out.

    Could a decrease like that in the “CPU used by this session” statistic mean that the database is not being used?

    Comment by German Minicucci — June 29, 2009 @ 4:39 pm BST Jun 29,2009 | Reply

    • German,

      I’ve had a quick look at an early and a late report. First thought 100 CPU seconds in one hour is virtually idle anyway.

      But looking at where that CPU time came from (I checked SQL ordered by buffer gets as a first cross-check) most of it is coming from a couple of queries that look like monitoring queries (queries against v$lock and v$session, v$sql) and don’t look like application code.

      These disappear from the later statspack. So maybe the person who was checking on the database decided it wasn’t worth the effort.

      If you check the times you mention, the average wait time for each of those events hasn’t changed – it’s just that the reduction in CPU usage pushes their total time higher up the list and makes their percentage contribution higher.

      Comment by Jonathan Lewis — June 29, 2009 @ 7:34 pm BST Jun 29,2009 | Reply

      • Jonathan,

        I think you’re right! I’m going to investigate further on that!
        The explanation was simple then… A monitoring script was disabled and a lot of the CPU it was using got free, so now the instance is more idle than before.

        Thanks so much for your help!

        Comment by German Minicucci — June 29, 2009 @ 8:19 pm BST Jun 29,2009 | Reply

  6. Can I derive “DB time” (available from 10g) by Oracle9i statistics?”

    Comment by Roberto — October 2, 2009 @ 2:07 pm BST Oct 2,2009 | Reply

  7. Hi,
    If , in AWR, we have CPU TIME =36000s and total CPU capacity is 46. so Total CPU for 30 minutes elapsed time is 46*30*60=82800s.
    So % of CPU TIME taken by oracle in 30 minutes is 36000/82800*100=43.47%.
    Is this calculation right?
    also if % of busy time is 70% then is it right to say that 70-43.47=26.53% of CPU was taken by non oracle processes?

    Comment by Yasir — May 9, 2012 @ 8:32 am BST May 9,2012 | Reply

    • Yasir,

      Sorry about the long delay in replying.

      Your calculations look correct. However, to follow up the 70% question, I would also at the section on OS Statistics to cross-check the BUSY_TIME, and the split between SYS_TIME and USER_TIME – I like to ensure that different sources of information give consistent values before I jump to any conclusions.

      Comment by Jonathan Lewis — May 24, 2012 @ 10:07 am BST May 24,2012 | Reply

  8. Hi Jonathan,

    If I’m right, “CPU used by this session” is measured in tens of milliseconds and user calls under ten milliseconds are not added to this metric whereas “DB CPU” is measured in microseconds.

    Could that difference in accuracy explain why “DB CPU” could be significantly higher than “CPU used by this session” ?
    Would that mean that a lot of user calls complete in less than 10 millisecond in the following example ?

    SQL> select cpu1, cpu2, cpu2 - cpu1
      2  from (
      3    select round(value / 10000) cpu2
      4    from   v$sys_time_model
      5    where  stat_name = 'DB CPU'),
      6       (
      7    select value cpu1
      8    from   v$sysstat
      9    where name = 'CPU used by this session');
    
          CPU1       CPU2  CPU2-CPU1
    ---------- ---------- ----------
       4591465    5878762    1287297
    

    Thank you for your help

    Comment by Vincent — January 7, 2013 @ 1:47 pm BST Jan 7,2013 | Reply

    • Vincent,

      I think it’s impossible to say – and detailed interpretation may depend on Version and Platform.

      I would assume that Oracle used a rounding mechanism to convert whatever time it acuired for CPU, so that the boundary case would be 5,000 ms rather than 10,000, and we might hope that on average the errors summed to zero across all sessions. We might also question whether Oracle populated the session value for “CPU Used by this session” by accumulating figures like “how many hundredths of CPU seconds have I used since last time” or whether it echoed the o/s result for “CPU time current recorded against this process”. The former would tend to introduce a cumulative error that might explain your results.

      A couple of reference points:

      a) you could query v$sess_time_model and compare it with v$sesstat to see if there were particular sessions that showed the largest discrepancy. This might give you some further ideas on the cause.

      b) when I ran your query on the next 11.2.0.3 I used, the error was the other way – i.e. DB CPU was smaller (by about 5%) than CPU used. (I also had a few sessions which recorded CPU used, but showed zero DB CPU.

      Comment by Jonathan Lewis — January 12, 2013 @ 11:19 am BST Jan 12,2013 | Reply


RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,508 other followers