Oracle Scratchpad

October 1, 2008

Analysing Statspack (10)

Filed under: Performance,Statspack,Troubleshooting — Jonathan Lewis @ 6:48 pm BST Oct 1,2008

[Further Reading on Statspack]

The following question appeared some time ago on the Oracle Forum:


Event                       Waits   Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time                              1,167                           86.9
db file sequential read   111,047       223            2              16.6 User I/O
log file sync              77,390       132            2               9.9 Commit
log file parallel write    78,060       124            2               9.2 System I/O
db file scattered read     79,233        10            0               0.7 User I/O

From AWR I am seeing a lot of CPU taking most call time %. Can anyone explain what this means. I have looked on metalink and there are a number of doc regarding top events in general.

So what could we guess from this minimal slice of an AWR report.

Something has to be top of the list. In this case it’s CPU – but if this is a typical AWR install (which means hourly snapshots), it’s reported 1,167 seconds of CPU in one hour: which is about 30% of the CPU available if it’s running on a single CPU system. (But we haven’t been told the length of the snapshot or the number of CPUs and whether this is supposed to represent a fully loaded production system, or a lightly loaded development system).

Second is the single block read time – which is averaging 2 milliseconds per read – so it’s not coming from disc (mostly) it’s coming out of memory (possibly a SAN cache). That might be pushing up CPU usage outside of Oracle, and could be using memory that would be better inside Oracle; but it’s not giving any obvious threat indications.

Log file sync – and its driver “log file parallel writes” - are also averageing 2m/s so we aren’t overloading the discs on log file writes, and (most) users probably don’t even notice the pause on commit. But we can’t tell whether the counts are appropriate for the work being done – so we can’t say if this is good or bad.

The multiblock-reads are blindingly fast – so must be coming from local file-system cache. But perhaps this is supposed to be an OLTP system, and that could be a lot of inefficient data accesses being used to collect a very small amount of data.

All in all, we might say that the hardware is not under stress – but it is possible that the application is so badly written that the work done per transaction is far higher than it should be. These stats say you might (for example) be doing a 128 block tablescan with one subsequent row lookup and update with commit as you work your way through a batch job.  The hardware seems to be coping, but that doesn’t mean the application is healthy.

Without a context, a small collection of numbers is not helpful.

But if you want a couple of guesses -

  • it’s an OLTP system (77,000 log file syncs) and there are some SQL statements that need tuning to do less work for their result (79,000 db file scattered reads).
  • Fixing a few SQL statement will probably eliminate a lot of db file scattered reads and a fair bit of CPU.
  • At the same time, it looks like the machine is surviving on the back of a large file system cache – put some more memory into the buffer cache and the CPU will drop further as Oracle does fewer ‘physical reads’ to the file system.

Two simple questions you can always ask about performance figures are these: “Can I see any resource usage that seems unreasonable for the result achieved ?” and “where are the gaps in the information that might make me jump to the wrong conclusion ?”

[Further Reading on Statspack]

Footnote: Any advice about reading AWR reports is almost always relevant when reading Statspack  reports (and vice versa).

12 Comments »

  1. Jonathan, this reminds me very much of a situation I saw just last week (of course, I had access to more information). In my case, excessive parsing seemed to be an additional factor in the high CPU; something else to look for here?

    Also, even if the buffer cache is increased there will still be “physical reads” if full scans are made, unless the tables are in “cache” mode.

    Comment by Stew Ashton — October 2, 2008 @ 7:07 pm BST Oct 2,2008 | Reply

  2. Stew,

    I think it’s always a good idea, when looking at the headline figures, to start by creating a list of possible causes and then checking for evidence that dismisses or confirms your ideas.

    CPU could be a side effect of heavy parsing (both hard – i.e. optimisation – and soft). We even see that we must have executed at least 77,000 data-modifying statement (or we wouldn’t be able to get 77,000 log file syncs).

    Of course, with the full report, we could just cross check a couple of relevant figures (parse time CPU, library cache latching, parse count(hard), session cursor cache hits etc.), but even when you do have the whole report it’s quite educational (and entertaining) to look at just the first page (load profile and waits) and try to guess what you’re going to see elsewhere.

    Comment by Jonathan Lewis — October 3, 2008 @ 7:49 pm BST Oct 3,2008 | Reply

  3. Jonathan,

    I was under the impression that log file syncs were also triggered by DBWn flushing dirty blocks. I haven’t checked on that behaviour since 8i, but it stands to reason: no block on disk should be allowed to contain changes protected by redo that is not yet flushed (i.e., “get ahead of” the redo stream on disk).

    So – if this is still correct – we may have executed a single data-modifying statement which, together with subsequent demand for free blocks, may have caused 77,000 blocks to be modified and then written to disk, triggering 77,000 log file syncs (in the background).

    I’d look at the background wait events before concluding that 77,000 log file syncs = 77,000 commits. But of course we don’t see that information either.

    Cheers!
    Flado

    Comment by Flado — October 4, 2008 @ 7:45 pm BST Oct 4,2008 | Reply

  4. Flado,

    Good point – both the suggestion for possible cause and strategy for follow-up.

    The behaviour is still largely the same – the database write can’t write a block if the log buffer still describes some changes that have been applied to it.

    The dbwr algorithm did change some time in 8, though – possibly 8.0 – so that the database writer didn’t have to wait for the log write to finish before carrying on writing other blocks.

    I haven’t checked recently, but I think that dbwr posts lgwr, without a sync, sets aside the problem block, then comes back to it when it’s finished dealing with the rest of the write list to see if lgwr has completed the write.

    I’ve copied your correction back into the first post and deleted the second post to minimise the risk for someone reading the incorrect version and not spotting the correction. Let me know if you want it changed.

    Comment by Jonathan Lewis — October 5, 2008 @ 7:20 pm BST Oct 5,2008 | Reply

  5. Jonathan,

    you said: “At the same time, it looks like the machine is surviving on the back of a large file system cache – put some more memory into the buffer cache and the CPU will drop further as Oracle does fewer ‘physical reads’ to the file system.” In my experience %CPU will go up if more requests can be satisfied from buffer cache because the time for physical IO drops and Oracle spends relatively more time getting blocks from buffer cache etc. I am probably missing something here (for example “CPU” means “overall CPU” and not “Oracle CPU”). Can you clarify this? Thanks a lot!

    Kind regards

    Comment by Robert Klemme — October 14, 2008 @ 2:14 pm BST Oct 14,2008 | Reply

  6. Robert,

    My comment was ambiguous – I was thinking of global effects.

    As you point out – if my session has to keep stopping to read from the file system, then my CPU usage is constrained. If my session can keep its data in the buffer cache, it doesn’t have to stop, so it can (probably) use more CPU in a given time interval. (See the final paragraphs on Statspack 9 for an example of what this can do).

    On the other hand, if my session has to read blocks from the file system before doing its “logical I/Os”, then some code has to kick a block off the LRU (possibly promoting a few before doing so) move a buffer to the LRU mid-point, detach the buffer from one hash chain and re-attach it to another, and move data from the file system to Oracle’s memory space. Getting rid of that reduces the total CPU requirement for the job.

    Comment by Jonathan Lewis — October 18, 2008 @ 7:30 am BST Oct 18,2008 | Reply

  7. Hi,

    I have a small part ( output ) of AWR Report. Can anyone help me on
    this, to interpret the AWR Output…

    Snap Id Snap Time Sessions Cursors/Session
    Begin Snap: 8193 17-Oct-08 16:00:54 90 62.7
    End Snap: 8194 17-Oct-08 16:01:26 89 63.4
    Elapsed: 0.52 (mins)
    DB Time: 1.63 (mins)

    With Regards,
    Raja.

    Comment by Raja — November 6, 2008 @ 9:47 am BST Nov 6,2008 | Reply

  8. Hi Raja,
    The interval between the snapshots (32 seconds) is too small for the report to be of much use. Also, it would take no less than a book to explain an AWR report.
    Cheers,
    Flado

    Comment by Flado — November 7, 2008 @ 10:57 am BST Nov 7,2008 | Reply

  9. Raja,

    This is not the appropriate forum for posting Statspack/AWR reports. I have deleted most of the content, but left the part to which Flado has responded. His observation about 30 seconds being too short is relevant – if you have a problem with this granularity you should probably be looking at specific sessions.

    You could try posting your report on the OTN Oracle database forum. (see the “Oracle DB Forum” Link at right of page). If you do, remember to use the {code} tags when posting your report so that it appears in a readable form using a fixed font.

    Comment by Jonathan Lewis — November 7, 2008 @ 2:13 pm BST Nov 7,2008 | Reply

  10. Jonathan,
    thank you for posting. It really allows to learn from your deep experience.

    I coded this statement.


    select *
     from (select event
           ,waits "Waits"
            ,round(time / 1000000
             ,2) "Wait Time (s)"
            ,round(pct * 100
             ,2) "Percent of Total"
            ,waitclass "Wait Class"
            ,round(time / waits / 1000) "avg (ms)"
         from (select e.event_name event
               ,e.total_waits - nvl(b.total_waits
                         ,0) waits
               ,(e.time_waited_micro - nvl(b.time_waited_micro
                            ,0)) time
               ,(e.time_waited_micro - nvl(b.time_waited_micro
                            ,0)) / (select sum(e1.time_waited_micro - nvl(b1.time_waited_micro
                                                   ,0))
                                  from dba_hist_system_event b1
                                    ,dba_hist_system_event e1
                                 where b1.snap_id(+) = b.snap_id
                                 and e1.snap_id = e.snap_id
                                 and b1.dbid(+) = b.dbid
                                 and e1.dbid = e.dbid
                                 and b1.instance_number(+) = b.instance_number
                                 and e1.instance_number = e.instance_number
                                 and b1.event_id(+) = e1.event_id
                                 and e1.total_waits > nvl(b1.total_waits
                                             ,0)
                                 and e1.wait_class  'Idle') pct
               ,e.wait_class waitclass
             from dba_hist_system_event b
               ,dba_hist_system_event e
               ,(with dhs as (select * from dba_hist_snapshot where instance_number = &pinstnum)
                 select * from (
                             select max(dbid) dbid, max(begin_interval_time) maxdata, max(snap_id) da_snap
                                       from dhs
                                      where trunc(begin_interval_time
                                           ,'MI') = to_date('&ada'
                                                   ,'dd-mm-yyyy hh24:mi'))
                 ) s1
            where b.snap_id = s1.da_snap
              and e.snap_id = (select min(snap_id)
                       from dba_hist_snapshot
                      where trunc(begin_interval_time
                           ,'MI') >= to_date('&ada'
                                    ,'dd-mm-yyyy hh24:mi')
                        and instance_number = &pinstnum) --&pendsnap
              and b.dbid = s1.dbid
              and e.dbid = s1.dbid
              and b.instance_number(+) = &pinstnum
              and e.instance_number = &pinstnum
              and b.event_id(+) = e.event_id
              and e.total_waits > nvl(b.total_waits
                         ,0)
              and e.wait_class  'Idle'
            order by time  desc
                ,waits desc))
    where event like '%&evt%'

    With this I can extract from a specific and “interesting” period (&dad=from-date, &ada=to-date) avarage of timing.
    If I set &evt at “%read%” or “%write%” I can verify avg in millisecond of disk access and verify if there are slow disk acess (for example verify if it exceeds 10-5ms threshold). It is true? Can you comment this please?
    Can you suggest optimization of this statement?

    Thanks a lot for your work!

    Comment by Sandro — December 19, 2008 @ 5:07 pm BST Dec 19,2008 | Reply

    • Sandro,

      Thanks for the contribution – I’ve edited it to try and make it more readable within the limits of WordPress comments.

      I think you could make the code both simpler and more efficient by making more use of subquery factoring (the “With Subquery” approach you have used on the the dba_hist_snapshot view). I note that you’ve access the dba_hist_system_event view four times to get a relatively simple report from it – you could make use of analytic functions rather than self-joins to do this; see, for example: http://www.jlcomp.demon.co.uk/statspack_trend.html

      Comment by Jonathan Lewis — December 21, 2008 @ 8:39 pm BST Dec 21,2008 | Reply

  11. Hi Jonathan,
    thank you for posting.
    I have rewrited my previous statement with your tip.
    select event_name
    ,total_waits
    ,time_waited
    ,total_waits – prev_waits snap_waits
    ,time_waited – prev_time snap_waited
    ,round((time_waited – prev_time) / decode((total_waits – prev_waits),0,null,(total_waits – prev_waits)),2) wait_average
    ,mind
    ,maxd
    ,deltat
    from (select mind
    ,maxd
    ,(maxd – mind) deltat
    ,snap_id
    ,dbid
    ,total_waits
    ,event_name
    ,lag(total_waits,1) over(partition by event_name order by event_name, snap_id) prev_waits
    ,time_waited_micro time_waited
    ,lag(time_waited_micro,1) over(partition by event_name order by event_name, snap_id) prev_time
    from dba_hist_system_event
    ,(with dhs as ( select *
    from dba_hist_snapshot
    where instance_number = &inr
    and dbid = &dbi)
    select max(dbid) dbi
    ,max(instance_number) inr
    ,min(begin_interval_time) mind
    ,max(end_interval_time) maxd
    ,min(snap_id) from_snap
    ,max(snap_id) to_snap
    from dhs
    where begin_interval_time >= to_date(‘&fromdate’,’dd-mm-yyyy hh24:mi:ss’)
    and begin_interval_time = 0 — dirty trick to skip instance restarts
    order by event_name, snap_id;

    But I think that my results are incorrect. I want extract avg wait from first snap (from_snap) and last snap (to_snap) of my input range date (&fromdate and &todate) by event fom partial name of event. If I set &evt at “%read%” or “%write%” I can verify avg in millisecond of disk access and verify if there are slow disk acess (for example verify if it exceeds 10-5ms threshold).

    if my input are:
    - inr = 1
    - dbi =
    - fromdate = 22-12-2008 00:00:01
    - toomdate = 22-12-2008 23:59:59

    I receive this output

    EVENT_NAME TOTAL_WAITS TIME_WAITED SNAP_WAITS SNAP_WAITED WAIT_AVERAGE MIND MAXD DELTAT DBID
    control file heartbeat 1 3999591 0 2671 22-DEC-08 01.22.00.000 PM 22-DEC-08 11.32.13.592 PM +000000000 10:10:13 1217438681
    db file parallel write 1217 792095 396 -128762 -325,16 22-DEC-08 01.22.00.000 PM 22-DEC-08 11.32.13.592 PM +000000000 10:10:13 1217438681
    db file scattered read 1513 8563709 1067 6870606 6439,18 22-DEC-08 01.22.00.000 PM 22-DEC-08 11.32.13.592 PM +000000000 10:10:13 1217438681
    db file sequential read 11060 66535913 5522 46396942 8402,2 22-DEC-08 01.22.00.000 PM 22-DEC-08 11.32.13.592 PM +000000000 10:10:13 1217438681
    log file parallel write 1464 2623153 321 1837649 5724,76 22-DEC-08 01.22.00.000 PM 22-DEC-08 11.32.13.592 PM +000000000 10:10:13 1217438681
    log file sequential read 54 1791689 4 23665 5916,25 22-DEC-08 01.22.00.000 PM 22-DEC-08 11.32.13.592 PM +000000000 10:10:13 1217438681

    Very strange results.
    What do you think?

    Thanks!

    Comment by Sandro — December 22, 2008 @ 11:23 pm BST Dec 22,2008 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,505 other followers