Oracle Scratchpad

September 20, 2007

Hit Ratios (3)

Filed under: Ratios,Statspack,Troubleshooting — Jonathan Lewis @ 11:56 am BST Sep 20,2007

 A little while ago I wrote some notes on why monitoring the buffer cache hit ratio (BCHR) was a pointless exercise. I’ve recently come across a lovely example that demonstrates very clearly the point that Connor McDonald was trying to make in his ‘pick a hit ratio’ script.

After starting up Statspack on a system which had “a performance problem in the overnight runs” I picked up the following from a one-hour snapshot during the critical time-period.


Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:  100.00       Redo NoWait %:    100.00
            Buffer  Hit   %:   99.40    In-memory Sort %:     99.97                


Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time                                                        5,223    26.59
db file sequential read                         1,761,467       4,791    24.39
db file scattered read                            169,915       3,427    17.45               


                                                           Free    Write  Buffer
     Number of Cache      Buffer    Physical   Physical  Buffer Complete    Busy
P      Buffers Hit %        Gets       Reads     Writes   Waits    Waits   Waits
--- ---------- ----- ----------- ----------- ---------- ------- --------  ------
D      100,000  99.4 757,626,201   4,570,027  1,062,747       0       67   6,514
          -------------------------------------------------------------                

 
                                                    CPU      Elapsd
  Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
    270,710,940          141    1,919,935.7   35.7   851.67    951.05 3004918185
{a nasty little procedure call}

Notice how “nice” the buffer cache hit ratio is.

Notice, however, that 35% of the logical I/O could be attributed to just one pl/sql procedure that (I later learned) had once been quite efficient. It turned out that the procedure was redundant and should not have been running.

So pretend that code wasn’t running, and subtract 270 million from the logical I/O count: the adjusted buffer cache hit ratio would be 99.1% rather than 99.4%. 

This example highlights the flaw in the argument “we monitor the ratio over time and only look for problems if it changes significantly”.  How much of a change do you want to see before you start looking for a problem? 

Is 0.3% enough to make you worry? If not, why not?  (It happens that a drop in the ratio actually saved 851 CPU seconds – but at some point in the past the opposite effect would have taken place – would you have looked for a problem because the ratio had gone up by 0.3%) 

We come back to the same conclusion that we saw in Hit Ratios (2). Sometimes you will get false positives (and waste your time), sometimes you will not take action because a change in a ratio is insignifcant despite there being a significant change in the underlying events. Monitoring the underlying figures gives you some information, monitor (just) the ratio and you’re likely to hide the information.

Footnote: if you do a little arithmetic with the file wait times shown, you’ll get an average of 2.7 milliseconds for the single block reads and 20.1 milliseconds for the multi-block reads. So there’s probably a fair bit of help coming from the file system cache on the single-block reads, and some read-ahead help from the SAN on the multi-block reads. The figures are hinting at a problem with inefficient SQL rather than an inherent problem with the hardware or configuration.

[Further reading on ratios]

8 Comments »

  1. Excellent!

    Comment by Mirjana — September 21, 2007 @ 10:00 am BST Sep 21,2007 | Reply

  2. @ Jonathan:

    “A little while ago I wrote some notes on why monitoring the buffer cache hit ratio (BCHR) was a pointless exercise. ”

    But in your CBO Fundamentals book (P. 25), you say:

    “… plotting trend lines from regular snapshots can give you important clues that something is changing, or that a performance anomaly occurred at some point in time.”

    What am I missing here?

    Comment by Nigel West — January 16, 2009 @ 10:30 am BST Jan 16,2009 | Reply

    • Nigel,

      It’s one of those comments where I should have left it out completely or written half a dozen pages explaining it. I make the comment in the 2nd edition volume of the book.

      Comment by Jonathan Lewis — January 30, 2009 @ 5:41 pm BST Jan 30,2009 | Reply

  3. Jonathan,

    Any chance of the 2nd edition being made available on O’ Reilly Safari?

    The 1st edition is there.

    Incidentally, if affordable, I recommend Safari to all. It’s a genuine boon.

    Comment by Nigel — February 4, 2009 @ 3:26 pm BST Feb 4,2009 | Reply

  4. Nigel,

    That should have been “2nd Volume” – which is not yet complete.

    I hadn’t come across Safari before – I’ll have to have a look into it. Once the book has gone to the publishers the subsequent appearances are up to them really; there have been a couple of deals on the book that I haven’t heard about – beyond receiving items with no apparent origin on the Royalty statement.

    Comment by Jonathan Lewis — February 4, 2009 @ 8:17 pm BST Feb 4,2009 | Reply

  5. Safari is about 40 UK pounds per month for the full “access any number of books” offering; a bit cheaper for a limited number of books per month.

    Saves me lugging a hundredweight of books around, people forgetting to return them, spilling coffee over them, pages falling out, etc.

    Not exactly low cost, but having a rather brilliant selection of books on all topics that I may want to know something of (e.g. UNIX, TCP/IP, Java, C++, etc., etc.), available wherever there’s an Internet connection, it seems good value. One can also download a few titles per month, to print off, I suppose.

    Businesses can buy corporate licences, too, which, I think, is a no-brainer for any business that wants its staff to do well in their work.

    Comment by Nigel — February 5, 2009 @ 7:54 am BST Feb 5,2009 | Reply

  6. Books:

    Without wishing to pry, is there any useful money in writing a technical book?

    From what you wrote, it all looks to be a bit of a punt, with regard to revenue streams.

    By the way, have you thought of producing some sort of visual Tuning course on, say, CD-ROM/DVD/whatever?

    Sometimes pictures really are worth a thousand words. A company called CBT Nuggets does that sort of thing (as do several others). I bought a (gasp!) SQL Server course from them a few years back: it’s a very useful format, and I learnt much faster, and with less uncertainty, than by using books alone.

    Food for thought!

    Comment by Nigel — February 6, 2009 @ 5:18 pm BST Feb 6,2009 | Reply

  7. I was surprised to hear from an author I respect that he received no money at all from a certain book, he only published for reputation enhancement.

    This old paper still seems relevant: http://oraclemagician.com/white_papers/book_article.doc

    No flamebait intended.

    Comment by joel garry — February 10, 2009 @ 10:06 pm BST Feb 10,2009 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,508 other followers