Oracle Scratchpad

January 7, 2013

Analysing Statspack 13

Filed under: Oracle,Statspack,Troubleshooting — Jonathan Lewis @ 6:44 pm GMT Jan 7,2013

A recent (Jan 2013) post on the OTN database forum reported a performance problem on Oracle 9.2.0.6 (so no AWR), and posted a complete statspack report to one of the public file-sharing sites. It’s been some time since I did a quick run through the highlights of trouble-shooting with statspack, so I’ve picked out a few points from this one to comment on.

As usual, although this specific report is Statspack, the same analysis would have gone into looking at a more modern AWR report, although I will make a couple of comments at the end about the extra material that would have been available by default with the AWR report that would have helped us help the OP.

First, here are the Load Profile, and “Top 5” parts of the report. To make sense of them, you also need to know that report covered a two-hour interval, and the machine had 12 CPUs; there’s also a little clue in the fact that this is 32-bit Oracle with a 600MB SGA (with the sga_max_size set to 1.6GB) running (alone) on a machine with 12GB of memory.

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:             54,437.02             65,579.53
              Logical reads:          2,053,336.52          2,473,626.15
              Block changes:                383.39                461.86
             Physical reads:             43,270.67             52,127.57
            Physical writes:                 46.63                 56.18
                 User calls:                 35.83                 43.16
                     Parses:                 25.53                 30.76
                Hard parses:                  0.18                  0.21
                      Sorts:                186.25                224.38
                     Logons:                  0.08                  0.10
                   Executes:                108.76                131.02
               Transactions:                  0.83

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time                                                       74,717    84.80
latch free                                        181,788       8,101     9.19
db file scattered read                         47,546,136       2,893     3.28
db file sequential read                        23,092,208       2,006     2.28
buffer busy waits                               1,812,920         193      .22
          -------------------------------------------------------------

Working from the top down, the first thing we notice is that 2M logical I/Os per second is quite busy, and that 43, 000 blocks read per second is pretty amazing – especially since the number of user calls and executes isn’t particularly large.

A cross check to the Top 5 then shows that a huge fraction of the work done by the database is CPU. At 2 hours and 12 cores we have 12 * 2 * 3600 = 86,400 CPU seconds available, and we’ve used 74,700 of them – and that’s into Cary Millsap’s “knee of the curve” for queueing problems. Since this isn’t the AWR, of course, we don’t have the OS Stats available, so we can’t see if the rest of the machines CPU is being used up outside the database – but looking at the db file read figures I’ll bet it is. If you have extreme CPU usage in an Oracle system, then latch contention problems are very likely to show up – and that’s our next highest figure – AWR would probably tells us which latch but, again based on the db file read figures, I’ll guess it’s the cache buffers chains latches.

So we look at the db file read figures and do the arithmetic that the AWR would show us to work out that we’re seeing multiblock read times of 0.06 milliseconds, and single block read times of 0.08 milliseconds. These have to be coming from the file system cache to be this fast. (Even then I think we may also seeing a timing problem (perhaps a measurement bug) that’s managing to lose wait time.) First thoughts, then – are that there is some code that is so inefficient that it’s doing far more physical I/O than it should; on the plus side the excess memory (12GB) means we’re being protected from the worst effects of disk I/O, but the caching effect means we’re able to burn up CPU at an astonishing rate to do the work.

So let’s look at the SQL to see where the CPU and disk activity is likely to be going. Here’s extract from the SQL ordered by Gets, and SQL Ordered by Reads. (We don’t have SQL ordered by Time in Statspack, of course):

                                                     CPU      Elapsd
  Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
  5,636,117,159           28  201,289,898.5   38.1 ######## #########  835260576
Module: RAXTRX
 INSERT INTO RA_INTERFACE_ERRORS  (INTERFACE_LINE_ID,   MESSAGE_
TEXT,   INVALID_VALUE) SELECT  INTERFACE_LINE_ID, :b_err_msg6, '
trx_number='||T.TRX_NUMBER||','||'customer_trx_id='||TL.CUSTOMER
_TRX_ID FROM RA_INTERFACE_LINES_GT IL, RA_CUSTOMER_TRX_LINES TL,
 RA_CUSTOMER_TRX T WHERE  IL.REQUEST_ID = :b1 AND    IL.INTERFAC

    511,626,129           20   25,581,306.5    3.5  8155.90  15045.56  205460129
Module: JDBC Thin Client
select  TRANSACTION_BRANCH , BRANCH_ADDRESS_1 , BRANCH_ADDRESS_2
 , BRANCH_PHONES , BRANCH_FAX , BRANCH_TIN , TRX_NUMBER , TRX_NU
MBER_DISP , TRX_DATE , TRX_DATE_DISP , BILL_TO_CUSTOMER_ID , BIL
L_TO_CUSTOMER_NAME , SHIP_DATE_ACTUAL_DISP , SHIP_TO_NAME , SHIP
_TO_ADDRESS1 , SHIP_TO_ADDRESS2 , SHIP_TO_ADDRESS3 , SHIP_TO_ADD

                                                     CPU      Elapsd
 Physical Reads  Executions  Reads per Exec %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
    281,953,754           20   14,097,687.7   90.5  8155.90  15045.56  205460129
Module: JDBC Thin Client
select  TRANSACTION_BRANCH , BRANCH_ADDRESS_1 , BRANCH_ADDRESS_2
 , BRANCH_PHONES , BRANCH_FAX , BRANCH_TIN , TRX_NUMBER , TRX_NU
MBER_DISP , TRX_DATE , TRX_DATE_DISP , BILL_TO_CUSTOMER_ID , BIL
L_TO_CUSTOMER_NAME , SHIP_DATE_ACTUAL_DISP , SHIP_TO_NAME , SHIP
_TO_ADDRESS1 , SHIP_TO_ADDRESS2 , SHIP_TO_ADDRESS3 , SHIP_TO_ADD

     94,281,089        8,550       11,027.0   30.3  2924.20   4550.22 3761220362
Module: JDBC Thin Client
SELECT DELV_ADDR.ADDRESS1 FROM APPS.OE_ORDER_HEADERS_ALL OE_HEAD
ERS , APPS.RA_SITE_USES_ALL DELV_SITE , APPS.RA_ADDRESSES_ALL DE
LV_ADDR WHERE TO_CHAR(ORDER_NUMBER) = :B1 AND DELV_SITE.SITE_USE
_ID(+) = OE_HEADERS.DELIVER_TO_ORG_ID AND DELV_ADDR.ADDRESS_ID(+
) = DELV_SITE.ADDRESS_ID

     94,011,901        8,544       11,003.3   30.2  2919.47   4527.54  803969757
Module: JDBC Thin Client
SELECT COALESCE(DELV_ADDR.CITY, DELV_ADDR.PROVINCE) FROM APPS.OE
_ORDER_HEADERS_ALL OE_HEADERS , APPS.RA_SITE_USES_ALL DELV_SITE
, APPS.RA_ADDRESSES_ALL DELV_ADDR WHERE TO_CHAR(ORDER_NUMBER) =
:B1 AND DELV_SITE.SITE_USE_ID(+) = OE_HEADERS.DELIVER_TO_ORG_ID
AND DELV_ADDR.ADDRESS_ID(+) = DELV_SITE.ADDRESS_ID

     93,981,254        8,551       10,990.7   30.2  2919.13   4533.69 4273350236
Module: JDBC Thin Client
SELECT DELV_ADDR.ADDRESS2 FROM APPS.OE_ORDER_HEADERS_ALL OE_HEAD
ERS , APPS.RA_SITE_USES_ALL DELV_SITE , APPS.RA_ADDRESSES_ALL DE
LV_ADDR WHERE TO_CHAR(ORDER_NUMBER) = :B1 AND DELV_SITE.SITE_USE
_ID(+) = OE_HEADERS.DELIVER_TO_ORG_ID AND DELV_ADDR.ADDRESS_ID(+
) = DELV_SITE.ADDRESS_ID

The first statement reported by Gets is responsible for 500 millions buffer gets in the interval (and even then that’s 40% of the total – so there may be a couple more big ones that have fallen out of the library cache in the interim, perhaps an hourly report on the previous hour would catch them). Look at the table that the SQL is inserting into – RA_INTERFACT_ERRORS. We might hope that an error table wasn’t going to receive many errors – so perhaps this is just an example of a statement with an insanely bad execution plan; clearly one to look at. Trouble is, though, that unless you set statspack to run at level 6 you don’t capture the execution plans – unlike the AWR which capture them by default. The batch took a couple of days to complete, though, so it would have been possible either to change the snapshot level, or to pull the execution plans from memory.

Looking at the SQL ordered by Reads, we see another example of the top statement being much more expensive than anything following it – but there’s an odd little pattern that caught my eye as I scanned the list. The next three statements look very similar to each other, and have similar statistics, and if you add up their stats the answer is very close to the figures for the first statement. Moreover, the report says that the top statement is responsible for 90% of the work – I think the top statement is actually calling the next three, so we need to look at those statements to see how to make them efficient in order to fix the first statement.

You might note, by the way, the capitals and the bind variables (:B1) in the next three statements – I suspect that there is a PL/SQL function in the first statement that is being called too many times. (20 executions of the first statement, 8,500 of the other three). Perhaps a change in the execution plan of the main query has produced a change in table order which has resulted in some predicate – one with a pl/sql function in it – being tested much more frequently than usual.

A quick cross-reference to the instance activity is in order:

physical reads                           311,462,252       43,270.7     52,127.6
prefetched blocks                        240,396,452       33,397.7     40,233.7

table fetch by rowid                  12,348,226,523    1,715,508.0  2,066,648.8
table fetch continued row             12,047,845,308    1,673,776.8  2,016,375.8

table scan blocks gotten                 361,692,971       50,249.1     60,534.4
table scan rows gotten                 7,787,418,493    1,081,886.4  1,303,333.6
table scans (long tables)                     32,041            4.5          5.4

A huge fraction of the physical reads seem to be for prefetched blocks – that a good indication that they are for tablescans (or index fast full scans), and when we compare rowid access with tablescan access we see that there are a lot of “long” tablescans (maybe the expensive three queries are all doing tablescans of a long table, that would account for 25,500 of those reported), with 360M blocks scanned. Apart from anything else, all those block scans will be using a big chunk of CPU. The AWR would show us which segments were being scanned, of course, as would statspack if we were running it at level 7.

I’ve included the table fetch by rowid and table fetch continued row. Normally we should be concerned by “continued fetches” – but in this case I’m prepared to overlook them temporarily: it’s possible that after the critical SQL is corrected we may still see that most of the fetches by rowid turn into continued fetches, and at that point it will be worth worrying; but at present, even though they’re probably responsible for a serious fraction of the CPU usage, those continued fetches may be for rows that we’re not going to be fetching once the SQL is doing the right thing.

The last little detail, just for confirmation, comes from the latch activity part of the report; I’ve just selected the busiest 4 latches:


                                           Pct    Avg   Wait                 Pct
                              Get          Get   Slps   Time       NoWait NoWait
Latch                       Requests      Miss  /Miss    (s)     Requests   Miss
------------------------ -------------- ------ ------ ------ ------------ ------
cache buffers chains     ##############    0.0    0.0      0  577,860,644    0.1
cache buffers lru chain         157,333    0.3    0.0      0  687,799,033    0.8
multiblock read objects     114,602,621    0.6    0.0     12            0
simulator hash latch        811,998,194    0.0    0.0      2            0

                                      Get                            Spin &
Latch Name                       Requests      Misses      Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
cache buffers chains       ##############  12,262,452     180,671 0/0/0/0/0
multiblock read objects       114,602,621     719,477         115 719369/101/7/0/0
simulator hash latch          811,998,194       3,889          19 3872/15/2/0/

As you can see, and as we had guessed, they’re all about the buffer cache – doing multiblock reads and loading/dumping blocks from the cache; and almost all of the sleeps are on the cache buffers chains.

Next Steps

As a quick and dirty which might help a bit – put more of the sga_max_size into the buffer cache – it doesn’t happen automatically in 9.2 – and that may reduce the interaction between the O/S and Oracle and save a bit of latching and CPU.

Address the three statements doing all the physical I/O, and check the path for the statement that cause them to be executed. Fix the statement that has done 40% of the buffer gets. In all cases the problem may simply be an execution plan change – but perhaps the machine has been hiding an inherent performance problem and on this occasion a small change in data (a few more errors, perhaps) resulted in the excess work generating a massive increase in workload and contention.

We do know, however, that there is a lot of work done on buffer gets that didn’t show up in this report – so run off the report for the previous hour and see if we can find a couple more big hitters. Once we’ve sorted out the worst SQL, we might then want to see if the volume of “continued rows” is high – and find out why they’re appearing. It’s possible that it’s due to a table with more than 254 columns, of course, or a long column, but we’ll worry about that only when we need to.

One final thought – it’s easy to fix the wrong thing when you’re looking at statspack reports. Although we can see some SQL that is “obviously” overloading the machine, maybe the business critical task that has to complete promptly is simply hidden because it does (relatively) little work. Unless we can fix the big problems extremely quickly, we ought to take a little time to check with the business and the developers that what we’re looking at really is the code that’s running the critical business task.

Longer term, of course, upgrade to 9.2.0.8 at least and then start running Statspack at level 7 for better information. Then upgrade some more.

9 Comments »

  1. These things must attract each other! Just finished a tuning exercise on a third party j2ee app here.

    A single SQL was in its 3rd day of uninterrupted execution. Finally got permission from the business to shoot it and fix the blessed thing.

    Typical j2ee “framework” disaster: auto-generated, 3.8Mrow-table joined to 4Mrow-GTT, with multiple subqueries on both predicates AND select column list coming from equally monster-joins. None of them with the slightest chance of ever using an index.
    What I call a “legacy query”: our grandchildren will see its results.

    The execution plan in EM was a very long, sorry list of nested over nested over nested loops of million row FTS…
    But because the buffer cache was quite large, the whole lot fit in cache. Hence, a LOT of CPU being used and negligible reads.
    And of course: lots of buffer cache waits and even much more lots of CPU waits!

    Which prompted the supplier’s “Oracle guru” to *categorically* claim “you are starved of CPU – add more of it”.
    Of course, that got shot in the bud.
    After a little judicious re-coding and trimming of totally unnecessary nested subqueries, the whole lot finished in around 10 minutes.

    More CPU. Yeah! Right…
    (save us from “j2ee Oracle gurus”!…)

    Comment by Noons — January 9, 2013 @ 6:34 am GMT Jan 9,2013 | Reply

    • Noons,

      In many ways this boils down to the question: “Who knows what the system is supposed to do?”

      Intuitively we feel that a query that takes 10,000,000 buffer gets and runs hundreds of times per day in an OLTP system has to be wrong – either it should only be running rarely, or it should be taking thousands of LIOs rather than millions. There are special cases, of course, and there’s a very large grey area where the reource usage looks suspicious but not necessarily outrageous. When in doubt, we need to be able to get a good answer to the question: “does this seem reasonable?”, and I often find that the answer to that question is that nobody knows, and you have to go through a lot of work trying to reverse engineer the requirement and the intent of the programmer.

      Comment by Jonathan Lewis — January 12, 2013 @ 12:33 pm GMT Jan 12,2013 | Reply

      • Very true. In this case, it was quite clear: it’s a daily aggregation batch job that starts executing at 11pm and is supposed to be finished by 8am.
        Used to work well within that window. Until the Xmas load from our centres made the “demo data” in the tables involved grow from 100 rows to 3.8 million.
        And of course: archiving older data is unheard of – apparently, adding CPU to enable the thing to “scale” was the recommended way to address this.
        Conveniently ignoring the process does not use parallel query and of the 4 cores available only one was flat out.
        Given I got a say in this one, I preferred to tune the SQL: now down to 10 minutes, from never finishing after 72 hours. I reckon that was a better improvement than “adding CPU”.
        Although of course the Oracle sales rep hates that! :)

        Comment by Noons — January 12, 2013 @ 1:24 pm GMT Jan 12,2013 | Reply

  2. Hallo Jonathan
    “Since this isn’t the AWR, of course, we don’t have the OS Stats available..”

    Not generally , since Version 10 captures Statspack also OS stats in table stats$osstat.

    Best Regards,
    Petr

    Comment by PetrN — January 11, 2013 @ 12:25 pm GMT Jan 11,2013 | Reply

    • PterN

      Thanks for the comment – it’s too easy to lose track of which bits appear in which version (though this was a 9i system, of course). The only bit I don’t forget is that Statspack 10 reported the event histograms that didn’t get into the AWR until 11 ;)

      Comment by Jonathan Lewis — January 12, 2013 @ 1:15 pm GMT Jan 12,2013 | Reply

  3. Hi jonathan,

    Happy new year.

    Could you please explain further your thoughts when you’ve wrote the following sentence :
    “and even then that’s 40% of the total – so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them”

    Do you think 5 billion of LR should represent more than 40% of the total?

    Comment by ahmed aangour — January 11, 2013 @ 2:10 pm GMT Jan 11,2013 | Reply

    • Ahmed,

      The point of that comment was that the instance activity reports something like 14.4 Billion LIOs, so somewhere we’ve “lost” 8 or 9 billion. This means that solving the 5B problem might be nice (and it might even be easy) but there’s still a huge amount of work hiding somewhere that we’re still going to have to address.

      Maybe the big one that we can see all happened in the seconde hour of the report, and there’s a similarly sized query that took place in the first hour. It’s always worth thinking about the things that we can’t see, and wondering whether there’s a way in which we can find them.

      Comment by Jonathan Lewis — January 12, 2013 @ 1:18 pm GMT Jan 12,2013 | Reply

  4. Hello Jonathan

    and the machine had 12 CPUs

    Where did you get this number? Original thread doesn’t show available cores count.

    Working from the top down, the first thing we notice is that 2M logical I/Os per second is quite busy

    Just to translate it into bytes makes sense: that’s around 16GiB/s, which could tell you more if you happen to know chipset – maybe that’s close to its limit.

    Since this isn’t the AWR, of course, we don’t have the OS Stats available, so we can’t see if the rest of the machines CPU is being used up outside the database – but looking at the db file read figures I’ll bet it is

    Could you please clarify what is it in the db file read figures which makes you think something else is eating up CPU outside the DB? BTW Statspack in 11g provides OS stats (10g doesn’t AFAIR), so it’s not only the tool that is an issue.

    If you have extreme CPU usage in an Oracle system, then latch contention problems are very likely to show up – and that’s our next highest figure – AWR would probably tells us which latch but, again based on the db file read figures, I’ll guess it’s the cache buffers chains latches.

    1. That’s not AWR which will tells the latch name, but modern Oracle kernel which has special names for specific latch free events, so Statspack in the 10g or 11g will show CBC latch in the Top 5 as well.
    2. It’s hard to decide what is the cause and what is the consequences – latching is causing CPU saturation or extreme CPU usage causes latch waits to appear in Top 5. In this case I think it is the 1st scenario, i.e. too much block gets cause too much CPU usage due to too much latch acquisitions.

    so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them

    That is a very likely situation, especially with relatively small shared pool. Another possibility is that something is still being executed, and in this case Oracle kernel in 9i could not help Statspack in identifying a query which is still running at the end snapshot.

    Trouble is, though, that unless you set statspack to run at level 6 you don’t capture the execution plans – unlike the AWR which capture them by default

    … but it doesn’t help in some situation as it doesn’t capture predicates (Bug 7493519: ACCESS_PREDICATES AND FILTER_PREDICATES MISSING IN DBA_SQL_PLAN_HIST, fixed in 12c). I don’t remember if Statspack gathers the predicates correctly under level 6 – if it is, then Statspack is preferable tool for this task.

    As a quick and dirty which might help a bit – put more of the sga_max_size into the buffer cache – it doesn’t happen automatically in 9.2 – and that may reduce the interaction between the O/S and Oracle and save a bit of latching and CPU.

    If it is indeed 32-bit binaries then I’m not sure it’s possible to allocate more than 2.7G to SGA. If it is 64-bit then SGA increase should also be combined with the switching to direct IO and possibly to the huge pages if it’s Linux (hostname suggests so). But without fixing SQL it’s possible to make things worse: removing IO completely, the app will (try to) use more CPU, resulting in potentially worse latch contention and more problems.

    What looks a bit suspicious to me in the report are Parallel operations: there were 3/3 DDL/DML statements parallelized, 54 DFO trees parallelized, 48 queries parallelized. I’m not sure if these numbers are consistent – but I think that even 9 PX slaves on a 12 core system could utilize 75% of the CPU if they catch the wrong plan. It could be related to a DDL on the index, perhaps, changing its degree although this is just a guess. Anyway, I think it’s worth looking if Parallel Execution is acceptable on this system or not.

    Comment by Timur Akhmadeev — January 11, 2013 @ 4:03 pm GMT Jan 11,2013 | Reply

    • Timur,

      The original question came up in (at least) two places on OTN – this is the one I saw and responded to.

      Your comments about me saying “AWR” when really I should have been saying “9i” are correct, of course. There is some OS information in 10g, though the presentation is different; in fact if you enable it you can get some OS stats appearing in the 8i and 9i Instance Activity.

      My inference from the db file read figures is that since they’re so fast they must be coming from the file system cache – and if that’s the case then there’s going to be a big chunk of CPU being recorded against the OS time for file system manipulation. (I’ve assumed, of course, that the huge excess of memory, and the low version of Oracle, are good indicators of a simple set up.).

      I think I made the point about confusing cause and effect in the OTN thread, but at this point I think it ought to be fairly clear that which is which. If anyone doesn’t get your point though, it’s worth mentioning that if you have a bit of a latch sleep problem then if everyone starts chewing up all the CPU (and without necessarily doing anything related to latches) then the latch sleeps get worse because it takes longer for a sleeper that’s woken up to get to the top of the run queue because of all the processes in the queue ahead of it that are using up their time slices.

      Good alternative explanation for how we could lose about 8 Billion LIOs that took place in this time interval. The OP might benefit from running off the next two or three (or more) hours reports as well.

      Predicate capture (statspack doesn’t do it): at this point, with such extreme problems, I’m only really interested in the big picture – are today’s (or this interval’s) execution plans the ones we’ve been seeing for weeks, or is today a very special day. I’ll worry about WHY there’s been a change after I’ve determined that a (massive) change has occurred.

      Regarding SGA size: it may not be in the thread you saw, but the sga_max_size is set to 1.7GB, and the db_cache_size and shared_pool_size are set (and reported) as 300MB – it looks as if someone has allowed for manual adjustment up to the legal 32-bit max for the platform, but has not actually allocated it. So there seems to be about 1GB of memory doing nothing that ought to have been in the db cache to reduce the constant copying from file system to Oracle cache. But it is possible that an increase might make the wrong thing go faster – with the side effect (see “top of run queue” comment) that the thing we’re worried about might slow down even more.

      In fact there’s a follow-up post from the same person comparing two consecutive days. I’m trying to find time to write some comments about it as “Statspack 13a”. It’s very educational.

      Comment by Jonathan Lewis — January 12, 2013 @ 6:42 pm GMT Jan 12,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a reply to ahmed aangour Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.