Oracle Scratchpad

January 28, 2023

Case Study

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 12:46 pm GMT Jan 28,2023

This example appeared quite recently on one of the public Oracle Forums, and the person who posted it had already worked out (the important bit of) what was going on before anyone asked for more information or supplied any suggestions for action (although there was one response pointing to a MOS Note that might have had some relevance and did contain some interesting, though unrelated, background information).

The title of the question was: “KTSJ process running select dbms_rowid.rowid_type(rowid) causing huge load”, and the opening post contain a number of bits of information about what KTSJ was and how it relates to SMCO. The critical details, though, were the SQL, and what it was running on.

The system was a 4 node RAC, running 19.12. The problem query was running multiple times between 4:00 a.m. and 9:00 a.m. and the table it references is about 4 GB and “not fragmented”. The table is “basic compressed” and partitioned on a date column, there are no LOB columns, and there are about 200,000 updates on the table “each morning” (but no comment about whether this is before, during, or after the performance problem). The query was simple:

select dbms_rowid.rowid_type(rowid) from <owner>.<table_name> where rownum=1;

In a follow-up post we got the following information about a typical run of the query:

Global Stats
============================================================================
| Elapsed |    Cpu  |      IO  |  Cluster | Fetch | Buffer |  Read |  Read |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |   Gets |  Reqs | Bytes |
============================================================================
|     123 |      16 |       94 |       13 |     1 |     1M | 81215 |  10GB |
============================================================================

If you’re familiar with the the SQL Monitor you’ll recognise this as the summary information from an SQL Monitor report – which means, of course, that the OP also has an execution plan for the query.

It’s easy of course to do the Bart Simpson “I knew that” claim after the OP has given the answer, but even with this tiny amount of information there are several clues to tell you what’s probably going on and why.

Inferences

In this case the query is so simple there are virtually no options for the optimizer. It’s a reference to a single table, it’s asking only for the rowid, and it’s going to stop after one row is returned – there are only three possible (unhinted) execution plans: a tablescan, an index fast full scan, or an index full scan – all with a stopkey. In fact, it’s only in earlier versions of Oracle that the optimizer would choose an index full scan over an index fast full scan.

The idea of a segment scan (whether tablescan or index fast full scan) may sound a bit extreme as a path to return just one row, but the optimizer knows that it’s going to stop after one row and adjusts the cost accordingly. Since indexes on a table are (almost always) smaller than the table itself we will probably see an index fast full scan on whichever index reports the smallest number of leaf blocks (and included at least one column declared not null), and we should see the scan stopping after a very small number of block visits.

The Global Stats tell us a completely different story. We might be expecting one multiblock read of 128 blocks, and possibly a few more single block reads for read-consistency (undo records applied) reasons; we’ve actually done 81,215 Read Requests totalling about 10GB (which really means 9.5GB to 10.5GB) or roughly 1.3 million blocks. This average read size of fractionally over 16 suggests a huge segment scan that didn’t stop early. As a quick sanity check you’ll note we’ve also reported 1M Buffer Gets (which means 0.5M to 1.5M) which is consistent with a segment scan into the buffer cache.

  • Question: Why would Oracle do a huge segment scan when we know it should have stopped scanning almost immediately?
  • Answer: there was a huge amount of empty space at the “start” of the segment.

One detail the OP told us was that the table was only 4GB and “not fragmented” – I don’t think you’d say that if there were 10GB of empty blocks in the table before a scan got to the first row. On the other hand, the OP didn’t say anything about the size or state of any indexes and we’re expecting an index fast full scan of the “smallest” index on the table.

So how does the smallest index on a 4GB table turn into a 10GB fast full scan? The answer is in two parts:

  • First is that the optimizer is looking at user_indexes.leaf_blocks to calculate the cost of a fast full scan.
  • Second is that the code to generate the value of leaf_blocks doesn’t see empty leaf blocks
  • Third is that there are patterns of data processing that can leave an index in a state where it holds a huge number of empty leaf blocks that haven’t been flagged for re-use.

The OP said there were about 200K updates on the table each morning – what if the activity was something like:

  • Insert a batch of rows with a status column holding a starting value
  • The column is indexed to allow rapid access to “new” (and other interesting, “rare”) status values
  • The column cycles through a few different status values before getting to a “final” status.

Depending on batch sizes, timing, concurrency effects, version of Oracle, and the possible values for status you could end up with an index that had a huge number of empty leaf blocks at the “low value” end and all the populated leaf blocks at the “high value” end. It’s not supposed to happen, but occasionally it does. If you gathered index stats Oracle would count only the blocks that held the few high values but when you did an index fast full scan you’d have to scan through a load of empty leaf blocks before you found one with an index entry.

Summary

Realistically the correct diagnostic method for problem would have been to check the plan (and activity) from the SQL Monitor report, notice the index fast full scan and (if they were visible) multiblock read waits, then check the state of the index. This note was just a little lesson on the way in which you can make reasonable guesses (inferences / deductions) from insufficient data so that you have some idea of what might have gone wrong.

In this case the steps were:

  1. Know that there are (realistically) only two possible execution paths for the stated query
  2. Do some arithmetic to recognise the activity as a large segment scan
  3. Eliminate the tablescan because the description of its state doesn’t allow for the observed phenomenon
  4. Recognise the potential for an index to get into a state that matches the phenomenon

Lagniappe

The OP had already replied to his own question explaining that he had identified an index that had grown to 13GB (when the table was only 4GB), and that the index was only 0.03GB (30MB) after a rebuild, and the problem disappeared.

This suggests a need for two further investigations:

  • Why does the index grow so much, and does it need to be coalesced or shrunk on a very regular basis (e.g. daily, or once per week).
  • A average table row is 133 (= 4 / 0.03) times as long as its index entry. That sounds like the rows may be rather long. (At least 1,470 bytes since the minimum Oracle allows for an index entry is 11 bytes). Has the tables pctfree been set for optimum storage, and might the table be suffering from lots of migrated rows?

1 Comment »

  1. […] Case Study (Jan 2023): finding information from a minimal amount of data […]

    Pingback by Troubleshooting catalogue | Oracle Scratchpad — January 28, 2023 @ 12:51 pm GMT Jan 28,2023 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: