Going too fast

October 14, 2008

Going too fast

Filed under: CBO,Execution plans,Performance,Statistics,Troubleshooting,Tuning — Jonathan Lewis @ 6:10 pm BST Oct 14,2008

I received an email a litle while ago with an unusual problem. It said:

“One of the jobs which used to take more than one hour to complete is now completing in less than 10 minutes. Neither the application developer nor we (the DBA’s) made *any* changes in the environment/code/database. I can’t work out why it’s got better!”

It’s not often that “going faster” is a problem – but there’s a very good reason for being worried about jobs that go faster for no apparent reason – one day your luck is going to run out and the jobs are going to go slower again – and people really notice when things slow down.

There was one critical query and the DBA told me that I could publish it if I camouflaged it – luckily it was a very simple query: so here’s the obfuscated version, with a set of table and column names that should give you some ideas about what’s going on.


update
	bal_extract	ext
set
	ext.daily_close = (
		select
			nvl(cbl.credit_total,0) + nvl(cbl.debit_total,0)
		from	cust_bal	cbl
		where
			cbl.transaction_date	= :b1
		and	cbl.cust_id(+)		= ext.cust_id
	)
where
	ext.batch_date = :b1
;

The DBA was licensed for AWR so he was able to send me all sorts of extra information. Even when the critical query ran well, it appeared in the AWR reports, which meant that the DBA could use the awrsqrpt.sql script to get resource figures and execution plans for the query from the repository. This is what he found (in fifteen minute snapshots).

Bad Run (when the job took more than an hour to complete):


Stat Name                                Statement   Per Execution % Snap
---------------------------------------- ---------- -------------- -------
Elapsed Time (ms)                           644,561            N/A    50.4
CPU Time (ms)                               626,440            N/A    54.6
Executions                                        0            N/A     N/A
Buffer Gets                               4,719,655            N/A    28.1
Rows                                              0            N/A     N/A
          -------------------------------------------------------------

Execution Plan
---------------------------------------------------------------------------------
| Id |Operation                    | Name        |Rows|Bytes|Cost (%CPU)|Time    |
----------------------------------------------------------------------------------
|  0 |UPDATE STATEMENT             |             |    |     |    2 (100)|        |
|  1 | UPDATE                      | BAL_EXTRACT |    |     |           |        |
|  2 |  TABLE ACCESS BY INDEX ROWID| BAL_EXTRACT |  1 |  28 |    2   (0)|00:00:01|
|  3 |   INDEX RANGE SCAN          | EXT_DATE_IX |  1 |     |    1   (0)|00:00:01|
|  4 |  TABLE ACCESS BY INDEX ROWID| CUST_BAL    |  1 |  26 |    4   (0)|00:00:01|
|  5 |   INDEX RANGE SCAN          | CBL_DATE_IX |  1 |     |    3   (0)|00:00:01|
----------------------------------------------------------------------------------

Good Run (when the job took ten minutes to complete):


Stat Name                                Statement   Per Execution % Snap
---------------------------------------- ---------- -------------- -------
Elapsed Time (ms)                            25,131       25,130.7     3.4
CPU Time (ms)                                23,270       23,270.2     3.9
Executions                                        1            N/A     N/A
Buffer Gets                               2,626,166    2,626,166.0    14.6
Rows                                        371,735      371,735.0     N/A
          -------------------------------------------------------------

Execution Plan
-----------------------------------------------------------------------------------------
| Id |Operation                    | Name             |Rows |Bytes |Cost (%CPU)|Time    |
-----------------------------------------------------------------------------------------
|  0 |UPDATE STATEMENT             |                  |     |      | 1110 (100)|        |
|  1 | UPDATE                      | BAL_EXTRACT      |     |      |           |        |
|  2 |  TABLE ACCESS FULL          | BAL_EXTRACT      | 116K| 2740K| 1110   (2)|00:00:14|
|  3 |  TABLE ACCESS BY INDEX ROWID| CUST_BAL         |   1 |   26 |    5   (0)|00:00:01|
|  4 |   INDEX RANGE SCAN          | CBL_DATE_CUST_IX |   1 |      |    4   (0)|00:00:01|
-----------------------------------------------------------------------------------------

The query completed in 25 seconds in the good run, telling us that we should expect to update about 372,000 rows. In the bad run, the query has not yet completed, although in 900 elapsed seconds (15 minute snapshot) we have used 644 CPU seconds.

You might be surprisied, though, that the bad execution plan has only done 4.7M buffer gets compared to the 2.6M from the good plan – and that may make you wonder how it’s managed to use so much more CPU.

We can look at the AWR report – and particularly the “Instance Statistics” to explain this apparent anomaly. I’ve picked out the numbers that (after eliminating some other possibilities that I won’t mention) are the ones that really matter:

Good Run:


Statistic 			Total 		per Second	per Trans
buffer is pinned count		1,649,019	1,846.60	1,088.46
table fetch by rowid		1,261,329	1,412.46	  832.56

Bad Run:


Statistic 			Total 		per Second	per Trans
buffer is pinned count		1,055,086,358 	1,174,806.15 	709,540.25
table fetch by rowid		  529,519,236	  589,603.36	356,099.02

People often forget that “buffer is pinned count” is another statistic that shows Oracle visiting buffer blocks – in this case blocks that have already been pinned, are still pinned, and don’t require any further latch activity before the latest visit takes place. These visits are less costly on CPU than “gets” because of the absence of latch activity, but if you do a lot of them, the CPU consumption can still become rather large. Large values for “buffer is pinned count” often go hand in hand with (large) index range scans.

So let’s go back to the execution plans.

Notice how the bad plan has an estimate of ONE row for the bal_extract table – when we know that we’re going to get 372 thousand rows. The good plan has an estimate of 116,000 rows – which is a much better estimate. The difference in estimates gives us a good clue about why this part of the plan has switched from an index range scan to a full table scan. (The difference in performance of this part of the query is not necessarily significant, but we might as well note the error).

For the second part of the execution the bad plan chooses to do a range scan of an index that looks (from its name) as if it is likely to be defined as just (transaction_date), while the good plan chooses an index the looks (from the name) as if it is likely to be defined as (transaction_date, cust_id). In both cases, the optimizer thinks that the result of using the chosen index is a single row from the table – and a single index entry that requires 3 or 4 block visits. In fact, we could reasonably guess that there are about 372,000 index entries (and table entries) per date in the cust_bal table, which explains why we see “buffer is pinned count” climb to one billion when we use index cbl_date_ix.

The source of the excessive CPU usage in the bad plan is that Oracle is using the wrong index to access the data. For each row in the bal_extract table it does a large range scan of the cbl_date_ix index, visiting lots of blocks in the cust_bal table. The underlying mechanisms of the nested loop, combined with the size of the data, the pattern of data scatter, and current activity on the system mean that most of this work is done whilst buffers are pinnned – which has helped keep the buffer “gets” to a reasonable level (which has helped to conceal the problem, perhaps) – but we still do far too much work.

The next question is: why has Oracle chosen the wrong index ?

The most obvious guess is that the statistics are wrong – and the most likely error is in the high value for the two date columns. With such simple predicates on the table, the only way that the optimizer could estimate one row from the bal_extract table when it’s really going to get 370 thousand is becauses of misleading stats, or a bug. Now, there are some odd bugs in some versions of Oracle relating to frequency histograms on columns that have only a single value – and we might be seeing something of this type. A more common issue, though, is that time-based columns (which include date and sequence columns) can easily end up with a “high_value” that goes out of date the moment after it has been set by a call to the dbms_stats package.

Using the 10g automatic stats collection routine, it might be days or weeks before enough data has changed to refresh the high value, and in that interval it is possible that plans can change from “high volume” plans (such as the tablescan we see in the good plan) to a “low volume” plan (such as the indexed access path in the bad plan) because the optimizer decides that the query is asking for data that is less and less likely to exist as the days go by.

I did say, though, that it was the second step in the plan that was more likely to be causing the problem. The same statistical issue could be the cause here. We (have guessed we) have one index on (transaction_date), and another on (transaction_date, cust_id). The single-column index is probably about half the size of the multi-column index in this case and seems to have a slightly smaller height, so when Oracle thinks there is very little (or no) data for a given transaction date, the arithmetic that chooses the index is much more likely to pick the smaller (inappropriate) index.

The solution for this user: first check my hypothesis – I think the stats have changed even though the DBAs “haven’t made any changes”. A quick check of the “last_analyzed_date” on the objects will tell you when the stats last changed, and it was probably just before the good plan appeared. In this case, given the licences to use the AWR, the DBA can also check the optimizer stats history (wri$_optstat%history views) in the AWR to see what the stats looked like before the last change.

If my hypothesis is correct then, at some arbitrary time in the future, the plans may change back to bad plans as the stats, data, and query get out of sync. So it would be a good idea to set up a schedule which forces a stats collection on this table at regular intervals, or write a little program that calls dbms_stats.set_column_stats to correct the “high_value” for the critical date columns every day before running the batch.

Of course, the details I have suggested to explain what’s happened in this example may be wrong – I don’t have access to the database to cross-check what I’ve assumed; but I can pretty certain that this change in plan relates in some way to a change in the statistics – and the scenario I’ve proposed is one which can happen quite easily.

Footnote 1: it might be possible to drop the (transaction_date) index, as its job will probably be handled by the (transaction_date, cust_id) index, and it could be quite benefitical to compress the latter on its first column – which is clearly highly repetitive.

Footnote 2: I have been advising people for the last seven or eight years that it’s perfectly reasonable to write programs to create statistics using the set_table_stats, set_index_stats and set_column_stats procedures in the dbms_stats package. Looking at Greg Rahn’s blog recently, I noticed that he has published links to some pdf files of presentations given at OOW by the Real World Performance Group which happened to include an example of how to do this – so it’s official, it’s legal, and it’s sanctioned by the top performance team in Oracle.

Comments (8)

8 Comments »

Another alternative (assuming your assumptions about the high value is correct) since the query uses bind variables might be to get Oracle’s blessing to disable bind variable peeking (_optim_peek_user_binds=false). In my experience it causes far more problems with unstable plans in a production environment than it fixes. I’m sure others’ mileage will vary.

Comment by Anthony Cordell — October 14, 2008 @ 8:41 pm BST Oct 14,2008 | Reply
How is disabling bind variable peeking going to fix issues when the bad execution plan is based on bad stats? If the query has not been parsed yet, it will be parsed with the bad plan. Once it is parsed, it’s parsed. If the query you are executing parses the same, it will use that plan. I may be wrong, but that’s how I see it.

Comment by Tom — October 17, 2008 @ 11:17 pm BST Oct 17,2008 | Reply
Jonathan,

I thought that adding outer join in the subquery add no meaning in the updating process. Is this my opinion correct ?

Thank you,
Rudi Kristanto

Comment by Rudi Kristanto — October 18, 2008 @ 5:20 am BST Oct 18,2008 | Reply
Rudi,

I believe you’re right about the outer/inner thing. When I saw it, I couldn’t think of an example of this type where it would make a difference (even if you have a nvl(sum()) in the subquery) and it wouldn’t make any difference in this case.

In fact, if you check the original SQL you’ll see that one predicate is “outered” and the other is “innered”. Because of the latter, Oracle will remove the “outer” anyway – but you would only be able to see that if you checked the filter predicates of the execution plan.

Comment by Jonathan Lewis — October 18, 2008 @ 6:53 am BST Oct 18,2008 | Reply
Tom,

If you disable bind-peeking for this update, the optimizer changes from:

"batch_date = {peeked value which is higher than the high_value}"

to

"batch_date = {unknown value that is assumed to be one of the num_distinct values in the column}"

so the aritmetic can change. In this particular case, if the “badness” of the stats is in the high value, but the rest of the stats (e.g. num_rows, num_distinct) are about right then the optimizer could produce a better plan.

Anthony,
I wouldn’t do this to solve problems with one or two queries. But on a temporary basis, to handle an urgent migration, it would be something I’d test if I had a huge number of problems that couldn’t be addressed any other way.

The experiences I’ve had of problems caused by peeking are largely related to the 10g upgrade resulting in lots of “unexpected” histograms being generated by the automatic stats collection.

Comment by Jonathan Lewis — October 18, 2008 @ 7:12 am BST Oct 18,2008 | Reply
Jonathan,

In 10g Oracle also takes automatic backups when gathering new statistics, so even without AWR license you could look at the backup statistics to compare them.
In 10.2.0.4 there are even diff_table_stats_* functions that you can use to compare statistics.

regards

Freek

Comment by Freek — October 18, 2008 @ 11:37 am BST Oct 18,2008 | Reply
Freek,

Thanks for the comment – I mentioned the relevant tables in the original note (wri$_opsstat_%_history) but I always think of wri$ tables as being part of AWR and therefore needing the licence. The correction and the comment about the new functions in dbms_stats in 10.2.0.4 are both useful.

For users of 9i, I wrote a note some time ago about how you can easily capture old stats as you create new stats.

Comment by Jonathan Lewis — October 18, 2008 @ 4:45 pm BST Oct 18,2008 | Reply
“For users of 9i, I wrote a note some time ago about how you can easily capture old stats as you create new stats.”

[unbelievable shameless plug]
or they van use my analyzethis package: http://freekdhooge.wordpress.com/analyzethis/
[/unbelievable shameless plug]

:)

Comment by dhoogfr — October 18, 2008 @ 6:34 pm BST Oct 18,2008 | Reply