Oracle Scratchpad

March 11, 2008

Everything Changes

Filed under: CBO,Execution plans,Statistics — Jonathan Lewis @ 8:59 pm GMT Mar 11,2008

Sometimes you look at a change in the optimizer arithmetic and think “Why did that take so long to appear?” Here’s an example that’s probably going to cause a lot of heartache over the next couple of years as people go through the upgrade process. I have a query that looks like this:

select
	padding
from
	t1
where
	n1	= 2
and	n2	= 2
;

Here are the execution plans from 10.1.0.4, 10.2.0.3, and 11.1.0.6 respectively:

Execution plan (10.1.0.4)
---------------------------------------------------------------------------
| Id  | Operation                   |  Name       | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |    10 |  2470 |    12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1          |    10 |  2470 |    12 |
|*  2 |   INDEX RANGE SCAN          | T1_I1       |    10 |       |     4 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"=2 AND "T1"."N2"=2)
.
.
Execution plan (10.2.0.3)
---------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    10 |  2470 |   168 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    10 |  2470 |   168 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |   200 |       |    25 |
---------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=2 AND "N2"=2)
.
.
Execution plan (11.1.0.6)
---------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   200 | 49400 |   168 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |   200 | 49400 |   168 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |   200 |       |    25 |
---------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=2 AND "N2"=2)

To get the 10.2 and 11.1 plans, I actually had to hint the queries. (And I’ve cheated just a little bit – I had an extra, constant, “padding” column in the middle of the index to make it larger than the actual values suggest; but I’ve removed that column reference from the predicate list)

Notice how the cardinalities (rows) and costs change. I have 4,000 rows in the table, and the n1 and n2 columns have 20 distinct values each. However, the way I’ve defined the data means that n1 and n2 are always the same.

In 10.1, the optimizer does it’s arithmetic by considering the n1 and n2 values separately – so it decides there are 400 (20 * 20) possible distinct combinations of n1 and n2, so for both the index and table selectivity it decides that the cardinality will be 10 (4,000 / 400). This means the index scan will be small (cost = 4) and we will visit just a few blocks in the table (incremental cost = 8).

In 10.2, the optimizer uses the index distinct_keys to calculate the number of index entries scanned. There are 20 distinct combinations, so we visit 200 entries (4,000 / 20). This is a larger range scan (cost = 25) and we have to visit a lot more blocks in the table (incremental cost = 143). Unfortunately, when we get to the table the optimizer decides that the number of rows we find is derived by considering the number of distinct column values separately – so we are back to 10 (4,000 / (20 * 20)).

Finally, in 11.1, the optimizer calculates the index selectivity and the table selectivity from the number of distinct keys in the index; so not only are the index and table costs high, the estimate for the number of rows in the table is also 200.

In many cases the incremental enhancements to the optimizer will make no difference to most people – but occasionally the increased costs or the increased cardinalities will change an execution plan, possibly making it better, possibly making it worse.

It’s important to find the differences in simple examples like this, because it makes it so much easier to understand what’s changed in complex examples.

Footnote: I had forgotten that I previously published this example (without the 11g change) some time ago.

10 Comments »

  1. I have a curiosity if you don’t mind and have some time to “spare” – does the CBO using distinct_keys show up in the 10053 trace as something like “using concatenated index cardinality”, or is this improvement used silently ?

    Comment by Alberto Dell'Era — March 11, 2008 @ 10:44 pm GMT Mar 11,2008 | Reply

  2. I’m not able to reproduce this behavior on my 10.2.0.3.1 instance. Would you please post your table and index create scripts?

    Comment by Scott — March 12, 2008 @ 7:39 pm GMT Mar 12,2008 | Reply

  3. Alberto,
    There are no obvious clues in the 10g trace, but the 11g trace has a comment about ‘Column Groups’ – so the last change may simply be a part of the general enhancement relating to the “extended statistics” (the reference to PredCnt 3 is explained by the actual code I used – see below):

    ColGroup Usage:: PredCnt: 3 Matches Full: #0 Partial: Sel: 0.0500

    Scott,
    The SQL, excluding the call to dbms_stats, is as follows:

    create table t1
    as
    select
            mod(rownum,20)          n1,
            mod(rownum,20)          n2,
            rpad('x',40,'x')        ind_pad,
            rpad('x',200)           padding
    from
            all_objects
    where
            rownum <= 4000
    ;

    create index t1_i1 on t1(n1, ind_pad, n2)
    pctfree 91
    ;

    select
            /*+ 
                    index(t1) 
            */
            padding
    from
            t1
    where
            n1      = 2
    and     ind_pad = rpad('x',40,'x')
    and     n2      = 2
    ;

    Comment by Jonathan Lewis — March 12, 2008 @ 8:23 pm GMT Mar 12,2008 | Reply

  4. […] by Everything Changes « Oracle Scratchpad — March 13, 2008 @ 1:31 pm UTC Mar […]

    Pingback by Index not used (10g) « Oracle Scratchpad — March 13, 2008 @ 1:33 pm GMT Mar 13,2008 | Reply

  5. […] is Oracle’s ever-evolving Cost Based Optimizer. Jonathan Lewis at Oracle Scratchpad looks at CBO and its erratic growing pains from version 9 onwards and why it behaves the way it […]

    Pingback by Log Buffer #89: a Carnival of the Vanities for DBAs — March 21, 2008 @ 5:00 pm GMT Mar 21,2008 | Reply

  6. Greetings Jonathan,

    Thanks for the excellent tip on how the execution plans display differently between Oracle 10gR1 and Oracle 11gR1. By the way hope to see you in San Diego sometime in the future.

    Cheers,
    Ben Prusinski, Oracle DBA/Consultant

    Comment by Ben Prusinski — April 25, 2008 @ 8:12 pm BST Apr 25,2008 | Reply

  7. I have a basic doubt in the example you have illustrated above.

    Is it not true that in 11.1 the Optimizer sees a more realistic value of the index & table selectivity ? Given this, is this plan not better than the ones in 10g. The 10g plans seem to have a lesser cost but they don’t seem to have calculated the selectivity correctly ?

    Please evaluate my understanding.

    Comment by Mukund — December 15, 2009 @ 3:30 pm GMT Dec 15,2009 | Reply

    • Mukund,

      Your observation is correct – this cost is more realistic, and the change is an improvement in the optimizer’s algorithms.

      The point I was trying to make, though, was that any change (even a logical improvement) will be bad news for someone because any change in an execution path may be unlucky for their particular case; so knowing about the existence of the change, and some details of how the change works, is important.

      Comment by Jonathan Lewis — December 16, 2009 @ 10:37 am GMT Dec 16,2009 | Reply

  8. […] is Oracle’s ever-evolving Cost Based Optimizer. Jonathan Lewis at Oracle Scratchpad looks at the CBO and its erratic growing pains from version 9 onwards and why it behaves the way it […]

    Pingback by Log Buffer #89: A Carnival of the Vanities for DBAs — February 12, 2013 @ 5:12 pm GMT Feb 12,2013 | Reply

  9. […] is actually an example of a feature of the optimizer that I described a few years ago. The original note described a change as you moved from 10.1 to […]

    Pingback by Index Selectivity | Oracle Scratchpad — February 9, 2014 @ 5:47 pm GMT Feb 9,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.