Oracle Scratchpad

March 11, 2008

Everything Changes

Filed under: CBO, Execution plans, Statistics — Jonathan Lewis @ 8:59 pm UTC Mar 11,2008

Sometimes you look at a change in the optimizer arithmetic and think “Why did that take so long to appear?” Here’s an example that’s probably going to cause a lot of heartache over the next couple of years as people go through the upgrade process. I have a query that looks like this:

select
	padding
from
	t1
where
	n1	= 2
and	n2	= 2
;

Here are the execution plans from 10.1.0.4, 10.2.0.3, and 11.1.0.6 respectively:


Execution plan (10.1.0.4)
---------------------------------------------------------------------------
| Id  | Operation                   |  Name       | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |    10 |  2470 |    12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1          |    10 |  2470 |    12 |
|*  2 |   INDEX RANGE SCAN          | T1_I1       |    10 |       |     4 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"=2 AND "T1"."N2"=2)
.
.
Execution plan (10.2.0.3)
---------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    10 |  2470 |   168 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    10 |  2470 |   168 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |   200 |       |    25 |
---------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=2 AND "N2"=2)
.
.
Execution plan (11.1.0.6_)
---------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   200 | 49400 |   168 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |   200 | 49400 |   168 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |   200 |       |    25 |
---------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=2 AND "N2"=2)

To get the 10.2 and 11.1 plans, I actually had to hint the queries. (And I’ve cheated just a little bit – I had an extra constant, padding, column in the middle of the index to make it larger than the actual values suggest; but I’ve removed that column reference from the predicate list)

Notice how the cardinalities (rows) and costs change. I have 4,000 rows in the table, and the n1 and n2 columns have 20 distinct values each. However, the way I’ve defined the data means that n1 and n2 are always the same.

In 10.1, the optimizer does it’s arithmetic by considering the n1 and n2 values separately – so it decides there are 400 (20 * 20) possible distinct combinations of n1 and n2, so for both the index and table selectivity it decides that the cardinality will be 10 (4,000 / 400). This means the index scan will be small (cost = 4) and we will visit just a few blocks in the table (incremental cost = 8).

In 10.2, the optimizer uses the index distinct_keys to calculate the number of index entries scanned. There are 20 distinct combinations, so we visit 200 entries (4,000 / 20). This is a larger range scan (cost = 25) and we have to visit a lot more blocks in the table (incremental cost = 143). Unfortunately, when we get to the table the optimizer decides that the number of rows we find is derived by considering the number of distinct column values separately – so we are back to 10 (4,000 / (20 * 20)).

Finally, in 11.1, the optimizer calculates the index selectivity and the table selectivity from the number of distinct keys in the index; so not only are the index and table costs high, the estimate for the number of rows in the table is also 200.

In many cases the incremental enhancements to the optimizer will make no difference to most people – but occasionally the increased costs or the increased cardinalities will change an execution plan, possibly making it better, possibly making it worse.

It’s important to find the differences in simple examples like this, because it makes it so much easier to understand what’s changed in complex examples.

Footnote: I had forgotten that I previously published this example (without the 11g change) some time ago.

6 Comments »

  1. I have a curiosity if you don’t mind and have some time to “spare” – does the CBO using distinct_keys show up in the 10053 trace as something like “using concatenated index cardinality”, or is this improvement used silently ?

    Comment by Alberto Dell'Era — March 11, 2008 @ 10:44 pm UTC Mar 11,2008 | Reply

  2. I’m not able to reproduce this behavior on my 10.2.0.3.1 instance. Would you please post your table and index create scripts?

    Comment by Scott — March 12, 2008 @ 7:39 pm UTC Mar 12,2008 | Reply

  3. Alberto,
    There are no obvious clues in the 10g trace, but the 11g trace has a comment about ‘Column Groups’ – so the last change may simply be a part of the general enhancement relating to the “extended statistics” (the reference to PredCnt 3 is explained by the actual code I used – see below):

    ColGroup Usage:: PredCnt: 3 Matches Full: #0 Partial: Sel: 0.0500

    Scott,
    The SQL, excluding the call to dbms_stats, is as follows:

    create table t1
    as
    select
            mod(rownum,20)          n1,
            mod(rownum,20)          n2,
            rpad('x',40,'x')        ind_pad,
            rpad('x',200)           padding
    from
            all_objects
    where
            rownum <= 4000
    ;

    create index t1_i1 on t1(n1, ind_pad, n2)
    pctfree 91
    ;

    select
            /*+ 
                    index(t1) 
            */
            padding
    from
            t1
    where
            n1      = 2
    and     ind_pad = rpad('x',40,'x')
    and     n2      = 2
    ;

    Comment by Jonathan Lewis — March 12, 2008 @ 8:23 pm UTC Mar 12,2008 | Reply

  4. [...] by Everything Changes « Oracle Scratchpad — March 13, 2008 @ 1:31 pm UTC Mar [...]

    Pingback by Index not used (10g) « Oracle Scratchpad — March 13, 2008 @ 1:33 pm UTC Mar 13,2008 | Reply

  5. [...] is Oracle’s ever-evolving Cost Based Optimizer. Jonathan Lewis at Oracle Scratchpad looks at CBO and its erratic growing pains from version 9 onwards and why it behaves the way it [...]

    Pingback by Log Buffer #89: a Carnival of the Vanities for DBAs — March 21, 2008 @ 5:00 pm UTC Mar 21,2008 | Reply

  6. Greetings Jonathan,

    Thanks for the excellent tip on how the execution plans display differently between Oracle 10gR1 and Oracle 11gR1. By the way hope to see you in San Diego sometime in the future.

    Cheers,
    Ben Prusinski, Oracle DBA/Consultant

    Comment by Ben Prusinski — April 25, 2008 @ 8:12 pm UTC Apr 25,2008 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.