Oracle Scratchpad

September 7, 2010

CBO Surprise

Filed under: CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 7:00 pm BST Sep 7,2010

Well, it surprised me!

I’ve said for a very long time that in principle, ignoring bugs and restrictions, the optimizer will always choose the lowest cost option during its search for an execution path. It turns out that this isn’t true. In a comment attached to a note I had written about a possible bug relating to function-based indexes I was told that there are cases where the optimizer follows a rule that allows it to ignore the lowest cost path if it is derived from a range-based predicate involving unpeekable bind variables.

The trouble is, any statement made about “bind variables” may also apply in any circumstances where the optimizer sees: “unknown value”. Here’s a simplified example that I find a little worrying (running on 11.1):

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		rownum <= 10000
)
select
	rownum			id,
	mod(rownum-1,20)	scattered,
	trunc(rownum / 500)	clustered,
	lpad(mod(rownum,2),10)	ind_pad,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 40000
;

create index t1_equi  on t1(scattered, ind_pad, clustered);
create index t1_range on t1(clustered, scattered);

-- call dbms_stats to compute table stats with no histograms

set autotrace traceonly explain

select
	small_vc
from
	t1
where
	scattered = 10
and	clustered between (select 40 from dual)
	          and     (select 41 from dual)
;

select
	/*+ index(t1 t1_range) */
	small_vc
from
	t1
where
	scattered = 10
and	clustered between (select 40 from dual)
	          and     (select 41 from dual)
;

The SQL with its “select constant from dual” may look a little artificial – but it represents a strategy that is used quite commonly; and it’s just one of several code patterns that can kick the optimizer into using the arithmetic for “unknown value at compile time” (another would the case where you use sys_context(), possibly in fine-grained access control – a.k.a. row-level security or virtual private database).

I’ve got two possible indexes on the table that could be used to assist this query – and one of them is significantly larger than the other because it has an extra column (ind_pad) in the middle which does not appear in the where clause. which index is the optimizer going to use ?

-- Default path
-----------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     5 |    85 |    17 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1      |     5 |    85 |    13 |
|*  2 |   INDEX RANGE SCAN          | T1_EQUI |     9 |       |     9 |
|   3 |    FAST DUAL                |         |     1 |       |     2 |
|   4 |    FAST DUAL                |         |     1 |       |     2 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("SCATTERED"=10 AND "CLUSTERED">= (SELECT 40 FROM
              "SYS"."DUAL" "DUAL") AND "CLUSTERED"<= (SELECT 41
              FROM "SYS"."DUAL" "DUAL"))
       filter("CLUSTERED">= (SELECT 40 FROM "SYS"."DUAL" "DUAL") AND
              "CLUSTERED"<= (SELECT 41 FROM "SYS"."DUAL" "DUAL"))

-- Hinted to user t1_range index
------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |     5 |    85 |    10 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1       |     5 |    85 |     6 |
|*  2 |   INDEX RANGE SCAN          | T1_RANGE |     9 |       |     2 |
|   3 |    FAST DUAL                |          |     1 |       |     2 |
|   4 |    FAST DUAL                |          |     1 |       |     2 |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CLUSTERED">= (SELECT 40 FROM "SYS"."DUAL" "DUAL") AND
              "SCATTERED"=10 AND "CLUSTERED"<= (SELECT 41 FROM "SYS"."DUAL" "DUAL"))
       filter("SCATTERED"=10)

The optimizer has selected the path with the higher cost – and it’s hard-coded to do this in response to the nature of the predicates and the available choice of indexes. We have two indexes, with predicate usage as follows:

t1_equi (scattered, ind_pad, clustered) with predicates that are: (equality, omitted, range)
t1_range (clustered, scattered) with predicates that are: (range, equality)

The important point is that we have a predicate on column clustered which is range-based with values that are unknown at optimisation time (even though the human eye can see that the values are going to be 40 and 41). Because of this uncertainty the optimizer is coded to bypass the index that starts with this suspect column and use the (higher cost) index where there is a known starting predicate. In effect the optimizer seems to be saying: “in the worst case scenario t1_range might end up doing an index full scan but t1_equi will only have to do a partial range scan”.

I’ve still got a lot of examples I want to work through to see how far ranging this rule is and if it ever gets ignored (what happens, for example, to joins with range-based predicates), but I hope that this preliminary note acts a useful clue when you next see Oracle ignoring the lowest cost path.

Update Sept 2013

I’ve used a subquery “(select constant from dual)” to hide a value from the optimizer in the examples above – but I’ve recently discovered that this no longer works in 12c or 11.2.0.4. It looks as if those versions know that dual is a special case and peek inside the subquery to get the value before doing the rest of the optimization.  (Fix control 11813257  looks relevant; it has the description : “selectivity of predicate with subquery returning constant” so perhaps it’s not just dual that makes a difference.)

 

7 Comments »

  1. I’ve csme across this once in a complex statement that I didn’t have time to figure out what odd CBO “quirk” I was dealing with- Excellent examples and clear explanations, thank you, Jonathan!

    Comment by Kellyn Pedersen — September 8, 2010 @ 2:20 am BST Sep 8,2010 | Reply

  2. Jonathan,

    I am afraid revelations like these are making me “atheist” about Oracle :)
    On a serious note, thanks for sharing this.

    Comment by Narendra — September 8, 2010 @ 10:38 am BST Sep 8,2010 | Reply

  3. Hi Jonathan,

    I tested it on 11.2 and it seems to be fixed in 11.2.

    See my test (I changed the name of t1 to t1xx)

    SQL&gt; startup force
    ORACLE instance started.
    Total System Global Area 1068937216 bytes
    Fixed Size                  2220200 bytes
    Variable Size             977276760 bytes
    Database Buffers           83886080 bytes
    Redo Buffers                5554176 bytes
    Database mounted.
    Database opened.
    SQL&gt;
    SQL&gt; connect scott/regit
    Connected.
    SQL&gt; set line 200
    SQL&gt; set pagesize 0
    SQL&gt; alter session set optimizer_use_sql_plan_baselines = false
      2  ;
    
    Session altered.
    
    SQL&gt; select
        small_vc
    from
        t1xx
    where
        scattered = 10
    and clustered between (select 40 from dual)
                  and     (select 41 from dual)  2    3    4    5    6    7    8  ;
    0000020011
    0000020031
    0000020051
    ……..
    0000020071
    
    50 rows selected.
    
    SQL&gt; select * from table(dbms_xplan.display_cursor)
      2  ;
    SQL_ID  bsvcwujxd02p6, child number 0
    -------------------------------------
    select     small_vc from     t1xx where     scattered = 10 and
    clustered between (select 40 from dual)               and     (select
    41 from dual)
    
    Plan hash value: 1241395774
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |       |       |    17 (100)|          |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T1XX    |     5 |   140 |    13   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN          | T1_EQUI |     9 |       |     9   (0)| 00:00:01 |
    |   3 |    FAST DUAL                |         |     1 |       |     2   (0)| 00:00:01 |
    |   4 |    FAST DUAL                |         |     1 |       |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("SCATTERED"=10 AND "CLUSTERED"&gt;= AND "CLUSTERED"= AND "CLUSTERED" select
        /*+ index(t1 t1_range) */
        small_vc
    from
        t1xx
    where
        scattered = 10
    and clustered between (select 40 from dual)
                  and     (select 41 from dual)  2    3    4    5    6    7    8    9  ;
    0000020011
    0000020031
    …..
    0000020951
    0000020971
    0000020991
    
    50 rows selected.
    
    SQL&gt; select * from table(dbms_xplan.display_cursor) ;
    SQL_ID  da0446kv1h1v8, child number 0
    -------------------------------------
    select     /*+ index(t1 t1_range) */     small_vc from     t1xx where
      scattered = 10 and clustered between (select 40 from dual)
       and     (select 41 from dual)
    
    Plan hash value: 1241395774
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |       |       |    17 (100)|          |
    |   1 |  TABLE ACCESS BY INDEX ROWID| T1XX    |     5 |   140 |    13   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN          | T1_EQUI |     9 |       |     9   (0)| 00:00:01 |
    |   3 |    FAST DUAL                |         |     1 |       |     2   (0)| 00:00:01 |
    |   4 |    FAST DUAL                |         |     1 |       |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("SCATTERED"=10 AND "CLUSTERED"&gt;= AND "CLUSTERED"= AND "CLUSTERED"&lt;=))
    
    
    24 rows selected.
    

    Regards Hans-Peter

    Comment by Hans-Peter — September 9, 2010 @ 11:57 am BST Sep 9,2010 | Reply

  4. Dear Hans-Peter,

    the problem reproduces even on 11.2.0.1.2. You have accidentally not given the right table name/alias in the hint for the second example. This is why the optimizer did not honour your hint:

    should be :

    /*+ index(t1xx t1_range) */
    instead of:
    /*+ index(t1 t1_range) */

    Regards,
    Martin

    Comment by Martin Decker — October 1, 2010 @ 6:56 am BST Oct 1,2010 | Reply

  5. [...] Update (31st Oct 2010):  I did a little extra work on this issue a few weeks later ago (see comments and pingback below), but forgot to link forward to the new article. My original conclusions were wrong; for more details see: http://jonathanlewis.wordpress.com/2010/09/07/cbo-surprise-3/ [...]

    Pingback by FBI Bug « Oracle Scratchpad — October 31, 2010 @ 8:31 am BST Oct 31,2010 | Reply

  6. [...] this version of Oracle this resulted in the predicates being treated as “guesses on an index range scan” – which resulted in the optimizer ignoring the appropriate index (until hinted) even though [...]

    Pingback by FBI Bug « Oracle Scratchpad — December 30, 2011 @ 5:48 pm BST Dec 30,2011 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,267 other followers