Oracle Scratchpad

September 7, 2010

CBO Surprise

Filed under: CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 7:00 pm BST Sep 7,2010

Well, it surprised me!

I’ve said for a very long time that in principle, ignoring bugs and restrictions, the optimizer will always choose the lowest cost option during its search for an execution path. It turns out that this isn’t true. I learned from comment 5 of this note about a possible bug relating to function-based indexes that there are cases where the optimizer follows a rule that allows it to ignore the lowest cost path if it is derived from a range-based predicate involving unpeekable bind variables. In short this means that:

there are cases where the optimizer can choose to discard an execution plan with a lower cost in favour of an execution plan with a higher cost.


It’s worth remembering, though, that any statement made about “bind variables” may also apply in any circumstances where the optimizer sees: “unknown/unknowable value”. Here’s a simplified example that I find a little worrying (running on 11.1):

rem
rem     Script:         index_range.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2010
rem 

create table t1
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                rownum <= 10000   -- > comment to avoid wordpress format issue
)
select
        rownum                  id,
        mod(rownum-1,20)        scattered,
        trunc(rownum / 500)     clustered,
        lpad(mod(rownum,2),10)  ind_pad,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum  <= 40000 -- > comment to avoid WordPress format issue
;

create index t1_equi  on t1(scattered, ind_pad, clustered);
create index t1_range on t1(clustered, scattered);

begin
        dbms_stats.gather_table_stats(
                ownname      => user,
                tabname      => 'T1',
                cascade      => true,
                method_opt   => 'for all columns size 1'
        );
end;
/

set autotrace traceonly explain

select
        small_vc
from
        t1
where
        scattered = 10
and     clustered between (select 40 from dual)
                  and     (select 41 from dual)
;

select
        /*+ index(t1 t1_range) */
        small_vc
from
        t1
where
        scattered = 10
and     clustered between (select 40 from dual)
                  and     (select 41 from dual)
;

set autotrace off


This SQL with its “select constant from dual” subqueries may look a little artificial – but it represents a strategy that is used quite commonly; and it’s just one of several code patterns that can kick the optimizer into using the arithmetic for “unknown value at compile time” (another case would be where you use sys_context(), possibly in fine-grained access control (fgac) – a.k.a. row-level security (rls) or virtual private database(vpd) – or its big brother Oracle Label Security (ols)).

I’ve got two possible indexes on the table that could be used to assist this query – and one of them is significantly larger than the other because it has an extra column (ind_pad) in the middle which does not appear in the where clause. which index is the optimizer going to use ?

Default path

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     5 |    85 |    17   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1      |     5 |    85 |    13   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_EQUI |     9 |       |     9   (0)| 00:00:01 |
|   3 |    FAST DUAL                |         |     1 |       |     2   (0)| 00:00:01 |
|   4 |    FAST DUAL                |         |     1 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("SCATTERED"=10 AND "CLUSTERED" .ge. (SELECT 40 FROM "SYS"."DUAL"
              "DUAL") AND "CLUSTERED".le. (SELECT 41 FROM "SYS"."DUAL" "DUAL"))
       filter("CLUSTERED".ge. (SELECT 40 FROM "SYS"."DUAL" "DUAL") AND
              "CLUSTERED".le. (SELECT 41 FROM "SYS"."DUAL" "DUAL"))

Hinted path

----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |     5 |    85 |    10   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1       |     5 |    85 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_RANGE |     9 |       |     2   (0)| 00:00:01 |
|   3 |    FAST DUAL                |          |     1 |       |     2   (0)| 00:00:01 |
|   4 |    FAST DUAL                |          |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CLUSTERED".ge. (SELECT 40 FROM "SYS"."DUAL" "DUAL") AND
              "SCATTERED"=10 AND "CLUSTERED".le. (SELECT 41 FROM "SYS"."DUAL" "DUAL"))
       filter("SCATTERED"=10)

NOTE: I’ve edited “greater than or equal to” to “.ge.” and “less than or equal to” to “.le” to avoid losing lots of text to the wordpress HTML formatter. NOTE also that this example will not produce the same effect on versions above 11.1.0..7 for reasons given below, and slightly a more complex example is needed.

The optimizer has selected the path with the higher cost – and it’s hard-coded to do this in response to the nature of the predicates and the available choice of indexes. We have two indexes, with predicate usage as follows:

t1_equi (scattered, ind_pad, clustered) with predicates that are: (equality, omitted, range)
t1_range (clustered, scattered) with predicates that are: (range, equality)

The important point is that we have a predicate on column clustered which is range-based with values that are unknown at optimisation time (even though the human eye can see that the values are going to be 40 and 41). Because of this uncertainty the optimizer is coded to bypass the index that starts with this suspect column and use the (higher cost) index where there is a known starting predicate. In effect the optimizer seems to be saying: “in the worst case scenario t1_range might end up doing an index full scan but t1_equi will only have to do a partial range scan”.

I’ve still got a lot of examples I want to work through to see how far ranging this rule is and if it ever gets ignored (what happens, for example, to joins with range-based predicates), but I hope that this preliminary note acts a useful clue when you next see Oracle ignoring the lowest cost path. If you want to check back to comment 8 of the previous note you’ll see that document 4112254.8 gives us some information about event 38068 and setting a level which controls a “cut-over point” (given as a percentage) that Oracle uses to decide whether or not to use a higher cost path.

Update Sept 2013

I’ve used a subquery “(select {constant} from dual)” to hide a value from the optimizer in the examples above – but I’ve recently discovered that this strategy no longer works in 12c or 11.2.0.4. It looks as if those versions know that dual is a special case and “peek” inside the subquery to get the value before doing the rest of the optimization.  (Fix control 11813257  looks relevant; it has the description : “selectivity of predicate with subquery returning constant” and it turns out that it’s not just dual that makes a difference – though it’s the most likely one that people will use.)

7 Comments »

  1. I’ve csme across this once in a complex statement that I didn’t have time to figure out what odd CBO “quirk” I was dealing with- Excellent examples and clear explanations, thank you, Jonathan!

    Comment by Kellyn Pedersen — September 8, 2010 @ 2:20 am BST Sep 8,2010 | Reply

  2. Jonathan,

    I am afraid revelations like these are making me “atheist” about Oracle :)
    On a serious note, thanks for sharing this.

    Comment by Narendra — September 8, 2010 @ 10:38 am BST Sep 8,2010 | Reply

  3. […] Update (31st Oct 2010):  I did a little extra work on this issue a few weeks later (see comments and pingback below), but forgot to link forward to the new article. My original conclusions were wrong; for more details see: https://jonathanlewis.wordpress.com/2010/09/07/cbo-surprise-3/ […]

    Pingback by FBI Bug « Oracle Scratchpad — October 31, 2010 @ 8:31 am GMT Oct 31,2010 | Reply

  4. […] this version of Oracle this resulted in the predicates being treated as “guesses on an index range scan” – which resulted in the optimizer ignoring the appropriate index (until hinted) even though […]

    Pingback by FBI Bug « Oracle Scratchpad — December 30, 2011 @ 5:48 pm GMT Dec 30,2011 | Reply

  5. […] The hinted path has a lower cost than the default path. I think this may be another case of an “unknowable” range scan being ignored in favour of a known […]

    Pingback by Conditional SQL – 5 | Oracle Scratchpad — March 2, 2018 @ 12:49 pm GMT Mar 2,2018 | Reply

  6. […] is an update on a post I wrote nearly 10 years ago describing how the optimizer could choose to ignore a lower cost indexed access path and use a […]

    Pingback by Lower Cost Ignored | Oracle Scratchpad — June 17, 2020 @ 3:49 pm BST Jun 17,2020 | Reply

  7. […] But after a moment’s thought you can dismiss this one (possibly correctly) as an example of the optimizer being cautious about the cost of access paths that are dictated by bind variables or unpeekable inputs. (But these […]

    Pingback by Hints and Costs | Oracle Scratchpad — October 7, 2021 @ 12:06 pm BST Oct 7,2021 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.