Oracle Scratchpad

March 12, 2007

Methods

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 9:52 pm GMT Mar 12,2007

The following question appeared a little while ago on comp.databases.oracle.server, with reference to Oracle 10.2.0.1:

 
drop table t1 purge;        

create table t1 
as 
select * from all_objects;         

create index t1_fbi1 on t1(lower(object_name)); 
create index t1_i1   on t1(object_name);         

-- now generate statistics 

Given the above data and indexes, why do the following two queries show significantly different execution plans ?

 
SQL> set autotrace traceonly explain       

SQL> select  max(object_name) 
  2  from    t1 
  3  ;       

Execution Plan 
---------------------------------------------------------- 
Plan hash value: 1743745495       

------------------------------------------------------------------------------------ 
| Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     | 
------------------------------------------------------------------------------------ 
|   0 | SELECT STATEMENT           |       |     1 |    25 |     2   (0)| 00:00:01 | 
|   1 |  SORT AGGREGATE            |       |     1 |    25 |            |          | 
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I1 | 46531 |  1136K|     2   (0)| 00:00:01 | 
------------------------------------------------------------------------------------       

SQL> select  max(lower(object_name)) 
  2  from    t1 
  3  ;       

Execution Plan 
---------------------------------------------------------- 
Plan hash value: 3308075536       

--------------------------------------------------------------------------------- 
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT      |         |     1 |    25 |    65   (2)| 00:00:01 | 
|   1 |  SORT AGGREGATE       |         |     1 |    25 |            |          | 
|   2 |   INDEX FAST FULL SCAN| T1_FBI1 | 46531 |  1136K|    65   (2)| 00:00:01 | 
--------------------------------------------------------------------------------- 

Note how the query that can use the simple index uses the special min/max optimisation for index range/full scans that appeared in the 8.1 timeline, but the query that can use the function-based index does a fast full scan and sort of the index – and a check of the resource usage shows that autotrace is telling us the truth about the plans in both cases.

To investigate this type of problem, one of my first “tricks” is simply to tell the optimizer to do what I think it should do. In this case, give it a hint to use the index properly.

 
SQL> select  /*+ index(t1) */ 
  2  	     max(lower(object_name)) 
  3  from    t1 
  4  ;      

Execution Plan 
---------------------------------------------------------- 
Plan hash value: 1546143440      

--------------------------------------------------------------------------------------- 
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT            |         |     1 |    25 |   235   (1)| 00:00:03 | 
|   1 |  SORT AGGREGATE             |         |     1 |    25 |            |          | 
|   2 |   FIRST ROW                 |         | 46531 |  1136K|   235   (1)| 00:00:03 | 
|   3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 46531 |  1136K|   235   (1)| 00:00:03 | 
--------------------------------------------------------------------------------------- 

The path I was expecting has appeared – with an interesting “first row” operation and a surprising cost ! A quick check of resource usage shows that Oracle used the path given, with minimal resource usage, confirming that the cost is a serious miscalculation. So where does that cost come from.

Change the query slightly, and you’ll see:

 
SQL> select	/*+ index(t1) */ 
  2  	object_name 
  3  from	t1 
  4  ;     

Execution Plan 
---------------------------------------------------------- 
Plan hash value: 2969740442     

-------------------------------------------------------------------------- 
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     | 
-------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT |       | 46531 |  1136K|   235   (1)| 00:00:03 | 
|   1 |  INDEX FULL SCAN | T1_I1 | 46531 |  1136K|   235   (1)| 00:00:03 | 
-------------------------------------------------------------------------- 

The index-hinted query to find the max() did actually use the min/max access path at run-time – we can see that from the resource usage – but the optimizer used the cost for a simple full scan, which is rather expensive and made the default behaviour switch to the fast full scan with sort. It’s some sort of bug in the optimizer.

Interestingly, we can get the min/max plan to appear by adding a predicate to the query that (notionally) addresses any problems that might be caused by nulls:

 
SQL> select  max(lower(object_name)) 
  2  from    t1 
  3  where   lower(object_name) is not null 
  4  ;    

Execution Plan 
---------------------------------------------------------- 
Plan hash value: 1546143440    

--------------------------------------------------------------------------------------- 
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     | 
--------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT            |         |     1 |    25 |     2   (0)| 00:00:01 | 
|   1 |  SORT AGGREGATE             |         |     1 |    25 |            |          | 
|   2 |   FIRST ROW                 |         | 46531 |  1136K|     2   (0)| 00:00:01 | 
|*  3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 46531 |  1136K|     2   (0)| 00:00:01 | 
---------------------------------------------------------------------------------------    

Predicate Information (identified by operation id): 
--------------------------------------------------- 
   3 - filter(LOWER("OBJECT_NAME") IS NOT NULL)    

So ultimately we have to go to the 10053 trace file to try and pin down the exact nature of the bug.  In the trace file, we can see that the version of code with the ‘not null’ predicate is the only one that results in a section of the Single Table Access Path calculation that examines the possibility of the min/max path.

 
  Access Path: index (Min/Max) 
    Index: T1_FBI1 
    resc_io: 2.00  resc_cpu: 14443 
    ix_sel: 2.1491e-005  ix_sel_with_filters: 2.1491e-005 
    Cost: 2.00  Resp: 2.00  Degree: 1 

From this, we can conclude the problem lies in the optimizer failing to spot the option for using the min/max path in the default scenario, rather than the optimizer doing the wrong calculation for the path.

It doesn’t however, tell us why the run-time engine can apparently use the min/max optimisation when the optimizer obeyed our hint and generated a plan that included a full index scan. The trace file in the hinted case only showed the full scan calculation, it didn’t suggest a min/max, nor a “first row”, nor a descending scan.

And just one final thought – the problem shouldn’t have anything to do with the fact that we have supplied an explicit ‘not null’ predicate. If this were an example of a “nulls not in index” problem, the optimizer would have to fall back to using a tablescan in every case.

And at this point, I usually pass the buck to Oracle support.

4 Comments »

  1. Hi Jonathan,

    your blog entry led me to another example, where Oracle does not seem to be able to get to the best execution strategy.

    create table test (aaa varchar2(30), bbb integer ); 
     begin 
     for i in 1..10000 loop 
      insert into test values ('TESTETSTESTSTS'||i, i); 
      end loop; 
      end; 
      / 
    
    insert /*+APPEND*/ into test select * from test; 
    [.. repeat...] 
    
    640000 rows created.
    
    INDEX_NAME          BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS 
    --------------- ---------- ----------- ------------- ----------------- ---------- 
    I_TEST_A                 2        5337         10000           1280000    1280000 
    I_TEST_B                 2        2675         10000           1280000    1280000 
    
    TABLE_NAME        NUM_ROWS     BLOCKS 
    --------------- ---------- ---------- 
    TEST               1280000       5035 
    
    COLUMN_NAME     NUM_DISTINCT LOW_VALUE  HIGH_VALUE    DENSITY  NUM_NULLS 
    --------------- ------------ ---------- ---------- ---------- ---------- 
    AAA                    10000 5445535445 5445535445      .0001          0 
                                 5453544553 5453544553 
                                 5453545331 5453545339 
                                            393939 
    
    BBB                    10000 C102       C302            .0001          0 
    
    Now the query should get the minimum AND the maximum of test.aaa at once:
    select max(aaa), min(aaa) from test;
    
    ---------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |     1 |    18 |   926   (5)| 00:00:10 |
    |   1 |  SORT AGGREGATE    |      |     1 |    18 |            |          |
    |   2 |   TABLE ACCESS FULL| TEST |  1280K|    21M|   926   (5)| 00:00:10 |
    ---------------------------------------------------------------------------
    Unfortunately the index is not chosen.
    
    But:
    select (select max(aaa) from test), (select min(aaa) from test) from dual;
    ---------------------------------------------------------------------------------------
    | Id  | Operation                  | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |          |     1 |       |     2   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE            |          |     1 |    18 |            |          |
    |   2 |   INDEX FULL SCAN (MIN/MAX)| I_TEST_A |  1280K|    21M|     2   (0)| 00:00:01 |
    |   3 |  SORT AGGREGATE            |          |     1 |    18 |            |          |
    |   4 |   INDEX FULL SCAN (MIN/MAX)| I_TEST_A |  1280K|    21M|     2   (0)| 00:00:01 |
    |   5 |  FAST DUAL                 |          |     1 |       |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    

    Does the job – and is quite faster with it.

    So, what’s the issue here?

    Thanks and best regards,
    Lars

    Comment by Lars — March 14, 2007 @ 10:53 am GMT Mar 14,2007 | Reply

  2. Lars, the main issue is that it would need special coding for a very specific case – and it’s probably not a good investment in programmer time for the optimizer group.

    In the generic case, you would have to decompose the query into two separate operations that could be optimized separately, then recombine the results correctly.

    It may come eventually – Oracle 11g is likely to have some interesting options for converting set-based operations to joins and vice versa, and that’s a step in the right direction.

    Comment by Jonathan Lewis — March 14, 2007 @ 8:48 pm GMT Mar 14,2007 | Reply

  3. Hi Jonathan,
    I have tested the same on 10g 10.2.0.3 with results a little bit different.

    Note that:

    select max(lower(object_name)) from t1;

    in my case uses the special min/max optimisation for index range/full scans with no hint or “where lower(object_name) is not null”, even when statistics were not calculated. Better optimizer?

    valasekd@DBBAFIR> select * from v$version;
    
    BANNER
    ----------------------------------------------------------------
    Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
    PL/SQL Release 10.2.0.3.0 - Production
    CORE	10.2.0.3.0	Production
    TNS for IBM/AIX RISC System/6000: Version 10.2.0.3.0 - Productio
    NLSRTL Version 10.2.0.3.0 - Production
    
    
    valasekd@DBBAFIR> show parameter optimizer
    
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ----------
    optimizer_dynamic_sampling           integer     2
    optimizer_features_enable            string      10.2.0.3
    optimizer_index_caching              integer     50
    optimizer_index_cost_adj             integer     20
    optimizer_mode                       string      ALL_ROWS
    optimizer_secure_view_merging        boolean     TRUE
    
    valasekd@DBBAFIR> drop table t1 purge;
    
    Table dropped.
    
    Elapsed: 00:00:00.21
    valasekd@DBBAFIR> 
    valasekd@DBBAFIR> create table t1
      2  as
      3  select * from all_objects;
    
    Table created.
    
    valasekd@DBBAFIR> create index t1_fbi1   on t1(lower(object_name));
    
    Index created.
    
    valasekd@DBBAFIR> create index t1_i1 on t1(object_name);
    
    Index created.
    
    valasekd@DBBAFIR> set autotrace traceonly explain
    valasekd@DBBAFIR> select  max(object_name) from t1;
    Elapsed: 00:00:00.00
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 575687492
    
    ------------------------------------------------------------------------------------
    | Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |       |     1 |    17 |     1   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE            |       |     1 |    17 |            |          |
    |   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I1 | 66718 |  1107K|     1   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------
    
    Note
    -----
       - dynamic sampling used for this statement
    
    valasekd@DBBAFIR> select  max(lower(object_name)) from t1;
    Elapsed: 00:00:00.01
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 921562405
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |     1 |    17 |    79   (2)| 00:00:01 |
    |   1 |  SORT AGGREGATE             |         |     1 |    17 |            |          |
    |   2 |   FIRST ROW                 |         | 66718 |  1107K|    79   (2)| 00:00:01 |
    |   3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 66718 |  1107K|    79   (2)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Note
    -----
       - dynamic sampling used for this statement
    
    valasekd@DBBAFIR> select  /*+ index(t1) */ max(lower(object_name)) from t1;
    Elapsed: 00:00:00.00
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 921562405
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |     1 |    17 |    79   (2)| 00:00:01 |
    |   1 |  SORT AGGREGATE             |         |     1 |    17 |            |          |
    |   2 |   FIRST ROW                 |         | 66718 |  1107K|    79   (2)| 00:00:01 |
    |   3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 66718 |  1107K|    79   (2)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Note
    -----
       - dynamic sampling used for this statement
    
    valasekd@DBBAFIR> set autotrace off
    valasekd@DBBAFIR> exec dbms_stats.gather_table_stats(user,'T1',cascade=>true);
    
    PL/SQL procedure successfully completed.
    
    Elapsed: 00:00:02.07
    valasekd@DBBAFIR> set autotrace traceonly explain
    valasekd@DBBAFIR> select  max(object_name) from t1;
    Elapsed: 00:00:00.00
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 575687492
    
    ------------------------------------------------------------------------------------
    | Id  | Operation                  | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |       |     1 |    21 |     1   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE            |       |     1 |    21 |            |          |
    |   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I1 | 84045 |  1723K|     1   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------
    
    valasekd@DBBAFIR> select  max(lower(object_name)) from t1;
    Elapsed: 00:00:00.00
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 921562405
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |     1 |    21 |    75   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE             |         |     1 |    21 |            |          |
    |   2 |   FIRST ROW                 |         | 84045 |  1723K|    75   (0)| 00:00:01 |
    |   3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 84045 |  1723K|    75   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    valasekd@DBBAFIR> select  /*+ index(t1) */ max(lower(object_name)) from t1;
    Elapsed: 00:00:00.00
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 921562405
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |         |     1 |    21 |    75   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE             |         |     1 |    21 |            |          |
    |   2 |   FIRST ROW                 |         | 84045 |  1723K|    75   (0)| 00:00:01 |
    |   3 |    INDEX FULL SCAN (MIN/MAX)| T1_FBI1 | 84045 |  1723K|    75   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    

    Comment by Dusan Valasek — March 15, 2007 @ 9:22 am GMT Mar 15,2007 | Reply

  4. Dusan, thanks for the update; it looks like a minor correction to the optimizer.

    It is interesting to note that the plan still shows the same FIRST ROW operator for the lower() option. This suggests that the change is simply to recognise that lower(object_name) is null if, and only if, name is null, rather than a complete fix for the min/max optimisation path.

    Comment by Jonathan Lewis — March 15, 2007 @ 9:16 pm GMT Mar 15,2007 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.