Oracle Scratchpad

September 9, 2020

Bloom Upgrade

Filed under: 18c,CBO,Joins,Oracle,Tuning — Jonathan Lewis @ 5:18 pm BST Sep 9,2020

It’s a common pattern of Oracle features that they start with various restrictions or limitations that disappear over time. This note is about an enhancement to Bloom filter processing that appeared in the 18.1 optimizer and, for some people, may be a good enough reason for upgrading to a newer version of Oracle. This enhancement came to my attention through a question on the Oracle Developer forum asking how to get a Bloom filter pushed inside a UNION ALL view. The original requirement wasn’t a trivial one so I’ll demonstrate the problem with a very simple example – first the data set:

rem
rem     Script:         bloom_pushdown.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sep 2020
rem
rem     Last tested:
rem             19.3.0.0
rem

create table t1 as select * from all_objects where rownum <= 50000;
create table t2 as select t1.* from t1, (select rownum n1 from dual connect by level <= 4);
create table t3 as select t1.* from t1, (select rownum n1 from dual connect by level <= 4); -- > comment to avoid wordpress format issue

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1 for columns object_type size 254'
        );
end;
/

I’ve been a bit lazy here, copying data from view all_objects. I’ve gathered stats on t1 so that I can generate a histogram on the object_type column because I’m going to query for a rare object_type and I want the optimizer to get a reasonable estimate of rows. I’m going to hint a parallel query to join t1 to t2 (aliased, trivially, as v1 for reasons that will become apparent soon):

select
        /*+ 
                parallel(2) 
        */
        t1.object_name, v1.object_name
from
        t1,
        t2 v1
where
        t1.object_type = 'SCHEDULE'
and     v1.object_id = t1.object_id
/

In my case the optimizer chooses to do a hash join between these two table, and creates a Bloom filter to try and minimise the data passing through the data flow operation. The result set in my 12.2.0.1 database is only 16 rows, so it would be nice if the parallel scan could eliminate most of the 200,000 rows in t2 early – here’s the execution plan pulled from memory after running the query with rowsource execution stats enabled:


----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |      1 |        |   371 (100)|     16 |00:00:00.06 |      20 |      0 |       |       |          |
|   1 |  PX COORDINATOR        |          |      1 |        |            |     16 |00:00:00.06 |      20 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)  | :TQ10000 |      0 |     16 |   371   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN           |          |      2 |     16 |   371   (5)|     16 |00:00:00.05 |    6278 |   3988 |  1250K|  1250K|     2/0/0|
|   4 |     JOIN FILTER CREATE | :BF0000  |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    2034 |      0 |       |       |          |
|*  5 |      TABLE ACCESS FULL | T1       |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    2034 |      0 |       |       |          |
|   6 |     JOIN FILTER USE    | :BF0000  |      2 |    200K|   292   (4)|     16 |00:00:00.04 |    4244 |   3988 |       |       |          |
|   7 |      PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|     16 |00:00:00.04 |    4244 |   3988 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2       |     32 |    200K|   292   (4)|     16 |00:00:00.03 |    4244 |   3988 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   5 - filter("T1"."OBJECT_TYPE"='SCHEDULE')
   8 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"V1"."OBJECT_ID"))

We see that Oracle has generated a Bloom filter at operation 4 from the data returned from t1 at operation 5, and then used that Bloom filter at operation 6 to eliminate most of the data from t2 before passing the remaining few rows up to the hash join.

Let’s make the query more interesting – what if you want to use a UNION ALL of t2 and t3 in the query (for example one might be “current data” while the other is “historic data”. Here’s the query and plan from 12.2.0.1:

select
        /*+ 
                parallel(2) 
        */
        t1.object_name, v1.object_name
from
        t1,
        (select * from t2 union all select * from t3) v1
where
        t1.object_type = 'SCHEDULE'
and     v1.object_id = t1.object_id
/

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |      1 |        |   667 (100)|     32 |00:00:00.37 |      40 |      0 |       |       |          |
|   1 |  PX COORDINATOR         |          |      1 |        |            |     32 |00:00:00.37 |      40 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)   | :TQ10000 |      0 |     32 |   667   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN            |          |      1 |     32 |   667   (5)|     32 |00:00:00.34 |    5125 |   3860 |  1250K|  1250K|     2/0/0|
|*  4 |     TABLE ACCESS FULL   | T1       |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    2034 |      0 |       |       |          |
|   5 |     VIEW                |          |      2 |    400K|   584   (4)|    400K|00:00:00.52 |    8488 |   7976 |       |       |          |
|   6 |      UNION-ALL          |          |      2 |        |            |    400K|00:00:00.24 |    8488 |   7976 |       |       |          |
|   7 |       PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|    200K|00:00:00.11 |    4244 |   3988 |       |       |          |
|*  8 |        TABLE ACCESS FULL| T2       |     32 |    200K|   292   (4)|    200K|00:00:00.07 |    4244 |   3988 |       |       |          |
|   9 |       PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|    200K|00:00:00.11 |    4244 |   3988 |       |       |          |
|* 10 |        TABLE ACCESS FULL| T3       |     32 |    200K|   292   (4)|    200K|00:00:00.03 |    4244 |   3988 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   4 - filter("T1"."OBJECT_TYPE"='SCHEDULE')
   8 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue
  10 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue

No Bloom filter – so all 400,000 rows feed up the plan and through the hash join. This won’t matter too much for my sub-second tiny data set but on a pair of 50GB tables, with the potential to offload the Bloom filter to storage in Exadata and, perhaps, eliminate 99% of the data at the cell servers, this could make a huge difference to performance.

Since Bloom filters are all about hashing data (in Oracle the standard Bloom filter is the bitmap summarising the build table in a hash join) let’s trying pushing the optimizer into a hash distribution for the parallel join to see if that had any effect:


select
        /*+ 
                parallel(2) 
                gather_plan_statistics
                leading(@sel$1 t1@sel$1 v1@sel$1)
                use_hash(@sel$1 v1@sel$1)
                pq_distribute(@sel$1 v1@sel$1 hash hash)
        */
        t1.object_name, v1.object_name
from
        t1,
        (select * from t2 union all select * from t3) v1
where
        t1.object_type = 'SCHEDULE'
and     v1.object_id = t1.object_id
/

---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |      1 |        |   667 (100)|     32 |00:00:00.43 |      60 |      0 |       |       |          |
|   1 |  PX COORDINATOR             |          |      1 |        |            |     32 |00:00:00.43 |      60 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)       | :TQ10002 |      0 |     32 |   667   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN BUFFERED       |          |      1 |     32 |   667   (5)|     32 |00:00:00.38 |    4000 |   3752 |  2290K|  2082K|     2/0/0|
|   4 |     JOIN FILTER CREATE      | :BF0000  |      2 |      4 |    75   (4)|      8 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |      PX RECEIVE             |          |      2 |      4 |    75   (4)|      8 |00:00:00.01 |       0 |      0 |       |       |          |
|   6 |       PX SEND HYBRID HASH   | :TQ10000 |      0 |      4 |    75   (4)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   7 |        STATISTICS COLLECTOR |          |      2 |        |            |      4 |00:00:00.01 |    1517 |      0 |       |       |          |
|   8 |         PX BLOCK ITERATOR   |          |      2 |      4 |    75   (4)|      4 |00:00:00.01 |    1517 |      0 |       |       |          |
|*  9 |          TABLE ACCESS FULL  | T1       |     26 |      4 |    75   (4)|      4 |00:00:00.01 |    1517 |      0 |       |       |          |
|  10 |     PX RECEIVE              |          |      2 |    400K|   584   (4)|     66 |00:00:00.77 |    8488 |   7976 |       |       |          |
|  11 |      PX SEND HYBRID HASH    | :TQ10001 |      2 |    400K|   584   (4)|     66 |00:00:00.77 |    8488 |   7976 |       |       |          |
|  12 |       JOIN FILTER USE       | :BF0000  |      2 |    400K|   584   (4)|     66 |00:00:00.77 |    8488 |   7976 |       |       |          |
|  13 |        VIEW                 |          |      2 |    400K|   584   (4)|    400K|00:00:00.68 |    8488 |   7976 |       |       |          |
|  14 |         UNION-ALL           |          |      2 |        |            |    400K|00:00:00.59 |    8488 |   7976 |       |       |          |
|  15 |          PX BLOCK ITERATOR  |          |      2 |    200K|   292   (4)|    200K|00:00:00.18 |    4244 |   3988 |       |       |          |
|* 16 |           TABLE ACCESS FULL | T2       |     32 |    200K|   292   (4)|    200K|00:00:00.06 |    4244 |   3988 |       |       |          |
|  17 |          PX BLOCK ITERATOR  |          |      2 |    200K|   292   (4)|    200K|00:00:00.12 |    4244 |   3988 |       |       |          |
|* 18 |           TABLE ACCESS FULL | T3       |     32 |    200K|   292   (4)|    200K|00:00:00.08 |    4244 |   3988 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   9 - access(:Z>=:Z AND :Z<=:Z)   -- > edit to avoid wordpress format issue 
       filter("T1"."OBJECT_TYPE"='SCHEDULE') 
  16 - access(:Z>=:Z AND :Z<=:Z)   -- > edit to avoid wordpress format issue
  18 - access(:Z>=:Z AND :Z<=:Z)   -- > edit to avoid wordpress format issue

We’ve managed to introduce a Bloom filter (which is visible as :BF0000 in the plan, even through there’s no reference to sys_op_bloom_filter() in the predicate information) but there’s a problem, we’re still passing 400,000 rows up the plan and the Bloom filter is only being applied at (or just after) the VIEW operator, discarding all but 66 rows before doing the hash join. It’s an improvement but not ideal; we’d like to see the Bloom filter applied to each of the two tables separately to eliminate rows as early as possible.

This can’t be done in 12.2, and you’d have to rewrite the query, changing a “join with union” into a “union of joins”, and that’s not really a desirable strategy.

Next Steps

Searching MOS, though you will be able to find the following note:

Doc ID 18849313.8 – ENH : bloom filters/pruning are pushed through union-all view

There’s an enhancement request to do what we want in 18.1, and the enhancement has got into the software. Here’s the (unhinted) plan from 19.3 (the plan stays the same when optimizer_features_enable is set back to 18.1.0, but drops back to the 12.1. plan when OFE is set to 12.2.0.1):

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |   666 (100)|     32 |00:00:00.11 |      10 |      0 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |            |     32 |00:00:00.11 |      10 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)    | :TQ10000 |      0 |     32 |   666   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN             |          |      2 |     32 |   666   (5)|     32 |00:00:00.05 |   10020 |   7958 |  1250K|  1250K|     2/0/0|
|   4 |     JOIN FILTER CREATE   | :BF0000  |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    1998 |      0 |       |       |          |
|*  5 |      TABLE ACCESS FULL   | T1       |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    1998 |      0 |       |       |          |
|   6 |     VIEW                 |          |      2 |    400K|   583   (4)|     32 |00:00:00.04 |    8022 |   7958 |       |       |          |
|   7 |      UNION-ALL           |          |      1 |        |            |     12 |00:00:00.02 |    4011 |   3979 |       |       |          |
|   8 |       JOIN FILTER USE    | :BF0000  |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|   9 |        PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|* 10 |         TABLE ACCESS FULL| T2       |     32 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|  11 |       JOIN FILTER USE    | :BF0000  |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|  12 |        PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|* 13 |         TABLE ACCESS FULL| T3       |     32 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   5 - filter("T1"."OBJECT_TYPE"='SCHEDULE')
  10 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."OBJECT_ID")) 
  13 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T3"."OBJECT_ID"))

As you can see, we create a Bloom filter at operation 4, and use it twice at operations 8 and 11 – with the sys_op_bloom_filter() functions clearly visible in the predicate information showing us that the Bloom filter is applied to the object_id column in both cases.

If you want to disable this enhancement for some reasons there are two hidden parameters available (which you might set for a single query using the opt_param() hint):

  • _bloom_filter_setops_enabled = true
  • _bloom_pruning_setops_enabled = true

The first is for Bloom filters in the situation shown, I assume the second deals with Bloom filters for partition pruning.

Summary

In versions prior to 18.1 the optimizer is unable to push Bloom filters down to the individual tables in a UNION ALL view, but this limitation was removed in the 18.1 code set.

 

July 10, 2020

Recursive WITH upgrade

Filed under: ANSI Standard,CBO,Execution plans,Oracle,Subquery Factoring,Upgrades — Jonathan Lewis @ 4:19 pm BST Jul 10,2020

There’s a notable change in the way the optimizer does cost and cardinality calculations for recursive subquery factoring that may make some of your execution plans change – with a massive impact on performance – as you upgrade to any version of Oracle from 12.2.0.1 onwards. The problem appeared in a question on the Oracle Developer Community forum a little while ago, with a demonstration script to model the issue.

I’ve copied the script – with a little editing – and reproduced the change in execution plan described by the OP. Here’s my copy of the script, with the insert statements that generate the data (all 1,580 of them) removed.

rem
rem     Script:         recursive_with_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:        
rem
rem     Last tested 
rem             12.2.0.1
rem             12.1.0.2
rem
rem     Notes:
rem     https://community.oracle.com/thread/4338248
rem
rem     The upgrade to 12.2.0.1 made this query much slower (on 15,000 rows)
rem     Setting OFE to 12.1.0.1 is a first possible fix for the issue.
rem     The scale is too small to see much difference in this case
rem

drop table test_folder purge;

create table test_folder(
        fldr_key                number(16,0)            not null        enable,                 
        fldr_id                 varchar2(255 byte)      not null        enable,                 
        fldr_desc_tx            varchar2(255 byte),                     
        par_fldr_key            number(16,0),                   
        seus_key                number(16,0)            not null        enable,                 
        fldr_private_flg        varchar2(1 byte)        not null        enable,                 
        last_updt_dt            date                    not null        enable,                 
        last_upby_seus_key      number(16,0)            not null        enable,                 
        lock_seq_nbr            number(9,0) default 0   not null        enable,                 
        content_guid            raw(16),                
        constraint test_folder_pk primary key (fldr_key)                
)       
;              

-- list of insert statements

alter table test_folder add constraint test_folder_fk  
        foreign key (par_fldr_key) references test_folder(fldr_key)
;  
  
create or replace force editionable view test_folder_vw (fldr_key) as   
with rec_path(fldr_key)  as (
        select  tf.fldr_key  
        from    test_folder tf where tf.par_fldr_key is null  
        union all  
        select  tf.fldr_key  
        from    test_folder tf, rec_path  
        where   rec_path.fldr_key = tf.par_fldr_key
)  
select fldr_key  
from rec_path   
; 

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'TEST_FOLDER',
                method_opt  => 'for all columns size 1'
        );
end;
/


select * from test_folder_vw where fldr_key = -41;  

I’ve run the test 3 times. First in 12.2.0.1 with no tweaking; then in 12.2.0.1 with the hint /*+ optimizer_features_enable(‘12.1.0.2’) */ and finally in a genuine 12.1.0.2 environment. In all three cases I enabled rowsource execution stats (‘alter session set statistics_level = all’) and pulled the plans from memory – with the following results

First, the base result from 12.1.0.2

----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |             |      1 |        |      1 |00:00:00.03 |     604 |       |       |          |
|*  1 |  VIEW                                     |             |      1 |    801 |      1 |00:00:00.03 |     604 |       |       |          |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|             |      1 |        |   1580 |00:00:00.03 |     604 | 36864 | 36864 |  102K (0)|
|*  3 |    TABLE ACCESS FULL                      | TEST_FOLDER |      1 |    161 |    161 |00:00:00.01 |      68 |       |       |          |
|*  4 |    HASH JOIN                              |             |      8 |    640 |   1419 |00:00:00.02 |     536 |  1696K|  1696K| 1488K (0)|
|   5 |     RECURSIVE WITH PUMP                   |             |      8 |        |   1580 |00:00:00.01 |       0 |       |       |          |
|*  6 |     TABLE ACCESS FULL                     | TEST_FOLDER |      8 |   1419 |  11352 |00:00:00.01 |     536 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("FLDR_KEY"=(-41))
   3 - filter("TF"."PAR_FLDR_KEY" IS NULL)
   4 - access("REC_PATH"."FLDR_KEY"="TF"."PAR_FLDR_KEY")
   6 - filter("TF"."PAR_FLDR_KEY" IS NOT NULL)

Two points to note, in particular. First that the hash join has the recursive with pump as its first (build table) child and the table access full of test_folder as its second child (probe table); secondly that there is no value given for E-Rows for the recursive with pump.

Now the 12.2.0.1 plan:

----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |             |      1 |        |      1 |00:00:00.01 |      47 |       |       |          |
|*  1 |  VIEW                                     |             |      1 |   2524K|      1 |00:00:00.01 |      47 |       |       |          |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|             |      1 |        |   1580 |00:00:00.01 |      47 | 36864 | 36864 |  102K (0)|
|*  3 |    TABLE ACCESS FULL                      | TEST_FOLDER |      1 |    161 |    161 |00:00:00.01 |      24 |       |       |          |
|*  4 |    HASH JOIN                              |             |      8 |   2524K|   1419 |00:00:00.01 |      23 |  1743K|  1743K| 1632K (0)|
|   5 |     BUFFER SORT (REUSE)                   |             |      8 |        |  11352 |00:00:00.01 |      23 | 73728 | 73728 |          |
|*  6 |      TABLE ACCESS FULL                    | TEST_FOLDER |      1 |   1419 |   1419 |00:00:00.01 |      23 |       |       |          |
|   7 |     RECURSIVE WITH PUMP                   |             |      8 |        |   1580 |00:00:00.01 |       0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("FLDR_KEY"=(-41)
   3 - filter("TF"."PAR_FLDR_KEY" IS NULL)
   4 - access("REC_PATH"."FLDR_KEY"="TF"."PAR_FLDR_KEY")
   6 - filter("TF"."PAR_FLDR_KEY" IS NOT NULL)

There are three changes to notice in this plan – which (for the OP) was much slower than the 12.1.0.2 plan. First, the order of the hash join has changed, the recursive with pump is now the second child (probe table) in the join (and again shows no value for E-Rows); secondly that Oracle has introduced an extra operation – the buffer sort (reuse) – populated by the table access full – as the build table; thirdly (presumably the point of buffer sort (reuse) operation) the number of buffer visits has dropped from a few hundred to a couple of dozen.

Finally let’s check what happens if we set the OFE (optimizer_features_enable) to 12.1.0.2 while running 12.2.0.1

----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |             |      1 |        |      1 |00:00:00.01 |      47 |       |       |          |
|*  1 |  VIEW                                     |             |      1 |    801 |      1 |00:00:00.01 |      47 |       |       |          |
|   2 |   UNION ALL (RECURSIVE WITH) BREADTH FIRST|             |      1 |        |   1580 |00:00:00.01 |      47 | 36864 | 36864 |  102K (0)|
|*  3 |    TABLE ACCESS FULL                      | TEST_FOLDER |      1 |    161 |    161 |00:00:00.01 |      24 |       |       |          |
|*  4 |    HASH JOIN                              |             |      8 |    640 |   1419 |00:00:00.01 |      23 |  1797K|  1797K| 1573K (0)|
|   5 |     RECURSIVE WITH PUMP                   |             |      8 |        |   1580 |00:00:00.01 |       0 |       |       |          |
|   6 |     BUFFER SORT (REUSE)                   |             |      8 |        |  11352 |00:00:00.01 |      23 | 73728 | 73728 |          |
|*  7 |      TABLE ACCESS FULL                    | TEST_FOLDER |      1 |   1419 |   1419 |00:00:00.01 |      23 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("FLDR_KEY"=(-41))
   3 - filter("TF"."PAR_FLDR_KEY" IS NULL)
   4 - access("REC_PATH"."FLDR_KEY"="TF"."PAR_FLDR_KEY")
   7 - filter("TF"."PAR_FLDR_KEY" IS NOT NULL)

In these conditions the recursive with pump has gone back to being the build table (first child); but it’s worth noting that the 12.2 buffer sort (reuse) is still in place – saving us a few hundred buffer gets (and, for a bigger table, a number of disc reads possibly). Downgrading the optimizer_features_enable has given us the plan we needed, but this we’ve got an example that shows that hacking the parameter isn’t a guarantee that we will get exactly the path we used to get in the older version.

The story so far.

It seems that we can address the performance problem that the OP had by setting the optimzer_feature_enable to the older version – possibly through a hint embedded in the SQL, perhaps through an SQL Baseline or SQL Patch. Maybe we’ll have to have a logon trigger that sets the parameter for particular users or, worst case scenario, maybe we’ll have to set the parameter at the system level. Given how undesirable the last option could be it would be nice to know exactly what is causing the change in plan.

As a basic clue – if the order of tables in a hash join reverses itself this usually means that the byte (not row) estimates have changed. The optimizer will use the table with the lower byte count as the build table in a hash join. So the recursive with pump – whose row and byte estimates don’t appear – must have produced larger numbers in 12.2.0.1.

A change in the 12.2 plan that I haven’t yet mentioned is the E-rows for the hash join; it’s gone up from 640 (12.1.0.2) to 2.5 million! So let’s repeat the tests with the CBO (10053) trace file enabled and see if we can find a number like 2524K appearing as a join estimate in the trace file. Having created the two trace files (in 12.2.0.1, one with the OFE set backwards) I executed the following grep command against the trace files:

grep -n "^Join Card - Rounded" orcl12c_ora_5524_ofe.trc
grep -n "^Join Card - Rounded" orcl12c_ora_5524_base.trc

I’d set the tracefile_identifier to ‘ofe’ and ‘base’ respectively for the 12.1.0.2 and 12.2.0.1 tests, and here are the results:

grep -n "^Join Card - Rounded" orcl12c_ora_5524_ofe.trc
1166:Join Card - Rounded: 640 Computed: 639.941176

grep -n "^Join Card - Rounded" orcl12c_ora_5524_base.trc
1195:Join Card - Rounded: 640 Computed: 639.941176
1391:Join Card - Rounded: 2544 Computed: 2543.865546
1576:Join Card - Rounded: 10112 Computed: 10111.865546
1737:Join Card - Rounded: 40193 Computed: 40193.075630
1898:Join Card - Rounded: 159759 Computed: 159758.731092
2059:Join Card - Rounded: 635008 Computed: 635008.462185
2220:Join Card - Rounded: 2524023 Computed: 2524023.394958
2269:Join Card - Rounded: 2524023 Computed: 2524023.394958

That’s an interesting clue. Something seems to be calculating a larger and larger value in the 12.2.0.1 trace, starting with the hash join cardinality that appeared in 12.1.0.2 had, growing by a factor of nearly 4 each time, and ending with the hash join cardinality we saw in the 12.2.0.1 plan.

Taking a closer look at the content of the 12.2.0.1 trace file it turned out that every stage in that escalation was Oracle recalculating the cost and cardinality of joining test_folder (the table) and rec_path (the “with” subquery) using the figures from the previous join calculation as the new base figures for rec_path. In effect the optimizer was calculating the cost of following the recursive subquery down to its 7th level of recursion.

Side note: in agreement with my comment about the smaller (in bytes) rowsource being used as the build table, the initial join order started as (test_folder, rec_path) in the first two iterations, but switched to (rec_path, test_folder) from the 3rd iteration onwards.

So we’ve identified the mechanics that cause the change in plan; the question now is: why 7 iterations to the final cost? (Briefly I did a quick check to see how many circles of hell there were in Dante’s Inferno – but it’s 9 (or 10 depending how you count). A quick check of v$parameter (and the x$ tables for the hidden parameters) revealed the following:

Name                                     Value
------------------------------------ ---------
_recursive_with_branch_iterations            7

Setting this parameter to 1 in the session, or adding the hint /*+ opt_param(‘_recursive_with_branch_iterations’ 1) */ to the query resulted in the 12.1.0.2 plan appearing in 12.2.0.1 – and this is a much less intrusive way of getting the plan we’re interested in than setting the entire OFE back to 12.1.0.2. One might even set the parameter in the spfile (after receiving approval from Oracle Corp., of course) given how precisely targetted it is (and know that it doesn’t switch off that nice little buffering trick.)

Summary

From 12.2 onwards the optimizer does recursive recosting of recursive “with” subqueries. This means the cost and cardinality estimates of a plan may change and the impact may cause a significant change in performance – it certainly did for the OP.

The change seems to be driven by the hidden parameter _recursive_with_branch_iterations, which was introduced in 12.2.0.0 with a default value of 7. Setting this parameter to 1 reverts the optimizer to the pre-12.2 behaviour. If you run into a problem of recursive “with” subqueries changing plans and causing performance problems on an upgrade from pre-12.2 to a newer version of Oracle then it would be worth investigating this parameter as the least intrusive way of reverting back to the original plans.

Update (July 2020)

I’ve just been prompted to check MOS for any references to the hidden parameter – and discovered a note that was published in September 2018, updated ub Feb 2019.  It’s amazing how easy it can be to find an answer on MOS when you already know what the answer is ;) Document id 2443466.1 Oracle 12.2.0.1 CBO calculating high cost/CPU for queries with recursive sub-query (Doc ID 2443466.1)

This gives two workarounds to the problem of a change in cost in 12.2 – set the optimizer_features_enable to 12.1.0.2, or set the hidden parameter to 1. It references two bugs (one a duplicate of the other, both apparently unpublished):

  • Bug 23515289 : PERFORMANCE REGRESSION OBSERVED WITH RECURSIVE WITH SERIAL PLAN
  • Bug 24566985 : UPG: QUERY PERFORMANCE ON ALL_TSTZ_TABLES 160 TIMES SLOWER THAN 11.2.0.4

and the Permanent Fix for the problem is to install the patch for Bug 24566985 on 12.2.0.1

 

June 29, 2020

Most Recent – 2

Filed under: CBO,Execution plans,Oracle,Tuning — Jonathan Lewis @ 1:02 pm BST Jun 29,2020

A question arrived in my email a few days ago with the following observations on a statement that was supposed to query the data dictionary for some information about a specified composite partitioned table. The query was wrapped in a little PL/SQL, similar to the following:

declare
        v_src_part      varchar2(30) := null;
        v_tab           varchar2(30)  := 'PT_COMPOSITE_1';
begin

        select
                /*+ qb_name(main) */
                uts1.subpartition_name
        into    v_src_part
        from
                user_tab_subpartitions uts1
        where
                uts1.table_name = v_tab
        and     uts1.last_analyzed is not null
        and     uts1.num_rows = (
                        select
                                /*+ qb_name(max_subq) */
                                max (uts2.num_rows)
                        from
                                user_tab_subpartitions uts2
                        where
                                uts2.table_name = /* v_tab */ uts1.table_name
                )
        and     rownum = 1
        ;

The requirement is simple: identify the subpartitions of a specific table that have the largest number of rows of any subpartition of the table – but report only the first match.

You’ll notice that the where clause of the subquery has a commented “v_tab” in it. This is the PL/SQL variable used in the outer query block to identify the target table, and it shouldn’t really make any difference if I use the PL/SQL variable in the subquery rather than using a correlating column. However, the question that came with this block of code was was follows:

All the partitions and subpartitions had their stats when running the test. On a first run using the correlated subquery the block reported oracle error ORA-01403: no data found. Changing the code to use the PL/SQL variable the block reported a specific subpartition as expected. A few hours later (after changing the code back to use the correlated subquery) the block reported the same subpartition. Have you ever seen anything like this? The Oracle version is 12.1.0.2.

Rule 1, of course, is to be a little sceptical when someone says “Honest, Guv, the stats are all okay”. But I’m going to assume that the statistcs on this table really were complete and that there was no “data-related” reason for this query to behave in such a surprising way.

The email is an invitation to consider two points.

  1. This looks like a bug: the two versions of the query are logically equivalent, they should return the same results if the underlying data had not changed. (In fact, I think the only “legal” way that the query could return ORA-01403 is if there were no stats on any subpartitions of the table in question – any ordinary usage of the dbms_stats package other than delete_table_stats() would have ensured that the query had to find something.) So, the first run of the correlated subquery produced no data while the modified query did get a result. That suggests a problem with some transformation in the 12.1.0.2 code to handle correlated aggregate subqueries.
  2. How could the second execution of the version with the correlated subquery produce a result a few hours later. Here are a couple of possibilities:
    • Someone had gathered dictionary stats (i.e. on the tables used by the query, not on the subpartitioned table) in the “few hours” gap so the optimizer picked a different execution plan which bypassed the bug.
    • (minor variation on previous) Someone had gather dictionary stats when the first execution plan was already in memory but the “auto_invalidate” option for cursor invalidation meant that the query didn’t get re-optimised for a few hours.
    • Nothing changed, but the query had been flushed from the library cache and did need re-optimisation a few hours later. Since the version is 12.1.0.2 this means statistics feedback or automatic SQL directives could have had an impact – which means there may be dynamic sampling during optimisation – and a different set of random samples could have resulted in a different execution plan.
    • Other …

The interesting bit

There is a generic feature about this question that is more interesting than the “what went wrong, how could I get different results”, and it’s in the choice you can make between using a correlation column and repeating a pl/sql variable (or literal value ).

The switch to using a pl/sql variable turns the subquery into a single-row, “standalone”, subquery – one that could be run without any reference to the outer query – and this imposes a dramatic change on what the optimizer can doSometimes that change will make a huge difference to the optimisation time and the run time.

As a correlated subquery the notional “first strategy” for the optimizer is:

“for each row in the outer query execute the inner query as a filter subquery passing in the correlation value

If you take the “standalone” approach the optimizer will be looking for a plan that says (in effect):

“run the subquery once to generate a constant that you will need to execute the rest of the query”

Running the subquery once rather than once per row is likely to be a good idea – on the other hand Oracle can do “scalar subquery caching” so if the value of the correlation column is always the same the correlated subquery will actually run only once anyway.

More importantly, when the optimizer sees a correlated subquery it will consider unnesting it and then transforming it in various other ways; and it might take the optimizer a long time to work out what it can and can’t do, and the plan it finally does produce may be much slower than what it could have done if it had not unnested the subquery.

Some test results

So I ran 3 variations of the PL/SQL block on Oracle 19.3.0.0 with the CBO trace (10053) enabled and picked out a few highlights. The three tests in order were:

  1. Use the pl/sql variable so the subquery could run as a standalone query
  2. Use the correlating column to make the subquery a correlated subquery
  3. Use the correlating column, but add the hint /*+ no_unnest */ to the subquery.

The results were as follows – first the timing, then a critical measure that explains the timing:

  • Case 1 – standalone subquery – total time 0.82 seconds
  • Case 2 – correlated subquery – total time 5.76 seconds
  • Case 3 – correlated subquery with no_unnest hint – total time 0.84 seconds

Where did all that extra time go – a lot of it went in optimisation. How many “Join Orders” were examined for each query

  • Case 1 – standalone subquery – 90 join orders
  • Case 2 – correlated subquery – 863 join orders
  • Case 3 – correlated subquery with no_unnest hint – 90 join orders

If you’re wondering what the 773 extra join orders were about here’s a clue. I extracted all the lines from the case 2 trace file that started with “SU:” – those are the lines tagged for “Subquery Unnest” – using a call to grep -n “^SU:” {tracefile name} and this is the result:


  2945:SU: Unnesting query blocks in query block SEL$071BB01A (#1) that are valid to unnest.
  2947:SU: Considering subquery unnest on query block SEL$071BB01A (#1).
  2948:SU:   Checking validity of unnesting subquery SEL$4F5F2F29 (#2)
  2949:SU:   Passed validity checks, but requires costing.
  2950:SU: Using search type: exhaustive
  2951:SU: Starting iteration 1, state space = (2) : (1)
  2952:SU:   Unnesting subquery query block SEL$4F5F2F29 (#2)Subquery removal for query block SEL$4F5F2F29 (#2)
  3089:SU: Costing transformed query.
 66112:SU: Considering interleaved complex view merging
 66113:SU:   Unnesting subquery query block SEL$4F5F2F29 (#2)Subquery removal for query block SEL$4F5F2F29 (#2)
 66366:SU: Costing transformed query.
129372:SU: Finished interleaved complex view merging
129373:SU: Considering interleaved distinct placement
129374:SU: Finished interleaved distinct placement
129375:SU: Considering interleaved join pred push down
129376:SU:   Unnesting subquery query block SEL$4F5F2F29 (#2)Subquery removal for query block SEL$4F5F2F29 (#2)
251638:SU: Rejected interleaved query.
251640:SU: Finished interleaved join pred push down
251641:SU: Considering interleaved OR Expansion
251642:SU:   Unnesting subquery query block SEL$4F5F2F29 (#2)Subquery removal for query block SEL$4F5F2F29 (#2)
251651:SU: Finished interleaved OR Expansion
251653:SU: Updated best state, Cost = 19.085153
251654:SU: Starting iteration 2, state space = (2) : (0)
251665:SU: Costing transformed query.
310395:SU: Not update best state, Cost = 20.083998
310396:SU: Will unnest subquery SEL$4F5F2F29 (#2)

The optimizer checks the validity of unnesting (generated) query block SEL$4F5F2F29 at line 2948 of the trace and decides, 308,000 lines later after an exhaustive examination of the possibilities, that it will unnest the subquery. Since this is a recent version of Oracle we take one simple extra step by checking for “TIMER” information, again using a “grep -n” call –

251639:TIMER:  SU: Interleaved JPPD SEL$B73B51DC cpu: 1.263 sec elapsed: 1.263 sec
251652:TIMER: SU: iteration (#1) SEL$B73B51DC cpu: 2.607 sec elapsed: 2.607 sec
310577:TIMER: CBQT SU and CVM SEL$071BB01A cpu: 3.323 sec elapsed: 3.323 sec
433371:TIMER: Cost-Based Join Predicate Push-Down SEL$12B6FE6C cpu: 1.307 sec elapsed: 1.306 sec
433477:TIMER: Cost-Based Transformations (Overall) SEL$12B6FE6C cpu: 4.731 sec elapsed: 4.731 sec
496189:TIMER: SQL Optimization (Overall) SEL$12B6FE6C cpu: 5.306 sec elapsed: 5.306 sec

Of course most of the time spent in this particular example was a result of optimising (and writing the optimizer trace), but for my tiny example (table definition below) the final figures I’ll show are the buffer gets and CPU time reported by a basic 10046 trace file after optimisation with all the relevant data was cached:

  • Case 1 – standalone subquery – 89 buffer gets / 0.00 seconds
  • Case 2 – correlated subquery – 130 buffer gets / 0.53 seconds
  • Case 3 – correlated subquery with no_unnest hint – 121 buffer gets / 0.08 CPU seconds

The sub-centisecond time is a little suspect, of course, but the others seem fairly trustworthy.

Conclusion

The title of this piece is “Most Recent” because the commonest requirement for a query of this shape is find the most recent row matching the following predicates”, even though in this case the interpretation is “find me the row matching the largest value”.

The “standard” pattern for writing a “most recent” query is to use a correlated subquery – but it’s worth remembering that you may reduce optimisation time and run time by “copying down the constant” rather than using the correlation mechanism.

(There are alternative strategies to the subquery approach, of course, and the analytic max() – introduced in Oracle 8i – is gaining traction as one of the popular alternatives.)

Footnote 1

If you want to re-run my test on different platforms and versions of Oracle, here’s the code to generate the table.  (Don’t be surprised if you don’t get completely consistent results – much of the optimization will depend on the size of all the relevant tables (tab$, tabcompart$, etc.) in the data dictionary, rather than on the actual definition of this partitioned table.


em
rem     Script:         most_recent_3.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2020
rem
rem     Last tested 
rem             19.3.0.0
rem

create table pt_composite_1 (
        id,
        grp,
        small_vc,
        padding
)
nologging
partition by range(id) 
subpartition by hash (grp)
subpartitions 4
(
        partition p2 values less than (400),
        partition p3 values less than (800),
        partition p4 values less than (1600),
        partition p5 values less than (3200)
)
as
select
        rownum                          id,
        trunc(rownum/50)                grp,
        to_char(trunc(rownum/20))       small_vc,
        rpad('x',100)                   padding
from
        all_objects
where 
        rownum <= 3000 -- > comment to avoid wordpress format issue
;

execute dbms_stats.gather_table_stats(user,'pt_composite_1',granularity=>'ALL')


Footnote 2

For reference, here are the outputs I got from executing egrep -n -e”^SU:” -e”TIMER” against the other two CBO trace files.

First for the “standalone” form – note how line 3130 tells us that “there is no correlation”.


806:SU: Considering subquery unnesting in query block MISC$1 (#0)
2947:SU: Unnesting query blocks in query block SEL$071BB01A (#1) that are valid to unnest.
2952:SU: Considering subquery unnest on query block SEL$071BB01A (#1).
2953:SU:   Checking validity of unnesting subquery SEL$4F5F2F29 (#2)
2954:SU:     SU bypassed: No correlation to immediate outer subquery.
2955:SU:     SU bypassed: Failed basic validity checks.
2956:SU:   Validity checks failed.
3130:SU:     SU bypassed: No correlation to immediate outer subquery.

Then for the correlated subquery with /*+ no_unnest */ hint; and line 3122 tells us that SU was bypassed because of a hint/parameter:


809:SU: Considering subquery unnesting in query block MISC$1 (#0)
2945:SU: Unnesting query blocks in query block SEL$071BB01A (#1) that are valid to unnest.
2947:SU: Considering subquery unnest on query block SEL$071BB01A (#1).
2948:SU:   Checking validity of unnesting subquery SEL$4F5F2F29 (#2)
2949:SU:     SU bypassed: Not enabled by hint/parameter.
2950:SU:     SU bypassed: Failed basic validity checks.
2951:SU:   Validity checks failed.
3122:SU:     SU bypassed: Not enabled by hint/parameter.

Neither file showed any “TIMER” information since that appears, by default, only for steps that take longer than one second. (If you want to adjust the granularity, see Franck Pachot’s note on parse time that describes bug/fix_control 16923858.

June 17, 2020

Lower Cost Ignored

Filed under: CBO,Oracle — Jonathan Lewis @ 3:49 pm BST Jun 17,2020

This is an update on a post I wrote nearly 10 years ago describing how the optimizer could choose to ignore a lower cost indexed access path and use a higher cost index if the lower cost were based on guesswork. The original article article used (select {constant} from dual) as a way of supplying an “unpeekable bind” in a predicate, but that mechanism stopped working 11gR2, hence the update.

The upate also goes into a little more detail about event 38036 which can be used to modify this behaviour by defining a “cut-off” percentage where Oracle will switch back to using the lower cost path.

We start with the code to generate the data – including, in this case – a table that I can query to supply “hidden constants” to the optimizer:

em
rem     Script:         index_range_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2020
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem

create table t1
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual 
        connect by 
                rownum <= 10000 -- > comment to avoid wordpress format issue
)
select
        rownum                  id,
        9 + mod(rownum-1,3)     scattered,
        trunc(rownum / 500)     clustered,
        lpad(mod(rownum,2),10)  ind_pad,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 40000 -- > comment to avoid wordpress format issue
;

create index t1_equi  on t1(scattered, ind_pad, clustered);
create index t1_range on t1(clustered, scattered);

create table driver(name varchar2(20), value number(4,0));
create index drv_i1 on driver(name);

insert into driver values('Forty',40);
insert into driver values('FortyOne',41);
commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'DRIVER',
                method_opt       => 'for all columns size 1'
        );
end;
/


The driver table models a pattern that I see fairly frequently of a table holding a set of “constants” that are subject to infrequent changes. (An alternative strategy is to have a PL/SQL package of constants and a function to return a value when supplied with a name.)

The key columns in the table are

  • scattered: 3 distinct values evenly but randomly scattered across the table – not a column you would index separately
  • clustered: 500 consecutive rows each for 80 distinct values, so very well clustered data

There are two indexes on the main data table

  • t1_range (clustered scattered) — roughly (80 * 3 =) 240 distinct keys
  • t1_equi (scattered, ind_pad, clustered) — roughly (3 * 2 * 80) = 480 distinct keys

If I execute the query: “select * from t1 where clustered between 40 and 41 and scattered = 10” Oracle uses the t1_range index to return an estimated 491 rows at a cost of 36. The estimates aren’t too far out since in my case the query returned 334 rows in 26 buffer visits (when my SQL*Plus arraysize was 500).

------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |   496 |  8432 |    36   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1       |   496 |  8432 |    36   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_RANGE |   496 |       |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CLUSTERED">=40 AND "SCATTERED"=10 AND "CLUSTERED".le.41)
       filter("SCATTERED"=10)

But what happens if we decide to keep the start and end values for clustered in the driver table:


set autotrace traceonly explain

select 
        /*  index(t1 t1_range) */
        small_vc
from
        t1
where
        scattered = 10
and     clustered between (select value from driver where name = 'Forty')
                  and     (select value from driver where name = 'FortyOne')
;

set autotrace off



You’ll notice the “hint that isn’t a hint” I’ve tested two versions of the query, one without a hint and one where I inserted the necessary “+” to have a hint instead of a comment. Here are the execution plans (with “<=” edited to “.le.” in the Predicate Information):


=======
Default
=======
-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |    33 |   561 |    67   (3)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED  | T1      |    33 |   561 |    63   (4)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                    | T1_EQUI |    60 |       |    56   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| DRIVER  |     1 |    11 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | DRV_I1  |     1 |       |     1   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID BATCHED| DRIVER  |     1 |    11 |     2   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN                  | DRV_I1  |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("SCATTERED"=10 AND "CLUSTERED">= (SELECT "VALUE" FROM "DRIVER" "DRIVER"
              WHERE "NAME"='Forty') AND "CLUSTERED".le. (SELECT "VALUE" FROM "DRIVER" "DRIVER" WHERE
              "NAME"='FortyOne'))
       filter("CLUSTERED">= (SELECT "VALUE" FROM "DRIVER" "DRIVER" WHERE "NAME"='Forty')
              AND "CLUSTERED".le. (SELECT "VALUE" FROM "DRIVER" "DRIVER" WHERE "NAME"='FortyOne'))
   4 - access("NAME"='Forty')
   6 - access("NAME"='FortyOne')

=============
Hint t1_range
=============
--------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |    33 |   561 |    10   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED  | T1       |    33 |   561 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                    | T1_RANGE |    60 |       |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| DRIVER   |     1 |    11 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | DRV_I1   |     1 |       |     1   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID BATCHED| DRIVER   |     1 |    11 |     2   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN                  | DRV_I1   |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("CLUSTERED">= (SELECT "VALUE" FROM "DRIVER" "DRIVER" WHERE "NAME"='Forty')
              AND "SCATTERED"=10 AND "CLUSTERED".le. (SELECT "VALUE" FROM "DRIVER" "DRIVER" WHERE
              "NAME"='FortyOne'))
       filter("SCATTERED"=10)
   4 - access("NAME"='Forty')
   6 - access("NAME"='FortyOne')

The big difference occurs at operation 2. You should notice two things – first, the estimate of rows is wrong by a factor of 10, but that’s not the point I want to chase. Secondly by default the optimizer has  selected the plan using the (far) more expensive index. This is a deliberately coded choice and if you have access to MOS then you should check Document ID 4112254.8: “CBO may not choose best index for single table access with unpeeked bind/function”.

Rerun the test (unhinted) after issuing.

alter session set events '38036 trace name context forever, level 10'

You will find that the optimizer automatically takes the path using the lower cost index. For values of 10 or higher Oracle will pick the lower cost index, for values of 9 or less the optimizer will stick with the higher cost index.

The question, of course, is what’s special about the value 10. Given the clue in the MOS document, and the costs from the plans above, and allowing for some fiddling with the arithmetic we can note the following:

  • Cost of using the expensive index for the table access is 63 (From the 10053 trace it’s actually 62.525996)
  • Cost of using the cheaper index for the table access if 6 (Again, from the 1003 trace, 6.037011)
  • Express 6.037011 / 62.525996 as a percentage and you get 9.655 which rounds to 10.

Probably not a coincidence! (Though it might take a few more test to decide whether it’s round() or ceiling())

So if the optimizer is picking the wrong index, and the arithmetic for the one it should be using is dictated by guessing, then calculate the ratio of the two costs for the table access, express as a percentage and round up (and add some if you want to) then set the event.

I’m not really sure how much use you could, or should, make of this event. Possibly it’s one of those things you use temporarily to allow you to generate an SQL Baseline (the event setting doesn’t get captured as an “opt_param()” hint unlike other fiddles of this type that you might use). Maybe there are a few special cases in batch jobs were the optimizer is a little inconsistent about which index to pick and the event is relevant and could be set through an ‘alter session’ command. It’s nice to know it’s there – but may be hard to use in practice.

Footnote

If you were wondering about the cardinality estimates of 60 rowids from the index and 33 rows from the table. This is the effect of the standard  “guesses for unpeeked binds” the optimizer uses. For bounded (between) range on an index the selectivity is 0.0045, for a table it’s 0.0025; in this case we then have to include a factor of 1/3 in the arithmetic to allow for the “scattered=10” predicate:

40,000 * 0.0045 * 1/3 = 60

40,000 * 0.0025 * 1/3 = 33.33….

 

 

 

 

June 5, 2020

Analytic cost error

Filed under: CBO,Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 1:06 pm BST Jun 5,2020

Here’s a surprising costing error that was raised on the Oracle Developer Forum a few days ago. There’s a glitch in the cost atributed to sorting when an analytic over() clause – with corresponding “window sort” operation – makes a “sort order by” operation redundant. Here’s a script to generate the data set I’ll use for a demonstration with a template for a few queries I’ll be running against the data.


rem
rem     Script:         window_sort_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Apr 2020
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem

create table t1
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                    id,
        cast(lpad(rownum,30,'0') as varchar2(30)) vc30,
        cast(lpad(rownum,65,'0') as varchar2(65)) vc65,
        lpad('x',100,'x')                         padding
from
        generator
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

select
--      row_number() over (order by vc65) rn1,
--      row_number() over (order by vc30) rn2,
--      vc30,
--      vc65,
        id
from
        t1
-- order by
--      vc65
--      vc30
/


I’m (optionally) using the row_number() analytic function over the entire data set and for each row_number() I include in the select list Oracle will have to sort the data; I’ve also got an (optional) order by on the two columns that appear in the row_number() functions and that may introduce some sorting as well. Here, for example, are a few examples of the queries I might run:


prompt  ===========================================
prompt  Select vc30, order by vc30
prompt  ===========================================

explain plan for
select 
        vc30,
        id 
from 
        t1
order by
        vc30
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes'));

prompt  ==========================================
prompt  Select row_number over vc30 - no ordering
prompt  ==========================================

explain plan for
select 
        row_number() over (order by vc30) rn2, 
        id 
from 
        t1
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes'));

prompt  ===========================================
prompt  Select row_number over vc30 - order by vc65
prompt  ===========================================

explain plan for
select 
        row_number() over (order by vc30) rn2, 
        id 
from 
        t1
order by
        vc65
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes'));

In the first query we select and sort vc30 so (approximately) we sort 10,000 rows x 30(-ish) bytes for 300K of sort space. In the second query we generate the row_number() based on sorting vc30 – the size of the output is much smaller (it’s only 10,000 numbers between 1 and 10,000) but to generate those numbers we do have to select and sort vc30, so the workload (predicted and actual) will probably be similar to that of the firsrt query. In the final query we have to select and sort vc30 to generate the row_number() but we also have to select (without reporting) and sort vc65 in order to report the results in the right order – so we should expect the workload to be roughly 3 times the size (approximately 10,000 * (30 + 65) bytes). Here, from 12.2.0.1, are the execution plans (with a little cosmetic tidying):

===========================================
Select vc30, order by vc30
===========================================
------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   341K|       |   130   (5)|
|   1 |  SORT ORDER BY     |      | 10000 |   341K|   448K|   130   (5)|
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   341K|       |    42   (5)|
------------------------------------------------------------------------

==========================================
Select row_number over vc30 - no ordering
==========================================
------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   341K|       |   130   (5)|
|   1 |  WINDOW SORT       |      | 10000 |   341K|   448K|   130   (5)|
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   341K|       |    42   (5)|
------------------------------------------------------------------------

===========================================
Select row_number over vc30 - order by vc65
===========================================
-------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 10000 |   986K|       |   463   (3)|
|   1 |  SORT ORDER BY      |      | 10000 |   986K|  1120K|   463   (3)|
|   2 |   WINDOW SORT       |      | 10000 |   986K|  1120K|   463   (3)|
|   3 |    TABLE ACCESS FULL| T1   | 10000 |   986K|       |    42   (5)|
-------------------------------------------------------------------------

As expected, the execution plans are roughly consistent with the estimates I gave for volume of data – the agreement between the query with order by vc30 and the query with over(order by vc30), and the increased load of ordering by vc65 when selecting the row_number(over vc30) is good.

So let’s see what the plan looks like when we select row_number(over vc30) and then order by vc30. If the optimizer is smart it will recognise that it’s possible to adopt a strategy that allows it to take advantage of the sorting from the over() clause to avoid a separate sort order by:


explain plan for
select 
        row_number() over (order by vc30) rn2, 
        id 
from 
        t1
order by
        vc30
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes'));

----------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   341K|    42   (5)|
|   1 |  WINDOW SORT       |      | 10000 |   341K|    42   (5)|
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   341K|    42   (5)|
----------------------------------------------------------------

The plan shows us that Oracle has used the 10gR2 “order by elimination” feature to bypass the need for a “sort order by” operation because it knows the data will be arriving in the right order from the “Window Sort” operation. Ynfortunately it also shows us that Oracle has lost the cost of doing the Window Sort!

Let’s try another experiment – let’s generate two different row_number() columns, with and without ordering:


prompt  =====================================
prompt  Select Both row_numbers - no ordering
prompt  =====================================

explain plan for
select 
        row_number() over (order by vc65) rn1, 
        row_number() over (order by vc30) rn2, 
        id 
from 
        t1
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes projection'));

prompt  ======================================
prompt  Select Both row_numbers order by vc30
prompt  ======================================

explain plan for
select 
        row_number() over (order by vc65) rn1, 
        row_number() over (order by vc30) rn2, 
        id 
from 
        t1
order by
        vc30
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes projection'));

prompt  ============================================
prompt  Select Both row_numbers order by vc65
prompt  ============================================

explain plan for
select 
        row_number() over (order by vc65) rn1, 
        row_number() over (order by vc30) rn2, 
        id 
from 
        t1
order by
        vc65
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes projection'));

You’ll notice that I’ve included a request for the projection information in the plans for these examples so that you can see what columns are passed up from each operation to its parent. Again, though, we’ll start by focusing on just the costs:


-------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 10000 |   986K|       |   463   (3)|
|   1 |  WINDOW SORT        |      | 10000 |   986K|  1120K|   463   (3)|
|   2 |   WINDOW SORT       |      | 10000 |   986K|  1120K|   463   (3)|
|   3 |    TABLE ACCESS FULL| T1   | 10000 |   986K|       |    42   (5)|
-------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=1) "VC30"[VARCHAR2,30], "VC65"[VARCHAR2,65],
       "ID"[NUMBER,22], ROW_NUMBER() OVER ( ORDER BY "VC65")[22], ROW_NUMBER()
       OVER ( ORDER BY "VC30")[22]
   2 - (#keys=1) "VC65"[VARCHAR2,65], "ID"[NUMBER,22],
       "VC30"[VARCHAR2,30], ROW_NUMBER() OVER ( ORDER BY "VC65")[22]
   3 - "ID"[NUMBER,22], "VC30"[VARCHAR2,30], "VC65"[VARCHAR2,65]


-------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 10000 |   986K|       |   253   (3)|
|   1 |  WINDOW SORT        |      | 10000 |   986K|  1120K|   253   (3)|
|   2 |   WINDOW SORT       |      | 10000 |   986K|  1120K|   253   (3)|
|   3 |    TABLE ACCESS FULL| T1   | 10000 |   986K|       |    42   (5)|
-------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=1) "VC30"[VARCHAR2,30], "VC65"[VARCHAR2,65],
       "ID"[NUMBER,22], ROW_NUMBER() OVER ( ORDER BY "VC65")[22], ROW_NUMBER()
       OVER ( ORDER BY "VC30")[22]
   2 - (#keys=1) "VC65"[VARCHAR2,65], "ID"[NUMBER,22],
       "VC30"[VARCHAR2,30], ROW_NUMBER() OVER ( ORDER BY "VC65")[22]
   3 - "ID"[NUMBER,22], "VC30"[VARCHAR2,30], "VC65"[VARCHAR2,65]


-------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 10000 |   986K|       |   253   (3)|
|   1 |  WINDOW SORT        |      | 10000 |   986K|  1120K|   253   (3)|
|   2 |   WINDOW SORT       |      | 10000 |   986K|  1120K|   253   (3)|
|   3 |    TABLE ACCESS FULL| T1   | 10000 |   986K|       |    42   (5)|
-------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=1) "VC65"[VARCHAR2,65], "VC30"[VARCHAR2,30],
       "ID"[NUMBER,22], ROW_NUMBER() OVER ( ORDER BY "VC30")[22], ROW_NUMBER()
       OVER ( ORDER BY "VC65")[22]
   2 - (#keys=1) "VC30"[VARCHAR2,30], "ID"[NUMBER,22],
       "VC65"[VARCHAR2,65], ROW_NUMBER() OVER ( ORDER BY "VC30")[22]
   3 - "ID"[NUMBER,22], "VC30"[VARCHAR2,30], "VC65"[VARCHAR2,65]

The first query – without an order by” clause – reports a cost of 463; add an order by clause and the cost drops to 253 (and the “order by” clause doesn’t appear as a sort order by operation in the plan). The cost differential between the ordered and “unordered” plans , by the way, is 210 (and from there down to the base tablescan is another 211) – and here’s another way to see that number (+/- 1) appearing:


explain plan for
select
        vc65,
        id
from
        t1
order by
        vc30
/

select * from table(dbms_xplan.display(null,null,'basic cost rows bytes'));

------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   986K|       |   253   (3)|
|   1 |  SORT ORDER BY     |      | 10000 |   986K|  1120K|   253   (3)|
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   986K|       |    42   (5)|
------------------------------------------------------------------------

The cost of the query with no “order by” clause is basically the cost of a table scan plus two sorts of (vc65, vc30, plus a few bits). When you add in an “order by” clause the optimizer discards the “order by” clause and then subtracts one of the sort costs as well.

CBO trace file

Every time I say something about 10053 (CBO) trace files I feel compelled to remind everyone that I rarely look at them, and then it’s usually because I think I know there’s a bug and where I’ll find it in the trace. That’s exactly the case here.

I’m expecting to see two differences in the trace files between the “no order” query, and a query where I’ve added in an “order by” clause. One difference is that one trace file will have an “OBYE” (eliminate order by) comment which won’t be in the other trace, one trace file will have a “cost for SORT” calculation which won’t be in the other.

So here are the relevant bits – first from the query without the order by clause:


OBYE:   Considering Order-by Elimination from view SEL$1 (#0)
***************************
Order-by elimination (OBYE)
***************************
OBYE:     OBYE bypassed: no order by to eliminate.

...

GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
Join order[1]:  T1[T1]#0
WiF sort
    SORT ressource         Sort statistics
      Sort width:         497 Area size:      435200 Max Area size:    87240704
      Degree:               1
      Blocks to Sort: 150 Row size:     122 Total Rows:          10000
      Initial runs:   2 Merge passes:  1 IO Cost / pass:         56
      Total IO sort cost: 206.000000      Total CPU sort cost: 12180621
      Total Temp space used: 1147000
    SORT ressource         Sort statistics
      Sort width:         497 Area size:      435200 Max Area size:    87240704
      Degree:               1
      Blocks to Sort: 150 Row size:     122 Total Rows:          10000
      Initial runs:   2 Merge passes:  1 IO Cost / pass:         56
      Total IO sort cost: 206.000000      Total CPU sort cost: 12180621
      Total Temp space used: 1147000
***********************
Best so far:  Table#: 0  cost: 463.384707  card: 10000.000000  bytes: 1010000.000000
***********************

And from one of the plans with an order by:


OBYE:   Considering Order-by Elimination from view SEL$1 (#0)
***************************
Order-by elimination (OBYE)
***************************
OBYE:     OBYE performed.

...

GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
Join order[1]:  T1[T1]#0
WiF sort
    SORT ressource         Sort statistics
      Sort width:         497 Area size:      435200 Max Area size:    87240704
      Degree:               1
      Blocks to Sort: 150 Row size:     122 Total Rows:          10000
      Initial runs:   2 Merge passes:  1 IO Cost / pass:         56
      Total IO sort cost: 206.000000      Total CPU sort cost: 12180621
      Total Temp space used: 1147000
***********************
Best so far:  Table#: 0  cost: 252.512458  card: 10000.000000  bytes: 1010000.000000
***********************

As you can see, the first (no order) trace file has two sort calculations under WiF sort, (Window Function?) while the second (order by) trace file reports “OBYE performed” and then loses one of its WiF sorts.

Note: If I had ordered by ID I would have seen two calculations of cost under the heading of WiF sort and a third calculation below that with the heading ORDER BY sort. Unfortunately when I ran the test to check this I also found that the OBYE report said: “OBYE performed” when it wasn’t relevant and there wasn’t an “order by” available for elimination.

Interestingly I tested to see if I could change the optimizer’s behaviour by adding the no_eliminate_oby(@sel$1) hint to the “order by” queries but the GENERAL PLANS section didn’t change, even though the trace file report: “OBYE: OBYE bypassed: hinted”, and the “Hint Report” from the Oracle 19i execution plan acknowledge the hint as legal and used.

Summary

If you have an “order by” clause in a query block that includes analytic functions and the optimizer decides that it can eliminate the “order by” and rely on the side effect of an analytic over() clause you may find that the cost of the query block is reduced by the cost of one of the Window Sort operations. (Technically this might lead to cases where the optimizer then made some poor choices in overall shape of the execution plan – though such cases might be very rare given that this costing error doesn’t affect the cardinality estimates.)

Lagniappe

In the last set of tests I added in the projection information as a simple example of a case where it can help you understand a little more of what the plan is supposed to achieve.  If you examine the last two query plans carefully (select both row_number() values and order by vc30 / vc65 respectively) Operation 2 of the first plan reports:

2 - (#keys=1) "VC65"[VARCHAR2,65], "ID"[NUMBER,22], "VC30"[VARCHAR2,30], ROW_NUMBER() OVER ( ORDER BY "VC65")[22]

while operation 2 of the second plan reports:

2 - (#keys=1) "VC30"[VARCHAR2,30], "ID"[NUMBER,22], "VC65"[VARCHAR2,65], ROW_NUMBER() OVER ( ORDER BY "VC30")[22]

It’s not until you look at these two lines that the plans show any differences – operations 1 and 2 simply say “Window Sort” without giving any clue about which window sort is for which over() clause. The projection information, though, tells you which way around the over() clauses operated – in the first query the over(order by vc65) is applied to the result of the tablescan first, while in the second query it’s the over(order by vc30) that is applied first.

Lagniappe 2

There’s another little oddity you might spot when you look at the projection information and think about the WiF sort costs from the unordered query. The columns passed from operation 3 to operation 2 are:

 3 - "ID"[NUMBER,22], "VC30"[VARCHAR2,30], "VC65"[VARCHAR2,65]

The columns passed from operation 2 to operation 1 are one of:

2 - (#keys=1) "VC65"[VARCHAR2,65], "ID"[NUMBER,22], "VC30"[VARCHAR2,30], ROW_NUMBER() OVER ( ORDER BY "VC65")[22]

2 - (#keys=1) "VC30"[VARCHAR2,30], "ID"[NUMBER,22], "VC65"[VARCHAR2,65], ROW_NUMBER() OVER ( ORDER BY "VC30")[22]

Operation 2 sorts the data from operation 3, and operation 1 sorts the data from operation 2 – but the columns arriving from operation 2 have an extra row_number()[22] added to them. So when you look in the trace file at the two Wif Sort calculations why do they both show:

Blocks to Sort: 150 Row size: 122 Total Rows: 10000

Shouldn’t one of them show a Row size that’s (at least) 22 longer than the other ?

 

 

 

 

 

 

June 1, 2020

Order By

Filed under: Bugs,CBO,Indexing,Oracle — Jonathan Lewis @ 1:05 pm BST Jun 1,2020

This is a brief note with an odd history – and the history is more significant than the note.

While searching my library for an example of an odd costing effect for the “order by” clause I discovered a script that looked as if I’d written for 11.1.0.6 in 2008 to demonstrate a redundant sort operation appearing in an execution plan; and then I discovered a second script written for 11.2.0.4 in 2014 demonstrating a variant of the same thing (presumably because I’d not found the original script in 2014) and the second script referenced a MOS bug number

Bug 18701129 : SORT ORDER BY ISN’T AVOIDED WHEN ROWID IS ADDED TO ORDER BY CLAUSE

Whenever I “discover” an old bug test I tend to re-run it to check whether or not the bug has been fixed.  So that’s what I did, and found that the anomaly was still present in 19.3.0.0. The really odd thing, though, was that the bug note no longer existed – and even after I’d done a few searches involving the text in the description I couldn’t manage to find it!

For the record, here’s the original 2008 script (with a couple of minor edits)


rem
rem     Script:         order_by_anomaly.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2008
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0        Still sorting
rem             12.2.0.1
rem             11.1.0.6
rem

set linesize 180
set pagesize 60

create table test 
as 
select  * 
from    all_objects 
where   rownum <= 10000 -- >  comment to avoid wordpress format issue
;

alter table test modify object_name not null;
create index i_test_1 on test(object_name);

analyze table test compute statistics;

set serveroutput off
alter session set statistics_level = all;

select  * 
from    (select * from test order by object_name) 
where 
        rownum < 11 -- > comment to avoid wordpress format issue
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));



select  * 
from    (select /*+ index(test) */ * from test order by object_name,rowid) 
where
        rownum < 11 -- > comment to avoid wordpress format issue
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter session set statistics_level = typical;
set serveroutput on

Yes, that is an analyze command – it’s a pretty old script and I must have been a bit lazy about writing it. (Or, possibly, it’s a script from an Oracle-l or Oracle forum posting and I hadn’t re-engineered it.)

I’ve run two queries – the first uses an inline view to impose an order on some data and then selects the first 10 rows. The second query does nearly the same thing but adds an extra column to the “order by” clause – except it’s not a real column it’s the rowid pseudo-column. Conveniently there’s an index on the table that is a perfect match for the “order by” clause and it’s on a non-null column so the optimizer can walk the index in order and stop after 10 rows.

Adding the rowid to the “order by” clause shouldn’t make any difference to the plan as the index Oracle is using is a single column non-unique index, which means that the internal representation makes it a two-column index where the rowid is (quite literally) stored as the second column. But here are the two execution plans:


----------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        |     10 |00:00:00.01 |       7 |
|*  1 |  COUNT STOPKEY                |          |      1 |        |     10 |00:00:00.01 |       7 |
|   2 |   VIEW                        |          |      1 |     10 |     10 |00:00:00.01 |       7 |
|   3 |    TABLE ACCESS BY INDEX ROWID| TEST     |      1 |  10000 |     10 |00:00:00.01 |       7 |
|   4 |     INDEX FULL SCAN           | I_TEST_1 |      1 |     10 |     10 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<11)



----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |      1 |        |     10 |00:00:00.01 |    4717 |       |       |          |
|*  1 |  COUNT STOPKEY                         |          |      1 |        |     10 |00:00:00.01 |    4717 |       |       |          |
|   2 |   VIEW                                 |          |      1 |  10000 |     10 |00:00:00.01 |    4717 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY               |          |      1 |  10000 |     10 |00:00:00.01 |    4717 |  4096 |  4096 | 4096  (0)|
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| TEST     |      1 |  10000 |  10000 |00:00:00.01 |    4717 |       |       |          |
|   5 |      INDEX FULL SCAN                   | I_TEST_1 |      1 |  10000 |  10000 |00:00:00.01 |      44 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<11)
   3 - filter(ROWNUM<11)


When I add the rowid to the “order by” clause the optimizer no longer sees walking the index as an option for avoiding work; it wants to collect all the rows from the table, sort them, and then report the first 10. In fact walking the index became such an expensive option that I had to hint the index usage (hence the non-null declaration) to make the optimizer choose it, the default plan for 19.3 was a full tablescan and sort.

It’s just a little example of an edge case, of course. It’s a pity that the code doesn’t recognise the rowid as (effectively) a no-op addition to the ordering when the rest of the “order by” clause matches the index declaration, but in those circumstances the rowid needn’t be there at all and you wouldn’t expect anyone to include it.

As I said at the start – the interesting thing about this behaviour is that it was once described in a bug note that has since disappeared from public view.

 

February 17, 2020

Join Elimination bug

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 3:37 pm GMT Feb 17,2020

It is possible to take subquery factoring (common table expressions / CTEs) too far. The most important purpose of factoring is to make a complex query easier to understand – especially if you can identify a messy piece of text that is used in more than one part of the query – but I have seen a couple of patterns appearing that make the SQL harder to read.

  • In one pattern each table is given its own factored subquery holding the non-join predicates (and, possibly, filter subqueries) and then the main query is just a simple join of the factored subqueries with nothing but join (ON) predicates.
  • In another pattern each factored subquery consists of the previous subquery with one more table added to it, so every subquery is no more that a “two-table” query block and the final subquery is a simple “select from last_factored_subquery”.

Neither of these patterns is helpful – but today’s blog note is not going to be about going to extremes with subquery factoring; instead it’s an example of a fairly reasonable use of subquery factoring that ran into a “wrong results” bug.

Consider a system that collects data from some type of meters. Here’s a two-table definition for meters and meter readings:


create table meters (
        meter_id        number,
        meter_type      varchar2(10),  -- references meter_types
        date_installed  date,
        padding         varchar2(100),
        constraint met_pk primary key(meter_id)
)
;

create table meter_readings (
        meter_id        number,
        date_read       date,
        reading         number(10,3),
        padding         varchar2(100),
        constraint      mrd_pk primary key(meter_id, date_read),
        constraint      mrd_fk_met foreign key (meter_id) references meters
)
;

insert into meters
select
        1e6 + rownum,
        case mod(rownum,3)
                when 0 then 'A'
                when 1 then 'B'
                       else 'C'
        end,
        trunc(sysdate) - mod(rownum,5),
        rpad('x',100,'x')
from
        dual 
connect by 
        level <= 10
;

execute dbms_stats.gather_table_stats(null,'meters')

insert into meter_readings 
select
        met.meter_id,
        met.date_installed - v.id + 2,
        dbms_random.value,
        rpad('x',100,'x')
from
        meters met,
        (select rownum id from dual connect by level <= 4) v
;

commit;

execute dbms_stats.gather_table_stats(null,'meter_readings')

I’ve picked the obvious primary keys for the two tables and defined the appropriate referential integrity constraint – which means the optimzer should be able to choose the best possible strategies for any query that joins the two tables.

I’ve created a very small data set – a few meters installed in the last few days, and a few readings per meters over the last few days. So lets report the readings for the last 48 hours, and include in the output any meters that haven’t logged a reading in that interval.

Here’s the query I wrote, with its output, running on a 19.3 instance on 17th Feb 2020:


with mrd_cte as (
        select 
                meter_id, date_read, reading
        from 
                meter_readings
        where 
                date_read in (trunc(sysdate), trunc(sysdate)+1)
)
select
        met.meter_id, met.date_installed, mrd_cte.date_read, reading
from
        meters met
left join 
        mrd_cte
on      mrd_cte.meter_id = met.meter_id
;

  METER_ID DATE_INST DATE_READ    READING
---------- --------- --------- ----------
   1000001 16-FEB-20 17-FEB-20       .063
   1000002 15-FEB-20
   1000003 14-FEB-20
   1000004 13-FEB-20
   1000005 17-FEB-20 18-FEB-20        .37
   1000005 17-FEB-20 17-FEB-20       .824
   1000006 16-FEB-20 17-FEB-20       .069
   1000007 15-FEB-20
   1000008 14-FEB-20
   1000009 13-FEB-20
   1000010 17-FEB-20 17-FEB-20       .161
   1000010 17-FEB-20 18-FEB-20       .818

12 rows selected.

The query returns 12 rows – which SQL*Plus can report because it counts them as it fetches them so it can give you the total at the end of the query.

Of course, sometimes people write preliminary queries to find out how big the result set would be before they run the query to acquire the result set itself. In cases like that (where they’re just going to select a “count(*)” the optimizer may a choose different execution path from the base query – perhaps finding a way to do an index-only execution, and maybe eliminating a few table joins from the query. So let’s execute a count of the above query:


rem
rem     Script:         join_elimination_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2020
rem

with mrd_cte as (
        select 
                meter_id, date_read, reading
        from 
                meter_readings
        where 
                date_read in (trunc(sysdate), trunc(sysdate)+1)
)
select count(*) from (
        select
                met.meter_id, met.date_installed, mrd_cte.date_read, mrd_cte.reading
        from
                meters met
        left join 
                mrd_cte
        on      mrd_cte.meter_id = met.meter_id
)
;

  COUNT(*)
----------
        10

1 row selected.

You’ll have to take my word for it, of course, but no-one else was using this database while I was running this test, and no-one else has access to the schema I was using anyway. Unfortunately when I count the 12 rows instead of reporting them Oracle thinks there are only 10 rows. Oops!

Step 1 in investigating the problem – check the execution plans to see if there are any differences in the structure of the plan, the use of predicates, or the outline information. I won’t bother with the plan for the base query because it was very obvious from the count query where the problem lay.


-------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |       |     1 (100)|          |
|   1 |  SORT AGGREGATE  |        |     1 |            |          |
|   2 |   INDEX FULL SCAN| MET_PK |    10 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$69B21C86")
      ELIMINATE_JOIN(@"SEL$00F67CF8" "METER_READINGS"@"SEL$1")
      OUTLINE(@"SEL$00F67CF8")
      MERGE(@"SEL$C43CA2CA" >"SEL$2")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$C43CA2CA")
      MERGE(@"SEL$D28F6BD4" >"SEL$E6E74641")
      OUTLINE(@"SEL$E6E74641")
      ANSI_REARCH(@"SEL$4")
      OUTLINE(@"SEL$D28F6BD4")
      MERGE(@"SEL$1" >"SEL$006708EA")
      OUTLINE(@"SEL$4")
      OUTLINE(@"SEL$006708EA")
      ANSI_REARCH(@"SEL$3")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$3")
      INDEX(@"SEL$69B21C86" "MET"@"SEL$3" ("METERS"."METER_ID"))
      END_OUTLINE_DATA
  */

This is the plan as pulled from memory by a call to dbms_xplan.display_cursor(). We note particularly the following: meter_readings doesn’t appear in the plan, there is no predicate section (and no asterisks against any of the operations that would tell us there ought to be some predicate information), and there’s a very revealing ELIMINATE_JOIN(@”SEL$00F67CF8″ “METER_READINGS”@”SEL$1”) in the outline information.

For some reason the optimizer has decided that it’s okay to remove meter_readings from the query (even though there may be many meter readings for each meter), so it was inevitable that it produced the wrong result.

Despite my opening note, this is not an issue with subquery factoring – it just looked that way when I started poking at the problem. In fact, if you rewrite the query using an inline view you get the same error, if you turn the inline view into a stored view you get the error, and if you turn the whole query into a simple (left) join with the date predicate as part of the ON clause you still get the error.

The problem lies somewhere in the join elimination transformation. If you go back to the outline information from the bad plan you’ll see the line: ELIMINATE_JOIN(@”SEL$00F67CF8″ “METER_READINGS”@”SEL$1”) – by changing this to NO_ELIMINATE_JOIN(…) and adding it to the main query block I got a suitable plan joining the two tables and producing the right result.

The problem appears in 12.2.0.1 and 19.5.0.0 (tested on livesql) – but does not appear in 12.1.0.2 or 11.2.0.4

There is a known bug associated with this problem:

Bug: 29182901
Abstract: WRONG COUNT WAS RETURNED WHEN _OPTIMIZER_ENHANCED_JOIN_ELIMINATION=TRUE
Query with Outer Join Returned a Wrong Result due to Join Elimination (Doc ID 29182901.8)

The bug is fixed in 20.1, with a fix that has been backported into the Jan 2020 patches for 19, 18, and 12.1

December 30, 2019

Scalar Subq Bug

Filed under: CBO,dbms_xplan,Execution plans,Oracle,subqueries — Jonathan Lewis @ 9:30 am GMT Dec 30,2019

This is an observation that came up on the Oracle Developer Forum a couple of days ago, starting life as the fairly common problem:

I have a “select” that runs quickly  but when I use in a “create as select” it runs very slowly.

In many cases this simply means that the query was a distributed query and the plan changed because the driving site changed from the remote to the local server. There are a couple of other reasons, but distributed DML is the one most commonly seen.

In this example, though, the query was not a distributed query, it was a fully local query. There were three features to the query that were possibly suspect, though:

  • “ANSI” syntax
  • scalar subqueries in the select list
  • redundant “order by” clauses in inline views

The OP had supplied the (horrible) SQL in a text format along with images from the Enterprise Manager SQL Monitor screen showing the two execution plans and two things were  obvious from the plans – first that the simple select had eliminated the scalar subqueries (which were redundant) while the CTAS had kept them in the plan, and secondly most of the elapsed time for the CTAS was spent in lots of executions of the scalar subqueries.

My first thought was that the problem was probably a quirk of how the optimizer translates “ANSI” SQL to Oracle-standard SQL, so I created a model that captured the key features of the problem – starting with 3 tables:

rem
rem     Script:         ctas_scalar_subq.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2019
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

create table t1 as
select * from all_objects
where rownum <= 10000 -- > comment to avoid wordpress format issue
;

alter table t1 add constraint t1_pk primary key(object_id);

create table t2 as
select * from t1
;

alter table t2 add constraint t2_pk primary key(object_id);

create table t3 as
select * from all_objects
where rownum <= 500 -- > comment to avoid wordpress format issue
;

alter table t3 add constraint t3_pk primary key(object_id);

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T2',
                method_opt  => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T3',
                method_opt  => 'for all columns size 1'
        );
end;
/

I’m going to use the small t3 table as the target for a simple scalar subquery in the select list of a query that selects some columns from t2; then I’m going to use that query as an inline view in a join to t1 and select some columns from the result. Here’s the starting query that’s going to become an inline view:


select 
        t2.*,
        (
        select  t3.object_type 
        from    t3 
        where   t3.object_id = t2.object_id
        )       t3_type
from
        t2
order by
        t2.object_id
;

And here’s how I join the result to t1:


explain plan for
        select
                v2.*
        from    (
                select
                        t1.object_id,
                        t1.object_name  t1_name,
                        v1.object_name  t2_name,
                        t1.object_type  t1_type,
                        v1.object_type  t2_type
                from
                        t1
                join (
                        select 
                                t2.*,
                                (
                                select  t3.object_type 
                                from    t3 
                                where   t3.object_id = t2.object_id
                                )       t3_type
                        from
                                t2
                        order by
                                t2.object_id
                )       v1
                on
                        v1.object_id = t1.object_id
                and     v1.object_type = 'TABLE'
                )       v2
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

The initial t2 query becomes an inline view called v1, and that becomes the second table in a join with t1. I’ve got the table and view in this order because initially the OP had an outer (left) join preserving t1 and I thought that that might be significant, but it turned out that it wasn’t.

Having joined t1 and v1 I’ve selected a small number of columns from the t1 and t2 tables and ignored the column that was generated by the inline scalar subquery. (This may seem a little stupid – but the same problem appears when the inline view is replaced with a stored view, which is a more realistic possibility.) Here’s the resulting execution plan (taken from 11.2.0.4 in this case):


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   476 | 31416 |    45  (12)| 00:00:01 |
|*  1 |  HASH JOIN           |      |   476 | 31416 |    45  (12)| 00:00:01 |
|   2 |   VIEW               |      |   476 | 15708 |    23  (14)| 00:00:01 |
|   3 |    SORT ORDER BY     |      |   476 | 41888 |    23  (14)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| T2   |   476 | 41888 |    22  (10)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | T1   | 10000 |   322K|    21   (5)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   4 - filter("T2"."OBJECT_TYPE"='TABLE')

I was a little surprised by this plan as I had expected the optimizer to eliminate the in-line “order by” in view v1 – but even when I changed the code to traditional Oracle join syntax the redundant and wasteful sort at operaton 3 still took place. (You might note that the data will be reported in an order dictated by the order of the data arriving from the t1 tablescan thanks to the mechanism of the hash join, so the sort is a total waste of effort.)

The plus point, of course, is that the optimizer had been smart enough to eliminate the scalar subquery referencing t3. The value returned from t3 is not needed anywhere in the course of the execution, so it simply disappears.

Now we change from a simple select to a Create as Select which I’ve run, with rowsource execution stats enabled, using Oracle 19.3 for this output:

set serveroutput off
set linesize 156
set trimspool on
set pagesize 60

alter session set statistics_level = all;

create table t4 as
        select  
                v2.*
        from    (
                select
                        t1.object_id,
                        t1.object_name  t1_name,
                        v1.object_name  t2_name,
                        t1.object_type  t1_type,
                        v1.object_type  t2_type
                from
                        t1
                join (
                        select 
                                t2.*,
                                (
                                select  t3.object_type 
                                from    t3 
                                where   t3.object_id = t2.object_id
                                )       t3_type
                        from
                                t2
                        order by 
                                t2.object_id
                )       v1
                on
                        v1.object_id = t1.object_id
                and     v1.object_type = 'TABLE'
                )       v2
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter session set statistics_level = typical;

And here’s the run-time execution plan – showing the critical error and statistics to prove that it really happened:

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Writes |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT           |       |      1 |        |      0 |00:00:00.01 |     471 |      3 |       |       |          |
|   1 |  LOAD AS SELECT                  | T4    |      1 |        |      0 |00:00:00.01 |     471 |      3 |  1042K|  1042K| 1042K (0)|
|   2 |   OPTIMIZER STATISTICS GATHERING |       |      1 |    435 |    294 |00:00:00.01 |     414 |      0 |   256K|   256K|  640K (0)|
|*  3 |    HASH JOIN                     |       |      1 |    435 |    294 |00:00:00.01 |     414 |      0 |  1265K|  1265K| 1375K (0)|
|   4 |     VIEW                         |       |      1 |    435 |    294 |00:00:00.01 |     234 |      0 |       |       |          |
|   5 |      TABLE ACCESS BY INDEX ROWID | T3    |    294 |      1 |     50 |00:00:00.01 |      54 |      0 |       |       |          |
|*  6 |       INDEX UNIQUE SCAN          | T3_PK |    294 |      1 |     50 |00:00:00.01 |       4 |      0 |       |       |          |
|   7 |      SORT ORDER BY               |       |      1 |    435 |    294 |00:00:00.01 |     234 |      0 | 80896 | 80896 |71680  (0)|
|*  8 |       TABLE ACCESS FULL          | T2    |      1 |    435 |    294 |00:00:00.01 |     180 |      0 |       |       |          |
|   9 |     TABLE ACCESS FULL            | T1    |      1 |  10000 |  10000 |00:00:00.01 |     180 |      0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   6 - access("T3"."OBJECT_ID"=:B1)
   8 - filter("T2"."OBJECT_TYPE"='TABLE')

You’ll notice that the VIEW at operation 4 reports the inline scalar subquery as operations 5 and 6, and the Starts column show that the scalar subquery executes 294 times – which is the number of rows returned by the scan of table t2. Although my first thought was that this was an artefact of the transformation from ANSI to Oracle syntax it turned out that when I modified the two statements to use traditional Oracle syntax the same difference appeared. Finally I re-ran the CTAS after removing the order by clause in the in-line view and the redundant subquery disappeared from the execution plan.

Tiny Geek bit

It’s not immediately obvious why there should be such a difference between the select and the CTAS in this case, but the 10053 trace files do give a couple of tiny clues the CTAS trace file includes the lines:

ORE: bypassed - Top query block of a DML.
TE: Bypassed: Top query block of a DML.
SQT:    SQT bypassed: in a transaction.

The first two suggest that we should expect some cases where DML statement optimise differently from simple queries. The last one is a further indication that differences may appear. (SQT – might this be subquery transformation, it doesn’t appear in the list of abbreviations in the trace file).

Unfortunately the SELECT trace file also included the line:


SQT:     SQT bypassed: Disabled by parameter.

So “SQT” – whatever that is – being in or out of a transaction may not have anything to do with the difference.

Summary

There are cases where optimising a select statement is not sufficient as a strategy for optimising a CTAS statement. In this case it looks as if an inline view which was non-mergable (thanks to a redundant order by clause) produced the unexpected side-effect that a completely redundant scalar subquery in the select list of the inline view was executed during the CTAS even though it was transformed out of existence for the simple select.

There are some unexpected performance threats in “cut-and-paste” coding and in re-using stored views if you haven’t checked carefully what they do and how they’re supposed to be used.

 

 

October 30, 2019

Strange Estimates.

Filed under: CBO,extended stats,Oracle,Statistics — Jonathan Lewis @ 1:10 pm GMT Oct 30,2019

A question came up on the Oracle-L list server a couple of days ago expressing some surprise at the following execution plan:


--------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                 |       |       |   845 (100)|          |
|   1 |  SORT AGGREGATE                      |                 |     1 |     7 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| ANY_TABLE       | 84827 |   579K|   845   (1)| 00:00:01 |
|   3 |    SORT CLUSTER BY ROWID             |                 | 68418 |       |    76   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                 | ANY_INDEX       | 68418 |       |    76   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("X"."ANY_COLUMN1"='J')
   4 - access("X"."ANY_COLUMN2"=89155)

You’ll notice that this is a very simple query accessing a table by index, yet the estimated table rows found exceeds the estimated number of index entries used to probe the table. How can this happen. The answer (most frequently) is that there’s a mismatch between the table (or, more commonly, column) statistics and the index statistics. This seems to happen very frequently when you start mixing partitioned tables with global (or globally partitioned) indexes but it can happen in very simple cases, especially since a call to gather_table_stats() with cascade set to true and using the auto_sample_size will take a small sample from the index while using a 100% “sample” from the table.

Here’s an example I engineered very quickly to demonstrate the point. There’s no particular reason for the choice of DML I’ve used on the data beyond a rough idea of setting up a percentage of nulls and deleting a non-uniform pattern of rows.


rem
rem     Script:         table_index_mismatch.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2019
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem
create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,1000)                n1,
        mod(rownum,1000)                n2,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1, for columns (n1,n2) size 1'
        );
end;
/

create index t1_i1 on t1(n1);

delete from t1 where mod(trunc(sqrt(n1)),7) = 0;
update t1 set n1 = null where mod(n1,10) = 0;
delete from t1 where mod(n1,10) = trunc(dbms_random.value(0,10));

execute dbms_stats.gather_table_stats(user,'t1',estimate_percent=>1)
execute dbms_stats.gather_index_stats(null,'t1_i1',estimate_percent=> 0.01)

Of course you’re not supposed to collect stats with arbitrary samples in any recent version of Oracle, so going for a 1% and 0.01% sample seems a little daft but I’m just doing that to demonstrate the problem with a very small data set.

After generating the data and gathering the stats I ran a few queries to pick out some critical numbers.


select
        table_name, sample_size, num_rows
from
        user_tables
where
        table_name = 'T1'
/

select 
        index_name, sample_size, num_rows, distinct_keys
from
        user_indexes
where
        table_name = 'T1'
and     index_name = 'T1_I1'
/

select
        column_name, sample_size, num_nulls, num_distinct
from
        user_tab_cols
where
        table_name = 'T1'
and     (
            column_name = 'N1'
         or virtual_column = 'YES'
        )
order by
        column_name
/

You’ll notice that I’ve only picked one of my original columns and any virtual columns. My gather_table_stats() call had a method_opt that included the creation of extended stats for the column group (n1, n2) and I want to report the stats on the resulting virtual column.


TABLE_NAME           SAMPLE_SIZE   NUM_ROWS
-------------------- ----------- ----------
T1                          7865     786500


INDEX_NAME           SAMPLE_SIZE   NUM_ROWS DISTINCT_KEYS
-------------------- ----------- ---------- -------------
T1_I1                     385779     713292           714


COLUMN_NAME                      SAMPLE_SIZE  NUM_NULLS NUM_DISTINCT
-------------------------------- ----------- ---------- ------------
N1                                      7012      85300          771
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS          7865          0          855

A couple of observations on the stats

  • the table sample size is, as expected, 1% of the reported num_rows (the actual count is 778,154).
  • The index sample size is much bigger than expected – but that’s probably related to the normal “select 1,100 leaf blocks strategy”. Because of the skew in the pattern of deleted values it’s possible for the sample size in this model to vary between 694,154 and something in the region of 380,000.
  • The n1 sample size is about 10% smaller than the table sample size – but that’s because I set 10% of the column to null.
  • The column group sample size matches the table sample size because column group hash values are never null, even if an underlying column is null.

So let’s check the execution plan for a very simple query:


set autotrace on explain
select id from t1 where n1 = 140 and n2 = 140;
set autotrace off


---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |   920 | 11960 |   918   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   920 | 11960 |   918   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |   909 |       |     5   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=140)
   2 - access("N1"=140)

The estimate for relevant index rowids is smaller than the estimate for the number of table rows! The difference is not as extreme as the case reported on Oracle-l, but I’m only trying to demonstrate a principle, not reproduce the exact results.

There are several ways in which contradictory results like this can appear – but in this case we can see the following:

  • For the table access table.num_rows/column.num_distinct = 786,500 / 855 = 919.88  (using the column group num_distinct)
  • For the index range scan: (table.num_rows – column.num_nulls) / column.num_distinct = (786500 – 85300) / 771 = 909.47 (using the n1 statistics)

So the change in strategy as it becomes possible for the optimizer to take advantage of the column group means the index and table have been using incompatible sets of stats (in particular there’s that loss of information about NULLs) as their cardinalities are calculated. The question, then, is “how much is that likely to matter”, and the follow-up if it can matter is “in what circumstancs could the effect be large enough to cause problems”. But that’s a topic for another day.

Update / Footnote

In the case of the Oracle-l example, there was no column group, and in some cases the optimizer would produce a plan where the table estimate was much smaller than the index estimate, and in other cases (like the opening plan above) the table estimate was signficantly greater than the index estimate. This was a side effect of adaptive statistics: the low table estimate was due to the basic “multiply separate selectivities”; but the with adaptive statistics enabled Oracle started sampling the table to check the correlation between the two tables, and then produced an SQL Plan Directive to do so and got to the higher (and correct) result.

 

 

October 23, 2019

Clustering_Factor

Filed under: CBO,Indexing,Oracle — Jonathan Lewis @ 9:56 pm BST Oct 23,2019

A few days ago I published a little note of a script I wrote some time ago to estimate the clustering_factor of an index before it had been built. At the time I pointed out that one of its limitations was that it would not handle cases where you were planning to set the table_cached_blocks preference, but a couple of days later I decided that I’d write another version of the code that would cater for the new feature – and that’s how I made an embarrassing discovery.

Having reviewed a number of notes I’ve published about the table_cached_blocks preference and its impact on the clustering_factor I’ve realised the what I’ve written has always been open to two interpretations – the one that I had in mind as I was writing, and the correct one.  I made this discovery because I had written a simple SQL statement – using the match_recognize() mechanism – to do what I considered to  be the appropriate calculation. After testing the query with a few sets of sample data that produced the correct results I emailed Stew Ashton (my “go-to” person for match_recognize() questions) asking if he would do a sanity check on the code because it was rather slow and I wondered if there was a better way of writing it.

His reply was roughly:

“I’ve read the notes you and Richard Foote have written about the clustering_factor and table_cached_blocks, and this isn’t doing what your description says it should.”

Then he explained what he had inferred from what I had written … and it made more sense than what I had been thinking when I wrote it. He also supplied some code to implement his interpretation – so I designed a couple of data models that would produce the wrong prediction for whichever piece of code implemented the wrong interpretation. His code gave the right answers, mine didn’t.

So here’s the difference in interpretation – the wrong one first – using 16 as a discussion value for the table_cached_blocks:

  • WRONG interpretation:  As you walk through index entries in order remember the last 16 rowids (that’s rowid for the rows in the table that the index is pointing to) you’ve seen. If the current rowid has a block id component that doesn’t match the block id from one of the remembered 16 rowids then increment the counter for the clustering_factor.
    • The simplicity of this algorithm means you can fix a “circular” array of 16 entries and keep walking around the circle overwriting the oldest entry each time you read a new one. It’s a pity that it’s the wrong idea because there’s a simple (though massively CPU -intensive match_recognize() strategy for implementing it – and if you were using an internal library mechanism during a proper gather_index_stats() it could be incredibly efficient.
  • RIGHT interpretation: set up an array for 16 block ids, each with an associated “row-number”. Walk through the index in order – giving each entry a row-number as you go. Extract the block id from the current entry and search through the array for a matching block id.  If you find a match then update its entry with the current row-number (so you can remembr how recently you saw the block id); if you don’t find a match then replace the entry that has the smallest (i.e. greatest distance into the past) row-number with the current block id and row-number and increment the counter for the clustering_factor.

The first piece of code that Stew Ashton sent me was an anonymous PL/SQL block that included some hard-coded fragments and embedded SQL to use a test table and index that I had defined, but he then sent a second piece of code that creates a generic function that uses dynamic SQL to construct a query against a table and an index definition that you want to test. The latter is the code I’ve published (with permission) below:


create or replace function predict_clustering_factor(
/*
Function to predict the clustering factor of an index,
taking into account the intended value of
the TABLE_CACHED_BLOCKS parameter of DBMS_STATS.SET_TABLE_PREFS.

Input is the table name, the list of column names
and the intended value of TABLE_CACHED_BLOCKS.

The function collects the last N block ids (not the last N entries).
When there is no more room, it increments the clustering factor
and replaces the least recently used block id with the current one.

Note: here a "block id" is a rowid with the row_number portion set to 0.
It is effectively a "truncated" rowid.
*/
  p_table_name in varchar2,
  p_column_list in varchar2,
  p_table_cached_blocks in number
) return number authid current_user is

  rc sys_refcursor;
  type tt_rids is table of rowid;
  lt_rids tt_rids;
  
  type t_block_list is record(
    rid rowid,
    last_hit number
  );

  type tt_block_list is table of t_block_list;
  lt_block_list tt_block_list := new tt_block_list();

  l_rid rowid;
  l_clustering_factor number := 0;
  b_block_found boolean;
  l_rn number := 0;
  l_oldest_hit number;
  i_oldest_hit binary_integer := 0;
  
  function truncated_rid(p_rid in rowid) return rowid is
    rowid_type number;
    object_number NUMBER;
    relative_fno NUMBER;
    block_number NUMBER;
    row_number NUMBER;
    rid rowid;

  begin

    DBMS_ROWID.ROWID_INFO (
      p_rid,
      rowid_type,
      object_number,
      relative_fno,
      block_number,
      row_number
    );

    rid := DBMS_ROWID.ROWID_CREATE (
      rowid_type,
      object_number,
      relative_fno,
      block_number,
      0
    );

    return rid;

  end truncated_rid;
  
begin
  if p_table_cached_blocks != trunc(p_table_cached_blocks)
  or p_table_cached_blocks not between 1 and 255 then
    raise_application_error(
      -20001, 
      'input parameter p_table_cached_blocks must be an integer between 1 and 255'
    );
  end if;

  open rc for 'select rowid from '||p_table_name||' order by '||p_column_list||', rowid';
  loop
    fetch rc bulk collect into lt_rids limit 1000;

    for irid in 1..lt_rids.count loop
      l_rn := l_rn + 1;
      l_rid := truncated_rid(lt_rids(irid));
      b_block_found := false;
      l_oldest_hit := l_rn;

      if l_rn = 1 then
        l_clustering_factor := l_clustering_factor + 1;
        lt_block_list.extend;
        lt_block_list(1).rid := l_rid;
        lt_block_list(1).last_hit := l_rn;

      else

        for i in 1..lt_block_list.count loop
          if l_oldest_hit > lt_block_list(i).last_hit then
            l_oldest_hit := lt_block_list(i).last_hit;
            i_oldest_hit := i;
          end if;
          if lt_block_list(i).rid = l_rid then
            b_block_found := true;
            lt_block_list(i).last_hit := l_rn;
            exit;
          end if;
        end loop;

        if not b_block_found then
          l_clustering_factor := l_clustering_factor + 1;
          if lt_block_list.count < p_table_cached_blocks then
            lt_block_list.extend;
            lt_block_list(lt_block_list.count).rid := l_rid;
            lt_block_list(lt_block_list.count).last_hit := l_rn; 
          else         
            lt_block_list(i_oldest_hit).rid := l_rid;
            lt_block_list(i_oldest_hit).last_hit := l_rn;
          end if;
        end if;

      end if;

    end loop;
    exit when rc%notfound;
  end loop;

  close rc;
  return l_clustering_factor;

exception when others then
  if rc%isopen then
    close rc;
  end if;
  raise;

end predict_clustering_factor;
/

After executing the above to create the function, here’s an example of usage:

rem
rem     Script:         clustering_factor_est_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2019
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem

create table t1
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        cast(rownum as varchar2(10))            v1,
        trunc(dbms_random.value(0,10000))       rand,
        rpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
/

-- -------------------------------------------------------------------

SQL> execute dbms_output.put_line('Predicted cf for t1(rand, id): ' || predict_clustering_factor('t1','rand, id',16))
Predicted cf for t1(rand, id): 997218
Elapsed: 00:00:07.54

SQL> execute dbms_output.put_line('Predicted cf for t1(rand, id): ' || predict_clustering_factor('t1','rand, id',255))
Predicted cf for t1(rand, id): 985607
Elapsed: 00:00:50.61

You’ll notice that the larger the setting for the “table_cached_blocks” parameter the more time it takes to predict the clustering_factor – and it was all CPU time in my example. This isn;t surprising given the need to search through an array holding the previous history. In this example the table t1 holds 1,000,000 rows, and the number and scatter of distinct values is so arranged that the code will hardly ever find a cached block id – essentially it’s the sort of index that isn’t going to cause much of confusion to the optimizer and isn’t likely to need special attention to make the optimizer use it when it should and ignore it when it’s inappropriate.

Finally a cut-n-paste to show the accuracy of the two predictions:

SQL> create index t1_i on t1(rand, id);
Elapsed: 00:00:02.96

SQL> execute dbms_stats.set_table_prefs(null,'t1','table_cached_blocks',16)
Elapsed: 00:00:00.01

SQL> execute dbms_stats.gather_index_stats(null,'t1_i')
Elapsed: 00:00:09.55

SQL> select clustering_factor from user_indexes where index_name = 'T1_I';

CLUSTERING_FACTOR
-----------------
           997218

Elapsed: 00:00:00.11

SQL> execute dbms_stats.set_table_prefs(null,'t1','table_cached_blocks',255)
Elapsed: 00:00:00.01

SQL> execute dbms_stats.gather_index_stats(null,'t1_i')
Elapsed: 00:00:07.80

SQL> select clustering_factor from user_indexes where index_name = 'T1_I';

CLUSTERING_FACTOR
-----------------
           985607

Elapsed: 00:00:00.00

Both match perfectly – but you might notice that creating the index and gathering the stats was much faster than predicting the clustering factor for the case where we set table_cached_blocks = 255.

(If you’re wondering, my “simple but irrelevant” match_recognize() query took 370 CPU second to complete for table_cached_blocks = 200 – and a limit on march_recognize() meant that 200 was the maximum value I was allowed to use – so now you know why I emailed Stew Ashton (and just for lagniappe. he also told me about a simple workaround for the 200 limit)).

 

 

October 22, 2019

ANSI Plans

Filed under: CBO,Execution plans,Hints,Oracle — Jonathan Lewis @ 6:59 pm BST Oct 22,2019

Here’s a thought that falls somewhere between philosophical and pragmatic. It came up while I was playing around with a problem from the Oracle database forum that was asking about options for rewriting a query with a certain type of predicate. This note isn’t really about that question but the OP supplied a convenient script to demonstrate their requirement and I’ve hi-jacked most of the code for my own purposes so that I can ask the question:

Should the presence of an intermediate view name generated by the optimizer in the course of cost-based query transformation cause two plans, which are otherwise identical and do exactly the same thing, to have different plan hash values ?

To demonstrate the issue let’s start with a simple script to create some data and generate an execution plan.


rem
rem     Script:         or_expand_plans.sql
rem     Author:         Jonathan Lewis
rem     Dated           Oct 2019
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem
rem     Notes:
rem     Data creation copied from ODC
rem     https://community.oracle.com/thread/4297365
rem

create table t as   
select 1 as id, to_date('2019-10-11', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 2 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 3 as id, to_date('2019-10-15', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 4 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
;  

create table t_others as  
select 1 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'Blue'  as color, 'Zone 7' as zoneid from dual  
union all  
select 2 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'Red'   as color, 'Zone 7' as zoneid from dual  
union all  
select 3 as id, to_date('2019-10-16', 'YYYY-MM-DD') as lastupdated, 'White' as color, 'Zone 7' as zoneid from dual  
union all  
select 4 as id, to_date('2019-10-17', 'YYYY-MM-DD') as lastupdated, 'Green' as color, 'Zone 7' as zoneid from dual  
;  

create table t_further_info as  
select 1 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'A' as typeinfo from dual  
union all   
select 2 as id, to_date('2019-10-14', 'YYYY-MM-DD') as lastupdated, 'C' as typeinfo from dual  
union all  
select 3 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'D' as typeinfo from dual  
union all  
select 4 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'E' as typeinfo from dual  
;  

prompt  ====================
prompt  "Traditional" syntax
prompt  ====================

explain plan for
select 
        /*+ or_expand */ 
        * 
from
        t,
        t_others        pt,
        t_further_info  fi  
/*
where   (
             t.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')   
         or pt.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
         or fi.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
        )  
*/
where   to_date('2019-10-21', 'YYYY-MM-DD') <= any(t.lastupdated, pt.lastupdated, fi.lastupdated)   
and     pt.id = t.id  
and     fi.id = t.id  
;

select * from table(dbms_xplan.display(null,null,'outline'));


You’ll see that I have a simple three-table join with the nasty little detail that I have a “non-join” predicates that may require Oracle to check across all three tables before it can decide whether or not a row should be discarded. I’ve shown two variants on a theme – they both have exactly the same effect but the ANY() presentation is just a little bit neater and more compact.

Essentially Oracle can use one of two strategies for this type of query/predicate; the first is to join all three tables and wait until the final join rowsource appears and then apply the check, or it can split the query into a union all of three separate queries where each query drives off a different table selecting only the rows from that table that match “its” part of the predicate.

In the latter case the second and third branches of the union all have to be modified to ensure that they discard any rows already returned by preceding parts of the union all; this can mean lots of new predicates appearing that use the lnnvl() function. (Of course there are variants between these two extremes, but the general principle is unchanged.)

The presence of the (incomplete) /*+ or_expand */ hint in my query is there to tell the optimizer that it should attempt to transform the query into the three-part union all. This, by the way, is a 12c feature, though older versions of Oracle could get similar effects in some cases from the /*+ use_concat */ hint. Here’s the plan, with outline,  I got from 12.2.0.1:


Plan hash value: 3181357500

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |     3 |   270 |    17   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_67EF6547    |     3 |   270 |    17   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                    |       |       |            |          |
|*  3 |    HASH JOIN            |                    |     1 |    55 |     6   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN|                    |     4 |   144 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T_FURTHER_INFO     |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |      BUFFER SORT        |                    |     3 |    69 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL | T_OTHERS           |     3 |    69 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL   | T                  |     3 |    57 |     2   (0)| 00:00:01 |
|*  9 |    HASH JOIN            |                    |     2 |   182 |    11  (10)| 00:00:01 |
|  10 |     VIEW                | VW_JF_SET$FB5125FC |     2 |   156 |     9  (12)| 00:00:01 |
|  11 |      UNION-ALL          |                    |       |       |            |          |
|* 12 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL| T_OTHERS           |     1 |    23 |     2   (0)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL| T                  |     3 |    57 |     2   (0)| 00:00:01 |
|* 15 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 16 |        TABLE ACCESS FULL| T                  |     1 |    19 |     2   (0)| 00:00:01 |
|  17 |        TABLE ACCESS FULL| T_OTHERS           |     4 |    92 |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS FULL   | T_FURTHER_INFO     |     4 |    52 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$7C4216F7" "PT"@"SEL$1")
      LEADING(@"SEL$7C4216F7" "T"@"SEL$1" "PT"@"SEL$1")
      FULL(@"SEL$7C4216F7" "PT"@"SEL$1")
      FULL(@"SEL$7C4216F7" "T"@"SEL$1")
      USE_HASH(@"SEL$A4A33BE0" "T"@"SEL$1")
      LEADING(@"SEL$A4A33BE0" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SEL$A4A33BE0" "T"@"SEL$1")
      FULL(@"SEL$A4A33BE0" "PT"@"SEL$1")
      USE_HASH(@"SET$49E1C21B_3" "T"@"SEL$1")
      USE_MERGE_CARTESIAN(@"SET$49E1C21B_3" "PT"@"SEL$1")
      LEADING(@"SET$49E1C21B_3" "FI"@"SEL$1" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "T"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "PT"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "FI"@"SEL$1")
      USE_HASH(@"SEL$5FCD2D3C" "FI"@"SEL$1")
      LEADING(@"SEL$5FCD2D3C" "VW_JF_SET$FB5125FC"@"SEL$81DF0931" "FI"@"SEL$1")
      FULL(@"SEL$5FCD2D3C" "FI"@"SEL$1")
      NO_ACCESS(@"SEL$5FCD2D3C" "VW_JF_SET$FB5125FC"@"SEL$81DF0931")
      NO_ACCESS(@"SEL$67EF6547" "VW_ORE_67EF6547"@"SEL$67EF6547")
      OUTLINE(@"SET$49E1C21B_2")
      OUTLINE(@"SET$49E1C21B_1")
      OUTLINE(@"SEL$1")
      FACTORIZE_JOIN(@"SET$49E1C21B"("FI"@"SET$49E1C21B_2" "FI"@"SET$49E1C21B_1"))
      OUTLINE(@"SET$0E101D56")
      OUTLINE(@"SEL$81DF0931")
      OUTLINE(@"SEL$5AB42CD1")
      OR_EXPAND(@"SEL$1" (1) (2) (3))
      OUTLINE(@"SET$49E1C21B")
      OUTLINE_LEAF(@"SEL$67EF6547")
      FACTORIZE_JOIN(@"SET$49E1C21B"("FI"@"SET$49E1C21B_2" "FI"@"SET$49E1C21B_1"))
      OUTLINE_LEAF(@"SET$0E101D56")
      OUTLINE_LEAF(@"SEL$5FCD2D3C")
      OUTLINE_LEAF(@"SET$FB5125FC")
      OUTLINE_LEAF(@"SEL$A4A33BE0")
      OUTLINE_LEAF(@"SEL$7C4216F7")
      OUTLINE_LEAF(@"SET$49E1C21B_3")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
   5 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   7 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   8 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   9 - access("FI"."ID"="ITEM_1")
  12 - access("PT"."ID"="T"."ID")
  13 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  14 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
  15 - access("PT"."ID"="T"."ID")
  16 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


This is a wonderful demonstration of how brilliant the optimizer can be.  The query has gone through several transformations and two of them have very high visibility. First, you can see the name VW_ORE_67EF6547 at operation 1. This is a view name that Oracle generates to express (cost-based) OR-Expansion” so the optimizer has clearly obeyed my hint. As a consequence of OR-expansion we can also see several examples of the lnnvl() function appearing in the Predicate Information section of the output.; we can also see the hint re-appearing in the completed form of OR_EXPAND(@”SEL$1″ (1) (2) (3)) in the Outline Data.

However, we don’t have the union all of three pieces that we might have expected; we have a union all of two pieces and the second piece is a hash join between the table t_further_info and a view called VW_JF_SET$FB5125FC. This view is the result of “join factorization”. The optimizer has taken the 2nd and 3rd sections of our union all view and decided that it would be cost-effective to “factor out” a common table, so this:

select from t_others, t, t_further_info ... where t_others.date_predicate ...
union all
select from t, t_others, t_further_info ... where t.date_predicate ...

changes to this:

select from 
        (
        select from t_others, t   ... where t_others.date_predicate ...
        union all
        select from t, t_others   ... where t.date_predicate ...
        ),
        t_further_info 
where   ...

Having said all that, I now have to change the code because the original query was written using “ANSI” style joins – like this:


explain plan for
select
        /*+ or_expand */
        * 
from
        t   
inner join 
        t_others       pt  
on      pt.id = t.id  
inner join 
        t_further_info fi  
on      fi.id = t.id  
where
        (
             t.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')   
         or pt.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
         or fi.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
        )
;

select  * from table(dbms_xplan.display(null,null,'outline'));


In the ANSI example I happen to have used the explicit “OR” list for the date predicates but that’s not really signficant . Here’s the plan produced by this query – and the first thing I’d like you to note is the Plan hash value:

Plan hash value: 3309788271

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |     3 |   270 |    17   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_31069B60    |     3 |   270 |    17   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                    |       |       |            |          |
|*  3 |    HASH JOIN            |                    |     1 |    55 |     6   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN|                    |     4 |   144 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T_FURTHER_INFO     |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |      BUFFER SORT        |                    |     3 |    69 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL | T_OTHERS           |     3 |    69 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL   | T                  |     3 |    57 |     2   (0)| 00:00:01 |
|*  9 |    HASH JOIN            |                    |     2 |   182 |    11  (10)| 00:00:01 |
|  10 |     VIEW                | VW_JF_SET$997549B1 |     2 |   156 |     9  (12)| 00:00:01 |
|  11 |      UNION-ALL          |                    |       |       |            |          |
|* 12 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL| T_OTHERS           |     1 |    23 |     2   (0)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL| T                  |     3 |    57 |     2   (0)| 00:00:01 |
|* 15 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 16 |        TABLE ACCESS FULL| T                  |     1 |    19 |     2   (0)| 00:00:01 |
|  17 |        TABLE ACCESS FULL| T_OTHERS           |     4 |    92 |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS FULL   | T_FURTHER_INFO     |     4 |    52 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$D12FC97A" "PT"@"SEL$1")
      LEADING(@"SEL$D12FC97A" "T"@"SEL$1" "PT"@"SEL$1")
      FULL(@"SEL$D12FC97A" "PT"@"SEL$1")
      FULL(@"SEL$D12FC97A" "T"@"SEL$1")
      USE_HASH(@"SEL$09C9729D" "T"@"SEL$1")
      LEADING(@"SEL$09C9729D" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SEL$09C9729D" "T"@"SEL$1")
      FULL(@"SEL$09C9729D" "PT"@"SEL$1")
      USE_HASH(@"SET$E8D85587_3" "T"@"SEL$1")
      USE_MERGE_CARTESIAN(@"SET$E8D85587_3" "PT"@"SEL$1")
      LEADING(@"SET$E8D85587_3" "FI"@"SEL$2" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SET$E8D85587_3" "T"@"SEL$1")
      FULL(@"SET$E8D85587_3" "PT"@"SEL$1")
      FULL(@"SET$E8D85587_3" "FI"@"SEL$2")
      USE_HASH(@"SEL$95B99BAF" "FI"@"SEL$2")
      LEADING(@"SEL$95B99BAF" "VW_JF_SET$997549B1"@"SEL$BB7F1ECF" "FI"@"SEL$2")
      FULL(@"SEL$95B99BAF" "FI"@"SEL$2")
      NO_ACCESS(@"SEL$95B99BAF" "VW_JF_SET$997549B1"@"SEL$BB7F1ECF")
      NO_ACCESS(@"SEL$31069B60" "VW_ORE_31069B60"@"SEL$31069B60")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      MERGE(@"SEL$1" >"SEL$2")
      OUTLINE(@"SEL$58A6D7F6")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SET$E8D85587_2")
      OUTLINE(@"SET$E8D85587_1")
      MERGE(@"SEL$58A6D7F6" >"SEL$3")
      OUTLINE(@"SEL$9E43CB6E")
      FACTORIZE_JOIN(@"SET$E8D85587"("FI"@"SET$E8D85587_2" "FI"@"SET$E8D85587_1"))
      OUTLINE(@"SET$6117B24C")
      OUTLINE(@"SEL$BB7F1ECF")
      OUTLINE(@"SEL$344003E3")
      OR_EXPAND(@"SEL$9E43CB6E" (1) (2) (3))
      OUTLINE(@"SET$E8D85587")
      OUTLINE_LEAF(@"SEL$31069B60")
      FACTORIZE_JOIN(@"SET$E8D85587"("FI"@"SET$E8D85587_2" "FI"@"SET$E8D85587_1"))
      OUTLINE_LEAF(@"SET$6117B24C")
      OUTLINE_LEAF(@"SEL$95B99BAF")
      OUTLINE_LEAF(@"SET$997549B1")
      OUTLINE_LEAF(@"SEL$09C9729D")
      OUTLINE_LEAF(@"SEL$D12FC97A")
      OUTLINE_LEAF(@"SET$E8D85587_3")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FI"."ID"="T"."ID" AND "PT"."ID"="T"."ID")
   5 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   7 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   8 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   9 - access("FI"."ID"="ITEM_1")
  12 - access("PT"."ID"="T"."ID")
  13 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  14 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
  15 - access("PT"."ID"="T"."ID")
  16 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


Is the plan for the “ANSI” version of the query the same as the plan for the “traditional” version? How carefully have you checked – after simply noting that the two Plan hash values were different.

The plans are the same – in that the mechanics are exactly the same and that ought to be the criterion on which we should judge them. But the hash values are different because of the change from traditional to ANSI syntax. The traditional form of the query starts life with a single query block while the ANSI form is considered to be two query blocks, so the initial ANSI query goes through three stages:


1) select from t1 join t2 join t3

2) select /*+ qb_name(sel$2) */ from (select /* qb_name(sel$1) */ from t1, t2), t3

3) select /*+ qb_name(sel$9E43CB6E) */ from t1, t2, t3

So the query is rapidly transformed to the equivalent traditional syntax but we  now have a query block name of SEL$9E43CB6E instead of SEL$1 that the traditional query (in the absence of a /*+ qb_name() */ hint would have had. This is why you see the difference in the two or_expand() hints in the Outline Data section. One reads: OR_EXPAND(@SEL$1 (1) (2) (3)), the other reads OR_EXPAND(@”SEL$9E43CB6E” (1) (2) (3)), and all the subseqent query block name differences follow on from this initial transformation. (Note: the value “sel$9e43cb6e” is derived from the input query block names of sel$1 and sel$2 that the new query block is derived from)

You may decide that this mismatch isn’t really terribly important. If you’ve modified the code to switch to ANSI style joins then you may be prepared to put in a little extra effort to check the plan in detail to see that it hasn’t changed; but it would be a lot nicer if the hash value wasn’t dependent on generated view names. You may recall that at one time Oracle had problems with plans that used materialized CTEs (“with” subqueries) because the plan hash value was affected by object names like sys_temp_0fd9d6791_dfc12da. The same principle ought, I think, to apply here.

If you don’t mind the ANSI/tradiational switch though, you might find that you’re less happy when you upgrade to 19c, because the same effect appears there too, only it’s worse. Not only do “identical” traditional and ANSI plans have different hash values, they don’t match the values from 12c because the generated name for the join factorization views (VW_JF) change in the upgrade. So if you’re depending on SQL Plan Baselines to reproduce 12c plans on 19c when you upgrade you may find cases where you know the stored baseline is giving you the same plan but Oracle thinks it isn’t and refuses to use it.

tl;dr

Plans which are functionally identical can have different plan hash values because the plans were reached through a different series of tranformations. In particular if you rewrite queries from “traditional” Oracle syntax to “ANSI” syntax you will find cases where the plan doesn’t change but the plan hash value does thanks to a change in the names of views generated by some transformations.

More significantly, if you upgrade from 12c to 19c there are case where the names of views generated by transformations may change, which could cause the optimizer to discard some of your carefully constructed SQL Plan Baselines as “not reproducible”.

Footnote

For reference, if I add the hint /*+ opt_param(‘_optimizer_join_factorization’ ‘false’) */ to the “traditional query then I get the following plan which shows more clearly the three branches that the original query has been split into – each section starting with a different choice for the driving table:


-------------------------------------------------------------------------------------------
| Id  | Operation               | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                 |     3 |   270 |    19   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_67EF6547 |     3 |   270 |    19   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                 |       |       |            |          |
|*  3 |    HASH JOIN            |                 |     1 |    55 |     7  (15)| 00:00:01 |
|*  4 |     HASH JOIN           |                 |     1 |    32 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T               |     1 |    19 |     2   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL  | T_FURTHER_INFO  |     4 |    52 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL   | T_OTHERS        |     4 |    92 |     2   (0)| 00:00:01 |
|*  8 |    HASH JOIN            |                 |     1 |    55 |     6   (0)| 00:00:01 |
|   9 |     MERGE JOIN CARTESIAN|                 |     4 |   144 |     4   (0)| 00:00:01 |
|* 10 |      TABLE ACCESS FULL  | T_OTHERS        |     1 |    23 |     2   (0)| 00:00:01 |
|  11 |      BUFFER SORT        |                 |     4 |    52 |     2   (0)| 00:00:01 |
|  12 |       TABLE ACCESS FULL | T_FURTHER_INFO  |     4 |    52 |     2   (0)| 00:00:01 |
|* 13 |     TABLE ACCESS FULL   | T               |     3 |    57 |     2   (0)| 00:00:01 |
|* 14 |    HASH JOIN            |                 |     1 |    55 |     6   (0)| 00:00:01 |
|  15 |     MERGE JOIN CARTESIAN|                 |     4 |   144 |     4   (0)| 00:00:01 |
|* 16 |      TABLE ACCESS FULL  | T_FURTHER_INFO  |     1 |    13 |     2   (0)| 00:00:01 |
|  17 |      BUFFER SORT        |                 |     3 |    69 |     2   (0)| 00:00:01 |
|* 18 |       TABLE ACCESS FULL | T_OTHERS        |     3 |    69 |     2   (0)| 00:00:01 |
|* 19 |     TABLE ACCESS FULL   | T               |     3 |    57 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SET$BB614FD2_3" "T"@"SET$BB614FD2_3")
      USE_MERGE_CARTESIAN(@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3")
      LEADING(@"SET$BB614FD2_3" "FI"@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3"
              "T"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "T"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "FI"@"SET$BB614FD2_3")
      USE_HASH(@"SET$BB614FD2_2" "T"@"SET$BB614FD2_2")
      USE_MERGE_CARTESIAN(@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2")
      LEADING(@"SET$BB614FD2_2" "PT"@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2"
              "T"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "T"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "PT"@"SET$BB614FD2_2")
      USE_HASH(@"SET$BB614FD2_1" "PT"@"SET$BB614FD2_1")
      USE_HASH(@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1")
      LEADING(@"SET$BB614FD2_1" "T"@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1"
              "PT"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "PT"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "T"@"SET$BB614FD2_1")
      NO_ACCESS(@"SEL$49E1C21B" "VW_ORE_67EF6547"@"SEL$67EF6547")
      OUTLINE(@"SEL$1")
      OR_EXPAND(@"SEL$1" (1) (2) (3))
      OUTLINE_LEAF(@"SEL$49E1C21B")
      OUTLINE_LEAF(@"SET$BB614FD2")
      OUTLINE_LEAF(@"SET$BB614FD2_1")
      OUTLINE_LEAF(@"SET$BB614FD2_2")
      OUTLINE_LEAF(@"SET$BB614FD2_3")
      ALL_ROWS
      OPT_PARAM('_optimizer_join_factorization' 'false')
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PT"."ID"="T"."ID")
   4 - access("FI"."ID"="T"."ID")
   5 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   8 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
  10 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  13 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
  14 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
  16 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  18 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
  19 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))

88 rows selected.

Although the “traditional” and “ANSI” plans still show a difference in their Plan hash values when join factorization is blocked, the absence of the join factorization view means that the plan hash values are now consistent between 12c to 19c (the output above came from 19.3.0.0 as you can see in the Outline information).

October 18, 2019

CBO Oddities – 1

Filed under: CBO,Oracle — Jonathan Lewis @ 6:10 pm BST Oct 18,2019

I’ve decided to do a little rewriting and collating so that I can catalogue related ideas in an order that makes for a better narrative. So this is the first in a series of notes designed to help you understand why the optimizer has made a particular choice and why that choice is (from your perspective) a bad one, and what you can do either to help the optimizer find a better plan, or subvert the optimizer and force a better plan.

If you’re wondering why I choose to differentiate between “help the optimizer” and “subvert the optimizer” consider the following examples.

  • A query is joining two tables in the wrong order with a hash join when you know that a nested loop join in the opposite order would far better because you know that the data you want is very nicely clustered and there’s a really good index that would make access to that data very efficient. You check the table preferences and discover that the table_cached_blocks preference (see end notes) is at its default value of 1, so you set it to 16 and gather fresh stats on the indexes on the table. Oracle now recognises the effectiveness of this index and changes plan accordingly.
  • The optimizer has done a surprising transformation of a query, aggregating a table before joining to a couple of other tables when you were expecting it to use the joins to eliminate a huge fraction of the data before aggregating it.  After a little investigation you find that setting hidden parameter _optimizer_distinct_placement to false stops this happening.

You may find the distinction unnecessarily fussy, but I’d call the first example “helping the optimzier” – it gives the optimizer some truthful information about your data that is potentially going to result in better decisions in many different statements – and the second example “subverting the optimizer” – you’ve brute-forced it into not taking a path you didn’t like but at the same time you may have stopped that feature from appearing in other ways or in other queries. Of course, you might have minimised the impact of setting the parameter by using the opt_param() hint to apply the restriction to just this one query, nevertheless it’s possible that there is a better plan for the query that would have used the feature at some other point in the query if you’d managed to do something to help the optimizer rather than constraining it.

What’s up with the Optimizer

It’s likely that most of the articles will be based around interpreting execution plans since those are the things that tell us what the optimizer thinks will happen when it executes a statement, and within execution plans there are three critical aspects to consider –

  1. the numbers (most particularly Cost and Rows),
  2. the shape of the plan,
  3. the Predicate Information.

I want to use this note to make a couple of points about just the first of the three.

  • First – the estimates on any one line of an execution plan are “per start” of the line; some lines of an execution plan will be called many times in the course of a statement. In many cases the Rows estimate from one line of a plan will dictate the number of times that some other line of the plan will be executed – so a bad estimate of “how much data” can double up as a bad estimate of “how many times”, leading to a plan that looks efficient on paper but does far too much work at run-time. A line in a plan that looks a little inefficient may be fine if it executes only one, a line that looks very efficient may be a disaster if it executes a million time. Being able to read a plan and spot the places where the optimizer has produced a poor estimate of Rows is a critical skill – and there are many reasons why the optimizer produces poor estimates. Being able to spot poor estimates depends fairly heavily on knowing the data, but if you know the generic reasons for the optimizer producing poor estimates you’ve got a head start for recognising and addressing the errors when they appear.
  • Second – Cost is synonymous with Time. For a given instance at a given moment there is a simple, linear, relationship between the figure that the optimizer reports for the Cost of a statement (or subsection of a statement) and the Time that the optimizer reports. For many systems (those that have not run the calibrate_io procedure) the Time is simply the Cost multiplied by the time the optimizer thinks it will take to satisfy a single block read request, and the Cost is the optimizer’s estimate of the I/O requirement to satisfy the statement – with a fudge factor introduced to recognise the fact that a “single block” read request ought to complete in less time than a “multiblock” read request. Generally speaking the optimizer will consider many possible plans for a statement and pick the plan with the lowest estimated cost – but there is at least one exception to this rule, and it is an unfortunate weakness in the optimizer that there are many valid reasons why its estimates of Cost/Time are poor. Of course, you will note that the values that Oracle reports for the Time column are only accurate to the second – which isn’t particularly helpful when a single block read typically operates in the range of a few milliseconds.

To a large degree the optimizer’s task boils down to:

  • What’s the volume and scatter of the data I need
  • What access paths, with what wastage, are available to get to that data
  • How much time will I spend on I/O reading (and possibly discarding) data to extract the bit I want

Of course there are other considerations like the amount of CPU needed for a sort, the potential for I/O as sorts or hash joins, the time to handle a round-trip to a remote system, and RAC variations on the basic theme. But for many statements the driving issue is that any bad estimates of “how much data” and “how much (real) I/O” will lead to bad, potentially catastrophic, choices of execution plan. In the next article I’ll list all the different reasons (that I can think of at the time) why the optimizer can produce bad estimates of volume and time.

References for Cost vs. Time

References for table_cached_blocks:

 

October 9, 2019

Cursor_sharing

Filed under: CBO,Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 4:58 pm BST Oct 9,2019

Here’s a funny little detail that I don’t think I’ve noticed before – needing only a simple demo script:


rem
rem     Script:         cursor_sharing_oddity.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2019
rem
rem     Last tested 
rem             12.2.0.1
rem

create table t1 as
select  * 
from    all_objects 
;

set serveroutput off
alter system flush shared_pool;
alter session set cursor_sharing = force;

select  *
from    t1
where
        created between date'2019-06-01' and date'2019-06-30'
;

select * from table(dbms_xplan.display_cursor);

Given that I’ve set cursor_sharing to FORCE (and flushed the shared pool just in case), what SQL do you expect to see if I pull the plan from memory, and what sort of thing do you expect to see in the Predicate Information. Probably some references to system-constructed bind variables like :”SYS_B_0″. This is what I got on 12.2.0.1:


SQL_ID  9qwa9gpg9rmjv, child number 0
-------------------------------------
select * from t1 where  created between date:"SYS_B_0" and
date:"SYS_B_1"

Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   170 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |  1906 |   251K|   170   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("CREATED">=TO_DATE(' 2019-06-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "CREATED"<=TO_DATE(' 2019-06-30 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))


Somehow I’ve got system-generated bind variables in the SQL (and v$sql – when I checked), but the original literal values are still present (in a different form) in the predicate information. Then, when I re-ran the query changing 1st June to 15th June I got the same SQL_ID (and generated bind variables) but with child number 1 and suitably modified filter predicates.

Of course, just for completion, if I write the query using the “old-fashioned” to_date() approach I end up with a single statement with lots of system-generated bind variables that are consistent between the SQL and the Predicate Information.

SQL_ID  10sfymvwv00qx, child number 0
-------------------------------------
select * from t1 where  created between to_date(:"SYS_B_0",:"SYS_B_1")
                   and to_date(:"SYS_B_2",:"SYS_B_3")

Plan hash value: 3332582666

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   189 (100)|          |
|*  1 |  FILTER            |      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  1029 |   135K|   189  (17)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(TO_DATE(:SYS_B_2,:SYS_B_3)>=TO_DATE(:SYS_B_0,:SYS_B_1))
   2 - filter(("CREATED">=TO_DATE(:SYS_B_0,:SYS_B_1) AND
              "CREATED"<=TO_DATE(:SYS_B_2,:SYS_B_3)))

If you are planning to do anything with cursor_sharing, watch out for the side effects of the “ANSI” date and time operators.

Update (3 hours later)

It turns out that I have come across this before – and written about it because it appeared as a performance problem to someone on the Oracle Developer Community forum.

The same behaviour is shown by timestamp literals and interval literals. For details on the two types of literal here’s a link to the 12.2 SQL Language Reference Manual section on literals.

October 4, 2019

opt_estimate catalogue

Filed under: CBO,Hints,Oracle — Jonathan Lewis @ 10:10 am BST Oct 4,2019

This is just a list of the notes I’ve written about the opt_estimate() hint.

  • opt_estimate – using the hint to affect index calculations: index_scan and index_filter
  • opt_estimate 2 – applying the hint to nested loop joins, options: nlj_index_scan and nlj_index_filter
  • opt_estimate 3 – a couple of little-known options for the hint, “group_by” and “having”.
  • opt_estimate 4 – applying the hint at the query block level: particularly useful for CTEs (“with subquery”) and non-mergeable views.
  • opt_estimate 5 – a story of failure: trying to use opt_estimate to push predicates into a union all view.

I have a couple more drafts on the topic awaiting completion, but if you know of any other articles that would be a good addition to the list feel free to reference them in the comments.

 

August 26, 2019

Troubleshooting

Filed under: CBO,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 12:19 pm BST Aug 26,2019

A recent thread on the Oracle Developer Community starts with the statement that a query is taking a very long time (with the question “how do I make it go faster?” implied rather than asked). It’s 12.1.0.2 (not that that’s particularly relevant to this blog note), and we have been given a number that quantifies “very long time” (again not particularly relevant to this blog note – but worth mentioning because your “slow” might be my “wow! that was fast” and far too many people use qualitative adjectives when the important detail is quantative). The query had already been running for 15 hours – and here it is:


SELECT 
        OWNER, TABLE_NAME 
FROM
        DBA_LOGSTDBY_NOT_UNIQUE 
WHERE
        (OWNER, TABLE_NAME) NOT IN (
                SELECT 
                        DISTINCT OWNER, TABLE_NAME 
                        FROM     DBA_LOGSTDBY_UNSUPPORTED
        ) 
AND     BAD_COLUMN = 'Y'

There are many obvious suggestions anyone could make for things to do to investigate the problem – start with the execution plan, check whether the object statistics are reasonably representative, run a trace with wait state tracing enabled to see where the time goes; but sometimes that are a couple of very simple observation you can make that point you to simple solutions.

Looking at this query we can recognise that it’s (almost certainly) about a couple of Oracle data dictionary views (which means it’s probably very messy under the covers with a horrendous execution plan) and, as I’ve commented from time to time in the past, Oracle Corp. developers create views for their own purposes so you should take great care when you re-purpose them. This query also has the very convenient feature that it looks like two simpler queries stitched together – so a very simple step in trouble-shooting, before going into any fine detail, is to unstitch the query and run the two parts separately to see how much data they return and how long they take to complete:


SELECT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_NOT_UNIQUE WHERE BAD_COLUMN = 'Y'

SELECT DISTINCT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTED

It’s quite possble that the worst case scenario for the total run time of the original query could be reduced to the sum of the run time of these two queries. One strategy to achieve this would be a rewrite of the form:

select  * 
from    (
        SELECT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_NOT_UNIQUE WHERE BAD_COLUMN = 'Y'
        minus
        SELECT DISTINCT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTED
)

Unfortunately the immediately obvious alternative may be illegal thanks to things like duplicates (which disappear in MINUS operations) or NULLs (which can make ALL the data “disappear” in some cases). In this case the original query might be capable of returning duplicates of (owner, table_name) from dba_lgstdby_not_unique which would collapse to a single ocurrence each in my rewrite – so my version of the query is not logically equivalent (unless the definition of the view enforces uniqueness); on the other hand tracking, back through the original thread to the MoS article where this query comes from, we can see that even if the query could return duplicates we don’t actually need to see them.

And this is the point of the blog note – it’s a general principle (that happens to be a very obvious strategy in this case): if a query takes too long, how does it compare with a simplified version of the query that might be a couple of steps short of the final target. If it’s easy to spot the options for simplification, and if the simplified version operates efficiently, them isolate it (using a no_merge hint if necessary), and work forwards from there. Just be careful that your rewrite remains logically equivalent to the original (if it really needs to).

In the case of this query, the two parts took 5 seconds and 9 seconds to complete, returning 209 rows and 815 rows respectively. Combining the two queries with a minus really should get the required result in no more than 14 seconds.

Footnote

The “distinct” in the second query is technically redundant as the minus operation applies a sort unique operation to both the two intermediate result sets before comparing them.  Similarly the  “distinct” was also redundant when the second query was used for the “in subquery” construction – again there would be an implied uniqueness operation if the optimizer decided to do a simple unnest of the subquery.

 

 

 

 

Next Page »

Powered by WordPress.com.