Oracle Scratchpad

March 21, 2019

Lost time

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 12:51 pm GMT Mar 21,2019

Here’s a little puzzle that came up in the ODC database forum yesterday – I’ve got a query that has been captured by SQL Monitor, and it’s taking much longer to run than it should but the monitoring report isn’t telling me what I need to know about the time.

Here’s a little model to demonstrate the problem – I’m going to join a table to itself (the self join isn’t a necessary feature of the demonstration, I’ve just been a bit lazy in preparing data). Here’s a (competely truthful) description of the table:

SQL> desc t1
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                                     NUMBER
 MOD_1000                               NUMBER
 V1                                     VARCHAR2(40)
 V2                                     VARCHAR2(40)
 PADDING                                VARCHAR2(100)

SQL> select num_rows, blocks from user_tables where table_name = 'T1';

  NUM_ROWS     BLOCKS
---------- ----------
    400000       7798

1 row selected.


And here’s the output I get from dbms_monitor.report_sql_monitor() for the query running on 18.3.0.0


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  TEST_USER (261:36685)
 SQL ID              :  g6j671u7zc9mn
 SQL Execution ID    :  16777218
 Execution Started   :  03/21/2019 08:54:56
 First Refresh Time  :  03/21/2019 08:54:56
 Last Refresh Time   :  03/21/2019 08:55:17
 Duration            :  21s
 Module/Action       :  MyModule/MyAction
 Service             :  SYS$USERS
 Program             :  sqlplus@linux183.localdomain (TNS V1-V3)
 Fetch Calls         :  2

Global Stats
==========================================================================
| Elapsed |   Cpu   |    IO    | PL/SQL  | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Time(s) | Calls |  Gets  | Reqs | Bytes |
==========================================================================
|      21 |      21 |     0.04 |    0.07 |     2 |     3M |  215 | 180MB |
==========================================================================

SQL Plan Monitoring Details (Plan Hash Value=83896840)
==================================================================================================================================================
| Id |       Operation       | Name |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity | Activity Detail |
|    |                       |      | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |   (# samples)   |
==================================================================================================================================================
|  0 | SELECT STATEMENT      |      |         |      |        21 |     +1 |     1 |        1 |      |       |     . |          |                 |
|  1 |   FILTER              |      |         |      |        21 |     +1 |     1 |        1 |      |       |     . |          |                 |
|  2 |    HASH JOIN OUTER    |      |     400 | 2014 |        21 |     +1 |     1 |      400 |      |       |   2MB |          |                 |
|  3 |     TABLE ACCESS FULL | T1   |     400 | 1005 |         1 |     +1 |     1 |      400 |   62 |  60MB |     . |          |                 |
|  4 |     TABLE ACCESS FULL | T1   |    400K | 1001 |        21 |     +1 |     1 |     400K |      |       |     . |          |                 |
==================================================================================================================================================

As you can see the total elapsed time 21 seconds of which the CPU time is the whole 21 seconds.

This seems a little large for a simple hash join so we should look at the “Activity Detail” section of the plan because that will report any ASH samples that can be found for this execution of this query. And that’s where the problem lies: despite using 21 CPU seconds in 21 seconds there are no ASH samples for the execution! Moreover – looking for other numerical oddities – the plan says we read 60MB in 62 read requests (that’s the first tablescan of of the 7,798 blocks of t1), but the summary says we read 180MB – where did we lose (or gain) 120MB ?

It might help to see the query (which is why I didn’t show it in the output above) and it might help to see the predicate section (which SQL Monitor doesn’t report). So here’s the query and its plan, with the resulting predicate section, pulled from memory:


SQL_ID  g6j671u7zc9mn, child number 0
-------------------------------------
select  /*+ monitor */  t1a.mod_1000, t1b.mod_1000 from  t1 t1a,  t1
t1b where  t1a.mod_1000 = 500 and t1b.id(+) = t1a.id and
nvl(t1b.mod_1000,0) + f2(t1a.id) + f3(t1a.id) > 0

Plan hash value: 83896840

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |  2014 (100)|          |
|*  1 |  FILTER             |      |       |       |            |          |
|*  2 |   HASH JOIN OUTER   |      |   400 |  7200 |  2014   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   400 |  3600 |  1005   (3)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1   |   400K|  3515K|  1001   (3)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("T1B"."MOD_1000",0)+"F2"("T1A"."ID")+"F3"("T1A"."ID")>0)
   2 - access("T1B"."ID"="T1A"."ID")
   3 - filter("T1A"."MOD_1000"=500)


Notice the two function calls that are part of the FILTER operation. That’s where the time is going – but I don’t think you can infer that from the SQL Monitor report (which rather suggests that the problem might be with the second full tablescan of t1 at operation 4).

Functions f2() and f3() both do a table scan of another table that is as large as t1 – and it looks as if they’re both going to be called 400 times, which is a lot of work. The 120MB of “lost” reads is the first table scan of each of the two tables, the 3M buffer gets (in the summary, if you didn’t notice it) is from the repeated tablescans as each row comes out of the hash join and the filter is applied.

Why doesn’t the report show us the CPU (and disk) samples? Because it’s querying ASH (v$active_session_history) by the SQL_ID of the principle SQL statement – and the work is being done by two other statements with different SQL_IDs.

Since it’s PL/SQL functions doing all the work why isn’t the 21 CPU seconds showing up in the “PL/SQL time(s)” summary figure? Because the time isn’t being spent in PL/SQL, it’s being spent in the SQL being run by the PL/SQL.

Just to finish off, let’s try to find the SQL being run by the PL/SQL. I’ll post the script to recreate the example at the end of the posting but for the moment I’ll just show you the query against v$sql that I ran to extract the SQL that’s embedded in the functions:


select  sql_id, executions, disk_reads, buffer_gets, sql_text
from    V$sql 
where   sql_text like 'SELECT%T2%' 
or      sql_text like 'SELECT%T3%'
;

SQL_ID        EXECUTIONS DISK_READS BUFFER_GETS SQL_TEXT
------------- ---------- ---------- ----------- --------------------------------------------------
12ytf1rry45d9        400       7683     3072817 SELECT MAX(ID) FROM T2 WHERE MOD_1000 = :B1
85dmsgqg3bh4w          1       7680        7698 SELECT MAX(ID) FROM T3 WHERE MOD_1000 = :B1

As you can see, there are roughly 120MB of disk I/O and 3M buffer gets due to these two statement – and one of them has run the 400 times we expected. It looks as if Oracle has done a cute little optimisation with the other function, though. If you look at the predicate it says:

  • NVL(“T1B”.”MOD_1000″,0)+”F2″(“T1A”.”ID”)+”F3″(“T1A”.”ID”)>0

It looks as if Oracle has either cached the result of the f3() function call (in which case why didn’t it also cache the f2() result) or it’s stopped evaluating the predicate as soon as the running total exceeded zero (but that might be a problem since f3() could return a negative number !). I don’t know exactly why Oracle is doing what it’s doing – but if I reverse the order of the functions in the predicates the f3() query will run 400 times and the f2() query will run once.

Appendix

If you want to re-run some tests here’s the code I used to create the demo:


create table t1
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum-1,1000)              mod_1000,
        lpad(rownum,10,'0')             v1,
        lpad(rownum,10,'0')             v2,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 4e5 -- > comment to avoid WordPress format issue
;

-- alter table t1 add constraint t1_pk primary key(id);

create table t2 nologging as select * from t1;
create table t3 nologging as select * from t1;

create function f2 (i_in number) return number
as
        m_ret number;
begin
        select max(id)
        into    m_ret
        from    t2
        where   mod_1000 = i_in
        ;

        return m_ret;
end;
/

create function f3 (i_in number) return number
as
        m_ret number;
begin
        select  max(id)
        into    m_ret
        from    t3
        where   mod_1000 = i_in
        ;

        return m_ret;
end;
/


begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T2',
                method_opt  => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T3',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter system flush buffer_cache;

variable b1 number
variable b2 number

set timing on

select
        /*+ monitor */
        t1a.mod_1000, t1b.mod_1000
from
        t1      t1a,
        t1      t1b
where
        t1a.mod_1000 = 500
and     t1b.id(+) = t1a.id
and     nvl(t1b.mod_1000,0) + f3(t1a.id) + f2(t1a.id) > 0
;


One interesting little detail, if you try the tests, is that the join has to be an outer join for the FILTER operation to appear.

March 4, 2019

Cartesian Join

Filed under: Execution plans,Oracle — Jonathan Lewis @ 1:37 pm GMT Mar 4,2019

I wrote this note a little over 4 years ago (Jan 2015) but failed to publish it for some reason. I’ve just rediscovered it and it’s got a couple of details that are worth mentioning, so I’ve decided to go ahead and publish it now.

A recent [ed: 4 year old] question on the OTN SQL forum asked for help in “multiplying up” data – producing multiple rows from a single row source. This is something I’ve done fairly often when modelling a problem, for example by generating an orders table and then generating an order_lines table from the orders table, and there are a couple of traps to consider.

The problem the OP had was that their base data was the result set from a complex query – which ran “fine”, but took 10 minutes to complete when a Cartesian join to a table holding just three rows was included. Unfortunately the OP didn’t supply, or even comment on, the execution plans. The obvious guess, of course, is that the extra table resulted in a completely different execution plan rather than the expected “do the original query then multiply by 3” plan, in which case the solution to the problem is (probably) simple – stick the original query into a non-mergeable view before doing the join.

Assume we have the following tables, t1 has 100,000 rows (generated from the SQL in this article), t2 has 4 rows where column id2 has the values from 1 to 4, t3 is empty – we can model the basic requirement with the query shown below:


SQL> desc t1
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID                               NUMBER
 C1                               CHAR(2)
 C2                               CHAR(2)
 C3                               CHAR(2)
 C4                               CHAR(2)
 PADDING                          VARCHAR2(100)

SQL> desc t2
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID2                              NUMBER

SQL> desc t3
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID                               NUMBER
 ID2                              NUMBER
 C1                               CHAR(2)
 C2                               CHAR(2)
 C3                               CHAR(2)
 C4                               CHAR(2)
 PADDING                          VARCHAR2(100)


insert into t3
select
        t1.id, t2.id2, t1.c1, t1.c2, c3, t1.c4, t1.padding
from
       (select * from t1) t1,
        t2
;

If we “lose” the plan for the original “select * from t1” (assuming t1 was really a complicated view) when we extend to the Cartesian join all we need to do is the following:


insert into t3
select
        /*+ leading(t1 t2) */
        t1.id, t2.id2, t1.c1, t1.c2, c3, t1.c4, t1.padding
from
        (select /*+ no_merge */ * from t1) t1,
        t2
;

This is where the problem starts to get a little interesting. The /*+ no_merge */ hint is (usually) a winner in situations like this – but why have I included a /*+ leading() */ hint choosing to put t2 (the small table) second in the join order? It’s because of the way that Cartesian Merge Joins work, combined with an eye to where my most important resource bottleneck is likely to be. Here’s the execution plan taken from memory after executing this statement with statistics_level set to all. (11.2.0.4):


SQL_ID  azu8ntfjg9pwj, child number 0
-------------------------------------
insert into t3 select   /*+ leading(t1 t2) */  t1.id, t2.id2, t1.c1,
t1.c2, c3, t1.c4, t1.padding from  (select /*+ no_merge */ * from t1)
t1,   t2

Plan hash value: 1055157753

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |      1 |        |      0 |00:00:10.28 |   48255 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |      |      1 |        |      0 |00:00:10.28 |   48255 |       |       |          |
|   2 |   MERGE JOIN CARTESIAN   |      |      1 |    400K|    400K|00:00:06.30 |    1727 |       |       |          |
|   3 |    VIEW                  |      |      1 |    100K|    100K|00:00:00.94 |    1725 |       |       |          |
|   4 |     TABLE ACCESS FULL    | T1   |      1 |    100K|    100K|00:00:00.38 |    1725 |       |       |          |
|   5 |    BUFFER SORT           |      |    100K|      4 |    400K|00:00:01.78 |       2 |  3072 |  3072 | 2048  (0)|
|   6 |     TABLE ACCESS FULL    | T2   |      1 |      4 |      4 |00:00:00.01 |       2 |       |       |          |
----------------------------------------------------------------------------------------------------------------------

Let’s try that again (building from scratch, of course) with the table order reversed in the leading() hint:


SQL_ID  52qaydutssvn5, child number 0
-------------------------------------
insert into t3 select   /*+ leading(t2 t1) */  t1.id, t2.id2, t1.c1,
t1.c2, c3, t1.c4, t1.padding from  (select /*+ no_merge */ * from t1)
t1,   t2

Plan hash value: 2126214450

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |      1 |        |      0 |00:00:12.29 |   48311 |   6352 |   1588 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |      |      1 |        |      0 |00:00:12.29 |   48311 |   6352 |   1588 |       |       |          |
|   2 |   MERGE JOIN CARTESIAN   |      |      1 |    400K|    400K|00:00:06.64 |    1729 |   6352 |   1588 |       |       |          |
|   3 |    TABLE ACCESS FULL     | T2   |      1 |      4 |      4 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
|   4 |    BUFFER SORT           |      |      4 |    100K|    400K|00:00:04.45 |    1727 |   6352 |   1588 |    13M|  1416K| 9244K (0)|
|   5 |     VIEW                 |      |      1 |    100K|    100K|00:00:00.80 |    1725 |      0 |      0 |       |       |          |
|   6 |      TABLE ACCESS FULL   | T1   |      1 |    100K|    100K|00:00:00.28 |    1725 |      0 |      0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

There’s one important detail that’s not explicit in the execution plans – I’ve set the workarea_size_policy to manual and the sort_area_size to 10MB to demonstrate the impact of having a dataset that is too large for the session’s workarea limit.

The results, in terms of timing, are border-line. With the “correct” choice of order the completion time is 10.28 seconds compared to 12.29 seconds, though if you look at the time for the Merge Join Cartesian operation the difference is much less significant. The critical point, though, appears at operation 4 – the Buffer Sort. I set my sort_area_size to something that I know was smaller than the data set I needed to buffer – so the operation had to spill to disc. Ignoring overheads and rounding errors the data from the 1,727 blocks I read from the table at pctfree = 10 were dumped to the temporary space in 1,588 packed blocks (sanity check: 1,727 * 0.9 = 1,554); and then those blocks were read back once for each row from the driving t2 table (sanity check: 1,588 * 4 = 6,352).

With my setup I had a choice of bottlenecks:  scan a very small data set in memory 100,000 times to burn CPU, or scan a large data set from disc 4 times. There wasn’t much difference in my case: but the difference could be significant on a full-scale production system.  By default the optimizer happened to pick the “wrong” path with my data sets.

But there’s something even more important than this difference in resource usage to generate the data: what does the data look like after it’s been generated.  Here’s a simple query to show you the first few rows of the stored result sets in the two different test:


SQL> select id, id2, c1, c2, c3, c4 from t3 where rownum <= 8;

Data from leading (t1 t2)
=========================
        ID        ID2 C1 C2 C3 C4
---------- ---------- -- -- -- --
         1          1 BV GF JB LY
         1          2 BV GF JB LY
         1          3 BV GF JB LY
         1          4 BV GF JB LY
         2          1 YV LH MT VM
         2          2 YV LH MT VM
         2          3 YV LH MT VM
         2          4 YV LH MT VM


Data from leading (t2 t1)
=========================
        ID        ID2 C1 C2 C3 C4
---------- ---------- -- -- -- --
         1          1 BV GF JB LY
         2          1 YV LH MT VM
         3          1 IE YE TS DP
         4          1 DA JY GS AW
         5          1 ZA DC KD CF
         6          1 VJ JI TC RI
         7          1 DN RY KC BE
         8          1 WP EQ UM VY

If we had been using code like this to generate an order_lines table from an orders table, with  leading(orders t2) we would have “order lines” clustered around the “order number” – which is a realistic model; when we have leading(t2 orders) the clustering disappears (technically the order numbers are clustered around the order lines). It’s this aspect of the data that might have a much more important impact on the suitability (and timing) of any testing you may be doing rather than a little time lost or gained in generating the raw data.

Footnote

If you try to repeat this test on your own system don’t expect my timing to match yours. Bear in mind, also, that with statistics_level set to all there’s going to be a CPU overhead that varies between the two options for the leading() hint – the CPU usage on rowsource execution stats could be much higher for the case where one of the operations starts 100,000 times.

 

January 31, 2019

Descending Problem

Filed under: Execution plans,Indexing,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 3:34 pm GMT Jan 31,2019

I’ve written in the past about oddities with descending indexes ( here, here, and here, for example) but I’ve just come across a case where I may have to introduce a descending index that really shouldn’t need to exist. As so often happens it’s at the boundary where two Oracle features collide. I have a table that handles data for a large number of customers, who record a reasonable number of transactions per year, and I have a query that displays the most recent transactions for a customer. Conveniently the table is partitioned by hash on the customer ID, and I have an index that starts with the customer_id and transaction_date columns. So here’s my query or, to be a little more accurate, the client’s query – simplified and camouflaged:


select  /*+ gather_plan_statistics */
        *
from    (
             select
                    v1.*,
                    rownum rn
             from   (
                             select   /*
                                         no_eliminate_oby
                                         index_rs_desc(t1 (customer_id, transaction_date))
                                      */
                                      t1.*
                             from     t1
                             where    customer_id = 50
                             and      transaction_date >= to_date('1900-01-01','yyyy-mm-dd')
                             order by transaction_date DESC
                ) v1
                where  rownum <= 10 -- > comment to avoid WordPress format issue
         )
where    rn >= 1
;

You’ll notice some hinting – the /*+ gather_plan_statistics */ will allow me to report the rowsource execution stats when I pull the plan from memory, and the hints in the inline view (which I’ve commented out in the above) will force a particular execution plan – walking through the index on (company_id, transaction_date) in descending order.

If I create t1 as a simple (non-partitioned) heap table I get the following plan unhinted (I’ve had to edit a “less than or equal to” symbol to avoid a WordPress format issue):

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |      1 |        |    14 (100)|     10 |00:00:00.01 |      14 |
|*  1 |  VIEW                           |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  2 |   COUNT STOPKEY                 |       |      1 |        |            |     10 |00:00:00.01 |      14 |
|   3 |    VIEW                         |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|   4 |     TABLE ACCESS BY INDEX ROWID | T1    |      1 |    340 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  5 |      INDEX RANGE SCAN DESCENDING| T1_I1 |      1 |     10 |     3   (0)|     10 |00:00:00.01 |       4 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   5 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Notice the descending range scan of the index – just as I wanted it – the minimal number of buffer visits, and only 10 rows (and rowids) examined from the table. But what happens if I recreate t1 as a hash-partitioned table with local index – here’s the new plan, again without hinting the SQL:


----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |       |      1 |        |   207 (100)|     10 |00:00:00.01 |     138 |       |       |          |
|*  1 |  VIEW                                          |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  2 |   COUNT STOPKEY                                |       |      1 |        |            |     10 |00:00:00.01 |     138 |       |       |          |
|   3 |    VIEW                                        |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  4 |     SORT ORDER BY STOPKEY                      |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |  2048 |  2048 | 2048  (0)|
|   5 |      PARTITION HASH SINGLE                     |       |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|   6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1    |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|*  7 |        INDEX RANGE SCAN                        | T1_I1 |      1 |    340 |     4   (0)|    340 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM. LE. 10)
   4 - filter(ROWNUM .LE. 10)
   7 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE">=TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TRANSACTION_DATE" IS NOT NULL)

Even though the optimizer has recognised that is will be visiting a single partition through a local index it has not chosen a descending index range scan, though it has used the appropriate index; so it’s fetched all the relevant rows from the table in the wrong order then sorted them discarding all but the top 10. We’ve done 138 buffer visits (which would turn into disk I/Os, and far more of them, in the production system).

Does this mean that the optimizer can’t use the descending index when the table is partitioned – or that somehow the costing has gone wrong. Here’s plan with the hints in place to see what happens when we demand a descending range scan:


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |      1 |        |   207 (100)|     10 |00:00:00.01 |       8 |
|*  1 |  VIEW                                 |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |       8 |
|*  2 |   COUNT STOPKEY                       |       |      1 |        |            |     10 |00:00:00.01 |       8 |
|   3 |    VIEW                               |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |       8 |
|   4 |     PARTITION HASH SINGLE             |       |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID| T1    |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|*  6 |       INDEX RANGE SCAN DESCENDING     | T1_I1 |      1 |    340 |     4   (0)|     16 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   6 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The optimizer is happy to oblige with the descending range scan – we can see that we’ve visited only 8 buffers, and fetched only 10 rows from the table. The cost, however, hasn’t made any allowance for the limited range scan. Check back to the plan for the simple (non-partitioned) table and you’ll see that the optimizer did allow for the reduced range scan. So the problem here is a costing one – we have to hint the index range scan if we want Oracle limit the work it does.

You might notice, by the way that the number of rowids returned in the index range scan descending operation is 16 rather than 10 – a little variation that didn’t show up when the table wasn’t partitioned. I don’t know why this happened, but when I changed the requirement to 20 rows the range scan returned 31 rowids, when I changed it to 34 rows the range scan returned 46 rows, and a request for 47 rows returned 61 index rowids – you can see the pattern, the number of rowids returned by the index range scan seems to be 1 + 15*N.

Footnote:

If you want to avoid hinting the code (or adding an SQL patch) you need only re-create the index with the transaction_date column declared as descending (“desc”), at which point the optimizer automatically chooses the correct strategy and the run-time engine returns exactly 10 rowids and doesn’t need to do any sorting. But who wants to create a descending index when they don’t really need it !

If you want to reproduce the experiments, here’s the script to create my test data.


rem
rem     Script:         pt_ind_desc_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:        
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1 (
        customer_id,
        transaction_date,
        small_vc,
        padding 
)
partition by hash(customer_id) partitions 4
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        mod(rownum,128)                         customer_id,
        (trunc(sysdate) - 1e6) + rownum         transaction_date,
        lpad(rownum,10,'0')                     v1,
        lpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(customer_id, transaction_date) 
local 
nologging
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

I’ve run this test on 12.1.0.2, 12.2.0.1, and 18.3.0.0 – the behaviour is the same in all three versions.

Update (1st Feb 2019)

As the client reminded me after reading the post, it’s worth pointing out that for more complex SQL you still have to worry about the errors in the cardinality and cost calculations that could easily push the optimizer into the wrong join order and/or join method – whether you choose to hint the ascending index or create a descending index.  Getting the plan you want for this type of “pagination” query can be a messy process.

January 17, 2019

Hint Reports

Filed under: dbms_xplan,Execution plans,Hints,Oracle — Jonathan Lewis @ 9:59 am GMT Jan 17,2019

Nigel Bayliss has posted a note about a frequently requested feature that has now appeared in Oracle 19c – a mechanism to help people understand what has happened to their hints.  It’s very easy to use, it’s just another format option to the “display_xxx()” calls in dbms_xplan; so I thought I’d run up a little demonstration (using an example I first generated 18 years and 11 versions ago) to make three points: first, to show the sort of report you get, second to show you that the report may tell you what has happened, but that doesn’t necessarily tell you why it has happened, and third to remind you that you should have stopped using the /*+ ordered */ hint 18 years ago.

I’ve run the following code on livesql:


rem
rem     Script:         c_ignorehint.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2001
rem


drop table ignore_1;
drop table ignore_2;

create table ignore_1
nologging
as
select
        rownum          id,
        rownum          val,
        rpad('x',500)   padding
from    all_objects
where   rownum <= 3000
;

create table ignore_2
nologging
as
select
        rownum          id,
        rownum          val,
        rpad('x',500)   padding
from    all_objects
where   rownum <= 500
;

alter table ignore_2
add constraint ig2_pk primary key (id);


explain plan for
update
        (
                select
                        /*+
                                ordered
                                use_nl(i2)
                                index(i2,ig2_pk)
                        */
                        i1.val  val1,
                        i2.val  val2
                from
                        ignore_1        i1,
                        ignore_2        i2
                where
                        i2.id = i1.id
                and     i1.val <= 10
        )
set     val1 = val2
;

select * from table(dbms_xplan.display(null,null,'hint_report'));

explain plan for
update
        (
                select
                        /*+
                                use_nl(i2)
                                index(i2,ig2_pk)
                        */
                        i1.val  val1,
                        i2.val  val2
                from
                        ignore_1        i1,
                        ignore_2        i2
                where
                        i2.id = i1.id
                and     i1.val <= 10
        )
set     val1 = val2
;

select * from table(dbms_xplan.display(null,null,'hint_report'));

As you can see I’ve simply added the format option “hint_report” to the call to dbms_xplan.display(). Before showing you the output I’ll just say a few words about the plans we might expect from the two versions of the update statement.

Given the /*+ ordered */ hint in the first statement we might expect Oracle to do a full tablescan of ignore_1 then do a nested loop into ignore_2 (obeying the use_nl() hint) using the (hinted) ig2_pk index. In the second version of the statement, and in the absence of the ordered hint, it’s possible that the optimizer will still use the same path but, in principle, it might find some other path.

So what do we get ? In order here are the two execution plans:


Plan hash value: 3679612214
 
--------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                      |          |    10 |   160 |   111   (0)| 00:00:01 |
|   1 |  UPDATE                               | IGNORE_1 |       |       |            |          |
|*  2 |   HASH JOIN                           |          |    10 |   160 |   111   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| IGNORE_2 |   500 |  4000 |    37   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN                   | IG2_PK   |   500 |       |     1   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS STORAGE FULL          | IGNORE_1 |    10 |    80 |    74   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("I2"."ID"="I1"."ID")
   5 - storage("I1"."VAL"<=10)
       filter("I1"."VAL"<=10)
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$DA9F4B51
           -  ordered
 
   3 -  SEL$DA9F4B51 / I2@SEL$1
         U -  use_nl(i2)
           -  index(i2,ig2_pk)




Plan hash value: 1232653668
 
------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT              |          |    10 |   160 |    76   (0)| 00:00:01 |
|   1 |  UPDATE                       | IGNORE_1 |       |       |            |          |
|   2 |   NESTED LOOPS                |          |    10 |   160 |    76   (0)| 00:00:01 |
|   3 |    NESTED LOOPS               |          |    10 |   160 |    76   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL | IGNORE_1 |    10 |    80 |    74   (0)| 00:00:01 |
|*  5 |     INDEX UNIQUE SCAN         | IG2_PK   |     1 |       |     0   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID| IGNORE_2 |     1 |     8 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - storage("I1"."VAL"<=10)
       filter("I1"."VAL"<=10)
   5 - access("I2"."ID"="I1"."ID")
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   5 -  SEL$DA9F4B51 / I2@SEL$1
           -  index(i2,ig2_pk)
           -  use_nl(i2)

As you can see, the “Hint Report” shows us how many hints have been seen in the SQL text, then the body of the report shows us which query block, operation and table (where relevant) each hint has been associated with, and whether it has been used or not.

The second query has followed exactly the plan I predicted for the first query and the report has shown us that Oracle noted, and used, the use_nl() and index() hints to access table ignore2, deciding for itself to visit the tables in the order ignore_1 -> ignore_2, and doing a full tablescan on ignore_1.

The first query reports three hints, but flags the use_nl() hint as unused. (There is (at least) one other flag that could appear against a hint – “E” for error (probably syntax error), so we can assume that this hint is not being ignored because there’s something wrong with it.) Strangely the report tells us that the optimizer has used the ordered hint but we can see from the plan that the tables appear to be in the opposite order to the order we specified in the from clause, and the chosen order has forced the optimizer into using an index full scan on ig2_pk because it had to obey our index() hint.  Bottom line – the optimizer has managed to find a more costly plan by “using but apparently ignoring” a hint that described the cheaper plan that we would have got if we hadn’t used the hint.

Explanation

Query transformation can really mess things up and you shouldn’t be using the ordered hint.

I’ve explained many times over the years that the optimizer evaluates the cost of an update statement by calculating the cost of selecting the rowids of the rows to be updated. In this case, which uses an updatable join view, the steps taken to follow this mechanism this are slightly more complex.  Here are two small but critical extracts from the 10053 trace file (taken from an 18c instance):


CVM:   Merging SPJ view SEL$1 (#0) into UPD$1 (#0)
Registered qb: SEL$DA9F4B51 0x9c9966e8 (VIEW MERGE UPD$1; SEL$1; UPD$1)

...

SQE: Trying SQ elimination.
Query after View Removal
******* UNPARSED QUERY IS *******
SELECT
        /*+ ORDERED INDEX ("I2" "IG2_PK") USE_NL ("I2") */
        0
FROM    "TEST_USER"."IGNORE_2" "I2",
        "TEST_USER"."IGNORE_1" "I1"
WHERE   "I2"."ID"="I1"."ID"
AND     "I1"."VAL"<=10


The optimizer has merged the UPDATE query block with the SELECT query block to produce a select statement that will produce the necessary plan (I had thought that i1.rowid would appear in the select list, but the ‘0’ will do for costing purposes). Notice that the hints have been preserved as the update and select were merged but, unfortunately, the merge mechanism has reversed the order of the tables in the from clause. So the optimizer has messed up our select statement, then obeyed the original ordered hint!

Bottom line – the hint report is likely to be very helpful in most cases but you will still have to think about what it is telling you, and you may still have to look at the occasional 10053 to understand why the report is showing you puzzling results. You should also stop using a hint that was replaced by a far superior hint more than 18 years ago – the ordered hint in my example should have been changed to /*+ leading(i1 i2) */ in Oracle 9i.

December 20, 2018

Transitive Closure

Filed under: CBO,Execution plans,Oracle — Jonathan Lewis @ 1:19 pm GMT Dec 20,2018

This is a follow-up to a note I wrote nearly 12 years ago, looking at the problems of transitive closure (or absence thereof) from the opposite direction. Transitive closure gives the optimizer one way of generating new predicates from the predicates you supply in your where clause (or, in some cases, your constraints); but it’s a mechanism with some limitations. Consider the following pairs of predicates:


    t1.col1 = t2.col2
and t2.col2 = t3.col3

    t1.col1 = t2.col2
and t2.col2 = 'X'

A person can see that the first pair of predicate allows us to infer that “t1.col1 = t3.col3” and the second pair of predicates allows us to infer that “t1.col1 = ‘X'”. The optimizer is coded only to recognize the second inference. This has an important side effect that can have a dramatic impact on performance in a way that’s far more likely to appear if your SQL is generated by code. Consider this sample data set (reproduced from the 2006 article):

rem
rem     Script:         transitive_loop.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2006
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem

create table t1 
as
select
        mod(rownum,100) col1,
        rpad('x',200)   v1
from
        all_objects
where   
        rownum <= 2000
;

create table t2
as
select
        mod(rownum,100) col2,
        rpad('x',200)   v2
from
        all_objects
where   
        rownum <= 2000
;

create table t3
as
select
        mod(rownum,100) col3,
        rpad('x',200)   v3
from
        all_objects
where   
        rownum <= 2000
;

-- gather stats if necessary

set autotrace traceonly explain

prompt  =========================
prompt  Baseline - two hash joins
prompt  =========================

select 
        t1.*, t2.*, t3.*
from
        t1, t2, t3
where
        t2.col2 = t1.col1
and     t3.col3 = t2.col2
;

prompt  ================================================
prompt  Force mismatch between predicates and join order
prompt  ================================================

select 
        /*+
                leading(t1 t3 t2)
        */
        t1.*, t2.*, t3.*
from
        t1, t2, t3
where
        t2.col2 = t1.col1
and     t3.col3 = t2.col2
;

The first query simply joins the tables in the from clause order on a column we know will have 20 rows for each distinct value, so the result sets will grow from 2,000 rows to 40,000 rows to 800,000 rows. Looking at the second query we would like to think that when we force Oracle to use the join order t1 -> t3 -> t2 it would be able to use the existing predicates to generate the predicate “t3.col3 = t1.col1” and therefore be able to do the same amount of work as the first query (and, perhaps, manage to produce the same final cardinality estimate).

Here are the two plans, taken from an instance of 12.2.0.1:


=========================
Baseline - two hash joins
=========================

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   800K|   466M|    48  (38)| 00:00:01 |
|*  1 |  HASH JOIN          |      |   800K|   466M|    48  (38)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T3   |  2000 |   398K|    10   (0)| 00:00:01 |
|*  3 |   HASH JOIN         |      | 40000 |    15M|    21   (5)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1   |  2000 |   398K|    10   (0)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| T2   |  2000 |   398K|    10   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T3"."COL3"="T2"."COL2")
   3 - access("T2"."COL2"="T1"."COL1")

================================================
Force mismatch between predicates and join order
================================================

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |   800K|   466M| 16926   (3)| 00:00:01 |
|*  1 |  HASH JOIN            |      |   800K|   466M| 16926   (3)| 00:00:01 |
|   2 |   TABLE ACCESS FULL   | T2   |  2000 |   398K|    10   (0)| 00:00:01 |
|   3 |   MERGE JOIN CARTESIAN|      |  4000K|  1556M| 16835   (2)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T1   |  2000 |   398K|    10   (0)| 00:00:01 |
|   5 |    BUFFER SORT        |      |  2000 |   398K| 16825   (2)| 00:00:01 |
|   6 |     TABLE ACCESS FULL | T3   |  2000 |   398K|     8   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."COL2"="T1"."COL1" AND "T3"."COL3"="T2"."COL2")

As you can see, there’s a dramatic difference between the two plans, and a huge difference in cost (though the predicted time for both is still no more than 1 second).

The first plan, where we leave Oracle to choose the join order, builds an in-memory hash table from t3, then joins t1 to t2 with a hash table and uses the result to join to t3 by probing the in-memory hash table.

The second plan, where we force Oracle to use a join order that (I am pretending) we believe to be a better join order results in Oracle doing a Cartesian merge join between t1 and t3 that explodes the intermediate result set up to 4 million rows (and the optimizer’s estimate is correct) before eliminating a huge amount of redundant data.

As far as performance is concerned, the first query took 0.81 seconds to generate its result set, the second query took 8.81 seconds. In both cases CPU time was close to 100% of the total time.

As a follow-up demo I added the extra predicate “t3.col3 = t1.col1” to the second query, allowing the optimizer to use a hash join with the join order t1 -> t3 -> t2, and this brought the run time back down (with a slight increase due to the extra predicate check on the second join).

Summary

The choice of columns in join predicates may stop Oracle from choosing the best join order because it is not able to use transitive closure to generate all the extra predicates that the human eye can see. If you are using programs to generate SQL rather than writing SQL by hand you are more likely to see this limitation resulting in some execution plans being less efficient than they could be.

 

 

 

 

December 18, 2018

NULL predicate

Filed under: CBO,Execution plans,Indexing,Oracle — Jonathan Lewis @ 1:13 pm GMT Dec 18,2018

People ask me from time to time if I’m going to write another book on the Cost Based Optimizer – and I think the answer has to be no because the product keeps growing so fast it’s not possible to keep up and because there are always more and more little details that might have been around for years and finally show up when someone asks me a question about some little oddity I’ve never noticed before.

The difficult with the “little oddities” is the amount of time you could spend trying to work out whether or not they matter and if it’s worth writing about them. Here’s a little example to show what I mean – first the data set:


rem
rem     Script:         null_filter.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0
rem             12.1.0.2
rem

create table t1
nologging
as
select  *
from    all_objects
where   rownum <= 50000 -- > comment to avoid wordpress format issue
;

insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
commit;

create index t1_i1 on t1(object_type, data_object_id, object_id, created);

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                cascade     => true,
                method_opt  => 'for all columns size 1'
        );
end;
/

It’s a simple data set with a single index. The only significant thing about the index is that the second column (data_object_id) is frequently null. This leads to a little quirk in the execution plans for a very similar pair of statements:


set serveroutput off
alter session set statistics_level = all;

select
        object_name, owner
from
        t1
where
        object_type = 'TABLE'
and     data_object_id = 20002
and     object_id = 20002
and     created > trunc(sysdate - 90)
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

select
        object_name, owner
from
        t1
where
        object_type = 'TABLE'
and     data_object_id is null
and     object_id = 20002
and     created > trunc(sysdate - 90)
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

How much difference would you expect in the execution plans for these two queries? There is, of course, the side effect of the “is null” predicate disabling the “implicit column group” that is the index distinct_keys value, but in this case I’ve got a range-based predicate on one of the columns so Oracle won’t be using the distinct_keys anyway.

Of course there’s the point that you can’t use the equality operator with null, you have to use “is null” – and that might make a difference, but how ? Here are the two execution plan:


----------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |      0 |00:00:00.01 |       3 |      1 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |      1 |      0 |00:00:00.01 |       3 |      1 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |      1 |      1 |      0 |00:00:00.01 |       3 |      1 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE' AND "DATA_OBJECT_ID"=20002 AND "OBJECT_ID"=20002 AND
              "CREATED">TRUNC(SYSDATE@!-90))

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |      0 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |      1 |      0 |00:00:00.01 |       3 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |      1 |      1 |      0 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE' AND "DATA_OBJECT_ID" IS NULL AND "OBJECT_ID"=20002 AND
              "CREATED">TRUNC(SYSDATE@!-90))
       filter(("OBJECT_ID"=20002 AND "CREATED">TRUNC(SYSDATE@!-90)))

The query with the predicate “data_object_id is null” repeats the object_id and sysdate predicates as access predicates and filter predicates. This seems a little surprising and a potential performance threat. In the first query the run_time engine will hit the correct index leaf block in exactly the right place very efficiently and then walk along it supplying every rowid to the parent operator until it hits the end of the range.

With the “is null” plan the run-time engine will be checking the actual value of object_id and created for every index entry on the way – how much extra CPU will this use and, more importantly, might Oracle start with the first index entry where object_type = ‘TABLE’ and data_object_id is null and walk through every index entry that has that null checking for the correct object_id as it goes ?

That last question is the reason for running the query with rowsource execution stats enabled. The first query did a single physical read while the second didn’t have to, but the more important detail is that both queries did the same number of buffer gets – and there is, by the way, a set of eight rows where the object_id and data_object_id are  20,002, but they were created several years ago so the index range scan returns no rows in both cases.

Based on that comparison, how do we show that Oracle has not walked all the way from the first index entry where object_type = ‘TABLE’ and data_object_id is null checking every entry on the way or, to put it another way, has Oracle really managed to prune down the index range scan to the minimum “wedge” indicated by the presence of the predicates “OBJECT_ID”=20002 AND “CREATED”>TRUNC(SYSDATE@!-90) as access predicates?

Let’s just count the number of leaf blocks that might be relevant, using the sys_op_lbid() function (last seen here) that Oracle uses internally to count the number of leaf blocks in an index. First we get the index object_id, then we scan it to see how many leaf blocks hold entries that match our object_type and data_object_id predicates but appear in the index before our target value of 20,002:


column object_id new_value m_index_id

select
        object_id
from
        user_objects
where
        object_type = 'INDEX'
and     object_name = 'T1_I1'
;

select  distinct sys_op_lbid(&m_index_id, 'L', rowid)
from    t1
where   object_type    = 'TABLE'
and     data_object_id is null
and     object_id      < 20002
;


SYS_OP_LBID(159271
------------------
AAAm4nAAFAAACGDAAA
AAAm4nAAFAAACF9AAA
AAAm4nAAFAAACGCAAA
AAAm4nAAFAAACF/AAA
AAAm4nAAFAAACF+AAA
AAAm4nAAFAAACGFAAA
AAAm4nAAFAAACGEAAA
AAAm4nAAFAAACGGAAA

8 rows selected.


This tells us that there are 8 leaf blocks in the index that we would have to range through before we found object_id 20,002 and we would have seen 8 buffer gets, not 3 in the rowsource execution stats, if Oracle had not actually been clever with its access predicates and narrowed down the wedge of the index it was probing.

Bottom line: for a multi-column index there seems to be a difference in execution plans between “column is null” and “column = constant” when the column is one of the earlier columns in the index – but even though the “is null” option results in some access predicates re-appearing as filter predicates in the index range scan the extra workload is probably not significant – Oracle still uses the minimum number of index leaf blocks in the index range scan.

 

December 10, 2018

Case Study

Filed under: Execution plans,Oracle,Statistics — Jonathan Lewis @ 1:10 pm GMT Dec 10,2018

A recent thread on the ODC database forum highlighted a case where the optimizer was estimating 83,000 for a particular index full scan when the SQL Monitor output for the operation showed that it was returning 11,000,000 rows.

Apart from the minor detail that the OP didn’t specifically ask a question, the information supplied was pretty good. The OP had given us a list of bind variables, with values, and the SQL statement, followed by the text output of the Monitor’ed SQL and, to get the predicate section of the plan, the output from a call to dbms_xplan. This was followed by the DDL for the critical index and a list of the stats for all the columns in the index.

Here’s the critical line of the plan (from the SQL Monitor report) followed by its predicate section (from the dbms_xplan output, but cosmetically enhanced) and some details of the columns used in the predicate:

SQL Plan Monitoring Details (Plan Hash Value=3210215320)
=================================================================================================================================================================================================================================
| Id    |            Operation            |         Name            |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write | Mem  | Temp | Activity |       Activity Detail       | Progress | 
|       |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes |      |      |   (%)    |         (# samples)         |          |
=================================================================================================================================================================================================================================
|    11 |             INDEX FULL SCAN     | PK_HOUSEHOLD_GDC        |   83917 | 22799 |        86 |     +1 |     1 |      11M |     9 | 73728 |       |       |      |      |    24.21 | Cpu (77)                    |          |
=================================================================================================================================================================================================================================

  11 - filter(
        (    TO_DATE(:SYS_B_00||TO_CHAR("MONTH")||:SYS_B_01||TO_CHAR("YEAR"),:SYS_B_02)>=ADD_MONTHS(TRUNC(TO_DATE(:SYS_B_03,:SYS_B_04),:SYS_B_05),(-:SYS_B_06)) 
         AND TO_DATE(:SYS_B_00||TO_CHAR("MONTH")||:SYS_B_01||TO_CHAR("YEAR"),:SYS_B_02)<=TRUNC(TO_DATE(:SYS_B_07,:SYS_B_08),:SYS_B_09)-:SYS_B_10)
        )

COLUMN_NAME                    DATA_TYPE       NUM_DISTINCT  DENSITY  NUM_NULLS LAST_ANALYZED       HISTOGRAM
------------------------------ --------------- ------------ -------- ---------- ------------------- ---------------
YEAR                           NUMBER                     5        0          0 2018-12-02 13:19:10 FREQUENCY
MONTH                          NUMBER                    12        0          0 2018-12-02 13:19:10 FREQUENCY

I’ve included the full Monitor output at the end of the posting, or you could visit the ODC page if you want to see it, but if we look at just this line we can see that the index full scan starts running in the first second of the query (‘Start Active’), runs once (‘Execs’) and, as the OP said, retrieved 11M rows in that one scan compared to an estimated 83,917.

When we examine the predicate section we can understand why the optimizer could make such a large error – the SQL requires Oracle to combine two columns from the table with various bits of bind variables to construct a date which is then compares with a couple of constant dates derived from several input bind variables using range based comparisons.

This is an example of Oracle using a fixed estimate of 5% for the selectivity of “unknown range-based comparison” – but with two comparisons the selectivity becomes 5% of 5% = 0.25% (i.e. 1/400).

If we look at the column definitions and stats we see that we seem to have 5 possible years and 12 possible months (which could mean a range as small as 3 years and 2 months) – so a selectivity of 1/400 would be in the right ballpark if we were querying for a date range of roughly 4.5 days. Working the figures the other way around – if 83,917 is 1/400 of the data then there are about 33.5M rows in the table and we are querying for something more like 1/3 of the table.

Observations

I find it curious that the optimizer used an “index full scan” to fetch a huge amount of data from the index when there is no requirement for sorting (there is a subsequent “hash unique”, rather than “sort unique nosort”). I would have expected an “index fast full scan” so I am curious to know if some optimizer parameters have been fiddled with to get the optimizer to bypass the fast full scan. Possibly a change in parameter settings would result in a very different plan.

The names of the bind variables are of the form “SYS_B_nn” – which means that the original query has been subject to the effects of forced cursor sharing. Since we are apparently expecting to identify and manipulate millions of rows this looks like the type of query where you don’t want to use cursor sharing. If the session can set “cursor_sharing=exact” before running the query, or inject the hint /*+ cursor_sharing_exact */ into the query then perhaps we’d get a better estimate of rows (and a better plan). If hinting or setting session parameters is possible then setting optimzer_dynamic_sampling to level 3, or possibly 4, might be sufficient.

The messy expression combining month and year is a crippling handicap to the optimizer – so fixing the query to make the literals visible isn’t actually going to help. This is Oracle 12c, though – so we could add a virtual date column (declared as invisible to avoid the threat of inserts that don’t specify column lists) and gather stats on it. The combination of virtual column and literal values might give the optimizer the information it really needs. Here’s a little script to demonstrate:


rem
rem     Script:         virtual_study.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:
rem
rem     Last tested
rem             12.1.0.2

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        sysdate - (5 * 365) + rownum / 550      d1,
        to_number(
                to_char(
                        (sysdate - (5 * 365) + rownum / 550),
                        'MM'
                )
        )                                       month,
        to_number(
                to_char(
                        (sysdate - (5 * 365) + rownum / 550),
                        'YYYY'
                )
        )                                       year,
        lpad(rownum,10,'0')                     v1
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1 for columns month size 12 for columns year size 6'
        );
end;
/

I’ve created a table with a million rows with data going back roughly 5 years from current date, which means I need roughly 550 rows per day. I’ve then created histograms on the month and year columns to match the original posting. Now I’ll set up the bind variables and values specified by the OP and run a simple query to show the date information that the bind variables give, and the 1/400 selectivity of the OP’s predicate:


var SYS_B_00 varchar2(32);
var SYS_B_01 varchar2(32);
var SYS_B_02 varchar2(32);
var SYS_B_03 varchar2(32);
var SYS_B_04 varchar2(32);
var SYS_B_05 varchar2(32);
var SYS_B_06 number;
var SYS_B_07 varchar2(32);
var SYS_B_08 varchar2(32);
var SYS_B_09 varchar2(32);
var SYS_B_10 number;

exec :SYS_B_00:='01/';
exec :SYS_B_01:='/';
exec :SYS_B_02:='dd/MM/yyyy';
exec :SYS_B_03:='10/04/2018';
exec :SYS_B_04:='MM/dd/yyyy';
exec :SYS_B_05:='q';
exec :SYS_B_06:=12;
exec :SYS_B_07:='10/04/2018';
exec :SYS_B_08:='MM/dd/yyyy';
exec :SYS_B_09:='q';
exec :SYS_B_10:=1;

select
        to_date(:sys_b_00||to_char(month)||:sys_b_01||to_char(year),:sys_b_02)  d1, 
        add_months(trunc(to_date(:sys_b_03,:sys_b_04),:sys_b_05),(-:sys_b_06))  c1,
        to_date(:sys_b_00||to_char(month)||:sys_b_01||to_char(year),:sys_b_02)  d2,
        trunc(to_date(:sys_b_07,:sys_b_08),:sys_b_09)-:sys_b_10                 c2
from
        t1
where
        rownum = 1
;

set serveroutput off
alter session set statistics_level = all;

select  count(*)
from    t1
where
        (    to_date(:sys_b_00||to_char(month)||:sys_b_01||to_char(year),:sys_b_02) >= add_months(trunc(to_date(:sys_b_03,:sys_b_04),:sys_b_05),(-:sys_b_06)) 
         and to_date(:sys_b_00||to_char(month)||:sys_b_01||to_char(year),:sys_b_02) <= trunc(to_date(:sys_b_07,:sys_b_08),:sys_b_09)-:sys_b_10 )
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

===========================================

D1        C1        D2        C2
--------- --------- --------- ---------
01-DEC-13 01-OCT-17 01-DEC-13 30-SEP-18


  COUNT(*)
----------
    200750

--------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:07.39 |    4980 |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:07.39 |    4980 |
|*  2 |   FILTER            |      |      1 |        |    200K|00:00:06.42 |    4980 |
|*  3 |    TABLE ACCESS FULL| T1   |      1 |   2500 |    200K|00:00:04.59 |    4980 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TRUNC(TO_DATE(:SYS_B_07,:SYS_B_08),:SYS_B_09)-:SYS_B_10 .ge. ADD_MON
              THS(TRUNC(TO_DATE(:SYS_B_03,:SYS_B_04),:SYS_B_05),(-:SYS_B_06)))
   3 - filter((TO_DATE(:SYS_B_00||TO_CHAR("MONTH")||:SYS_B_01||TO_CHAR("YEAR")
              ,:SYS_B_02) .ge. ADD_MONTHS(TRUNC(TO_DATE(:SYS_B_03,:SYS_B_04),:SYS_B_05),(-:SYS_B
              _06)) AND TO_DATE(:SYS_B_00||TO_CHAR("MONTH")||:SYS_B_01||TO_CHAR("YEAR"),:SYS
              _B_02) .le. TRUNC(TO_DATE(:SYS_B_07,:SYS_B_08),:SYS_B_09)-:SYS_B_10))


Note: in this and subsequent text I’ve had to use .le. to represent “less than or equal to” and .ge. to represent “greater than or equal to”. in the execution plans

This shows us that the first row in my table has a date component of 1st Dec 2013, while the date range required by the OP was one year’s worth of data between 1st Oct 2017 and 30th Sept 2018. The optimizer’s estimate of 2,500 rows out of 1M is the 1/400 we expect.

Let’s test the effect of running the query using literals (i.e. in the OP’s environment stop the “cursor_sharing = force” effect):


select
        count(*)
from    t1
where
        (    to_date('01/'||to_char(month)||'/'||to_char(year),'dd/MM/yyyy') >= add_months(trunc(to_date('10/04/2018','dd/MM/yyyy'),'q'),(-12)) 
         and to_date('01/'||to_char(month)||'/'||to_char(year),'dd/MM/yyyy') <= trunc(to_date('10/04/2018','dd/MM/yyyy'),'q')-1 )
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'))

========================================================

 COUNT(*)
----------
    200750


--------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   892 (100)|      1 |00:00:05.17 |    4980 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |            |      1 |00:00:05.17 |    4980 |
|*  2 |   TABLE ACCESS FULL| T1   |      1 |   2500 |   892  (30)|    200K|00:00:04.30 |    4980 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((TO_DATE('01/'||TO_CHAR("MONTH")||'/'||TO_CHAR("YEAR"),'dd/MM/yyyy') .ge. TO_DAT
              E(' 2017-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              TO_DATE('01/'||TO_CHAR("MONTH")||'/'||TO_CHAR("YEAR"),'dd/MM/yyyy') .le. TO_DATE(' 2018-03-31
              00:00:00', 'syyyy-mm-dd hh24:mi:ss')))


We can see that the literals have echoed through the plan to the predicate section, but the optimizer hasn’t changed its estimate. Let’s create the virtual column, gather stats on it, and try again:


alter table t1 add v_date invisible generated always as (
        to_date('01/'||to_char(month)||'/'||to_char(year),'dd/MM/yyyy')
) virtual
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for columns v_date size 1')

select  /* virtual column */
        count(*)
from    t1
where
        (    to_date('01/'||to_char(month)||'/'||to_char(year),'dd/MM/yyyy') >= add_months(trunc(to_date('10/04/2018','dd/MM/yyyy'),'q'),(-12)) 
         and to_date('01/'||to_char(month)||'/'||to_char(year),'dd/MM/yyyy') <= trunc(to_date('10/04/2018','dd/MM/yyyy'),'q')-1 )
;

 select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

=======================================================================

 COUNT(*)
----------
    200750

--------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |   950 (100)|      1 |00:00:06.27 |    4980 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |            |      1 |00:00:06.27 |    4980 |
|*  2 |   TABLE ACCESS FULL| T1   |      1 |    236K|   950  (34)|    200K|00:00:04.78 |    4980 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((TO_DATE('01/'||TO_CHAR("MONTH")||'/'||TO_CHAR("YEAR"),'dd/MM/yyyy') .ge. TO_DAT
              E(' 2017-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              TO_DATE('01/'||TO_CHAR("MONTH")||'/'||TO_CHAR("YEAR"),'dd/MM/yyyy') .le. TO_DATE(' 2018-03-31
              00:00:00', 'syyyy-mm-dd hh24:mi:ss')))



The optimizer sees that the expression involving month and year matches the virtual column definition, and evaluates the two date expression to produce simple constants and gives us a cardinality estimate in the right ballpark.

Conclusion

Cursor sharing and “big” queries don’t mix. If you have queries that have to manipulate large volumes of data then the overhead of optimising each one separately is likely to be insignificant, and the threat of cardinality errors introduced by bind variables being re-used could be significant.

If you have to make use of an existing (bad) table definition, and can’t managed to write predicates that allow the optimizer to use existing column statistics, remember that you might be able to create a virtual (and invisible) column that captures the necessary definition thereby allowing you to give Oracle some statistics about the necessary predicate.

Footnote

In case you didn’t want to scan through the ODC page, here’s the full SQL Monitor output for the original query:


Global Stats
==============================================================================================
| Elapsed |   Cpu   |    IO    | Cluster  |  Other   | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) |  Gets  | Reqs | Bytes | Reqs  | Bytes |
==============================================================================================
|     320 |      76 |      140 |       39 |       66 |     8M | 257K |   2GB |  1528 | 306MB |
==============================================================================================
 
 
SQL Plan Monitoring Details (Plan Hash Value=3210215320)
=================================================================================================================================================================================================================================
| Id    |            Operation            |         Name            |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Write | Write | Mem  | Temp | Activity |       Activity Detail       | Progress | 
|       |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes | Reqs  | Bytes |      |      |   (%)    |         (# samples)         |          |
=================================================================================================================================================================================================================================
|  -> 0 | SELECT STATEMENT                |                         |         |       |       180 |   +142 |     1 |        0 |       |       |       |       |      |      |          |                             |          |
|  -> 1 |   SORT UNIQUE                   |                         |    1093 | 52574 |       180 |   +142 |     1 |        0 |       |       |   534 | 107MB |   2M | 113M |     0.94 | Cpu (3)                     |          |
|  -> 2 |    NESTED LOOPS                 |                         |    1093 | 52573 |       180 |   +142 |     1 |       3M |       |       |       |       |      |      |     0.31 | Cpu (1)                     |          |
|  -> 3 |     NESTED LOOPS                |                         |    1118 | 52573 |       180 |   +142 |     1 |       3M |       |       |       |       |      |      |     0.31 | Cpu (1)                     |          |
|  -> 4 |      HASH JOIN RIGHT SEMI       |                         |    1118 | 52238 |       189 |   +133 |     1 |       3M |       |       |       |       | 153M |      |     1.57 | Cpu (5)                     |          |
|     5 |       VIEW                      |                         |    157K | 31145 |         9 |   +134 |     1 |       2M |       |       |       |       |      |      |          |                             |          |
|     6 |        WINDOW SORT              |                         |    157K | 31145 |        57 |    +86 |     1 |       4M |  3777 | 199MB |   994 | 199MB |      |      |     3.14 | Cpu (5)                     |     100% |
|       |                                 |                         |         |       |           |        |       |          |       |       |       |       |      |      |          | direct path read temp (5)   |          |
|     7 |         HASH JOIN               |                         |    157K | 29653 |        50 |    +85 |     1 |       4M |       |       |       |       |      |      |     1.26 | Cpu (4)                     |          |
|     8 |          VIEW                   |                         |   81771 | 23273 |         1 |    +86 |     1 |       1M |       |       |       |       |      |      |          |                             |          |
|     9 |           HASH UNIQUE           |                         |   81771 | 23273 |        75 |    +12 |     1 |       1M |       |       |       |       |      |      |     1.89 | Cpu (6)                     |          |
|    10 |            FILTER               |                         |         |       |        78 |     +9 |     1 |      11M |       |       |       |       |      |      |     0.31 | Cpu (1)                     |          |
|    11 |             INDEX FULL SCAN     | PK_HOUSEHOLD_GDC        |   83917 | 22799 |        86 |     +1 |     1 |      11M |     9 | 73728 |       |       |      |      |    24.21 | Cpu (77)                    |          |
|    12 |          INDEX FULL SCAN        | PK_ADV_HOUSEHOLD_ACCT   |      8M |  6332 |        49 |    +86 |     1 |       8M |       |       |       |       |      |      |    12.58 | gc cr block 2-way (37)      |          |
|       |                                 |                         |         |       |           |        |       |          |       |       |       |       |      |      |          | gc current block 2-way (3)  |          |
| -> 13 |       INDEX FULL SCAN           | PK_ADV_HOUSEHOLD_ACCT   |      8M |  6332 |       180 |   +142 |     1 |       7M |       |       |       |       |      |      |     0.63 | Cpu (2)                     |          |
| -> 14 |      INDEX RANGE SCAN           | IDX4_LPL_BETA_CUST_RLTN |       1 |     1 |       181 |   +141 |    3M |       3M | 75759 | 592MB |       |       |      |      |    23.27 | gc current grant 2-way (1)  |          |
|       |                                 |                         |         |       |           |        |       |          |       |       |       |       |      |      |          | Cpu (21)                    |          |
|       |                                 |                         |         |       |           |        |       |          |       |       |       |       |      |      |          | db file parallel read (52)  |          |
| -> 15 |     TABLE ACCESS BY INDEX ROWID | IMPL_LPL_BETA_CUST_RLTN |       1 |     1 |       180 |   +142 |    3M |       3M |  177K |   1GB |       |       |      |      |    29.56 | Cpu (12)                    |          |
|       |                                 |                         |         |       |           |        |       |          |       |       |       |       |      |      |          | db file parallel read (81)  |          |
|       |                                 |                         |         |       |           |        |       |          |       |       |       |       |      |      |          | db file sequential read (1) |          |
=================================================================================================================================================================================================================================

December 7, 2018

Plans and Trees

Filed under: Execution plans,Oracle — Jonathan Lewis @ 5:58 pm GMT Dec 7,2018

Prompted by a question on the ODC database forum – and also because I failed to get to the “Bonus slides” on my presentation on basic execution plans at both the DOAG and UKOUG conferences, here’s a small of slides demonstrating how to convert a text execution plan into a tree that you can read using the mechanism described in Oracle’s white paper by the phrase: “start from the bottom left and work across and then up”.

The file is a Microsoft Powerpoint file (early version).

 

November 26, 2018

Shrink Space

Filed under: dbms_xplan,Execution plans,Oracle,Performance,subqueries — Jonathan Lewis @ 4:37 pm GMT Nov 26,2018

I have never been keen on the option to “shrink space” for a table because of the negative impact it can have on performance.

I don’t seem to have written about it in the blog but I think there’s something in one of my books pointing out that the command moves data from the “end” of the table (high extent ids) to the “start” of the table (low extent ids) by scanning the table backwards to find data that can be moved and scanning forwards to find space to put it. This strategy can have the effect of increasing the scattering of the data that you’re interested in querying if most of your queries are about “recent” data, and you have a pattern of slowing deleting aging data. (You may end up doing a range scan through a couple of hundred table blocks for data at the start of the table that was once packed into a few blocks near the end of the table.)

In a discussion with a member of the audience at the recent DOAG conference (we were talking about execution plans for queries that included filter subqueries) I suddenly thought of another reason why (for an unlucky person) the shrink space command could be a disaster – here’s a little fragment of code and output to demonstrate the point.


rem
rem     Script:         shrink_scalar_subq.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2018
rem     Purpose:
rem
rem     Versions tested
rem             12.2.0.1
rem

select
        /*+ gather_plan_statistics pre-shrink */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from
                emp outer
        where
                outer.sal > (
                        select  /*+ no_unnest */
                                avg(inner.sal)
                        from
                                emp inner
                        where
                                inner.dept_no = outer.dept_no
                )
        )
;

alter table emp enable row movement;
alter table emp shrink space compact;

select
        /*+ gather_plan_statistics post-shrink  */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from emp outer
        where outer.sal >
                (
                        select /*+ no_unnest */ avg(inner.sal)
                        from emp inner
                        where inner.dept_no = outer.dept_no
                )
        )
;

The two queries are the same and the execution plans are the same (the shrink command doesn’t change the object statistics, after all), but the execution time jumped from 0.05 seconds to 9.43 seconds – and the difference in timing wasn’t about delayed block cleanout or other exotic side effects.


  COUNT(*)
----------
      9498

Elapsed: 00:00:00.05


  COUNT(*)
----------
      9498

Elapsed: 00:00:09.43

The query is engineered to have a problem, of course, and enabling rowsource execution statistics exaggerates the anomaly – but the threat is genuine. You may have seen my posting (now 12 years old) about the effects of scalar subquery caching – this is another example of the wrong item of data appearing in the wrong place making us lose the caching benefit. The emp table I’ve used here is (nearly) the same emp table I used in the 2006 posting, but the difference between this case and the previous case is that I updated a carefully selected row to an unlucky value in 2006, but here in 2018 the side effects of a call to shrink space moved a row from the end of the table (where it was doing no harm) to the start of the table (where it had a disastrous impact).

Here are the two execution plans – before and after the shrink space – showing the rowsource execution stats. Note particularly the number of times the filter subquery ran – jumping from 7 to 3172 – the impact this has on the buffer gets, and the change in time recorded:

----------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |      1 |00:00:00.03 |    1880 |
|   1 |  SORT AGGREGATE       |      |      1 |      1 |      1 |00:00:00.03 |    1880 |
|   2 |   VIEW                |      |      1 |    136 |   9498 |00:00:00.03 |    1880 |
|*  3 |    FILTER             |      |      1 |        |   9498 |00:00:00.03 |    1880 |
|   4 |     TABLE ACCESS FULL | EMP  |      1 |  19001 |  19001 |00:00:00.01 |     235 |
|   5 |     SORT AGGREGATE    |      |      7 |      1 |      7 |00:00:00.02 |    1645 |
|*  6 |      TABLE ACCESS FULL| EMP  |      7 |   2714 |  19001 |00:00:00.02 |    1645 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("OUTER"."SAL">)
   6 - filter("INNER"."DEPT_NO"=:B1)


----------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |      1 |00:00:09.42 |     745K|
|   1 |  SORT AGGREGATE       |      |      1 |      1 |      1 |00:00:09.42 |     745K|
|   2 |   VIEW                |      |      1 |    136 |   9498 |00:00:11.71 |     745K|
|*  3 |    FILTER             |      |      1 |        |   9498 |00:00:11.70 |     745K|
|   4 |     TABLE ACCESS FULL | EMP  |      1 |  19001 |  19001 |00:00:00.01 |     235 |
|   5 |     SORT AGGREGATE    |      |   3172 |      1 |   3172 |00:00:09.40 |     745K|
|*  6 |      TABLE ACCESS FULL| EMP  |   3172 |   2714 |     10M|00:00:04.33 |     745K|
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("OUTER"."SAL">)
   6 - filter("INNER"."DEPT_NO"=:B1)


Footnote:

For completeness, here’s the code to generate the emp table. It’s sitting in a tablespace using system managed extents and automatic segment space management.


create table emp(
        dept_no         not null,
        sal,
        emp_no          not null,
        padding,
        constraint e_pk primary key(emp_no)
)
as
with generator as (
        select  null
        from    dual
        connect by
                level <= 1e4 -- > comment to avoid wordpress format issue
)
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',60)
from
        generator       v1,
        generator       v2
where
        rownum <= 2e4 -- > comment to avoid wordpress format issue
;


insert into emp values(432, 20001, 20001, rpad('x',60));
delete /*+ full(emp) */ from emp where emp_no <= 1000;      -- > comment to avoid wordpress format issue
commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'EMP',
                method_opt       => 'for all columns size 1'
        );
end;
/



 

November 19, 2018

Table order

Filed under: ANSI Standard,Execution plans,Oracle,Tuning — Jonathan Lewis @ 1:30 pm GMT Nov 19,2018

Over the last few days I’ve highlighted on Twitter a couple of older posts showing how a change in the order that tables appear in the from clause could affect the execution plan of a query. In one case the note was purely theoretical describing a feature of the way the optimizer works with simple query blocks, in the other case the note was about an anomaly with table elimination that could appear with both “ANSI” and “traditional” Oracle syntax.

Here’s another note that might be more generally useful – an example of an odd side effect of ordering and “ANSI” syntax, with a suggestion for a pattern for writing ANSI SQL. It’s based on a test I wrote to play around with a problem that showed up on the Oracle database forum more than six years ago and shows a strange inconsistency. The setup is a little long-winded as the example involves 4 tables, so I’ll leave the script to create, load and index the tables to the end of the note. Here’s the query that introduced the problem; it’s a fairly straightforward 4 table join with two (left) outer joins:


select
        episode.episode_id , episode.cross_ref_id , episode.date_required ,
        product.number_required,
        request.site_id
from
        episode
left join
        request
on      episode.cross_ref_id = request.cross_ref_id
join
        product
ON      episode.episode_id = product.episode_id
left join
        product_sub_type
ON      product.prod_sub_type_id = product_sub_type.prod_sub_type_id
where
        episode.department_id = 2
and     product.status = 'I'
order by
        episode.date_required
;

And here’s the execution plan:


----------------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         | 33333 |  1725K|       | 17135   (4)| 00:00:01 |
|   1 |  SORT ORDER BY       |         | 33333 |  1725K|  2112K| 17135   (4)| 00:00:01 |
|*  2 |   HASH JOIN OUTER    |         | 33333 |  1725K|  1632K| 16742   (4)| 00:00:01 |
|*  3 |    HASH JOIN         |         | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| PRODUCT | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| EPISODE |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    TABLE ACCESS FULL | REQUEST |  4000K|    57M|       | 13542   (3)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EPISODE"."CROSS_REF_ID"="REQUEST"."CROSS_REF_ID"(+))
   3 - access("EPISODE"."EPISODE_ID"="PRODUCT"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

The first thing you’ll notice, of course, is that the plan reports a three table join. Thanks to various referential integrity constraints, the absence of the table in the final select list, and the nature of the join to that table, the optimizer has determined that the product_sub_type table could be eliminated from the join without changing the result set.

What you can’t tell from the plan is that there’s an index on the request table that holds all the columns needed to satisfy the query, and an index fast full scan on the index would be significantly more efficient than the tablescan that appears at operation 6.

Having noticed from the plan that product_sub_type is redundant, the obvious thing to do before investigating further is to rewrite the statement to remove the table . Here’s the resulting query, with execution plan:

----------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 33333 |  1725K|       |  5525   (6)| 00:00:01 |
|   1 |  SORT ORDER BY         |             | 33333 |  1725K|  2112K|  5525   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |             | 33333 |  1725K|  1632K|  5132   (7)| 00:00:01 |
|*  3 |    HASH JOIN           |             | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | PRODUCT     | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL  | EPISODE     |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN| IX4_REQUEST |  4000K|    57M|       |  1932   (7)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EPISODE"."CROSS_REF_ID"="REQUEST"."CROSS_REF_ID"(+))
   3 - access("EPISODE"."EPISODE_ID"="PRODUCT"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

So – when the optimizer removes the product_sub_type from the query the plan reports a tablescan of request, when we remove product_sub_type the plan reports an index fast full scan of an appropriate index – which appears to be roughly one seventh (1,932/13,542) of the size of the table. It’s a little surprising that the optimizer didn’t get it right by itself – but “ANSI” style SQL often displays quirky little side effects because of the way the optimizer transforms it into traditional Oracle style.

We could stop at that point, of course, but then you’d wonder about the significance of the title of the post. So let’s play around with the join order of the original query, without removing the product_sub_type table.

As a general strategy (though not an absolute rule) I tend to arrange code so that outer joins don’t appear before “inner” joins. In this example that means I would have written the original statement as follows:


select
        episode.episode_id, episode.cross_ref_id, episode.date_required,
        product.number_required,
        request.site_id
from
        episode
join
        product
ON      product.episode_id = episode.episode_id
left join
        product_sub_type
ON      product_sub_type.prod_sub_type_id = product.prod_sub_type_id
left join
        request
on      request.cross_ref_id = episode.cross_ref_id
where
        episode.department_id = 2
and     product.status        = 'I'
order by
        episode.date_required
;

All I’ve done is move the join between episode and product up the SQL, following it with the outer join to product_sub_type, finally closing with the outer join between episode and request. Here’s the execution plan – which you might expect to look exactly like the original plan:


----------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 33333 |  1725K|       |  5525   (6)| 00:00:01 |
|   1 |  SORT ORDER BY         |             | 33333 |  1725K|  2112K|  5525   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |             | 33333 |  1725K|  1632K|  5132   (7)| 00:00:01 |
|*  3 |    HASH JOIN           |             | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | PRODUCT     | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL  | EPISODE     |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN| IX4_REQUEST |  4000K|    57M|       |  1932   (7)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("REQUEST"."CROSS_REF_ID"(+)="EPISODE"."CROSS_REF_ID")
   3 - access("PRODUCT"."EPISODE_ID"="EPISODE"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

The product_sub_type table has been eliminated and we’re doing an index fast full scan of the ix4_request index instead of a tablescan of the much larger request table.

tl;dr

Changing the order of the tables in an ANSI join – especially when there are outer joins involved – could make a significant difference to the way the query is transformed and optimised. While it is nice to write the table ordering so that “chains” of joins are easily visible, bear in mind that re-ordering the join to postpone outer joins may be enough to help the optimizer produce a better execution plan.

Footnote

If you want to play around with the example, here’s the code to create and load the tables. The code doesn’t follow my usual style as most of it is cut-n-pasted from the Oracle forum thread:


rem
rem     script:         Ansi_outer_5.sql
rem     Dated:          July 2012
rem     Author:         Jonathan Lewis
rem
rem     Last tested
rem             18.3.0.0        iffs still not used by default
rem             12.2.0.1        iffs still not used by default
rem

create table episode (
        episode_id number (*,0),
        department_id number (*,0),
        date_required date,
        cross_ref_id varchar2 (11),
        padding varchar2 (80),
        constraint pk_episode primary key (episode_id)
)
;

create table product_sub_type (
        prod_sub_type_id number (*,0),
        sub_type_name varchar2 (20),
        units varchar2 (20),
        padding varchar2 (80),
        constraint pk_product_sub_type primary key (prod_sub_type_id)
)
;

create table product (
        product_id number (*,0),
        prod_type_id number (*,0),
        prod_sub_type_id number (*,0),
        episode_id number (*,0),
        status varchar2 (1),
        number_required number (*,0),
        padding varchar2 (80),
        constraint pk_product primary key (product_id),
        constraint nn_product_episode check (episode_id is not null) 
)
;

alter table product add constraint fk_product 
        foreign key (episode_id) references episode (episode_id)
;

alter table product add constraint fk_prod_sub_type
        foreign key (prod_sub_type_id) references product_sub_type (prod_sub_type_id)
;

create table request (
        request_id number (*,0),
        department_id number (*,0),
        site_id number (*,0),
        cross_ref_id varchar2 (11),
        padding varchar2 (80),
        padding2 varchar2 (80),
        constraint pk_request primary key (request_id),
        constraint nn_request_department check (department_id is not null),
        constraint nn_request_site_id check (site_id is not null)
)
;

prompt  ===================
prompt  Loading episode ...
prompt  ===================

insert /*+ append */ into episode
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, 2,
    sysdate + mod (r, 14),
    to_char (r, '0000000000'),
    'ABCDEFGHIJKLMNOPQRSTUVWXYZ' || to_char (r, '000000')
  from generator g
where g.r <= 3e5
/ 

commit;

prompt  ============================
prompt  Loading product_sub_type ...
prompt  ============================

insert /*+ append */ into product_sub_type
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, 
       to_char (r, '000000'),
       to_char (mod (r, 3), '000000'),
       'ABCDE' || to_char (r, '000000')
  from generator g
where g.r <= 15
/ 

commit;

prompt  ===================
prompt  Loading product ...
prompt  ===================

insert /*+ append */ into product
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, mod (r, 12) + 1, mod (r, 15) + 1, mod (r, 300000) + 1,
       decode (mod (r, 3), 0, 'I', 1, 'C', 2, 'X', 'U'),
       dbms_random.value (1, 100), NULL
  from generator g
where g.r <= 1e5
/ 

commit;

prompt  ===================
prompt  Loading request ...
prompt  ===================

insert /*+ append */ into request
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e7
       ) 
select 
        r, mod (r, 4) + 1, 1, to_char (r, '0000000000'),
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567890123456789' || to_char (r, '000000'),
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789012345678' || to_char (r, '000000')
  from generator g
where g.r <= 4e6
/ 

commit;

create index ix1_episode_cross_ref on episode (cross_ref_id);

create index ix1_product_episode on product (episode_id);
create index ix2_product_type on product (prod_type_id);

create index ix1_request_site on request (site_id);
create index ix2_request_dept on request (department_id);
create index ix3_request_cross_ref on request (cross_ref_id);
create index ix4_request on request (cross_ref_id, site_id);

exec dbms_stats.gather_schema_stats ('test_user')

Note that there is a call to gather_schema_stats() at the end, rather than a set of 4 calls to gather_table_stats(); you may want to change this. The entire data set, including indexes, will need about 1.5GB of free space.

 

November 15, 2018

num_index_keys

Filed under: 12c,Bugs,CBO,Execution plans,Hints,Oracle — Jonathan Lewis @ 1:13 pm GMT Nov 15,2018

The title is the name of an Oracle hint that came into existence in Oracle 10.2.0.3 and made an appearance recently in a question on the rarely used “My Oracle Support” Community forum (you’ll need a MOS account to be able to read the original). I wouldn’t have found it but the author also emailed me the link asking if I could take a look at it.  (If you want to ask me for help – without paying me, that is – then posting a public question in the Oracle (ODC) General Database or SQL forums and emailing me a private link is the strategy most likely to get an answer, by the way.)

The question was about a very simple query using a straightforward index – with a quirky change of plan after upgrading from 10.2.0.3 to 12.2.0.1. Setting the optimizer_features_enable to ‘10.2.0.3’ in the 12.2.0.1 system re-introduced the 10g execution plan. Here’s the query:


SELECT t1.*
   FROM   DW1.t1
  WHERE   t1.C1 = '0001' 
    AND   t1.C2 IN ('P', 'F', 'C')
    AND   t1.C3 IN (
                    '18110034450001',
                    '18110034450101',
                    '18110034450201',
                    '18110034450301',
                    '18110034450401',
                    '18110034450501'
          );
 

Information supplied: t1 holds about 500 million rows at roughly 20 rows per block, the primary key index is (c1, c2, c3, c4), there are just a few values for each of c1, c2 and c4, while c3 is “nearly unique” (which, for clarity, was expanded to “the number of distinct values of c3 is virtually the same as the number of rows in the table”).

At the moment we don’t have any information about histograms and we don’t known whether or not “nearly unique” might still allow a few values of c3 to have a large number of duplicates, so that’s something we might want to follow up on later.

Here are the execution plans – the fast one (from 10g) first, then the slow (12c) plan – and you should look carefully at the predicate section of the two plans:


10g (pulled from memory with rowsource execution statistics enabled)
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |      1 |        |      6 |00:00:00.01 |      58 |      5 |
|   1 |  INLIST ITERATOR             |                  |      1 |        |      6 |00:00:00.01 |      58 |      5 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1               |     18 |      5 |      6 |00:00:00.01 |      58 |      5 |
|*  3 |    INDEX RANGE SCAN          | PK_T1            |     18 |      5 |      6 |00:00:00.01 |      52 |      4 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."C1"='0001' AND (("T1"."C2"='C' OR "T1"."C2"='F' OR
              "T1"."C2"='P')) AND (("C3"='18110034450001' OR "C3"='18110034450101' OR
              "C3"='18110034450201' OR "C3"='18110034450301' OR "C3"='18110034450401' OR
              "C3"='18110034450501')))

 

12c (from explain plan)
---------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                  |     1 |   359 |     7   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR                     |                  |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1               |     1 |   359 |     7   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | PK_T1            |     1 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."C1"='0001' AND ("T1"."C2"='C' OR "T1"."C2"='F' OR
              "T1"."C2"='P'))
       filter("C3"='18110034450001' OR "C3"='18110034450101' OR
              "C3"='18110034450201' OR "C3"='18110034450301' OR
              "C3"='18110034450401' OR "C3"='18110034450501')
  

When comparing plans it’s better, of course, to present the same sources from the two systems, it’s not entirely helpful to have the generated plan from explain plan in one version and a run-time plan with stats in the other – given the choice I’d like to see the run-time from both. Despite this, I felt fairly confident that the prediction would match the run-time for 12c and that I could at least guess the “starts” figure for 12c.

The important thing to notice is the way that the access predicate in 10g has split into an access predicate followed by a filter predicate in 12c. So 12c is going to iterate three times (once for each of the values  ‘C’, ‘F’, ‘P’) and then walk a potentially huge linked list of index leaf blocks looking for 6 values of c3, while 10g is going to probe the index 18 times (3 combinations of c2 x six combinations of c3) to find “nearly unique” rows which means probably one leaf block per probe.

The 12c plan was taking minutes to run, the 10g plan was taking less than a second. The difference in execution time was probably the effect of the 12c plan ranging through (literally) thousands of index leaf blocks.

There are many bugs and anomalies relating to in-list iteration and index range scans and cardinality calculations – here’s a quick sample of v$system_fix_control in 12.2.0.1:


select optimizer_feature_enable ofe, sql_feature, bugno, description
from v$system_fix_control
where
	optimizer_feature_enable between '10.2.0.4' and '12.2.0.1'
and	(   sql_feature like '%CBO%'
	 or sql_feature like '%CARDINALITY%'
	)
and	(    lower(description) like '%list%'
	 or  lower(description) like '%iterat%'
	 or  lower(description) like '%multi%col%'
	)
order by optimizer_feature_enable, sql_feature, bugno
;

OFE        SQL_FEATURE                      BUGNO DESCRIPTION
---------- --------------------------- ---------- ----------------------------------------------------------------
10.2.0.4   QKSFM_CBO_5259048              5259048 undo unused inlist
           QKSFM_CBO_5634346              5634346 Relax equality operator restrictions for multicolumn inlists

10.2.0.5   QKSFM_CBO_7148689              7148689 Allow fix of bug 2218788 for in-list predicates

11.1.0.6   QKSFM_CBO_5139520              5139520 kkoDMcos: For PWJ on list dimension, use part/subpart bits

11.2.0.1   QKSFM_CBO_6818410              6818410 eliminate redundant inlist predicates

11.2.0.2   QKSFM_CBO_9069046              9069046 amend histogram column tracking for multicolumn stats

11.2.0.3   QKSFM_CARDINALITY_11876260    11876260 use index filter inlists with extended statistics
           QKSFM_CBO_10134677            10134677 No selectivity for transitive inlist predicate from equijoin
           QKSFM_CBO_11834739            11834739 adjust NDV for list partition key column after pruning
           QKSFM_CBO_11853331            11853331 amend index cost compare with inlists as filters
           QKSFM_CBO_12591120            12591120 check inlist out-of-range values with extended statistics

11.2.0.4   QKSFM_CARDINALITY_12828479    12828479 use dynamic sampling cardinality for multi-column join key check
           QKSFM_CARDINALITY_12864791    12864791 adjust for NULLs once for multiple inequalities on nullable colu
           QKSFM_CARDINALITY_13362020    13362020 fix selectivity for skip scan filter with multi column stats
           QKSFM_CARDINALITY_14723910    14723910 limit multi column group selectivity due to NDV of inlist column
           QKSFM_CARDINALITY_6873091      6873091 trim histograms based on in-list predicates
           QKSFM_CBO_13850256            13850256 correct estimates for transitive inlist predicate with equijoin

12.2.0.1   QKSFM_CARDINALITY_19847091    19847091 selectivity caching for inlists
           QKSFM_CARDINALITY_22533539    22533539 multi-column join sanity checks for table functions
           QKSFM_CARDINALITY_23019286    23019286 Fix cdn estimation with multi column stats on fixed data types
           QKSFM_CARDINALITY_23102649    23102649 correction to inlist element counting with constant expressions
           QKSFM_CBO_17973658            17973658 allow partition pruning due to multi-inlist iterator
           QKSFM_CBO_21057343            21057343 order predicate list
           QKSFM_CBO_22272439            22272439 correction to inlist element counting with bind variables

There are also a number of system parameters relating to inlists that are new (or have changed values) in 12.2.0.1 when compared with 10.2.0.3 – but I’m not going to go into those right now.

I was sufficiently curious about this anomaly that I emailed the OP to say I would be happy to take a look at the 10053 trace files for the query – the files probably weren’t going to be very large given that it was only a single table query – but in the end it turned out that I solved the problem before he’d had time to email them. (Warning – don’t email me a 10053 file on spec; if I want one I’ll ask for it.)

Based on the description I created an initial model of the problem – it took about 10 minutes to code:


rem     Tested on 12.2.0.1, 18.3.0.1

drop table t1 purge;

create table t1 (
	c1 varchar2(4) not null,
	c2 varchar2(1) not null,
	c3 varchar2(15) not null,
	c4 varchar2(4)  not null,
	v1 varchar2(250)
)
;

insert into t1
with g as (
	select rownum id 
	from dual
	connect by level <= 1e4 -- > hint to avoid wordpress format issue
)
select
	'0001',
	chr(65 + mod(rownum,11)),
	'18110034'||lpad(1+100*rownum,7,'0'),
	lpad(mod(rownum,9),4,'0'),
	rpad('x',250,'x')
from
	g,g
where
        rownum <= 1e5 -- > hint to avoid wordpress format issue
;


create unique index t1_i1 on t1(c1, c2, c3, c4);

begin
        dbms_stats.gather_table_stats(
                null,
                't1',
                method_opt => 'for all columns size 1'
        );
end;
/

alter session set statistics_level = all;
set serveroutput off

prompt	==========================
prompt	Default optimizer features
prompt	==========================

select
        /*+ optimizer_features_enable('12.2.0.1') */
	t1.*
FROM	t1
WHERE
	t1.c1 = '0001' 
AND	t1.c2 in ('H', 'F', 'C')
AND	t1.c3 in (
		'18110034450001',
		'18110034450101',
		'18110034450201',
		'18110034450301',
		'18110034450401',
		'18110034450501'
	)
;

select * from table(dbms_xplan.display_cursor(null,null,'cost allstats last'));

select 
        /*+ optimizer_features_enable('10.2.0.3') */
	t1.*
FROM	t1
WHERE
	t1.c1 = '0001' 
AND	t1.c2 in ('H', 'F', 'C')
AND	t1.c3 in (
		'18110034450001',
		'18110034450101',
		'18110034450201',
		'18110034450301',
		'18110034450401',
		'18110034450501'
	)
;

select * from table(dbms_xplan.display_cursor(null,null,'cost allstats last'));

alter session set statistics_level = all;
set serveroutput off

The two queries produced the same plan – regardless of the setting for optimizer_features_enable – it was the plan originally used by the OP’s 10g setting:


-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |      1 |        |    20 (100)|      0 |00:00:00.01 |      35 |
|   1 |  INLIST ITERATOR             |       |      1 |        |            |      0 |00:00:00.01 |      35 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |     18 |      2 |    20   (0)|      0 |00:00:00.01 |      35 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |     18 |      2 |    19   (0)|      0 |00:00:00.01 |      35 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."C1"='0001' AND (("T1"."C2"='C' OR "T1"."C2"='F' OR "T1"."C2"='H')) AND
              (("T1"."C3"='18110034450001' OR "T1"."C3"='18110034450101' OR "T1"."C3"='18110034450201' OR
              "T1"."C3"='18110034450301' OR "T1"."C3"='18110034450401' OR "T1"."C3"='18110034450501')))

There was one important difference between the 10g and the 12c plans – in 10g the cost of the table access (hence the cost of the total query) was 20; in 12c it jumped to 28 – maybe there’s a change in the arithmetic for costing the iterator, and maybe that’s sufficient to cause a problem.

Before going further it’s worth checking what the costs would look like (and, indeed, if the plan is possible in both versions) if we force Oracle into the “bad” plan. That’s where we finally get to the hint in the title of this piece. If I add the hint /*+ num_index_keys(t1 t1_i1 2) */ what’s going to happen ? (Technically I’ve included a hint to use the index, and specified the query block name to make sure Oracle doesn’t decide to switch to a tablescan):


select
        /*+
            optimizer_features_enable('12.2.0.1')
            index_rs_asc(@sel$1 t1@sel$1 (t1.c1 t1.c2 t1.c3 t1.c4))
            num_index_keys(@sel$1 t1@sel$1 t1_i1 2)
        */
        t1.*
FROM        t1
WHERE
        t1.c1 = '0001'
AND        t1.c2 in ('H', 'F', 'C')
AND        t1.c3 in (
                '18110034450001',
                '18110034450101',
                '18110034450201',
                '18110034450301',
                '18110034450401',
                '18110034450501'
        )
;

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |      1 |        |   150 (100)|      0 |00:00:00.01 |     154 |      1 |
|   1 |  INLIST ITERATOR                     |       |      1 |        |            |      0 |00:00:00.01 |     154 |      1 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      3 |     18 |   150   (2)|      0 |00:00:00.01 |     154 |      1 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |      3 |     18 |   142   (3)|      0 |00:00:00.01 |     154 |      1 |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."C1"='0001' AND (("T1"."C2"='C' OR "T1"."C2"='F' OR "T1"."C2"='H')))
       filter(("T1"."C3"='18110034450001' OR "T1"."C3"='18110034450101' OR "T1"."C3"='18110034450201' OR
              "T1"."C3"='18110034450301' OR "T1"."C3"='18110034450401' OR "T1"."C3"='18110034450501'))

This was the plan from 12.2.0.1 – and again the plan for 10.2.0.3 was identical except for costs which became 140 for the index range scan and 141 for the table access. At first sight it looks like 10g may be using the total selectivity of the entire query as the scaling factor for the index clustering_factor to find the table cost while 12c uses the cost of accessing the table for one iteration (rounding up) before multiplying by the number of iterations.

Having observed this detail I thought I’d do a quick test of what happened by default if I requested 145 distinct values of c3. Both versions defaulted to the access/filter path rather than the pure access path – but again there was a difference in costs. The 10g index cost was 140 with a table access cost of 158, while 12c had an index cost of 179 and a table cost of 372. So both versions switch plans at some point – do they switch at the same point ? Reader, I could not resist temptation, so I ran a test loop. With my data set the 12c version switched paths at 61 values in the in-list and 10g switched at 53 values –

Conclusion: there’s been a change in the selectivity calculations for the use of in-list iterators, which leads to a change in costs, which can lead to a change in plans; the OP was just unlucky with his data set and stats. Possibly there’s something about his data or stats that makes the switch appear with a much smaller in-list than mine.

Footnote:

When I responded to the thread on MOSC with the suggestion that the problem was in part due to statistics and might be affected by out of date stats (or a histogram on the (low-frequency) c2 column) the OP noted that stats hadn’t been gathered since some time in August – and found that the 12c path changed to the efficient (10g) one after re-gathering stats on the table.

 

November 8, 2018

Where / Having

Filed under: CBO,Conditional SQL,Execution plans,Oracle — Jonathan Lewis @ 12:11 pm GMT Nov 8,2018

There’s a very old mantra about the use of the “having” clause that tells us that if it’s valid (i.e. will always give the same results) then any predicate that could be moved from the having clause to the where clause should be moved. In recent versions of Oracle the optimizer will do this for itself in some cases but (for reasons that I’m not going to mention) I came across a silly example recently where a little manual editing produced a massive performance improvement.

Here’s a quick demo:


rem
rem     Script:         where_having.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2018
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

reate table t1
as
select * 
from all_objects 
where rownum <= 50000   -- > comment to avoid WordPress format issue
;

spool where_having.lst

set serveroutput off

select /*+ gather_plan_statistics */ 
        object_type, count(*) 
from    t1 
group by 
        object_type 
having  count(*) > 0 
and     1 = 2
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
;

The big question is: will Oracle do a full tablescan of t1, or will it apply a “null is not null” filter early to bypass that part of the plan. Here’s the plan pulled from memory, with run-time statistics (all versions from 11g to 18c):


--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      0 |00:00:00.02 |     957 |    955 |       |       |          |
|*  1 |  FILTER             |      |      1 |        |      0 |00:00:00.02 |     957 |    955 |       |       |          |
|   2 |   HASH GROUP BY     |      |      1 |      1 |     27 |00:00:00.02 |     957 |    955 |  1186K|  1186K| 1397K (0)|
|   3 |    TABLE ACCESS FULL| T1   |      1 |  50000 |  50000 |00:00:00.01 |     957 |    955 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((COUNT(*)>0 AND 1=2))


As you can see, the filter at operation 1 includes the contradiction “1=2”, but Oracle tests this only after doing the full tablescan and aggregation. If you move the “1=2” into the where clause the tablescan doesn’t happen.

Interestingly, if you write the query with an in-line view and trailing where clause:


select /*+ gather_plan_statistics */
        *
from    (
        select
                object_type, count(*)
        from    t1
        group by
                object_type
        having  count(*) > 0
        )
where
        1 = 2
;

The optimizer is clever enough to push the final predicate inside the view (where you might expect it to become part of the having clause) and push it all the way down into a where clause on the base table.


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      0 |00:00:00.01 |
|*  1 |  FILTER              |      |      1 |        |      0 |00:00:00.01 |
|   2 |   HASH GROUP BY      |      |      1 |      1 |      0 |00:00:00.01 |
|*  3 |    FILTER            |      |      1 |        |      0 |00:00:00.01 |
|   4 |     TABLE ACCESS FULL| T1   |      0 |  50000 |      0 |00:00:00.01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(COUNT(*)>0)
   3 - filter(NULL IS NOT NULL)



A quirky case of the optimizer handling the (apparently) more complex query than it does the simpler query.

September 30, 2018

Case Study

Filed under: 12c,Execution plans,Oracle,subqueries,Troubleshooting — Jonathan Lewis @ 7:59 pm GMT Sep 30,2018

A question about reading execution plans and optimising queries arrived on the ODC database forum a little while ago; the owner says the following statement is taking 14 minutes to return 30,000 rows and wants some help understanding why.

If you look at the original posting you’ll see that we’ve been given the text of the query and the execution plan including rowsource execution stats. There’s an inconsistency between the supplied information and the question asked, and I’ll get back to that shortly, but to keep this note fairly short I’ve excluded the 2nd half of the query (which is a UNION ALL) because the plan says the first part of the query took 13 minutes and 20 second and the user is worried about a total of 14 minutes.

SELECT /*+ gather_plan_statistics*/ DISTINCT
                rct.org_id,
                hzp.party_name,
                hca.account_number,
                rct.interface_header_attribute1 order_number,
                rct.customer_trx_id,
                rct.trx_number,
                rct.trx_date,
                rctd.gl_date,
                rct.creation_date,
                rctl.line_number,
                rct.invoice_currency_code inv_currency,
                (
                       SELECT SUM (rct_1.extended_amount)
                       FROM   apps.ra_customer_trx_lines_all rct_1
                       WHERE  rct_1.customer_trx_id = rct.customer_trx_id
                       AND    rct_1.line_type = 'LINE') inv_net_amount,
                (
                       SELECT SUM (rct_2.extended_amount)
                       FROM   apps.ra_customer_trx_lines_all rct_2
                       WHERE  rct_2.customer_trx_id = rct.customer_trx_id
                       AND    rct_2.line_type = 'TAX') inv_tax_amount,
                (
                       SELECT SUM (rct_3.extended_amount)
                       FROM   apps.ra_customer_trx_lines_all rct_3
                       WHERE  rct_3.customer_trx_id = rct.customer_trx_id) inv_gross_amount,
                gll.currency_code                                    func_currency,
                Round((
                        (
                        SELECT SUM (rct_4.extended_amount)
                        FROM   apps.ra_customer_trx_lines_all rct_4
                        WHERE  rct_4.customer_trx_id = rct.customer_trx_id
                        AND    rct_4.line_type = 'LINE')*gdr.conversion_rate),2) func_net_amount,
                Round((
                        (
                        SELECT SUM (rct_5.extended_amount)
                        FROM   apps.ra_customer_trx_lines_all rct_5
                        WHERE  rct_5.customer_trx_id = rct.customer_trx_id
                        AND    rct_5.line_type = 'TAX')*gdr.conversion_rate),2) func_tax_amount,
                Round((
                        (
                        SELECT SUM (rct_6.extended_amount)
                        FROM   apps.ra_customer_trx_lines_all rct_6
                        WHERE  rct_6.customer_trx_id = rct.customer_trx_id)*gdr.conversion_rate),2) func_gross_amount,
                glcc.segment1                                                                 company,
                glcc.segment2                                                                 account,
                hg.geography_name                                                             billing_country,
                gdr.conversion_rate
FROM            apps.hz_parties hzp,
                apps.hz_cust_accounts hca,
                apps.ra_customer_trx_all rct,
                apps.ra_customer_trx_lines_all rctl,
                apps.ra_cust_trx_line_gl_dist_all rctd,
                apps.gl_code_combinations_kfv glcc,
                apps.hz_cust_site_uses_all hcsua,
                apps.hz_cust_acct_sites_all hcasa,
                apps.hz_party_sites hps,
                apps.hz_locations hl,
                apps.hz_geographies hg,
                apps.gl_ledgers gll,
                apps.gl_daily_rates gdr
WHERE           hzp.party_id = hca.party_id
AND             hca.cust_account_id = rct.bill_to_customer_id
AND             hca.cust_account_id = hcasa.cust_account_id
AND             rct.customer_trx_id = rctl.customer_trx_id
AND             rctl.customer_trx_line_id = rctd.customer_trx_line_id
AND             glcc.code_combination_id = rctd.code_combination_id
AND             rct.bill_to_site_use_id = hcsua.site_use_id
AND             hcsua.cust_acct_site_id = hcasa.cust_acct_site_id
AND             hcasa.party_site_id = hps.party_site_id
AND             hps.location_id = hl.location_id
AND             hl.country = hg.country_code
AND             hg.geography_type = 'COUNTRY'
AND             rctl.line_type = 'TAX'
AND             gll.ledger_id = rct.set_of_books_id
AND             gdr.from_currency = rct.invoice_currency_code
AND             gdr.to_currency = gll.currency_code
AND             to_date(gdr.conversion_date) = to_date(rctd.gl_date)
AND             gdr.conversion_type = 'Corporate'
AND             rctd.gl_date BETWEEN To_date ('01-JAN-2018', 'DD-MON-YYYY') AND  To_date ('31-JAN-2018', 'DD-MON-YYYY')
AND             glcc.segment1 = '2600'
AND             glcc.segment2 = '206911'
GROUP BY        hzp.party_name,
                hca.account_number,
                rct.interface_header_attribute1,
                rct.trx_number,
                rct.trx_date,
                rct.creation_date,
                rctl.line_number,
                rctl.unit_selling_price,
                rct.org_id,
                rctd.gl_date,
                rct.customer_trx_id,
                glcc.segment1,
                glcc.segment2,
                hg.geography_name,
                rct.invoice_currency_code,
                gll.currency_code,
                gdr.conversion_rate 

We note that there are six scalar subqueries in the text I’ve reported – and they form two groups of three, and the difference between the two groups is that one group is multiplied by a conversion rate while the other isn’t; moreover in each group the three subqueries are simply querying subsets of the same correlated data set. So it looks as if all 6 scalar subqueries could be eliminated and replaced by the inclusion of an aggregate view in the from clause and the projection of 6 columns from that view.

However, before pursuing that option, take a look at the plan with the rowsource execution stats – where is the time going ?


-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                                           |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   1 |  UNION-ALL                                                 |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   2 |   HASH UNIQUE                                              |                              |      1 |      1 |    501 |00:13:20.17 |    3579K|  
|   3 |    HASH GROUP BY                                           |                              |      1 |      1 |  19827 |00:13:20.15 |    3579K|  
|   4 |     NESTED LOOPS                                           |                              |      1 |        |  21808 |00:13:10.26 |    3579K|  
|   5 |      NESTED LOOPS                                          |                              |      1 |      1 |  21808 |00:13:10.11 |    3578K|  
|   6 |       NESTED LOOPS OUTER                                   |                              |      1 |      1 |  21808 |00:13:09.90 |    3576K|  
|   7 |        NESTED LOOPS OUTER                                  |                              |      1 |      1 |  21808 |00:13:09.25 |    3501K|  
|   8 |         NESTED LOOPS OUTER                                 |                              |      1 |      1 |  21808 |00:13:08.48 |    3426K|  
|   9 |          NESTED LOOPS OUTER                                |                              |      1 |      1 |  21808 |00:13:07.66 |    3333K|  
|  10 |           NESTED LOOPS OUTER                               |                              |      1 |      1 |  21808 |00:13:06.92 |    3258K|  
|  11 |            NESTED LOOPS OUTER                              |                              |      1 |      1 |  21808 |00:13:06.08 |    3183K|  
|  12 |             NESTED LOOPS                                   |                              |      1 |      1 |  21808 |00:13:04.69 |    3090K|  
|  13 |              NESTED LOOPS                                  |                              |      1 |      1 |  21808 |00:13:05.75 |    3026K|  
|  14 |               NESTED LOOPS                                 |                              |      1 |      1 |  21808 |00:13:03.30 |    2961K|  
|  15 |                NESTED LOOPS                                |                              |      1 |      1 |  33459 |00:00:04.33 |    1123K|  
|  16 |                 NESTED LOOPS                               |                              |      1 |    351 |  33459 |00:00:03.67 |    1025K|  
|  17 |                  NESTED LOOPS                              |                              |      1 |    351 |  33459 |00:00:03.06 |     926K|  
|  18 |                   NESTED LOOPS                             |                              |      1 |    351 |  33459 |00:00:02.47 |     827K|  
|* 19 |                    HASH JOIN                               |                              |      1 |    351 |  33459 |00:00:01.90 |     730K|  
|  20 |                     TABLE ACCESS FULL                      | GL_LEDGERS                   |      1 |     38 |     39 |00:00:00.01 |      15 |  
|  21 |                     NESTED LOOPS                           |                              |      1 |        |  33459 |00:00:01.75 |     730K|  
|  22 |                      NESTED LOOPS                          |                              |      1 |    351 |  33459 |00:00:01.44 |     696K|  
|  23 |                       NESTED LOOPS                         |                              |      1 |    351 |  33459 |00:00:01.11 |     646K|  
|* 24 |                        HASH JOIN                           |                              |      1 |    385 |  33459 |00:00:00.40 |     526K|  
|* 25 |                         TABLE ACCESS BY INDEX ROWID BATCHED| GL_CODE_COMBINATIONS         |      1 |     35 |      1 |00:00:00.01 |     108 |  
|* 26 |                          INDEX RANGE SCAN                  | GL_CODE_COMBINATIONS_N2      |      1 |    499 |     77 |00:00:00.01 |       3 |  
|* 27 |                         TABLE ACCESS BY INDEX ROWID BATCHED| RA_CUST_TRX_LINE_GL_DIST_ALL |      1 |    651K|   1458K|00:00:02.22 |     526K|  
|* 28 |                          INDEX RANGE SCAN                  | RA_CUST_TRX_LINE_GL_DIST_N2  |      1 |    728K|   1820K|00:00:01.60 |   11147 |  
|* 29 |                        TABLE ACCESS BY INDEX ROWID         | RA_CUSTOMER_TRX_LINES_ALL    |  33459 |      1 |  33459 |00:00:00.53 |     119K|  
|* 30 |                         INDEX UNIQUE SCAN                  | RA_CUSTOMER_TRX_LINES_U1     |  33459 |      1 |  33459 |00:00:00.31 |   86364 |  
|* 31 |                       INDEX UNIQUE SCAN                    | RA_CUSTOMER_TRX_U1           |  33459 |      1 |  33459 |00:00:00.21 |   49850 |  
|  32 |                      TABLE ACCESS BY INDEX ROWID           | RA_CUSTOMER_TRX_ALL          |  33459 |      1 |  33459 |00:00:00.20 |   33459 |  
|  33 |                    TABLE ACCESS BY INDEX ROWID             | HZ_CUST_ACCOUNTS             |  33459 |      1 |  33459 |00:00:00.42 |   97887 |  
|* 34 |                     INDEX UNIQUE SCAN                      | HZ_CUST_ACCOUNTS_U1          |  33459 |      1 |  33459 |00:00:00.24 |   64428 |  
|  35 |                   TABLE ACCESS BY INDEX ROWID              | HZ_PARTIES                   |  33459 |      1 |  33459 |00:00:00.44 |   98783 |  
|* 36 |                    INDEX UNIQUE SCAN                       | HZ_PARTIES_U1                |  33459 |      1 |  33459 |00:00:00.26 |   65175 |  
|  37 |                  TABLE ACCESS BY INDEX ROWID               | HZ_CUST_SITE_USES_ALL        |  33459 |      1 |  33459 |00:00:00.46 |   98374 |  
|* 38 |                   INDEX UNIQUE SCAN                        | HZ_CUST_SITE_USES_U1         |  33459 |      1 |  33459 |00:00:00.28 |   64915 |  
|* 39 |                 TABLE ACCESS BY INDEX ROWID                | HZ_CUST_ACCT_SITES_ALL       |  33459 |      1 |  33459 |00:00:00.45 |   98195 |  
|* 40 |                  INDEX UNIQUE SCAN                         | HZ_CUST_ACCT_SITES_U1        |  33459 |      1 |  33459 |00:00:00.26 |   64736 |  
|  41 |                TABLE ACCESS BY INDEX ROWID BATCHED         | GL_DAILY_RATES               |  33459 |      1 |  21808 |00:12:44.59 |    1838K|  
|* 42 |                 INDEX RANGE SCAN                           | GL_DAILY_RATES_U1            |  33459 |      1 |  21808 |00:13:08.16 |    1837K|  
|  43 |               TABLE ACCESS BY INDEX ROWID                  | HZ_PARTY_SITES               |  21808 |      1 |  21808 |00:00:00.35 |   64339 |  
|* 44 |                INDEX UNIQUE SCAN                           | HZ_PARTY_SITES_U1            |  21808 |      1 |  21808 |00:00:00.23 |   42531 |  
|  45 |              TABLE ACCESS BY INDEX ROWID                   | HZ_LOCATIONS                 |  21808 |      1 |  21808 |00:00:00.33 |   64353 |  
|* 46 |               INDEX UNIQUE SCAN                            | HZ_LOCATIONS_U1              |  21808 |      1 |  21808 |00:00:00.18 |   42545 |  
|  47 |             VIEW PUSHED PREDICATE                          | VW_SSQ_1                     |  21808 |      1 |  21808 |00:00:01.17 |   93476 |  
|  48 |              SORT GROUP BY                                 |                              |  21808 |      1 |  21808 |00:00:01.06 |   93476 |  
|  49 |               TABLE ACCESS BY INDEX ROWID BATCHED          | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |     16 |    145K|00:00:00.84 |   93476 |  
|* 50 |                INDEX RANGE SCAN                            | XXC_CUSTOMER_GETPAID         |  21808 |     16 |    145K|00:00:00.36 |   59938 |  
|  51 |            VIEW PUSHED PREDICATE                           | VW_SSQ_2                     |  21808 |      1 |  21808 |00:00:00.69 |   74433 |  
|  52 |             SORT GROUP BY                                  |                              |  21808 |      1 |  21808 |00:00:00.59 |   74433 |  
|  53 |              TABLE ACCESS BY INDEX ROWID BATCHED           | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  92201 |00:00:00.49 |   74433 |  
|* 54 |               INDEX RANGE SCAN                             | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  92201 |00:00:00.24 |   59903 |  
|  55 |           VIEW PUSHED PREDICATE                            | VW_SSQ_3                     |  21808 |      1 |  21808 |00:00:00.61 |   74852 |  
|  56 |            SORT GROUP BY                                   |                              |  21808 |      1 |  21808 |00:00:00.51 |   74852 |  
|  57 |             TABLE ACCESS BY INDEX ROWID BATCHED            | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  53060 |00:00:00.38 |   74852 |  
|* 58 |              INDEX RANGE SCAN                              | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  53060 |00:00:00.19 |   59148 |  
|  59 |          VIEW PUSHED PREDICATE                             | VW_SSQ_4                     |  21808 |      1 |  21808 |00:00:00.70 |   93490 |  
|  60 |           SORT GROUP BY                                    |                              |  21808 |      1 |  21808 |00:00:00.61 |   93490 |  
|  61 |            TABLE ACCESS BY INDEX ROWID BATCHED             | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |     16 |    145K|00:00:00.63 |   93490 |  
|* 62 |             INDEX RANGE SCAN                               | XXC_CUSTOMER_GETPAID         |  21808 |     16 |    145K|00:00:00.25 |   59950 |  
|  63 |         VIEW PUSHED PREDICATE                              | VW_SSQ_5                     |  21808 |      1 |  21808 |00:00:00.63 |   74427 |  
|  64 |          SORT GROUP BY                                     |                              |  21808 |      1 |  21808 |00:00:00.54 |   74427 |  
|  65 |           TABLE ACCESS BY INDEX ROWID BATCHED              | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  92201 |00:00:00.44 |   74427 |  
|* 66 |            INDEX RANGE SCAN                                | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  92201 |00:00:00.21 |   59900 |  
|  67 |        VIEW PUSHED PREDICATE                               | VW_SSQ_6                     |  21808 |      1 |  21808 |00:00:00.59 |   74846 |  
|  68 |         SORT GROUP BY                                      |                              |  21808 |      1 |  21808 |00:00:00.50 |   74846 |  
|  69 |          TABLE ACCESS BY INDEX ROWID BATCHED               | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  53060 |00:00:00.35 |   74846 |  
|* 70 |           INDEX RANGE SCAN                                 | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  53060 |00:00:00.17 |   59144 |  
|* 71 |       INDEX RANGE SCAN                                     | HZ_GEOGRAPHIES_N11           |  21808 |   5812 |  21808 |00:00:00.13 |    2684 |  
|  72 |      TABLE ACCESS BY INDEX ROWID                           | HZ_GEOGRAPHIES               |  21808 |    168 |  21808 |00:00:00.07 |     620 |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  

Let’s start by raising some concerns about the quality of information available.

First, the OP says it takes 14 minutes to return 30,000 rows: but the top line of the plan says it has taken 13 minutes and 20 seconds to return the first 501 rows, and if we look a little further down the plan operation 3 (Hash Group By) reports 00:13:20.15 to aggregate down to 19,827 rows. So this half of the plan cannot return more than 19,827 rows, and the half I have discarded (for the moment) must be returning the other 10,000+ rows. The information we have is incomplete.

Of course you may think that whatever the rest of the plan does is fairly irrelevant – it’s only going to be responsible for at most another 40 seconds of processing – except my previous experience of rowsource execution statistics tells me that when you do a large number of small operations the times reported can be subject to fairly large rounding errors and that enabling the measurement can increase the execution time by a factor of three or four. It’s perfectly feasible that this half of the query is actually the faster half under normal run-time circumstances but runs much more slowly (with a much higher level of CPU utilisation) when rowsource execution stats is in enabled. So let’s not get too confident.

With that warning in mind, what can we see in this half of the plan.

Big picture: the inline scalar subqueries have disappeared. In 12c the optimimzer can unnest scalar subqueries in the select list and turn them into outer joins, and we can see that there are 6 “Nested Loop Outer” operations, corresponding to 6 “View Pushed Predicate” operations against views labelled VW_SSQ1 through to VW_SSQ6 (SSQ = Scalar Sub Query ?). This goes back to my early comment – a person could probably rewrite the 6 scalar subqueries as a single aggregate view in the from clause: the optimizer isn’t quite clever enough to manage that in this case, but in simpler cases it might be able to do exactly that.

Big picture 2: most of the 13 minutes 20 seconds appears at operation 14 as it processes the 33,459 rows supplied to it from the 4.33 seconds of work done by operation 15 and its descendants. Reducing this part of the execution plan to the smallest relevant section we get the following:

-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|  14 |               NESTED LOOPS                                 |                              |      1 |      1 |  21808 |00:13:03.30 |    2961K|  
|  15 |                NESTED LOOPS                                |                              |      1 |      1 |  33459 |00:00:04.33 |    1123K|  
|  41 |                TABLE ACCESS BY INDEX ROWID BATCHED         | GL_DAILY_RATES               |  33459 |      1 |  21808 |00:12:44.59 |    1838K|  
|* 42 |                 INDEX RANGE SCAN                           | GL_DAILY_RATES_U1            |  33459 |      1 |  21808 |00:13:08.16 |    1837K|  
-----------------------------------------------------------------------------------------------------------------------------------------------------  

For each row supplied by operation 15 Oracle calls operation 41, which calls operation 42 to do an index range scan to supply a set of rowids so that operation 41 can access a table and return rows. Apparently the total time spent by operation 41 waiting for operation 42 to return rowids and then doing its own work is 12 minutes 44 seconds, while the range scans alone (all 33,459 of them) take 13 minutes and 8 seconds. Remember, though, that “lots of small operations = scope of rounding errors” when you look at these timings. Despite the inconsistency between the timings for operations 41 and 42 it’s reasonable to conclude that between them that’s where most of the execution time went.

Two questions – (a) can we refine our analysis of how the time is split between the two operations and (b) why do these lines take so much time.

Check the Starts and the A-rows: (reminder: for comparison, we expect A-rows to be approximately E-rows * Starts) for both operations we see 33,459 starts and 21,808 rows. The index range scans return (on average) a single rowid about two-thirds of the time, and every time a range scan returns a rowid the corresponding row is returned from the table (If you check the Id column there’s no asterisk on operation 41 so no extra predicate is applied as Oracle accesses the table row – but even if there were an extra predicate we’d still be happy to infer that if 21,808 rowids returned from operation 42 turned into 21,808 rows returned from the table then there are no wasted accesses to the table).

Now look at the Buffers for the index range scan – 1.837M: that’s roughly 56 buffers per range scan – that’s a lot of index to range through to find one rowid, which is a good clue that perhaps we do a lot of work with each Start and really do use up a lot of CPU on this operation. Let’s see what the Predicate Section of the plan tells us about this range scan:


Predicate Information (identified by operation id):  
---------------------------------------------------  
  42 - access("GDR"."FROM_CURRENCY"="RCT"."INVOICE_CURRENCY_CODE" AND "GDR"."TO_CURRENCY"="GLL"."CURRENCY_CODE" AND   
              "GDR"."CONVERSION_TYPE"='Corporate')  
       filter(("GDR"."CONVERSION_TYPE"='Corporate' AND TO_DATE(INTERNAL_FUNCTION("GDR"."CONVERSION_DATE"))=TO_DATE(INTERNAL_FUNCTION("RCTD"."  
              GL_DATE"))))  

We have access predicates (things which narrow down the number of leaf blocks that we walk through) and filter predicates (things we do to test every key entry we access). Notably the gdr.conversion type is a filter predciate as well as an access predicate – and that suggests that our set of predicates has “skipped over” a column in the index: from_currency and to_currency might be the first two columns in the index, but conversion_type is then NOT the third.

More significantly, though, there’s a column called conversion_date in the index (maybe that’s column 3 in the index – it feels like it ought to be); but for every index entry we’ve selected from the 56 blocks we walk through we do some sort of internal conversion (or un-translated transformation) to the column then convert the result to a date to compare it with another date (similarly processed from an earlier operation). What is that “internal function” – let’s check the query:


AND             gdr.from_currency = rct.invoice_currency_code
AND             gdr.to_currency = gll.currency_code
AND             gdr.conversion_type = 'Corporate'
AND             to_date(gdr.conversion_date) = to_date(rctd.gl_date)
AND             rctd.gl_date BETWEEN To_date ('01-JAN-2018', 'DD-MON-YYYY') AND  To_date ('31-JAN-2018', 'DD-MON-YYYY')

(I’ve swapped the order of a couple of lines to highlight a detail).

The filter predicate is comparing gdr.conversion_date with rctd.gl_date – and we can probably assume that both columns really are dates because (a) the word “date” is in their names and (b) the rctd.gl_date is being compared with genuine date values in the next predicate down (and – though I haven’t shown it – the way the plan reports the next predicate proves that the column really is a date datatype).

So the predicate in the SQL applies the to_date() function to two columns that are dates – which means the optimizer has to convert the date columns to some default character format and then convert them back to dates. The “internal function” is a to_char() call. Conversions between date and character formats are CPU-intensive, and we’re doing a double conversion (to_date(to_char(column_value)) to every data value in roughly 56 blocks of an index each time we call that line of the plan. It’s not surprising we spend a lot of time in that line.

Initial strategy:

Check the column types for those two columns, if they are both date types decide whether or not the predicate could be modified to a simple gdr.conversion_date = rctd.gl_date (though it’s possible that something slightly more sophisticated should be used) but whatever you do avoid the redundant conversion through character format.

Ideally, of course, if we can avoid this conversion we may find that Oracle can be more accurate in its range scan through the index, but we may still find that we do a large range scan even if we do manage to do it a little more efficiently, in which case we may want to see if there is an alternative index which will allow use to pick the one rowid we need from the index without  visiting so many leaf blocks in the index.

Warning

Simply eliminating the to_date() calls may changes the results. Here’s a demonstration of how nasty things happen when you apply to_date() to a date:


SQL> desc t1
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 D1                                     DATE
 D2                                     DATE

SQL> insert into t1 values(sysdate, sysdate + 10/86400);

1 row created.

SQL> select * from t1 where d1 = d2;

no rows selected

SQL> select * from t1 where to_date(d1) = to_date(d2);

D1        D2
--------- ---------
30-SEP-18 30-SEP-18

1 row selected.

SQL> alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select * from d1 where to_date(d1) = to_date(d2);

no rows selected

Different users could get different results because they have different settings for their nls_date_format.

Reminder

I started my analysis with two comments about the quality of information – first, we don’t really know whether or not this half of the union all would be responsble for most of the time if rowsource execution statistics were not enabled; secondly large number of small operations can lead to a lot of rounding errors in timing. There are six occurrences of unnested scalar subqueries which are all called 21,808 times – and the recorded time for all 6 of them is remarkably small given the number of executions, even when allowing for the precision with which they operate; it’s possible that these subqueries take a larger fraction of the total time than the plan indicates, in which case it might become necessary (rather than just nice) to do a manual unnesting and reduce the number of inline views to 3 (one for each line_type), 2 (one with, one without, conversion_rate) or just one.

Footnote

Once again I’ve spent a couple of hours writing notes to explain the thoughts that went through my mind in roughly 10 minutes of reading the original posting. It’s a measure really of how many bits of information you can pull together, and possibly discard, very quickly once you understand how many things the optimizer is capable of doing and how the execution plan tries to show you how a statement was (or will be) handled.

Update (5th Oct 2018)

Another way of looking for the best strategy for tuning this statement, given the available information, is this:

Where, in the sequence of events, does the data volume we’re processing drop to the right scale for the output. If we don’t drop to the right scale very early in the plan execution then we may need to re-arrange the order in which we visit tables; if we are operating at the right volume almost immediately then there’s a good chance that we’ve started the right way. Take a look at the first few lines of this plan (remembering that the query was interrupted before returning the whole result set):

-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                                           |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   1 |  UNION-ALL                                                 |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   2 |   HASH UNIQUE                                              |                              |      1 |      1 |    501 |00:13:20.17 |    3579K|  
|   3 |    HASH GROUP BY                                           |                              |      1 |      1 |  19827 |00:13:20.15 |    3579K|  
|   4 |     NESTED LOOPS                                           |                              |      1 |        |  21808 |00:13:10.26 |    3579K|
-----------------------------------------------------------------------------------------------------------------------------------------------------    

At line 4 we generate 21,808 rows which we aggregate down to 19,827, which we then hash down to distinct values – the original user told us that the query returns 30,000 rows so we shouldn’t assume that the uniqueness requirement has reduced 19,827 rows to the 501 reported so far, there may be more to come. What we can say about these numbers, particularly lines 3 and 4 is that prior to the aggregation we need to find about 22,000 rows and carry them through the rest of the plan.

Now look at lines 24 – 28 where the heavy duty action starts (the first physical operation is actually at lines 19/20 where (thanks to swapping join inputs) Oracle scans the gl_ledger table and hashes it into memory in anticipation of incoming probe data – but that’s a tiny blip on the way to the big join):

-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|* 24 |                        HASH JOIN                           |                              |      1 |    385 |  33459 |00:00:00.40 |     526K|  
|* 25 |                         TABLE ACCESS BY INDEX ROWID BATCHED| GL_CODE_COMBINATIONS         |      1 |     35 |      1 |00:00:00.01 |     108 |  
|* 26 |                          INDEX RANGE SCAN                  | GL_CODE_COMBINATIONS_N2      |      1 |    499 |     77 |00:00:00.01 |       3 |  
|* 27 |                         TABLE ACCESS BY INDEX ROWID BATCHED| RA_CUST_TRX_LINE_GL_DIST_ALL |      1 |    651K|   1458K|00:00:02.22 |     526K|  
|* 28 |                          INDEX RANGE SCAN                  | RA_CUST_TRX_LINE_GL_DIST_N2  |      1 |    728K|   1820K|00:00:01.60 |   11147 | 
-----------------------------------------------------------------------------------------------------------------------------------------------------    

The important thing we see here is that the very first hash join identifies 33,459 rows: we’re immediately into the right ball-park for the final output. The timings are a bit suspect – I really don’t like seeing the time for hash join (0.4 seconds) being smaller than one of its direct child operations (the 2.22 seconds) – but this bit of the work seems to get to the right scale very quickly: this looks as if it’s likely to be a good way to start the final join order.

We might question whether the optimizer has been wise to use an index range scan to identify 1.45 million rows in a table (and probing it 1.82 million times). Maybe that was quick because all the data had previously been buffered and perhaps thisrange scan will be extremely slow on a busy production system; maybe a tablescan would be better, maybe there’s a way of getting to this big table through a different join order that means we only visit it roughly 33,459 times through an index that identifies exactly the rows we really need. Without good knowledge of what the data looks like (and without understanding what the query is supposed to achieve and how often it runs) we can only look at the supplied execution plan and work out where the time went and whether that suggests the plan is doing roughly the right thing or doing something that is clearly silly. This plan looks like a reasonable starting point with one minor (we hope) glitch around line 42 that we identified earlier on.

 

September 5, 2018

Subquery Order

Filed under: 12c,Execution plans,Hints,Oracle,subqueries — Jonathan Lewis @ 1:09 pm GMT Sep 5,2018

From time to time I’ve wanted to optimize a query by forcing Oracle to execute existence (or non-existence) subqueries in the correct order because I know which subquery will eliminate most data most efficiently, and it’s always a good idea to look for ways to eliminate early. I’ve only just discovered (which doing some tests on 18c) that Oracle 12.2.0.1 introduced the /*+ order_subq() */ hint that seems to be engineered to do exactly that.

Here’s a very simple (and completely artificial) demonstration of use.


rem
rem     Script:         122_order_subq.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sep 2018
rem

create table t1 as select * from all_objects;
create table t2 as select * from all_objects;
create table t3 as select * from all_objects;

create index t2_i1 on t2(object_id);
create index t3_i1 on t3(object_id);

prompt  =============================
prompt  order_subq(@main subq2 subq3)
prompt  =============================

explain plan for
select
        /*+
                qb_name(main)
                no_unnest(@subq2)
                no_unnest(@subq3)
                order_subq(@main subq2 subq3)
        */
        t1.object_name, t1.object_type
from
        t1
where
        exists (
                select
                        /*+ qb_name(subq2) */
                        null
                from    t2
                where   t2.object_id = t1.object_id * 5
        )
and     exists (
                select
                        /*+ qb_name(subq3) */
                        null
                from    t3
                where   t3.object_id = t1.object_id * 13
        )
;

select * from table(dbms_xplan.display(null,null,'outline'));

=============================
order_subq(@main subq2 subq3)
=============================

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2585036931

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |    53 | 51090   (1)| 00:00:02 |
|*  1 |  FILTER            |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1    | 61765 |  3196K|   163   (4)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | T2_I1 |     1 |     5 |     1   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN | T3_I1 |     1 |     5 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
...
      ORDER_SUBQ(@"MAIN" "SUBQ2" "SUBQ3")
...
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ NO_UNNEST QB_NAME ("SUBQ2") */ 0 FROM
              "T2" "T2" WHERE "T2"."OBJECT_ID"=:B1*5) AND  EXISTS (SELECT /*+
              NO_UNNEST QB_NAME ("SUBQ3") */ 0 FROM "T3" "T3" WHERE
              "T3"."OBJECT_ID"=:B2*13))
   3 - access("T2"."OBJECT_ID"=:B1*5)
   4 - access("T3"."OBJECT_ID"=:B1*13)

I’ve blocked subquery unnesting for the purposes of the demo and given a query block name to the two subqueries (using a name that identifies the associated table). As you can see, the execution plan uses the subqueries as filter subqueries, operating them in the order I’ve specified in my hint. You can also see that the hint is echoed down into the Outline section of the plan.

It’s possible that this is the plan that the optimizer would have chosen without the order_subq hint, so I ought to see if I can also use the hint to make the subqueries filter in the oppostie order. I happen to know that executing the subquery against t3 is likely to eliminate more rows that executing the subquery against t2. (The “* 13” compared to the “* 5” is significant) so I really want the subqueries to be used in the opposite order anyway – so here’s what happens when I reverse the order in the hint:


prompt  =============================
prompt  order_subq(@main subq3 subq2)
prompt  =============================

explain plan for
select
        /*+
                qb_name(main)
                no_unnest(@subq2)
                no_unnest(@subq3)
                order_subq(@main subq3 subq2)
        */
        t1.object_name, t1.object_type
from
        t1
where
        exists (
                select
                        /*+ qb_name(subq2) */
                        null
                from    t2
                where   t2.object_id = t1.object_id * 5
        )
and     exists (
                select
                        /*+ qb_name(subq3) */
                        null
                from    t3
                where   t3.object_id = t1.object_id * 13
        )
;

select * from table(dbms_xplan.display(null,null,'outline'));

=============================
order_subq(@main subq2 subq3)
=============================

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3585049451

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |    53 | 51090   (1)| 00:00:02 |
|*  1 |  FILTER            |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1    | 61765 |  3196K|   163   (4)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | T3_I1 |     1 |     5 |     1   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN | T2_I1 |     1 |     5 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
...
      ORDER_SUBQ(@"MAIN" "SUBQ3" "SUBQ2")
...
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ NO_UNNEST QB_NAME ("SUBQ3") */ 0 FROM
              "T3" "T3" WHERE "T3"."OBJECT_ID"=:B1*13) AND  EXISTS (SELECT /*+
              NO_UNNEST QB_NAME ("SUBQ2") */ 0 FROM "T2" "T2" WHERE
              "T2"."OBJECT_ID"=:B2*5))
   3 - access("T3"."OBJECT_ID"=:B1*13)
   4 - access("T2"."OBJECT_ID"=:B1*5)

With the modified hint in place the order of the filter subqueries is reversed. Notice how the Predicate section also echoes the ordering of the subqueries.

Footnote

It should be noted that the order_subq() hint doesn’t get mentioned in the 18c SQL Language Reference “Alphabetical List of Hints”. If it were then one of the little oddities that might get a mention is that the optimizer seems to ignore the hint if you disable CPU costing. (not that anyone should be doing that since 10g).

August 22, 2018

Descending bug

Filed under: Bugs,Execution plans,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 1:20 pm GMT Aug 22,2018

Following on from Monday’s posting about reading execution plans and related information, I noticed a question on the ODC database forum asking about the difference between “in ({list of values})” and a list of “column = {constant}” predicates connected by OR. The answer to the question is that there’s essentially no difference as you would be able to see from the predicate section of an execution plan:


SELECT  c1, c2, c3, c4, c5, c6, c7, c8..  
FROM    TAB1  
WHERE   STS IN ( 'A', 'B')  
AND     cnt < '4'  
AND     dt < sysdate  
and     rownum <=1;  
  
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 33399   (1)| 00:03:14 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   114K|   201M| 33399   (1)| 00:03:14 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   114K|       |   723   (1)| 00:00:05 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"<SYSDATE@!)  
   4 - access("STS"='A' OR "STS"='B')  

Note how the predicate section tells you that the original “sts in ( ‘A’, ‘B’ )” has been transformed into “sts = ‘A’ or sts = ‘B'”.

A further point I made about IN-lists in Monday’s post was that as one step in the transformation Oracle would sort the list and eliminate duplicates, and it suddenly occurred to me to wonder whether Oracle would sort the list in descending order if the only relevant index were defined to start with a descending column. Naturally I had to try it so here’s a suitable script to prepare some data:

rem
rem     Script:         descending_bug_06.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2018
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0        Crashes
rem             12.2.0.1        Crashes
rem             12.1.0.2        Crashes
rem             11.2.0.4        Bad Plan
rem

create table t1
nologging
pctfree 95 pctused 5
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        lpad(rownum,10,'0')             v1,
        case mod(rownum,1000)
                when 0 then 'A'
                when 3 then 'B'
                when 6 then 'C'
                       else 'D'
        end                             sts,
        case mod(rownum,1000)
                when 0 then '1'
                when 3 then '2'
                when 6 then '3'
                       else '4'
        end                             cnt,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 -- > comment to avoid WordPress format issue
;

create index t1_i1a on t1(sts) nologging;
create index t1_i1d on t1(sts desc) nologging;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size skewonly'
        );
end;
/

There is one oddity in this script – if you’ve got every column in an index declared as DESC you’ve made a mistake and none of the columns should be declared as DESC. The feature is relevant only if you want a mixture of ascending and descending column in a single index.

An important detail of the script is that I’ve gathered stats AFTER creating the objects – it’s important to do this, even in 18.3, because (a) creating the “descending” index will result in a hidden virtual column being created to represent the descending column and I want make sure I have stats on that column and (b) the “stats on creation” code doesn’t generate histograms and I want a (frequency) histogram on columns sts and the hidden, virtual, descending version of the column.

After generating the data and checking that I have the correct histograms for sts and sys_nc00006$ (the relevant hidden column) I can then run the following test:

set serveroutput off
alter session set statistics_level = all;

alter index t1_i1d invisible;

select  sts, count(*)
from    t1
where   sts in ('B','C')
group by
        sts
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter index t1_i1d   visible;
alter index t1_i1a invisible;

select  sts, count(*)
from    t1
where   sts in ('B','C')
group by
        sts
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter session set statistics_level = typical;
set serveroutput on


The code makes one index invisible then runs a query that should use an inlist iterator; then it switches indexes making the invisible one visible and vice versa and repeats the query. I’ve enabled rowsource execution statistics and pulled the execution plans from memory to make sure I don’t get fooled by any odd glitches that might exist within “explain plan”. Here are the results from 11.2.0.4 – normal index, then descending index – with a little cosmetic cleaning:


S   COUNT(*)
- ----------
B        100
C        100

SQL_ID  f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |      2 |00:00:00.01 |       5 |
|   1 |  SORT GROUP BY NOSORT|        |      1 |      2 |      2 |00:00:00.01 |       5 |
|   2 |   INLIST ITERATOR    |        |      1 |        |    200 |00:00:00.01 |       5 |
|*  3 |    INDEX RANGE SCAN  | T1_I1A |      2 |    178 |    200 |00:00:00.01 |       5 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("STS"='B' OR "STS"='C'))


Index altered.
Index altered.


S   COUNT(*)
- ----------
C        100
B        100


SQL_ID  f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |      2 |00:00:00.02 |     198 |
|   1 |  SORT GROUP BY NOSORT|        |      1 |      2 |      2 |00:00:00.02 |     198 |
|*  2 |   INDEX FULL SCAN    | T1_I1D |      1 |   1000 |    200 |00:00:00.02 |     198 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='B' OR
              SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='C'))


As expected we see counts of 100 for ‘B’s and ‘C’s, and we also see that the “sort group by nosort” operation with the descending index has produced the results in reverse order. The problem though is that the optimizer has decided to use an “index full scan” on the descending index, and the estimate of the rows returned is terribly wrong (and seems to be the common “5% guess”, used once for each target value), and the number of buffer visits is huge compared to the result from the normal index – Oracle really did walk every leaf block in the index to get this result. The predicate section also looks rather silly – why hasn’t the optimizer produced predicates more like: “sys_nc00006$ = sys_op_descend(‘B’)” ?

In passing you’ll notice that the estimated rows in the plan using the normal index is a little low. This is the result of Oracle using a small sample (ca. 5,500 rows) in 11g to gather histogram stats. 12c will do better for a frequency histogram with the fast algorithm it uses for a 100% (auto) sample size.

So 11g doesn’t do very well but we’ve got 12.1.0.2, 12.2.0.1, and (in the last couple of weeks) 18.3 to play with. Here’s the result from 12.1.0.2 and 12.2.0.1 for the query that should use the descending index:


select  sts, count(*)
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [qernsRowP], [1], [], [], [], [], [], [], [], [], [], []


SQL_ID  f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

-----------------------------------------------------------------------------------------
| Id  | Operation            | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |      1 |        |      0 |00:00:00.01 |       0 |
|   1 |  SORT GROUP BY NOSORT|        |      1 |      2 |      0 |00:00:00.01 |       0 |
|   2 |   INLIST ITERATOR    |        |      1 |        |    101 |00:00:00.03 |       5 |
|*  3 |    INDEX RANGE SCAN  | T1_I1D |      2 |    200 |    101 |00:00:00.01 |       5 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("T1"."SYS_NC00006$"=SYS_OP_DESCEND('B') OR
              "T1"."SYS_NC00006$"=SYS_OP_DESCEND('C')))
       filter((SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='B' OR
              SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='C'))

The query crashed! The plan, however, did look appropriate – the optimizer picked an inlist iterator, picked an index range scan, got the correct estimate of rows (index entries), and did better with the predicate section (though having used a sensible predicate for the access predciate it then used the bizarre version as the filter predicate). Judging from the A-rows column the query seems to have crashed at roughly the point where the optimizer was switching from the range scan for the first iteration into the range scan for the second iteration.

And then there’s Oracle 18.3 – which does the same as the 12c versions :(

To make sure that my silly “single column so it shouldn’t be declared descending” index was the sole cause of the problem I repeated the tests using a two-column index on (sts, cnt).

Conclusion:

Descending indexes or (to be more accurate) indexes with descending columns can still produce problems even in the very latest version of Oracle.

Footnote

Oracle MoS has the wonderful “ORA-600/ORA-7445/ORA-700 Error Look-up Tool (Doc ID 153788.1)” (which doesn’t yet allow you to choose 18.3 as a version) so I used this to do a look up for ORA-00600 errors with first paremeter qernsRowP in 12.2.0.1 and got the following suggestion from doc ID 285913.1: “set event:10119 to disable no-sort fetch and then reparse the failing SQL.” The example suggested setting the event to level 12, and this solved the problem for all three failing versions – but the suggestion came with a warning: “Setting this event at system level may impact the performance of database.” The execution plan (taken, in this case, from 18.2) may explain the warning:

 

S   COUNT(*)
- ----------
B	 100
C	 100

SQL_ID	f20u42pmw1z6w, child number 0
-------------------------------------
select sts, count(*) from t1 where sts in ('B','C') group by  sts

------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |	    |	   1 |	      |      2 |00:00:00.01 |	    4 |       |       | 	 |
|   1 |  HASH GROUP BY	   |	    |	   1 |	    2 |      2 |00:00:00.01 |	    4 |  1558K|  1558K|  659K (0)|
|   2 |   INLIST ITERATOR  |	    |	   1 |	      |    200 |00:00:00.01 |	    4 |       |       | 	 |
|*  3 |    INDEX RANGE SCAN| T1_I1D |	   2 |	  200 |    200 |00:00:00.01 |	    4 |       |       | 	 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(("T1"."SYS_NC00006$"=SYS_OP_DESCEND('B') OR "T1"."SYS_NC00006$"=SYS_OP_DESCEND('C')))
       filter((SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='B' OR SYS_OP_UNDESCEND("T1"."SYS_NC00006$")='C'))


The plan has changed from using a “sort group by nosort” – which effectively means just keeping a running count as you go – to a real “hash group by” which means you have to do the hashing arithmetic for every value (though maybe there’s a deterministic trick that means Oracle won’t do the arithmetic if the next value to be hashed is the same as the previous value) and the actual memory used (659K) does seem a little extreme for counting two distinct values.

 

Next Page »

Powered by WordPress.com.