Oracle Scratchpad

March 21, 2019

Lost time

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 12:51 pm GMT Mar 21,2019

Here’s a little puzzle that came up in the ODC database forum yesterday – I’ve got a query that has been captured by SQL Monitor, and it’s taking much longer to run than it should but the monitoring report isn’t telling me what I need to know about the time.

Here’s a little model to demonstrate the problem – I’m going to join a table to itself (the self join isn’t a necessary feature of the demonstration, I’ve just been a bit lazy in preparing data). Here’s a (competely truthful) description of the table:

SQL> desc t1
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                                     NUMBER
 MOD_1000                               NUMBER
 V1                                     VARCHAR2(40)
 V2                                     VARCHAR2(40)
 PADDING                                VARCHAR2(100)

SQL> select num_rows, blocks from user_tables where table_name = 'T1';

  NUM_ROWS     BLOCKS
---------- ----------
    400000       7798

1 row selected.


And here’s the output I get from dbms_monitor.report_sql_monitor() for the query running on 18.3.0.0


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  TEST_USER (261:36685)
 SQL ID              :  g6j671u7zc9mn
 SQL Execution ID    :  16777218
 Execution Started   :  03/21/2019 08:54:56
 First Refresh Time  :  03/21/2019 08:54:56
 Last Refresh Time   :  03/21/2019 08:55:17
 Duration            :  21s
 Module/Action       :  MyModule/MyAction
 Service             :  SYS$USERS
 Program             :  sqlplus@linux183.localdomain (TNS V1-V3)
 Fetch Calls         :  2

Global Stats
==========================================================================
| Elapsed |   Cpu   |    IO    | PL/SQL  | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Time(s) | Calls |  Gets  | Reqs | Bytes |
==========================================================================
|      21 |      21 |     0.04 |    0.07 |     2 |     3M |  215 | 180MB |
==========================================================================

SQL Plan Monitoring Details (Plan Hash Value=83896840)
==================================================================================================================================================
| Id |       Operation       | Name |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity | Activity Detail |
|    |                       |      | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |   (# samples)   |
==================================================================================================================================================
|  0 | SELECT STATEMENT      |      |         |      |        21 |     +1 |     1 |        1 |      |       |     . |          |                 |
|  1 |   FILTER              |      |         |      |        21 |     +1 |     1 |        1 |      |       |     . |          |                 |
|  2 |    HASH JOIN OUTER    |      |     400 | 2014 |        21 |     +1 |     1 |      400 |      |       |   2MB |          |                 |
|  3 |     TABLE ACCESS FULL | T1   |     400 | 1005 |         1 |     +1 |     1 |      400 |   62 |  60MB |     . |          |                 |
|  4 |     TABLE ACCESS FULL | T1   |    400K | 1001 |        21 |     +1 |     1 |     400K |      |       |     . |          |                 |
==================================================================================================================================================

As you can see the total elapsed time 21 seconds of which the CPU time is the whole 21 seconds.

This seems a little large for a simple hash join so we should look at the “Activity Detail” section of the plan because that will report any ASH samples that can be found for this execution of this query. And that’s where the problem lies: despite using 21 CPU seconds in 21 seconds there are no ASH samples for the execution! Moreover – looking for other numerical oddities – the plan says we read 60MB in 62 read requests (that’s the first tablescan of of the 7,798 blocks of t1), but the summary says we read 180MB – where did we lose (or gain) 120MB ?

It might help to see the query (which is why I didn’t show it in the output above) and it might help to see the predicate section (which SQL Monitor doesn’t report). So here’s the query and its plan, with the resulting predicate section, pulled from memory:


SQL_ID  g6j671u7zc9mn, child number 0
-------------------------------------
select  /*+ monitor */  t1a.mod_1000, t1b.mod_1000 from  t1 t1a,  t1
t1b where  t1a.mod_1000 = 500 and t1b.id(+) = t1a.id and
nvl(t1b.mod_1000,0) + f2(t1a.id) + f3(t1a.id) > 0

Plan hash value: 83896840

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |  2014 (100)|          |
|*  1 |  FILTER             |      |       |       |            |          |
|*  2 |   HASH JOIN OUTER   |      |   400 |  7200 |  2014   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   400 |  3600 |  1005   (3)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T1   |   400K|  3515K|  1001   (3)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("T1B"."MOD_1000",0)+"F2"("T1A"."ID")+"F3"("T1A"."ID")>0)
   2 - access("T1B"."ID"="T1A"."ID")
   3 - filter("T1A"."MOD_1000"=500)


Notice the two function calls that are part of the FILTER operation. That’s where the time is going – but I don’t think you can infer that from the SQL Monitor report (which rather suggests that the problem might be with the second full tablescan of t1 at operation 4).

Functions f2() and f3() both do a table scan of another table that is as large as t1 – and it looks as if they’re both going to be called 400 times, which is a lot of work. The 120MB of “lost” reads is the first table scan of each of the two tables, the 3M buffer gets (in the summary, if you didn’t notice it) is from the repeated tablescans as each row comes out of the hash join and the filter is applied.

Why doesn’t the report show us the CPU (and disk) samples? Because it’s querying ASH (v$active_session_history) by the SQL_ID of the principle SQL statement – and the work is being done by two other statements with different SQL_IDs.

Since it’s PL/SQL functions doing all the work why isn’t the 21 CPU seconds showing up in the “PL/SQL time(s)” summary figure? Because the time isn’t being spent in PL/SQL, it’s being spent in the SQL being run by the PL/SQL.

Just to finish off, let’s try to find the SQL being run by the PL/SQL. I’ll post the script to recreate the example at the end of the posting but for the moment I’ll just show you the query against v$sql that I ran to extract the SQL that’s embedded in the functions:


select  sql_id, executions, disk_reads, buffer_gets, sql_text
from    V$sql 
where   sql_text like 'SELECT%T2%' 
or      sql_text like 'SELECT%T3%'
;

SQL_ID        EXECUTIONS DISK_READS BUFFER_GETS SQL_TEXT
------------- ---------- ---------- ----------- --------------------------------------------------
12ytf1rry45d9        400       7683     3072817 SELECT MAX(ID) FROM T2 WHERE MOD_1000 = :B1
85dmsgqg3bh4w          1       7680        7698 SELECT MAX(ID) FROM T3 WHERE MOD_1000 = :B1

As you can see, there are roughly 120MB of disk I/O and 3M buffer gets due to these two statement – and one of them has run the 400 times we expected. It looks as if Oracle has done a cute little optimisation with the other function, though. If you look at the predicate it says:

  • NVL(“T1B”.”MOD_1000″,0)+”F2″(“T1A”.”ID”)+”F3″(“T1A”.”ID”)>0

It looks as if Oracle has either cached the result of the f3() function call (in which case why didn’t it also cache the f2() result) or it’s stopped evaluating the predicate as soon as the running total exceeded zero (but that might be a problem since f3() could return a negative number !). I don’t know exactly why Oracle is doing what it’s doing – but if I reverse the order of the functions in the predicates the f3() query will run 400 times and the f2() query will run once.

Appendix

If you want to re-run some tests here’s the code I used to create the demo:


create table t1
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum-1,1000)              mod_1000,
        lpad(rownum,10,'0')             v1,
        lpad(rownum,10,'0')             v2,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 4e5 -- > comment to avoid WordPress format issue
;

-- alter table t1 add constraint t1_pk primary key(id);

create table t2 nologging as select * from t1;
create table t3 nologging as select * from t1;

create function f2 (i_in number) return number
as
        m_ret number;
begin
        select max(id)
        into    m_ret
        from    t2
        where   mod_1000 = i_in
        ;

        return m_ret;
end;
/

create function f3 (i_in number) return number
as
        m_ret number;
begin
        select  max(id)
        into    m_ret
        from    t3
        where   mod_1000 = i_in
        ;

        return m_ret;
end;
/


begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T2',
                method_opt  => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T3',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter system flush buffer_cache;

variable b1 number
variable b2 number

set timing on

select
        /*+ monitor */
        t1a.mod_1000, t1b.mod_1000
from
        t1      t1a,
        t1      t1b
where
        t1a.mod_1000 = 500
and     t1b.id(+) = t1a.id
and     nvl(t1b.mod_1000,0) + f3(t1a.id) + f2(t1a.id) > 0
;


One interesting little detail, if you try the tests, is that the join has to be an outer join for the FILTER operation to appear.

March 19, 2019

IM_DOMAIN$

Filed under: 18c,Oracle,Troubleshooting — Jonathan Lewis @ 12:05 pm GMT Mar 19,2019

A few months ago Franck Pachot wrote about a recursive SQL statement that kept appearing in the library cache. I discovered the note today because I had just found a client site where the following statement suddenly appeared near the top of the “SQL ordered by Executions” section of their AWR reports after they had upgraded to 18c.


select domain# from sys.im_domain$ where objn = :1 and col# = :2

I found Franck’s article by the simple expedient of typing the entire query into a Google search – his note was the first hit on the list, and he had a convenient example (based on the SCOTT schema) to demonstrate the effect, so I built the tables from the schema and ran a simple test with extended SQL tracing (event 10046) enabled.

Here’s an extract (with no deletions) from the resulting trace file:

PARSING IN CURSOR #139819795591784 len=110 dep=0 uid=104 oct=3 lid=104 tim=31306461773 hv=3572295767 ad='6bf8b8a0' sqlid='8n2bcc3aftu2r'
select /*+ leading(EMP DEPT) USE_HASH(DEPT) USE_HASH(BONUS) */ * from DEPT natural join EMP natural join BONUS
END OF STMT
PARSE #139819795591784:c=59,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306461772

PARSING IN CURSOR #139819795585328 len=64 dep=1 uid=0 oct=3 lid=0 tim=31306461966 hv=1240924087 ad='69a8b760' sqlid='0b639nx4zdzxr'
select domain# from sys.im_domain$ where objn = :1 and col# = :2
END OF STMT
PARSE #139819795585328:c=37,e=37,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306461965
EXEC #139819795585328:c=32,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306462058
FETCH #139819795585328:c=17,e=17,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306462098
STAT #139819795585328 id=1 cnt=0 pid=0 pos=1 obj=10422 op='TABLE ACCESS BY INDEX ROWID IM_DOMAIN$ (cr=1 pr=0 pw=0 str=1 time=21 us cost=0 size=39 card=1)'
STAT #139819795585328 id=2 cnt=0 pid=1 pos=1 obj=10423 op='INDEX UNIQUE SCAN IM_DOMAIN_UK (cr=1 pr=0 pw=0 str=1 time=18 us cost=0 size=0 card=1)'
CLOSE #139819795585328:c=5,e=5,dep=1,type=1,tim=31306462287

EXEC #139819795591784:c=484,e=484,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306462316
FETCH #139819795591784:c=0,e=804,p=0,cr=44,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306463191
STAT #139819795591784 id=1 cnt=0 pid=0 pos=1 obj=0 op='HASH JOIN  (cr=45 pr=0 pw=0 str=1 time=1222 us cost=72 size=97 card=1)'
STAT #139819795591784 id=2 cnt=4 pid=1 pos=1 obj=0 op='HASH JOIN  (cr=45 pr=0 pw=0 str=1 time=1001 us cost=70 size=232 card=4)'
STAT #139819795591784 id=3 cnt=4 pid=2 pos=1 obj=117764 op='TABLE ACCESS FULL EMP (cr=22 pr=0 pw=0 str=1 time=259 us cost=35 size=152 card=4)'
STAT #139819795591784 id=4 cnt=4 pid=2 pos=2 obj=117765 op='TABLE ACCESS FULL DEPT (cr=22 pr=0 pw=0 str=1 time=81 us cost=35 size=80 card=4)'
STAT #139819795591784 id=5 cnt=0 pid=1 pos=2 obj=117766 op='TABLE ACCESS FULL BONUS (cr=0 pr=0 pw=0 str=1 time=4 us cost=2 size=39 card=1)'
CLOSE #139819795591784:c=24,e=24,dep=0,type=1,tim=31306508552

PARSE #139819795591784:c=41,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306508798
PARSE #139819795585328:c=21,e=22,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306509010
EXEC #139819795585328:c=132,e=132,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306509220
FETCH #139819795585328:c=20,e=19,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306509415
CLOSE #139819795585328:c=8,e=8,dep=1,type=3,tim=31306509494
EXEC #139819795591784:c=682,e=704,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306509558
FETCH #139819795591784:c=588,e=1246,p=0,cr=44,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306511014
CLOSE #139819795591784:c=23,e=22,dep=0,type=3,ti

As you can see, every time I do a parse call for the query against the SCOTT schema (PARSE #139819795591784), Oracle does a parse/exec/fetch/close for the query against im_domain$ (PARSE #139819795585328) – and this happens even when the SCOTT query is in the session cursor cache!

As Franck points out, this looks like something to do with the In Memory option even though the option wasn’t enabled in his database, and wasn’t enabled in my client’s database. Once you’ve got a reproducible example of a problem, though, you can start fiddling to see if you can bypass it. In this case I decided to check all the parameters to do with the in-memory option – which is a bit like hard work because there are 208 parameters that include the word “inmemory”. After checking the descriptions of the first twenty or so I decided there was an easier option – if Oracle is asking about “domains” for columns possibly it’s something to do with the relatively new “join group” feature for in-memory columnar compression, so I ran a query to produce the list of parameter names and description for parameter with the words “join” and “group” in their names – there are two:


_sqlexec_join_group_aware_hj_enabled              enable/disable join group aware hash join
_sqlexec_join_group_aware_hj_unencoded_rowset     minimum number of unencoded rowsets processed before adaptation 

The first one looks rather promising – and it has a default value to TRUE, and it can be changed by “alter session” or “alter system”. So I executed:


alter session set "_sqlexec_join_group_aware_hj_enabled" = false;
alter system flush shared_pool;

Then I ran my test again and voila! there it wasn’t. No more queries against in_domain$.

Problem solved (until the client decides they want to use the in-memory option, perhaps).

There may be other reasons why this recursive query appears which aren’t to do with hash joins, of course, but join groups are specifically to allow efficient hash joins with the in-memory option, (it’s a trick to do with common encoding for compression to allow Bloom filtering to eliminate CUs without decoding) so I’m hoping I won’t have to track down and eliminate another sources for the query.

 

January 31, 2019

Descending Problem

Filed under: Execution plans,Indexing,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 3:34 pm GMT Jan 31,2019

I’ve written in the past about oddities with descending indexes ( here, here, and here, for example) but I’ve just come across a case where I may have to introduce a descending index that really shouldn’t need to exist. As so often happens it’s at the boundary where two Oracle features collide. I have a table that handles data for a large number of customers, who record a reasonable number of transactions per year, and I have a query that displays the most recent transactions for a customer. Conveniently the table is partitioned by hash on the customer ID, and I have an index that starts with the customer_id and transaction_date columns. So here’s my query or, to be a little more accurate, the client’s query – simplified and camouflaged:


select  /*+ gather_plan_statistics */
        *
from    (
             select
                    v1.*,
                    rownum rn
             from   (
                             select   /*
                                         no_eliminate_oby
                                         index_rs_desc(t1 (customer_id, transaction_date))
                                      */
                                      t1.*
                             from     t1
                             where    customer_id = 50
                             and      transaction_date >= to_date('1900-01-01','yyyy-mm-dd')
                             order by transaction_date DESC
                ) v1
                where  rownum <= 10 -- > comment to avoid WordPress format issue
         )
where    rn >= 1
;

You’ll notice some hinting – the /*+ gather_plan_statistics */ will allow me to report the rowsource execution stats when I pull the plan from memory, and the hints in the inline view (which I’ve commented out in the above) will force a particular execution plan – walking through the index on (company_id, transaction_date) in descending order.

If I create t1 as a simple (non-partitioned) heap table I get the following plan unhinted (I’ve had to edit a “less than or equal to” symbol to avoid a WordPress format issue):

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |      1 |        |    14 (100)|     10 |00:00:00.01 |      14 |
|*  1 |  VIEW                           |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  2 |   COUNT STOPKEY                 |       |      1 |        |            |     10 |00:00:00.01 |      14 |
|   3 |    VIEW                         |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|   4 |     TABLE ACCESS BY INDEX ROWID | T1    |      1 |    340 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  5 |      INDEX RANGE SCAN DESCENDING| T1_I1 |      1 |     10 |     3   (0)|     10 |00:00:00.01 |       4 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   5 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Notice the descending range scan of the index – just as I wanted it – the minimal number of buffer visits, and only 10 rows (and rowids) examined from the table. But what happens if I recreate t1 as a hash-partitioned table with local index – here’s the new plan, again without hinting the SQL:


----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |       |      1 |        |   207 (100)|     10 |00:00:00.01 |     138 |       |       |          |
|*  1 |  VIEW                                          |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  2 |   COUNT STOPKEY                                |       |      1 |        |            |     10 |00:00:00.01 |     138 |       |       |          |
|   3 |    VIEW                                        |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  4 |     SORT ORDER BY STOPKEY                      |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |  2048 |  2048 | 2048  (0)|
|   5 |      PARTITION HASH SINGLE                     |       |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|   6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1    |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|*  7 |        INDEX RANGE SCAN                        | T1_I1 |      1 |    340 |     4   (0)|    340 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM. LE. 10)
   4 - filter(ROWNUM .LE. 10)
   7 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE">=TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TRANSACTION_DATE" IS NOT NULL)

Even though the optimizer has recognised that is will be visiting a single partition through a local index it has not chosen a descending index range scan, though it has used the appropriate index; so it’s fetched all the relevant rows from the table in the wrong order then sorted them discarding all but the top 10. We’ve done 138 buffer visits (which would turn into disk I/Os, and far more of them, in the production system).

Does this mean that the optimizer can’t use the descending index when the table is partitioned – or that somehow the costing has gone wrong. Here’s plan with the hints in place to see what happens when we demand a descending range scan:


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |      1 |        |   207 (100)|     10 |00:00:00.01 |       8 |
|*  1 |  VIEW                                 |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |       8 |
|*  2 |   COUNT STOPKEY                       |       |      1 |        |            |     10 |00:00:00.01 |       8 |
|   3 |    VIEW                               |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |       8 |
|   4 |     PARTITION HASH SINGLE             |       |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID| T1    |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|*  6 |       INDEX RANGE SCAN DESCENDING     | T1_I1 |      1 |    340 |     4   (0)|     16 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   6 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The optimizer is happy to oblige with the descending range scan – we can see that we’ve visited only 8 buffers, and fetched only 10 rows from the table. The cost, however, hasn’t made any allowance for the limited range scan. Check back to the plan for the simple (non-partitioned) table and you’ll see that the optimizer did allow for the reduced range scan. So the problem here is a costing one – we have to hint the index range scan if we want Oracle limit the work it does.

You might notice, by the way that the number of rowids returned in the index range scan descending operation is 16 rather than 10 – a little variation that didn’t show up when the table wasn’t partitioned. I don’t know why this happened, but when I changed the requirement to 20 rows the range scan returned 31 rowids, when I changed it to 34 rows the range scan returned 46 rows, and a request for 47 rows returned 61 index rowids – you can see the pattern, the number of rowids returned by the index range scan seems to be 1 + 15*N.

Footnote:

If you want to avoid hinting the code (or adding an SQL patch) you need only re-create the index with the transaction_date column declared as descending (“desc”), at which point the optimizer automatically chooses the correct strategy and the run-time engine returns exactly 10 rowids and doesn’t need to do any sorting. But who wants to create a descending index when they don’t really need it !

If you want to reproduce the experiments, here’s the script to create my test data.


rem
rem     Script:         pt_ind_desc_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:        
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1 (
        customer_id,
        transaction_date,
        small_vc,
        padding 
)
partition by hash(customer_id) partitions 4
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        mod(rownum,128)                         customer_id,
        (trunc(sysdate) - 1e6) + rownum         transaction_date,
        lpad(rownum,10,'0')                     v1,
        lpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(customer_id, transaction_date) 
local 
nologging
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

I’ve run this test on 12.1.0.2, 12.2.0.1, and 18.3.0.0 – the behaviour is the same in all three versions.

Update (1st Feb 2019)

As the client reminded me after reading the post, it’s worth pointing out that for more complex SQL you still have to worry about the errors in the cardinality and cost calculations that could easily push the optimizer into the wrong join order and/or join method – whether you choose to hint the ascending index or create a descending index.  Getting the plan you want for this type of “pagination” query can be a messy process.

December 7, 2018

Misdirection

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 11:48 am GMT Dec 7,2018

A recent post on the ODC database forum prompted me to write a short note about a trap that catches everyone from time to time. The trap is following the obvious; and it’s a trap because it’s only previous experience that lets you decide what’s obvious and the similarity between what you’re looking at and your previous experience may be purely coincidental.

The question on OTN (paraphrased) was as follows:

When I run the first query below Oracle doesn’t use the index on column AF and is slow, but when I run the second query the Oracle uses the index and it’s fast. So when the input starts with ‘\\’ the indexes are not used. What’s going on ?


SELECT * FROM T WHERE AF = '\\domain\test\1123.pdf';
SELECT * FROM T WHERE AF = 'a\\domain\test\1123.pdf';

Looking at the two queries my first thought was that it’s obvious what’s (probably) happening, and my second thought was the more interesting question: “why does this person think that the ‘\\’ is significant ?”

The cause of the difference in behaviour is probably related to the way that Oracle stores statistics (specifically histograms) about character columns, and the way in which the cardinality calculations can go wrong.  If two character match over the first few characters the numeric representation of those strings that Oracle uses in a histogram is identical, and if they are long enough even the “actual value” stored would be identical. It looks as if this person is storing URLs, and it’s quite likely that there are a lot of long URLs that start with the same (long) string of characters – it’s a very old problem – and it’s an example of a column where you probably want to be absolutely sure that you don’t gather a histogram.

But why did the OP decide that the ‘\\’ was the significant bit ? I don’t know, of course, but  how about this as a guess:

  • No contrary tests: Perhaps every single time the query misbehaved the value started with ‘\\’ and it never went wrong for any other starting values. And maybe the OP tested several different domain names – it would be much easier to see the ‘\\’ as the common denominator rather than “repetitive leading character string” if you tested with values that spanned different domains.

combined with

  • An easily available “justification”: In many programming languages (including SQL) ‘\’ is an escape character – if you don’t really know much about how the optimizer works you might believe that that could be enough to confuse the optimizer.

It can be very difficult when you spot an obvious pattern to pause long enough to consider whether you’ve identified the whole pattern, or whether you’re looking at a special case that’s going to take you in the wrong direction.

 

November 27, 2018

Counting Rows

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 9:08 pm GMT Nov 27,2018

Here’s another little utility I use from time to time (usually for small tables) to check how many rows there are in each block of the table, and which blocks are used. It doesn’t do anything clever, just call routines in the dbms_rowid package for each rowid in the table:


rem
rem     Rowid_count.sql
rem     Generic code to count rows per block in a table
rem     Ordered by file and block
rem

define m_table = '&1'

spool rowid_count

select 
        dbms_rowid.rowid_relative_fno(rowid)    rel_file_no, 
        dbms_rowid.rowid_block_number(rowid)    block_no,
        count(*)                                rows_starting_in_block
from 
        &m_table        t1
group by 
        dbms_rowid.rowid_relative_fno(rowid), 
        dbms_rowid.rowid_block_number(rowid) 
order by 
        dbms_rowid.rowid_relative_fno(rowid), 
        dbms_rowid.rowid_block_number(rowid)
;


select
        rows_starting_in_block,
        count(*)        blocks
from
        (
        select 
                dbms_rowid.rowid_relative_fno(rowid), 
                dbms_rowid.rowid_block_number(rowid),
                count(*)                                rows_starting_in_block
        from 
                &m_table        t1
        group by 
                dbms_rowid.rowid_relative_fno(rowid), 
                dbms_rowid.rowid_block_number(rowid) 
        )
group by
        rows_starting_in_block
order by
        rows_starting_in_block
;

spool off


And here’s a sample of the output:


REL_FILE_NO   BLOCK_NO ROWS_STARTING_IN_BLOCK
----------- ---------- ----------------------
	 22	   131			  199
	 22	   132			  199
	 22	   133			  199
	 22	   134			  199
	 22	   135			   88
	 22	   138			  111

6 rows selected.


ROWS_STARTING_IN_BLOCK	   BLOCKS
---------------------- ----------
		    88		1
		   111		1
		   199		4

3 rows selected.


Obviously it could take quite a lot of I/O and CPU to run the two queries against a large table – generally I use it when I want to pick a block to dump afterwards.

October 19, 2018

add_colored_sql

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 3:08 pm GMT Oct 19,2018

The following request appeared recently on the Oracle-L mailing list:

I have one scenario related to capturing of sql statement in history table..  Like dba_hist_sqltext capture the queries that ran for 10 sec or more..  How do I get the sql stmt which took less time say in  millisecond..  Any idea please share.

An AWR snapshot captures statements that (a) meet some workload criteria such as “lots of executions” and (b) happen to be in the library cache when the snapshot takes place; but if you have some statements which you think are important or interesting enough to keep an eye on that don’t do enough work to meet the normal workload requirements of the AWR snapshots it’s still possible to tell Oracle to capture them by “coloring” them.  (Apologies for the American spelling – it’s necessary to avoid error ‘PLS_00302: component %s must be declared’.)

Somewhere in the 11gR1 timeline the package dbms_workload_repository acquired the following two procedures:


PROCEDURE ADD_COLORED_SQL
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 SQL_ID                         VARCHAR2                IN
 DBID                           NUMBER                  IN     DEFAULT

PROCEDURE REMOVE_COLORED_SQL
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 SQL_ID                         VARCHAR2                IN
 DBID                           NUMBER                  IN     DEFAULT

You have to be licensed to use the workload repository, of course, but if you are you can call the first procedure to mark an SQL statement as “interesting”, after which its execution statistics will be captured whenever it’s still in the library cache at snapshot time. The second procedure lets you stop the capture – and you will probably want to use this procedure from time to time because there’s a limit (currently 100) to the number of statements you’re allowed to color and if you try to exceed the limit your call will raise Oracle error ORA-13534.


ORA-13534: Current SQL count(100) reached maximum allowed (100)
ORA-06512: at "SYS.DBMS_WORKLOAD_REPOSITORY", line 751
ORA-06512: at line 3

If you want to see the list of statements currently marked as colored you can query table wrm$_colored_sql, exposed through the views dba_hist_colored_sql and (in 12c) cdb_hist_colored_sql. (Note: I haven’t yet tested whether the limit of 100 statements is per PDB or summed across the entire CDB [but see comment #2 below] – and the answer may vary with version of Oracle, of course).


SQL> select * from sys.wrm$_colored_sql;

      DBID SQL_ID             OWNER CREATE_TI
---------- ------------- ---------- ---------
3089296639 aedf339438ww3          1 28-SEP-18

1 row selected.

If you’ve had to color a statement to force the AWR snapshot to capture it the statement probably won’t appear in the standard AWR reports; but it will be available to the “AWR SQL” report (which I usually generate from SQL*Plus with a call to $ORACLE_HOME/rdbms/admin/awrsqrpt./sql).

Footnote

If the statement you’re interested in executes very infrequently and often drops out of the library cache before it can be captured in an AWR snapshot then an alternative strategy is to enable system-wide tracing for that statement so that you can capture every execution in a trace file.

 

October 17, 2018

Problem Solving

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 4:11 pm GMT Oct 17,2018

Here’s a little question that popped up on the Oracle-L list server a few days ago:

I am facing this issue running this command in 11.2.0.4.0 (also in 12c R2 I got the same error)

SQL> SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL;
SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL
ORA-01873: a precisão precedente do intervalo é pequena demais

 

How do you go about finding out what’s going on ? In my case the first thing is to check the translation the error message (two options):

SQL> execute dbms_output.put_line(sqlerrm(-1873))
ORA-01873: the leading precision of the interval is too small

SQL> SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL;
SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL
                                                                                                       *
ERROR at line 1:
ORA-01873: the leading precision of the interval is too small

That didn’t quite match my guess, but it was similar, I had been guessing that it was saying something about precision – but it doesn’t really strike me as an intuitively self-explanatory message, so maybe a quick check in $ORACLE_HOME/rdbms/mesg/oraus.msg to find the error number with cause and action will help:


01873, 00000, "the leading precision of the interval is too small"
// *Cause: The leading precision of the interval is too small to store the
//  specified interval.
// *Action: Increase the leading precision of the interval or specify an
//  interval with a smaller leading precision.

Well, that doesn’t really add value – and I can’t help feeling that if the leading precision of the interval is too small it won’t help to make it smaller. So all I’m left to go on is that there’s a precision problem of some sort and it’s something to do with the interval, and probably NOT with adding the interval to the timestamp. So let’s check that bit alone:


SQL> SELECT NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL;
SELECT NUMTODSINTERVAL(2850166802000/1000, 'SECOND') FROM DUAL
                                    *
ERROR at line 1:
ORA-01873: the leading precision of the interval is too small


So the interval bit is the problem. Since the problem is about “precision”, let’s try messing about with the big number. First I’ll do a bit of cosmetic tidying by doing the division to knock off the trailing zeros, then I’ll see what happens when I divide by 10:

SQL> SELECT NUMTODSINTERVAL(285016680, 'SECOND') from dual;

NUMTODSINTERVAL(285016680,'SECOND')
---------------------------------------------------------------------------
+000003298 19:18:00.000000000

So 285 million works, but 2.85 billion doesn’t. The value that works give an interval of about 3,298 days, which is about 10 years, so maybe there’s an undocumented limit of 100 years on the input value; on the other hand the jump from 285 million to 2.85 billion does take you through a critical computer-oriented limit: 231 – 1, the maximum signed 32 bit integer (2147483647) so lets try using that value, and that value plus 1 in the expression:


SQL> SELECT NUMTODSINTERVAL(power(2,31), 'SECOND') from dual;
SELECT NUMTODSINTERVAL(power(2,31), 'SECOND') from dual
                       *
ERROR at line 1:
ORA-01873: the leading precision of the interval is too small


SQL> SELECT NUMTODSINTERVAL(power(2,31)-1, 'SECOND') from dual;

NUMTODSINTERVAL(POWER(2,31)-1,'SECOND')
---------------------------------------------------------------------------
+000024855 03:14:07.000000000

1 row selected.

Problem identified – it’s a numeric limit of the numtodsinterval() function. Interestingly it’s not documented in the Oracle manuals, in fact the SQL Reference manual suggests that this shouldn’t be a limit because it says that “any number value or anything that can be cast as a number is legal” and in Oracle-speak a number allows for roughly 38 digits precision.

Whilst we’ve identified the problem we still need a way to turn the input number into the timestamp we need – the OP didn’t need help with that one: divide by sixty and convert using minutes instead of seconds:


SQL> SELECT TO_TIMESTAMP('1970-01-01 00:00:00.0','YYYY-MM-DD HH24:MI:SS.FF') + NUMTODSINTERVAL(2850166802000/1000/60, 'MINUTE') FROM DUAL;

TO_TIMESTAMP('1970-01-0100:00:00.0','YYYY-MM-DDHH24:MI:SS.FF')+NUMTODSINTER
---------------------------------------------------------------------------
26-APR-60 01.00.02.000000000 AM

1 row selected

Job done.

September 30, 2018

Case Study

Filed under: 12c,Execution plans,Oracle,subqueries,Troubleshooting — Jonathan Lewis @ 7:59 pm GMT Sep 30,2018

A question about reading execution plans and optimising queries arrived on the ODC database forum a little while ago; the owner says the following statement is taking 14 minutes to return 30,000 rows and wants some help understanding why.

If you look at the original posting you’ll see that we’ve been given the text of the query and the execution plan including rowsource execution stats. There’s an inconsistency between the supplied information and the question asked, and I’ll get back to that shortly, but to keep this note fairly short I’ve excluded the 2nd half of the query (which is a UNION ALL) because the plan says the first part of the query took 13 minutes and 20 second and the user is worried about a total of 14 minutes.

SELECT /*+ gather_plan_statistics*/ DISTINCT
                rct.org_id,
                hzp.party_name,
                hca.account_number,
                rct.interface_header_attribute1 order_number,
                rct.customer_trx_id,
                rct.trx_number,
                rct.trx_date,
                rctd.gl_date,
                rct.creation_date,
                rctl.line_number,
                rct.invoice_currency_code inv_currency,
                (
                       SELECT SUM (rct_1.extended_amount)
                       FROM   apps.ra_customer_trx_lines_all rct_1
                       WHERE  rct_1.customer_trx_id = rct.customer_trx_id
                       AND    rct_1.line_type = 'LINE') inv_net_amount,
                (
                       SELECT SUM (rct_2.extended_amount)
                       FROM   apps.ra_customer_trx_lines_all rct_2
                       WHERE  rct_2.customer_trx_id = rct.customer_trx_id
                       AND    rct_2.line_type = 'TAX') inv_tax_amount,
                (
                       SELECT SUM (rct_3.extended_amount)
                       FROM   apps.ra_customer_trx_lines_all rct_3
                       WHERE  rct_3.customer_trx_id = rct.customer_trx_id) inv_gross_amount,
                gll.currency_code                                    func_currency,
                Round((
                        (
                        SELECT SUM (rct_4.extended_amount)
                        FROM   apps.ra_customer_trx_lines_all rct_4
                        WHERE  rct_4.customer_trx_id = rct.customer_trx_id
                        AND    rct_4.line_type = 'LINE')*gdr.conversion_rate),2) func_net_amount,
                Round((
                        (
                        SELECT SUM (rct_5.extended_amount)
                        FROM   apps.ra_customer_trx_lines_all rct_5
                        WHERE  rct_5.customer_trx_id = rct.customer_trx_id
                        AND    rct_5.line_type = 'TAX')*gdr.conversion_rate),2) func_tax_amount,
                Round((
                        (
                        SELECT SUM (rct_6.extended_amount)
                        FROM   apps.ra_customer_trx_lines_all rct_6
                        WHERE  rct_6.customer_trx_id = rct.customer_trx_id)*gdr.conversion_rate),2) func_gross_amount,
                glcc.segment1                                                                 company,
                glcc.segment2                                                                 account,
                hg.geography_name                                                             billing_country,
                gdr.conversion_rate
FROM            apps.hz_parties hzp,
                apps.hz_cust_accounts hca,
                apps.ra_customer_trx_all rct,
                apps.ra_customer_trx_lines_all rctl,
                apps.ra_cust_trx_line_gl_dist_all rctd,
                apps.gl_code_combinations_kfv glcc,
                apps.hz_cust_site_uses_all hcsua,
                apps.hz_cust_acct_sites_all hcasa,
                apps.hz_party_sites hps,
                apps.hz_locations hl,
                apps.hz_geographies hg,
                apps.gl_ledgers gll,
                apps.gl_daily_rates gdr
WHERE           hzp.party_id = hca.party_id
AND             hca.cust_account_id = rct.bill_to_customer_id
AND             hca.cust_account_id = hcasa.cust_account_id
AND             rct.customer_trx_id = rctl.customer_trx_id
AND             rctl.customer_trx_line_id = rctd.customer_trx_line_id
AND             glcc.code_combination_id = rctd.code_combination_id
AND             rct.bill_to_site_use_id = hcsua.site_use_id
AND             hcsua.cust_acct_site_id = hcasa.cust_acct_site_id
AND             hcasa.party_site_id = hps.party_site_id
AND             hps.location_id = hl.location_id
AND             hl.country = hg.country_code
AND             hg.geography_type = 'COUNTRY'
AND             rctl.line_type = 'TAX'
AND             gll.ledger_id = rct.set_of_books_id
AND             gdr.from_currency = rct.invoice_currency_code
AND             gdr.to_currency = gll.currency_code
AND             to_date(gdr.conversion_date) = to_date(rctd.gl_date)
AND             gdr.conversion_type = 'Corporate'
AND             rctd.gl_date BETWEEN To_date ('01-JAN-2018', 'DD-MON-YYYY') AND  To_date ('31-JAN-2018', 'DD-MON-YYYY')
AND             glcc.segment1 = '2600'
AND             glcc.segment2 = '206911'
GROUP BY        hzp.party_name,
                hca.account_number,
                rct.interface_header_attribute1,
                rct.trx_number,
                rct.trx_date,
                rct.creation_date,
                rctl.line_number,
                rctl.unit_selling_price,
                rct.org_id,
                rctd.gl_date,
                rct.customer_trx_id,
                glcc.segment1,
                glcc.segment2,
                hg.geography_name,
                rct.invoice_currency_code,
                gll.currency_code,
                gdr.conversion_rate 

We note that there are six scalar subqueries in the text I’ve reported – and they form two groups of three, and the difference between the two groups is that one group is multiplied by a conversion rate while the other isn’t; moreover in each group the three subqueries are simply querying subsets of the same correlated data set. So it looks as if all 6 scalar subqueries could be eliminated and replaced by the inclusion of an aggregate view in the from clause and the projection of 6 columns from that view.

However, before pursuing that option, take a look at the plan with the rowsource execution stats – where is the time going ?


-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                                           |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   1 |  UNION-ALL                                                 |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   2 |   HASH UNIQUE                                              |                              |      1 |      1 |    501 |00:13:20.17 |    3579K|  
|   3 |    HASH GROUP BY                                           |                              |      1 |      1 |  19827 |00:13:20.15 |    3579K|  
|   4 |     NESTED LOOPS                                           |                              |      1 |        |  21808 |00:13:10.26 |    3579K|  
|   5 |      NESTED LOOPS                                          |                              |      1 |      1 |  21808 |00:13:10.11 |    3578K|  
|   6 |       NESTED LOOPS OUTER                                   |                              |      1 |      1 |  21808 |00:13:09.90 |    3576K|  
|   7 |        NESTED LOOPS OUTER                                  |                              |      1 |      1 |  21808 |00:13:09.25 |    3501K|  
|   8 |         NESTED LOOPS OUTER                                 |                              |      1 |      1 |  21808 |00:13:08.48 |    3426K|  
|   9 |          NESTED LOOPS OUTER                                |                              |      1 |      1 |  21808 |00:13:07.66 |    3333K|  
|  10 |           NESTED LOOPS OUTER                               |                              |      1 |      1 |  21808 |00:13:06.92 |    3258K|  
|  11 |            NESTED LOOPS OUTER                              |                              |      1 |      1 |  21808 |00:13:06.08 |    3183K|  
|  12 |             NESTED LOOPS                                   |                              |      1 |      1 |  21808 |00:13:04.69 |    3090K|  
|  13 |              NESTED LOOPS                                  |                              |      1 |      1 |  21808 |00:13:05.75 |    3026K|  
|  14 |               NESTED LOOPS                                 |                              |      1 |      1 |  21808 |00:13:03.30 |    2961K|  
|  15 |                NESTED LOOPS                                |                              |      1 |      1 |  33459 |00:00:04.33 |    1123K|  
|  16 |                 NESTED LOOPS                               |                              |      1 |    351 |  33459 |00:00:03.67 |    1025K|  
|  17 |                  NESTED LOOPS                              |                              |      1 |    351 |  33459 |00:00:03.06 |     926K|  
|  18 |                   NESTED LOOPS                             |                              |      1 |    351 |  33459 |00:00:02.47 |     827K|  
|* 19 |                    HASH JOIN                               |                              |      1 |    351 |  33459 |00:00:01.90 |     730K|  
|  20 |                     TABLE ACCESS FULL                      | GL_LEDGERS                   |      1 |     38 |     39 |00:00:00.01 |      15 |  
|  21 |                     NESTED LOOPS                           |                              |      1 |        |  33459 |00:00:01.75 |     730K|  
|  22 |                      NESTED LOOPS                          |                              |      1 |    351 |  33459 |00:00:01.44 |     696K|  
|  23 |                       NESTED LOOPS                         |                              |      1 |    351 |  33459 |00:00:01.11 |     646K|  
|* 24 |                        HASH JOIN                           |                              |      1 |    385 |  33459 |00:00:00.40 |     526K|  
|* 25 |                         TABLE ACCESS BY INDEX ROWID BATCHED| GL_CODE_COMBINATIONS         |      1 |     35 |      1 |00:00:00.01 |     108 |  
|* 26 |                          INDEX RANGE SCAN                  | GL_CODE_COMBINATIONS_N2      |      1 |    499 |     77 |00:00:00.01 |       3 |  
|* 27 |                         TABLE ACCESS BY INDEX ROWID BATCHED| RA_CUST_TRX_LINE_GL_DIST_ALL |      1 |    651K|   1458K|00:00:02.22 |     526K|  
|* 28 |                          INDEX RANGE SCAN                  | RA_CUST_TRX_LINE_GL_DIST_N2  |      1 |    728K|   1820K|00:00:01.60 |   11147 |  
|* 29 |                        TABLE ACCESS BY INDEX ROWID         | RA_CUSTOMER_TRX_LINES_ALL    |  33459 |      1 |  33459 |00:00:00.53 |     119K|  
|* 30 |                         INDEX UNIQUE SCAN                  | RA_CUSTOMER_TRX_LINES_U1     |  33459 |      1 |  33459 |00:00:00.31 |   86364 |  
|* 31 |                       INDEX UNIQUE SCAN                    | RA_CUSTOMER_TRX_U1           |  33459 |      1 |  33459 |00:00:00.21 |   49850 |  
|  32 |                      TABLE ACCESS BY INDEX ROWID           | RA_CUSTOMER_TRX_ALL          |  33459 |      1 |  33459 |00:00:00.20 |   33459 |  
|  33 |                    TABLE ACCESS BY INDEX ROWID             | HZ_CUST_ACCOUNTS             |  33459 |      1 |  33459 |00:00:00.42 |   97887 |  
|* 34 |                     INDEX UNIQUE SCAN                      | HZ_CUST_ACCOUNTS_U1          |  33459 |      1 |  33459 |00:00:00.24 |   64428 |  
|  35 |                   TABLE ACCESS BY INDEX ROWID              | HZ_PARTIES                   |  33459 |      1 |  33459 |00:00:00.44 |   98783 |  
|* 36 |                    INDEX UNIQUE SCAN                       | HZ_PARTIES_U1                |  33459 |      1 |  33459 |00:00:00.26 |   65175 |  
|  37 |                  TABLE ACCESS BY INDEX ROWID               | HZ_CUST_SITE_USES_ALL        |  33459 |      1 |  33459 |00:00:00.46 |   98374 |  
|* 38 |                   INDEX UNIQUE SCAN                        | HZ_CUST_SITE_USES_U1         |  33459 |      1 |  33459 |00:00:00.28 |   64915 |  
|* 39 |                 TABLE ACCESS BY INDEX ROWID                | HZ_CUST_ACCT_SITES_ALL       |  33459 |      1 |  33459 |00:00:00.45 |   98195 |  
|* 40 |                  INDEX UNIQUE SCAN                         | HZ_CUST_ACCT_SITES_U1        |  33459 |      1 |  33459 |00:00:00.26 |   64736 |  
|  41 |                TABLE ACCESS BY INDEX ROWID BATCHED         | GL_DAILY_RATES               |  33459 |      1 |  21808 |00:12:44.59 |    1838K|  
|* 42 |                 INDEX RANGE SCAN                           | GL_DAILY_RATES_U1            |  33459 |      1 |  21808 |00:13:08.16 |    1837K|  
|  43 |               TABLE ACCESS BY INDEX ROWID                  | HZ_PARTY_SITES               |  21808 |      1 |  21808 |00:00:00.35 |   64339 |  
|* 44 |                INDEX UNIQUE SCAN                           | HZ_PARTY_SITES_U1            |  21808 |      1 |  21808 |00:00:00.23 |   42531 |  
|  45 |              TABLE ACCESS BY INDEX ROWID                   | HZ_LOCATIONS                 |  21808 |      1 |  21808 |00:00:00.33 |   64353 |  
|* 46 |               INDEX UNIQUE SCAN                            | HZ_LOCATIONS_U1              |  21808 |      1 |  21808 |00:00:00.18 |   42545 |  
|  47 |             VIEW PUSHED PREDICATE                          | VW_SSQ_1                     |  21808 |      1 |  21808 |00:00:01.17 |   93476 |  
|  48 |              SORT GROUP BY                                 |                              |  21808 |      1 |  21808 |00:00:01.06 |   93476 |  
|  49 |               TABLE ACCESS BY INDEX ROWID BATCHED          | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |     16 |    145K|00:00:00.84 |   93476 |  
|* 50 |                INDEX RANGE SCAN                            | XXC_CUSTOMER_GETPAID         |  21808 |     16 |    145K|00:00:00.36 |   59938 |  
|  51 |            VIEW PUSHED PREDICATE                           | VW_SSQ_2                     |  21808 |      1 |  21808 |00:00:00.69 |   74433 |  
|  52 |             SORT GROUP BY                                  |                              |  21808 |      1 |  21808 |00:00:00.59 |   74433 |  
|  53 |              TABLE ACCESS BY INDEX ROWID BATCHED           | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  92201 |00:00:00.49 |   74433 |  
|* 54 |               INDEX RANGE SCAN                             | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  92201 |00:00:00.24 |   59903 |  
|  55 |           VIEW PUSHED PREDICATE                            | VW_SSQ_3                     |  21808 |      1 |  21808 |00:00:00.61 |   74852 |  
|  56 |            SORT GROUP BY                                   |                              |  21808 |      1 |  21808 |00:00:00.51 |   74852 |  
|  57 |             TABLE ACCESS BY INDEX ROWID BATCHED            | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  53060 |00:00:00.38 |   74852 |  
|* 58 |              INDEX RANGE SCAN                              | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  53060 |00:00:00.19 |   59148 |  
|  59 |          VIEW PUSHED PREDICATE                             | VW_SSQ_4                     |  21808 |      1 |  21808 |00:00:00.70 |   93490 |  
|  60 |           SORT GROUP BY                                    |                              |  21808 |      1 |  21808 |00:00:00.61 |   93490 |  
|  61 |            TABLE ACCESS BY INDEX ROWID BATCHED             | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |     16 |    145K|00:00:00.63 |   93490 |  
|* 62 |             INDEX RANGE SCAN                               | XXC_CUSTOMER_GETPAID         |  21808 |     16 |    145K|00:00:00.25 |   59950 |  
|  63 |         VIEW PUSHED PREDICATE                              | VW_SSQ_5                     |  21808 |      1 |  21808 |00:00:00.63 |   74427 |  
|  64 |          SORT GROUP BY                                     |                              |  21808 |      1 |  21808 |00:00:00.54 |   74427 |  
|  65 |           TABLE ACCESS BY INDEX ROWID BATCHED              | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  92201 |00:00:00.44 |   74427 |  
|* 66 |            INDEX RANGE SCAN                                | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  92201 |00:00:00.21 |   59900 |  
|  67 |        VIEW PUSHED PREDICATE                               | VW_SSQ_6                     |  21808 |      1 |  21808 |00:00:00.59 |   74846 |  
|  68 |         SORT GROUP BY                                      |                              |  21808 |      1 |  21808 |00:00:00.50 |   74846 |  
|  69 |          TABLE ACCESS BY INDEX ROWID BATCHED               | RA_CUSTOMER_TRX_LINES_ALL    |  21808 |      8 |  53060 |00:00:00.35 |   74846 |  
|* 70 |           INDEX RANGE SCAN                                 | XXC_CUSTOMER_GETPAID         |  21808 |     12 |  53060 |00:00:00.17 |   59144 |  
|* 71 |       INDEX RANGE SCAN                                     | HZ_GEOGRAPHIES_N11           |  21808 |   5812 |  21808 |00:00:00.13 |    2684 |  
|  72 |      TABLE ACCESS BY INDEX ROWID                           | HZ_GEOGRAPHIES               |  21808 |    168 |  21808 |00:00:00.07 |     620 |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  

Let’s start by raising some concerns about the quality of information available.

First, the OP says it takes 14 minutes to return 30,000 rows: but the top line of the plan says it has taken 13 minutes and 20 seconds to return the first 501 rows, and if we look a little further down the plan operation 3 (Hash Group By) reports 00:13:20.15 to aggregate down to 19,827 rows. So this half of the plan cannot return more than 19,827 rows, and the half I have discarded (for the moment) must be returning the other 10,000+ rows. The information we have is incomplete.

Of course you may think that whatever the rest of the plan does is fairly irrelevant – it’s only going to be responsible for at most another 40 seconds of processing – except my previous experience of rowsource execution statistics tells me that when you do a large number of small operations the times reported can be subject to fairly large rounding errors and that enabling the measurement can increase the execution time by a factor of three or four. It’s perfectly feasible that this half of the query is actually the faster half under normal run-time circumstances but runs much more slowly (with a much higher level of CPU utilisation) when rowsource execution stats is in enabled. So let’s not get too confident.

With that warning in mind, what can we see in this half of the plan.

Big picture: the inline scalar subqueries have disappeared. In 12c the optimimzer can unnest scalar subqueries in the select list and turn them into outer joins, and we can see that there are 6 “Nested Loop Outer” operations, corresponding to 6 “View Pushed Predicate” operations against views labelled VW_SSQ1 through to VW_SSQ6 (SSQ = Scalar Sub Query ?). This goes back to my early comment – a person could probably rewrite the 6 scalar subqueries as a single aggregate view in the from clause: the optimizer isn’t quite clever enough to manage that in this case, but in simpler cases it might be able to do exactly that.

Big picture 2: most of the 13 minutes 20 seconds appears at operation 14 as it processes the 33,459 rows supplied to it from the 4.33 seconds of work done by operation 15 and its descendants. Reducing this part of the execution plan to the smallest relevant section we get the following:

-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|  14 |               NESTED LOOPS                                 |                              |      1 |      1 |  21808 |00:13:03.30 |    2961K|  
|  15 |                NESTED LOOPS                                |                              |      1 |      1 |  33459 |00:00:04.33 |    1123K|  
|  41 |                TABLE ACCESS BY INDEX ROWID BATCHED         | GL_DAILY_RATES               |  33459 |      1 |  21808 |00:12:44.59 |    1838K|  
|* 42 |                 INDEX RANGE SCAN                           | GL_DAILY_RATES_U1            |  33459 |      1 |  21808 |00:13:08.16 |    1837K|  
-----------------------------------------------------------------------------------------------------------------------------------------------------  

For each row supplied by operation 15 Oracle calls operation 41, which calls operation 42 to do an index range scan to supply a set of rowids so that operation 41 can access a table and return rows. Apparently the total time spent by operation 41 waiting for operation 42 to return rowids and then doing its own work is 12 minutes 44 seconds, while the range scans alone (all 33,459 of them) take 13 minutes and 8 seconds. Remember, though, that “lots of small operations = scope of rounding errors” when you look at these timings. Despite the inconsistency between the timings for operations 41 and 42 it’s reasonable to conclude that between them that’s where most of the execution time went.

Two questions – (a) can we refine our analysis of how the time is split between the two operations and (b) why do these lines take so much time.

Check the Starts and the A-rows: (reminder: for comparison, we expect A-rows to be approximately E-rows * Starts) for both operations we see 33,459 starts and 21,808 rows. The index range scans return (on average) a single rowid about two-thirds of the time, and every time a range scan returns a rowid the corresponding row is returned from the table (If you check the Id column there’s no asterisk on operation 41 so no extra predicate is applied as Oracle accesses the table row – but even if there were an extra predicate we’d still be happy to infer that if 21,808 rowids returned from operation 42 turned into 21,808 rows returned from the table then there are no wasted accesses to the table).

Now look at the Buffers for the index range scan – 1.837M: that’s roughly 56 buffers per range scan – that’s a lot of index to range through to find one rowid, which is a good clue that perhaps we do a lot of work with each Start and really do use up a lot of CPU on this operation. Let’s see what the Predicate Section of the plan tells us about this range scan:


Predicate Information (identified by operation id):  
---------------------------------------------------  
  42 - access("GDR"."FROM_CURRENCY"="RCT"."INVOICE_CURRENCY_CODE" AND "GDR"."TO_CURRENCY"="GLL"."CURRENCY_CODE" AND   
              "GDR"."CONVERSION_TYPE"='Corporate')  
       filter(("GDR"."CONVERSION_TYPE"='Corporate' AND TO_DATE(INTERNAL_FUNCTION("GDR"."CONVERSION_DATE"))=TO_DATE(INTERNAL_FUNCTION("RCTD"."  
              GL_DATE"))))  

We have access predicates (things which narrow down the number of leaf blocks that we walk through) and filter predicates (things we do to test every key entry we access). Notably the gdr.conversion type is a filter predciate as well as an access predicate – and that suggests that our set of predicates has “skipped over” a column in the index: from_currency and to_currency might be the first two columns in the index, but conversion_type is then NOT the third.

More significantly, though, there’s a column called conversion_date in the index (maybe that’s column 3 in the index – it feels like it ought to be); but for every index entry we’ve selected from the 56 blocks we walk through we do some sort of internal conversion (or un-translated transformation) to the column then convert the result to a date to compare it with another date (similarly processed from an earlier operation). What is that “internal function” – let’s check the query:


AND             gdr.from_currency = rct.invoice_currency_code
AND             gdr.to_currency = gll.currency_code
AND             gdr.conversion_type = 'Corporate'
AND             to_date(gdr.conversion_date) = to_date(rctd.gl_date)
AND             rctd.gl_date BETWEEN To_date ('01-JAN-2018', 'DD-MON-YYYY') AND  To_date ('31-JAN-2018', 'DD-MON-YYYY')

(I’ve swapped the order of a couple of lines to highlight a detail).

The filter predicate is comparing gdr.conversion_date with rctd.gl_date – and we can probably assume that both columns really are dates because (a) the word “date” is in their names and (b) the rctd.gl_date is being compared with genuine date values in the next predicate down (and – though I haven’t shown it – the way the plan reports the next predicate proves that the column really is a date datatype).

So the predicate in the SQL applies the to_date() function to two columns that are dates – which means the optimizer has to convert the date columns to some default character format and then convert them back to dates. The “internal function” is a to_char() call. Conversions between date and character formats are CPU-intensive, and we’re doing a double conversion (to_date(to_char(column_value)) to every data value in roughly 56 blocks of an index each time we call that line of the plan. It’s not surprising we spend a lot of time in that line.

Initial strategy:

Check the column types for those two columns, if they are both date types decide whether or not the predicate could be modified to a simple gdr.conversion_date = rctd.gl_date (though it’s possible that something slightly more sophisticated should be used) but whatever you do avoid the redundant conversion through character format.

Ideally, of course, if we can avoid this conversion we may find that Oracle can be more accurate in its range scan through the index, but we may still find that we do a large range scan even if we do manage to do it a little more efficiently, in which case we may want to see if there is an alternative index which will allow use to pick the one rowid we need from the index without  visiting so many leaf blocks in the index.

Warning

Simply eliminating the to_date() calls may changes the results. Here’s a demonstration of how nasty things happen when you apply to_date() to a date:


SQL> desc t1
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 D1                                     DATE
 D2                                     DATE

SQL> insert into t1 values(sysdate, sysdate + 10/86400);

1 row created.

SQL> select * from t1 where d1 = d2;

no rows selected

SQL> select * from t1 where to_date(d1) = to_date(d2);

D1        D2
--------- ---------
30-SEP-18 30-SEP-18

1 row selected.

SQL> alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select * from d1 where to_date(d1) = to_date(d2);

no rows selected

Different users could get different results because they have different settings for their nls_date_format.

Reminder

I started my analysis with two comments about the quality of information – first, we don’t really know whether or not this half of the union all would be responsble for most of the time if rowsource execution statistics were not enabled; secondly large number of small operations can lead to a lot of rounding errors in timing. There are six occurrences of unnested scalar subqueries which are all called 21,808 times – and the recorded time for all 6 of them is remarkably small given the number of executions, even when allowing for the precision with which they operate; it’s possible that these subqueries take a larger fraction of the total time than the plan indicates, in which case it might become necessary (rather than just nice) to do a manual unnesting and reduce the number of inline views to 3 (one for each line_type), 2 (one with, one without, conversion_rate) or just one.

Footnote

Once again I’ve spent a couple of hours writing notes to explain the thoughts that went through my mind in roughly 10 minutes of reading the original posting. It’s a measure really of how many bits of information you can pull together, and possibly discard, very quickly once you understand how many things the optimizer is capable of doing and how the execution plan tries to show you how a statement was (or will be) handled.

Update (5th Oct 2018)

Another way of looking for the best strategy for tuning this statement, given the available information, is this:

Where, in the sequence of events, does the data volume we’re processing drop to the right scale for the output. If we don’t drop to the right scale very early in the plan execution then we may need to re-arrange the order in which we visit tables; if we are operating at the right volume almost immediately then there’s a good chance that we’ve started the right way. Take a look at the first few lines of this plan (remembering that the query was interrupted before returning the whole result set):

-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                                           |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   1 |  UNION-ALL                                                 |                              |      1 |        |    501 |00:13:20.17 |    3579K|  
|   2 |   HASH UNIQUE                                              |                              |      1 |      1 |    501 |00:13:20.17 |    3579K|  
|   3 |    HASH GROUP BY                                           |                              |      1 |      1 |  19827 |00:13:20.15 |    3579K|  
|   4 |     NESTED LOOPS                                           |                              |      1 |        |  21808 |00:13:10.26 |    3579K|
-----------------------------------------------------------------------------------------------------------------------------------------------------    

At line 4 we generate 21,808 rows which we aggregate down to 19,827, which we then hash down to distinct values – the original user told us that the query returns 30,000 rows so we shouldn’t assume that the uniqueness requirement has reduced 19,827 rows to the 501 reported so far, there may be more to come. What we can say about these numbers, particularly lines 3 and 4 is that prior to the aggregation we need to find about 22,000 rows and carry them through the rest of the plan.

Now look at lines 24 – 28 where the heavy duty action starts (the first physical operation is actually at lines 19/20 where (thanks to swapping join inputs) Oracle scans the gl_ledger table and hashes it into memory in anticipation of incoming probe data – but that’s a tiny blip on the way to the big join):

-----------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                                                  | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  
-----------------------------------------------------------------------------------------------------------------------------------------------------  
|* 24 |                        HASH JOIN                           |                              |      1 |    385 |  33459 |00:00:00.40 |     526K|  
|* 25 |                         TABLE ACCESS BY INDEX ROWID BATCHED| GL_CODE_COMBINATIONS         |      1 |     35 |      1 |00:00:00.01 |     108 |  
|* 26 |                          INDEX RANGE SCAN                  | GL_CODE_COMBINATIONS_N2      |      1 |    499 |     77 |00:00:00.01 |       3 |  
|* 27 |                         TABLE ACCESS BY INDEX ROWID BATCHED| RA_CUST_TRX_LINE_GL_DIST_ALL |      1 |    651K|   1458K|00:00:02.22 |     526K|  
|* 28 |                          INDEX RANGE SCAN                  | RA_CUST_TRX_LINE_GL_DIST_N2  |      1 |    728K|   1820K|00:00:01.60 |   11147 | 
-----------------------------------------------------------------------------------------------------------------------------------------------------    

The important thing we see here is that the very first hash join identifies 33,459 rows: we’re immediately into the right ball-park for the final output. The timings are a bit suspect – I really don’t like seeing the time for hash join (0.4 seconds) being smaller than one of its direct child operations (the 2.22 seconds) – but this bit of the work seems to get to the right scale very quickly: this looks as if it’s likely to be a good way to start the final join order.

We might question whether the optimizer has been wise to use an index range scan to identify 1.45 million rows in a table (and probing it 1.82 million times). Maybe that was quick because all the data had previously been buffered and perhaps thisrange scan will be extremely slow on a busy production system; maybe a tablescan would be better, maybe there’s a way of getting to this big table through a different join order that means we only visit it roughly 33,459 times through an index that identifies exactly the rows we really need. Without good knowledge of what the data looks like (and without understanding what the query is supposed to achieve and how often it runs) we can only look at the supplied execution plan and work out where the time went and whether that suggests the plan is doing roughly the right thing or doing something that is clearly silly. This plan looks like a reasonable starting point with one minor (we hope) glitch around line 42 that we identified earlier on.

 

August 24, 2018

Error Logging

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 11:19 am GMT Aug 24,2018

Error logging is a topic that I’ve mentioned a couple of times in the past, most recently as a follow-up in a discussion of the choices for copying a large volume of data from one table to another, but originally in an addendum about a little surprise you may get when you use extended strings (max_string_size = EXTENDED).

If you use the default call to dbms_errlog.create_error_log() to create an error logging table then Oracle will create a table with a few columns of its own plus every column (name) that you have in your original table – but it will create your columns as varchar2(4000), or nvarchar2(2000), or raw(2000) – unless you’ve set the max_string_size to extended.  Here’s a simple  demo script with results from two different systems, one with the default setting the other with the extended setting (note, there’s a little inconsistency in handling raw() columns.


rem
rem     Script:         log_errors_min.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2018
rem     Purpose:
rem

create table t1 (
        v1      varchar2(10),
        n1      number(2,0),
        d1      date,
        nv1     nvarchar2(10),
        r1      raw(10)
);


execute dbms_errlog.create_error_log('t1')

desc err$_t1


max_string_size = STANDARD
--------------------------
 Name			       Null?	Type
 ----------------------------- -------- --------------------
 ORA_ERR_NUMBER$			NUMBER
 ORA_ERR_MESG$				VARCHAR2(2000)
 ORA_ERR_ROWID$ 			ROWID
 ORA_ERR_OPTYP$ 			VARCHAR2(2)
 ORA_ERR_TAG$				VARCHAR2(2000)
 V1					VARCHAR2(4000)
 N1					VARCHAR2(4000)
 D1					VARCHAR2(4000)
 NV1					NVARCHAR2(2000)
 R1					RAW(2000)


max_string_size = EXTENDED
--------------------------
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ORA_ERR_NUMBER$                        NUMBER
 ORA_ERR_MESG$                          VARCHAR2(2000)
 ORA_ERR_ROWID$                         ROWID
 ORA_ERR_OPTYP$                         VARCHAR2(2)
 ORA_ERR_TAG$                           VARCHAR2(2000)
 V1                                     VARCHAR2(32767)
 N1                                     VARCHAR2(32767)
 D1                                     VARCHAR2(32767)
 NV1                                    NVARCHAR2(16383)
 R1                                     RAW(32767)

Every single “original” column that appears in this table will be a LOB, with an inline LOB locator of 30 or more bytes. (At least, that’s the 12.1.0.2 implementation, I haven’t checked for 12.2 or 18.3).

If this is going to be a problem (e.g. you have a table defined with 500 columns but only use 120 of them) you can create a minimalist error logging table. Provided you create it with the ora_err% columns suitably defined you can add only those columns you’re really interested in (or feel threatened by), and you don’t have to declare them at extreme lengths. e.g.


create table err$_special (
        ora_err_number$         number,
        ora_err_mesg$           varchar2(2000),
        ora_err_rowid$          rowid,
        ora_err_optyp$          varchar2(2),
        ora_err_tag$            varchar2(2000),
        n1                      varchar2(128)
)
;

insert into t1 values(1,'abc','02-jan-1984',sys_op_c2c('abc'),hextoraw('0xFF')) 
log errors into err$_special
reject limit unlimited
;

execute print_table('select * from err$_special')


ORA_ERR_NUMBER$               : 1722
ORA_ERR_MESG$                 : ORA-01722: invalid number

ORA_ERR_ROWID$                :
ORA_ERR_OPTYP$                : I
ORA_ERR_TAG$                  :
N1                            : abc


If you try to create an error logging table that doesn’t include the 5 critical columns you’ll see Oracle error ORA-38900: missing mandatory column “ORA_ERR_{something}” of error log table “{your logging table name}” when you try to log errors into it, and the 5 critical columns have to be the first 5 columns (in any order) in the table or you’ll get Oracle error ORA-38901: column “ORA_ERR_{something}$” of table “{your logging table name}” when you try to log errors into it.

August 20, 2018

Masterclass – 1

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 10:42 am GMT Aug 20,2018

A recent thread on the Oracle developer community database forum raised a fairly typical question with a little twist. The basic question is “why is this (very simple) query slow on one system when it’s much faster on another?” The little twist was that the original posting told use that “Streams Replication” was in place to replicate the data between the two systems.

To make life easy for remote trouble-shooters the poster had supplied (for each system) the output from SQL Monitor when running the query, the autotrace output (which shows the predicate section that SQL Monitor doesn’t report), and the session statistics for the query run, plus some statistics about the single table in the query, the index used in the plan, and the column on which that index was based.

Here, with a little cosmetic editing (and a query that has clearly been camouflaged by the OP), is the information supplied for the faster database, where the query took about 30 seconds to complete.


SELECT c1, c2, c3, c4, c5, c6, c7, c8..  
FROM TAB1  
WHERE STS IN ( 'A', 'B')  
AND cnt < '4'  
AND dt < sysdate  
and rownum <=1;  
  
Sql_monitor and stats from DB1  
******************************  
  
Global Information  
------------------------------  
 STS              :  DONE (ALL ROWS)             
 Instance ID         :  1                           
 Execution Started   :  08/17/2018 08:31:22         
 First Refresh Time  :  08/17/2018 08:31:22         
 Last Refresh Time   :  08/17/2018 08:31:53         
 Duration            :  31s                         
 Program             :  sqlplus.exe                 
 Fetch Calls         :  1                           
  
Global Stats  
===============================================================================  
| Elapsed |   Cpu   |    IO    | Concurrency | Fetch | Buffer | Read  | Read  |  
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Calls |  Gets  | Reqs  | Bytes |  
===============================================================================  
|      33 |    3.00 |       30 |        0.08 |     1 |   102K | 38571 | 301MB |  
===============================================================================  
  
SQL Plan Monitoring Details (Plan Hash Value=715774357)  
======================================================================================================================================================================================  
| Id |            Operation            |          Name           |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read  | Read  | Activity |       Activity Detail        |  
|    |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs  | Bytes |   (%)    |         (# samples)          |  
======================================================================================================================================================================================  
|  0 | SELECT STATEMENT                |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  1 |   COUNT STOPKEY                 |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  2 |    INLIST ITERATOR              |                         |         |       |         1 |    +31 |     1 |        1 |       |       |          |                              |  
|  3 |     TABLE ACCESS BY INDEX ROWID | TAB1                    |    114K | 33399 |        32 |     +0 |     2 |        1 | 38377 | 300MB |    96.77 | Cpu (1)                      |  
|    |                                 |                         |         |       |           |        |       |          |       |       |          | db file sequential read (16) |  
|    |                                 |                         |         |       |           |        |       |          |       |       |          | read by other session (13)   |  
|  4 |      INDEX RANGE SCAN           | TAB1_STS_IDX            |    115K |   723 |        30 |     +2 |     2 |     118K |   194 |   2MB |     3.23 | read by other session (1)    |  
======================================================================================================================================================================================  
  
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 33399   (1)| 00:03:14 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   114K|   201M| 33399   (1)| 00:03:14 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   114K|       |   723   (1)| 00:00:05 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"<SYSDATE@!)  
   4 - access("STS"='A' OR "STS"='B')  
  
  
Table stats 
-----------
table_name    num_rows        blocks  empty_blocks  chain_cnt  avg_row_len             
TAB1        79,654,925    22,416,917             0          0        1,847          
  
column_stats(STS)
------------------
table_name  column_name  num_distinct  num_nulls    density  avg_col_len   
TAB1        STS                     5          0  6.2049E-9            2         
  
Index_stats(on STS)
-------------------
index_name    leaf_blocks  distinct_keys  avg_leaf_blocks_per_key  avg_data_blocks_per_key  clustering_factor    num_rows
TAB1_STS_IDX      487,939              5                   97,587                4,458,874         22,294,372  78,308,939   
 
Session stats
-------------
process last non-idle time              1,534,508,966
session connect time                    1,534,508,966
logical read bytes from cache             839,663,616
cell physical IO interconnect bytes       316,055,552
physical read bytes                       316,055,552
physical read total bytes                 316,055,552
file io wait time                          17,044,083
session pga memory                          8,643,880
session pga memory max                      8,643,880
temp space allocated (bytes)                4,194,304
session uga memory                          1,755,696
session uga memory max                      1,755,696
buffer is pinned count                        135,743
table fetch by rowid                          117,519
non-idle wait count                           107,301
session logical reads                         102,500
consistent gets                               102,450
consistent gets from cache                    102,448
no work - consistent read gets                102,368
buffer is not pinned count                    101,741
free buffer inspected                          43,458
free buffer requested                          38,592
physical read total IO requests                38,581
physical read IO requests                      38,581
physical reads                                 38,581
physical reads cache                           38,579
hot buffers moved to head of LRU               37,258
bytes sent via SQL*Net to client                7,370
bytes received via SQL*Net from client          6,869
redo size                                       5,536
undo change vector size                         4,432
DB time                                         3,166
non-idle wait time                              2,962
user I/O wait time                              2,954
table fetch continued row                       2,423


And here’s the equivalent information from the slower database where the query took more than 9 times as long (4 minutes 42 seconds) to complete.


Global Information  
------------------------------  
 STS              :  DONE (ALL ROWS)           
 Instance ID         :  1                         
 Execution Started   :  08/17/2018 08:21:47       
 First Refresh Time  :  08/17/2018 08:21:47       
 Last Refresh Time   :  08/17/2018 08:26:29       
 Duration            :  282s                      
 Module/Action       :  SQL*Plus/-                
 Program             :  sqlplus.exe               
 Fetch Calls         :  1                         
  
Global Stats  
================================================================  
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  |  
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |  
================================================================  
|     287 |    8.76 |      278 |     1 |   110K | 110K | 858MB |  
================================================================  
  
SQL Plan Monitoring Details (Plan Hash Value=715774357)  
======================================================================================================================================================================================  
| Id |            Operation            |          Name           |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |        Activity Detail        |  
|    |                                 |                         | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |          (# samples)          |  
======================================================================================================================================================================================  
|  0 | SELECT STATEMENT                |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  1 |   COUNT STOPKEY                 |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  2 |    INLIST ITERATOR              |                         |         |       |         1 |   +282 |     1 |        1 |      |       |          |                               |  
|  3 |     TABLE ACCESS BY INDEX ROWID | TAB1                    |    142K | 40211 |       282 |     +1 |     2 |        1 | 109K | 854MB |   100.00 | db file sequential read (277) |  
|  4 |      INDEX RANGE SCAN           | TAB1_STS_IDX            |    142K |   892 |       280 |     +3 |     2 |     118K |  491 |   4MB |          |                               |  
======================================================================================================================================================================================  
  
Execution Plan (autotrace) 
---------------------------------------------------------------------------------------------------------  
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                         |     1 |  1847 | 40211   (1)| 00:08:03 |  
|*  1 |  COUNT STOPKEY                |                         |       |       |            |          |  
|   2 |   INLIST ITERATOR             |                         |       |       |            |          |  
|*  3 |    TABLE ACCESS BY INDEX ROWID| TAB1                    |   141K|   249M| 40211   (1)| 00:08:03 |  
|*  4 |     INDEX RANGE SCAN          | TAB1_STS_IDX            |   141K|       |   892   (1)| 00:00:11 |  
---------------------------------------------------------------------------------------------------------  
  
Predicate Information (identified by operation id):  
---------------------------------------------------  
   1 - filter(ROWNUM<=1)  
   3 - filter("cnt"<'4' AND "dt"<SYSDATE@!)  
   4 - access("STS"='A' OR "STS"='B')  
            
Table stats
-----------
table_name    num_rows        blocks  empty_blocks  chain_cnt  avg_row_len             
TAB1        79,447,350   22,318,667            710    537,597        1,847  
  
column_stats(STS)
-----------------
table_name  column_name  num_distinct  num_nulls    density  avg_col_len  
TAB1        STS                     5          0  6.1789E-9            2  
  
Index_stats(on STS)
-------------------
index_name    leaf_blocks  distinct_keys  avg_leaf_blocks_per_key  avg_data_blocks_per_key  clustering_factor    num_rows
TAB1_STS_IDX      493,152              5                   98,630                4,382,625         21,913,127  79,106,263   


Session Stats
-------------
process last non-idle time              1,534,508,200
session connect time                    1,534,508,200
logical read bytes from cache             903,790,592
physical read total bytes                 899,629,056
cell physical IO interconnect bytes       899,629,056
physical read bytes                       899,629,056
file io wait time                         277,881,742
session pga memory                          8,586,744
session pga memory max                      8,586,744
temp space allocated (bytes)                4,194,304
session uga memory max                      1,690,184
session uga memory                          1,690,184
buffer is pinned count                        129,148
table fetch by rowid                          117,521
session logical reads                         110,326
consistent gets                               110,276
consistent gets from cache                    110,276
no work - consistent read gets                110,071
non-idle wait count                           109,879
free buffer requested                         109,830
physical read IO requests                     109,818
physical reads cache                          109,818
physical reads                                109,818
physical read total IO requests               109,818
buffer is not pinned count                    109,577
free buffer inspected                          70,740
hot buffers moved to head of LRU               31,910
DB time                                        28,203
non-idle wait time                             27,788
user I/O wait time                             27,788
dirty buffers inspected                        19,067
bytes sent via SQL*Net to client               14,927
bytes received via SQL*Net from client         10,607
redo size                                       5,440
undo change vector size                         4,432
table fetch continued row                       3,660

There are all sorts of noteworthy details in these two sets of information – some of the “how to see what’s in front of you” type, some of the “be careful, Oracle can deceive you” type. So I’m going to walk though the output picking up a number of background thoughts before commenting on the answer to the basic question.

We’ll start with the object statistics, then we’ll look at the SQL Monitor plan to see if we can determine where the extra time was spent, then we’ll try to work out what else the plan might be telling us about the code and data, then we’ll summarise my observations to make a claim about the difference in behaviour.

Object statistics

The table has 79M rows with average length of 1,847 bytes, using 22M blocks. With an 8KB block size and that average row size we would expect to see about 3 rows per block, and that’s fairly consistent with the value of rows / blocks.  We don’t know what the sample size was for this stats collection, but it might have been a “small” sample size rather than the the 100% you would get from using auto_sample_size, so that might also explain some discrepancy between the two different views on the figures.

We note that the secondary system reports a chain_cnt in excess of 500,000 rows. The only (unhacked) way that this figure could be set would be through a call to analyze statistics, and once the figure is there it won’t go away unless you use the analyze command again to delete statistics.  We don’t know the history of how and when the figure got there so it doesn’t guarantee that there are any chained or migrated rows, nor does the zero in the table stats on the primary system guarantee that it doesn’t have any chained or migrated rows – all it tells us is that at some time someone used the wrong command to gather stats and there were some (less than 1%) migrated or chained rows in the table at the time. (The optimizer will use this figure in its arithmetic if it is set, by the way, so it may affect some of the cost calculations – but not by a huge amount.)

The column sts reports 5 distinct values, no nulls, and a density of 6.2e-9 which is roughly half of 1/79M: so we have a frequency histogram on the column (in the absence of a histogram the density would be 1/5, and it’s reasonable to assume that the number of buckets was either the default or set to something larger than 5).  We were told that the system was running 11.2.0.4 – so we have to be a little suspicious about the accuracy of this histogram since it will have been sampled with a very small sample if the stats collection had used auto_sample_size. (12c will use a specially optimized 100% sample for frequency and top-N histograms when using auto_sample_size)

The index on sts has a clustering_factor of around 22M which is similar to the number of blocks in the table – and that’s not too surprising if there are are only a very small number of distinct values in the column – especially when the presence of the histogram suggest that there’s a skew in the data distribution. (There’s more to come on that point.) The number of leaf blocks is about 500,000 (being lazy about arithmetic) – just as a side note this suggests the index is running in a fairly inefficient state (and probably hasn’t been created with the compress keyword).

Doing a rough estimate of the index arithmetic :  the avg_col_len for sts is 2, so the space required for each index entry will be 13 bytes (2 for the column, 7 for the rowid content, 2 for the row header, 2 for the row directory entry).  Take off the block overhead, and assume the index is running at a “typical” 70% space usage per leaf block and you might expect 5,600 bytes used per leaf block for current index data and that works out to about 430 index entries per leaf block.  With 79M rows in the table that should lead to 79M/430 leaf blocks – i.e. roughly 184,000 leaf blocks, not 493,000 leaf blocks.  However it’s not unusual to see an index with extremely repetitive values operating at something like 50% utilisation, which would bring our estimate to about 310 rows per leaf block and 255,000 leaf blocks – which is still off by a factor of nearly 2 compared to what we’ve actually got. Again, of course, we have to be a little bit cautious about these statistics – we don’t know the sample size, and Oracle uses a surprisingly small number of blocks to sample the stats for an index.

Where’s the time.

The SQL Monitor gives us a very clear report of where most of the time went – almost all of it was spent in I/O waits, and almost all of the wait time was in the “table access by index rowid” opration in both cases; but the primary system did 38,377 read requests while the secondary did 109,000 read requests in that line of the plan. It is significant, though, that quite a lot (40%) of the ASH samples for that operation on the primary system were for “read by other session” rather than “db file sequential read”:  in other words some other session(s) were doing a lot of work to pull the data we wanted into the buffer cache at the same time. Apart from the fact that a wait for “read by other session” often means we spend less time waiting than if we’d had to do the read ourselves, the presence of this wait suggests that other sessions may be “pre-caching” data for us so that we end up having to read far fewer blocks than would otherwise be the case.

It’s important to note at the same time that the difference in Buffer Gets for the two systems was small – 102K vs. 110K – and the “Rows (actual)” was the same in both cases – 118K entries returned by the index range scan.  Both systems did similar amounts of “logical” work, to process similar amounts of data; the difference was the fraction of the work that required a buffer get to turn into a disc read or a “wait for other read”.

We might want to pick up a few more numbers to corroborate the view that the only significant difference was in the volume of data cached and not some more esoteric reason.  Some of the session statistics should help.


DB1:  table fetch by rowid                          117,519
DB2:  table fetch by rowid                          117,521

DB1:  undo change vector size                         4,432
DB2:  undo change vector size                         4,432

DB1:  redo size                                       5,536
DB2:  redo size                                       5,440

DB1:  session logical reads                         102,500
DB2:  session logical reads                         110,326

DB1:  no work - consistent read gets                102,368
DB2:  no work - consistent read gets                110,071

DB1:  table fetch continued row                       2,423
DB2:  table fetch continued row                       3,660

The number of rows fetched by rowid is virtually identical and we have done (virtually) no work that generates undo or redo – such as delayed block cleanout; there are no statistics shown for “%undo record applied” so we probably haven’t done very much work to get a read consistent view of the data though we can’t be sure that the OP simply failed to copy that stat into list supplied (but then the similarity of “session logical reads” to “no work – consistent read gets” confirms the hypothesis that we didn’t do any (significant) work on visiting undo blocks.

We do see a few percent increase in the number of buffer gets (“session logical reads”) – but this may reflect the fact that the actual pattern of data in one table is slightly different from the pattern in the other – thanks to ASSM the process id of the process that inserts a row into a table can affect (within a small range, usually) the block into which the row is inserted; but at 102,000 / 110,000 buffer gets to visit 117,500 rows in the table we can see that there must be some table blocks that hold two (or more) rows that are identified as consecutive in the index – leading to some row visits being achieved through a buffer pin and without a fresh buffer get. You’ll note that this argument is consistent with the small variation in clustering_factor (not that we entirely trust those figures) for the two indexes – the system with the lower clustering_factor for the index has done fewer buffer gets to acquire the same number of rows from the table – by definition that means (assuming default setup) that there are more cases where “the next table row” is in the same block as the current row.

The final figure I’ve shown is the “table fetch continued rows”: according to the table stats (which we don’t necessarily trust completely) 500K out of 79M rows are chained/migrated which is roughly 0.6%. We know that we’re visiting about 117K table rows so might expect (on average) roughly the same percentage migrated/chained viz: 0.6% of 117K = 743, so there’s a little anomaly there (or an error in our assumption about “average” behaviour.  It’s worth noting, though, that a “continued fetch” would have to do an extra buffer visit (and maybe an extra physical read).  You might wonder, of course, how there could be any chained or migrated rows when the average row length is 1,847 bytes but in a follow-up post the OP did say there were 3 BLOB columns in the table, which can cause havoc with interpreting stats for all sorts of reasons. We don’t have any information about the table structure – particularly whether the columns in the query appear before or after the BLOB columns in the table definition – and we don’t know what processing takes place (for example, maybe the 3rd BLOB is only updated after the sts column has been changed to a value other than A or B which would help to explain why we shouldn’t be using the 0.6% calculation above as a table-wide average), so we’re not in a position to say why any of the continued fetches appear but there are several guesses we could make and they’re all easy to check.

Plan observations

If we examine row estimates we see that it 114K for the faster plan and 141K for the slower plan (with a closely corresponding variation in cost). The difference in estimates simply tells us that the histogram gathering was probably a small sample size and subject to a lot of variation. The scale of the estimates tells us that the A and B rows are probably rare – call it 125K out of 79M rows, about 0.16% of the total rows in the table, so it would not be surprising to see consecutive samples for the histogram producing significant variations in estimates.

The other interesting thing we can note in the SQL Monitor plan is that the Starts column for the index range scan / table access operations in both plans shows the value 2: this means that there are no “A” rows that match the other predicates:  Oracle has run the “A” iteration to completion then started the “B” iteration and found a row on the second iteration. Is this a coincidence, or should it always happen, or is it only fairly likely to happen; is it possible to find times when there are no suitable “B” rows but plenty of suitable “A” rows. The final predicate in the query is “rownum <= 1” – so the query is picking one row with no explicit strategy for choosing a specific row when there are multiple appropriate rows, does this mean that we could optimizer the query by rewriting it as a “union all” that searched for B rows first and A rows second ? We just don’t know enough about the processing.

In passing, we can’t get Oracle to search the B rows first by changing the order of the in-list.  If you have a predicate like “where sts in ({list of literal values})” the optimizer will sort the list to eliminate duplicates before rewriting the predicate as a list of disjuncts, and then (if the path uses an iterator) iterate through the list in the resulting order.

In the absence of information about the way in which the data is processed we can only say that we need to avoid visiting the table so frequently. To do this we will need to add one or both of the columns from the other predicates to the index – this might double the size of the index, but eliminate 95% of the potential I/O.  For example if we discover that A and B rows are initially created “into the future” and this query is looking for a row whose “time has come” so that it can be processed and changed to an X row (say) then there may only ever be a tiny number of rows where the “sts = A and the dt < sysdate” and an index on (sts, dt) would be a perfect solution (especially if it were compressed on at least the first column).

The OP has declared a reluctance to add an index to the table – but there are two points to go with this indexing strategy. Since we know there’s a frequency histogram and the A and B rows appear to be rare values – what benefit is there in having an index that covers the other values (unless 2 of the remaining 3 are also rare).  How about creating a function-based index that represents only the rare values and modifying this code to use that index – e.g.

create index t1_id on t1 (
        case sts when 'A' then sts when 'B' then sts end,
        case sts when 'A' then dt  when 'B' then dt  end
) compress 1
;

select  *
from    t1
where   case sts when 'A' then sts when 'B' then sts end in ('A','B')
and     case sts when 'A' then dt  when 'B' then dt  end < sysdate
and     cnt < '4'
and     rownum <= 1
/


You might be able to replace a huge index (79M rows worth) with this small one (120K rows worth) unless there’s too much other code in the system that has to be adjusted or the sts column is actually the target of a referential integrity constraint; at worst you could add this index knowing that it’s generally not going to consume much in the way of extra space or processing resources and is going to save you a lot of work for this query.

Summary

The execution plan from SQL Monitor points very strongly to the fast system benefiting from having a lot of the relevant data cached and constantly being reloaded into the cache by other sessions while the slow system has to acquire almost all of its data by real phyiscal reads. Most of the reads address the table so engineering an index that is low-cost and (fairly) high precision is the most significant strategy for reducing the workload and time on the slow system.

The fact that all the potential A rows fail to match the full predicate set suggests that there MAY be some aspect of the processing that means it would be more efficient to check for B rows before checking for A rows.

Given the massive skew in the data distribution a function-based index that hides all the non-popular values (or even all the values that are not of interest to this query) may be the most cost-effective way of adding a very effective index to the table with minimal resource requirements.

And finally

It’s taken me more than 4 hours to write this note after spending about 10 minutes reading through the information supplied by the OP and identifying and cross-checking details. A small fraction of the 4+ hours was spent creating a little model to check something I had to say about in-lists, the rest of it was trying to write up a coherent description covering all the details.

That’s it for today, but I may have missed a couple of points that I noticed as I read the OP’s posting; and I will want to do a little cosmetic work on this article and check grammar and spelling over the next couple of days.

Update (already)

Shortly after I posted this blog note the owner of the question reported the following as the distribution of values for the sts column:

 STS   COUNT(*)
---- ----------
   A          6
   E        126
   D        866
   C   80212368
   B     117631

Two things stand out about these figures – first it’s an ideal example of a case where it would be nice avoid having index entries for the 80 million ‘C’ rows. Depending on the coding and testing costs, the supportability of the application and the possible side effects this could be done with a function-based index, or by introducing a virtual column that hides the ‘C’s behing a NULL, or by changing the code to use NULL instead of ‘C’.

Secondly – I made a comment about rewriting the code to query the B’s before the A’s. But we saw that Oracle worked through about 117,000 rows before returning a result: so the fitures above tell us that it must have worked through almost all the B’s and the handful of A’s was just a tiny little blip before it got to the B iteration – so there’s no point in making that change.

My suggestion for the function-based index above could be modified in two ways, of course – add two more “when”s to each “case” to capture the D and E rows, or take the opposite viewpoint and create an index on expressions like: “case sts when ‘C’ then to_char(null) else sts end”. The benefit of the latter approach is that you don’t have to modify the index definition (and rebuild the index) if a new legal value for sts appears.

July 12, 2018

Cardinality Puzzle

Filed under: Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 12:57 pm GMT Jul 12,2018

One of the difficulties of being a DBA and being required to solve performance problems is that you probably never have enough time to think about how you got to a solution and why the solution works; and if you don’t learn about the process itself , you just don’t get better at it. That’s why I try (at least some of the time) to write articles and books (as I did with CBO Fundamentals) that

  1. explain simple details that can be used as background facts
  2. walk through the steps of solving a problem

So here’s an example from a question on the ODC database forum asking about the cause and workaround for a bad cardinality estimate that is producing a poorly performing execution plan. It’s actually a type of problem that comes up quite frequently on large data sets and explains why a simple “gather stats” is almost guaranteed to leave you with a few headaches (regardless of whether or not you choose to include histograms as part of the process). I’m not going to offer “the answer” – I’m just going to talk about the inferences we can make from the facts supplied and where we have to go from there.

The DBA has a table holding 80,000,000,000 rows. It is list/hash partitioned with 2 partitions and 1,024 sub-partitions (per partition) but neither of the partitioning key columns appears in the query. The query runs parallel and the optimizer (presumably thanks to the specific settings of various parameters related to parallel execution uses dynamic sampling at level 3).

There is an inline view defined in the query and the DBA has isolated this as a key component of the problem and supplied a query and plan (from “explain plan”) against that view.


select * from TAB2 T
WHERE T.DT = to_date(:b1,'MM/DD/YYYY HH24:MI:SS');
 
------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                          |   479M|    76G|  1756K (14)| 05:51:14 |       |       |        |      |            |
|   1 |  PX COORDINATOR              |                          |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000                 |   479M|    76G|  1756K (14)| 05:51:14 |       |       |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX PARTITION HASH ALL     |                          |   479M|    76G|  1756K (14)| 05:51:14 |     1 |  1024 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS STORAGE FULL| TAB1                     |   479M|    76G|  1756K (14)| 05:51:14 |     1 |  2048 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - storage(COALESCE("TB1"."DFG",'N')='N' AND TRUNC(INTERNAL_FUNCTION("TB1"."DT_TM"))=TO_DATE(:B1,'MM/DD/YYYY
              HH24:MI:SS'))
       filter(COALESCE("TB1"."DFG",'N')='N' AND TRUNC(INTERNAL_FUNCTION("TB1"."DT_TM"))=TO_DATE(:B1,'MM/DD/YYYY
              HH24:MI:SS'))

Note
-----
   - dynamic sampling used for this statement (level=3)

The DBA’s problem is that if the estimated cardinality of this extract goes over roughly 500M the optimizer chooses a bad plan for the overall query – and on occasion this extract has given an estimate of 5 billion rows. Moreover, the actual number of rows returned by this extract is typically in the order of 40M, so the estimate is a long way off even when it’s “good enough”.

So where do we start looking to work out what’s going wrong? You’ll note, of course, that after text expansion the user’s single predicate has changed, and an extra predicate (previously hidden inside the view) has appeared; instead of just T.DT = to_date(:b1,’MM/DD/YYYY HH24:MI:SS’) we now have (cosmetically adjusted):

        COALESCE(DFG,'N')='N' 
AND     TRUNC(DT_TM)=TO_DATE(:B1,'MM/DD/YYYY HH24:MI:SS')

There are two immediately obvious threats here – first that the combination of predicates means Oracle is likely to make a mistake because it will check the individual selectivities and multiply them together to get the combined selectivity, second that the appearance of predicates of the form “function(column) = constant” means that Oracle will guess 1% as the individual selectivities.

Without checking more details we might assume that a possible quick fix (that would require no changes to existing code) would be to create a couple of virtual columns (or extended stats) to represent the two expressions and gather stats on the resulting columns – though it is a restriction of extended stats that you can’t “double up” and create a column group on the two column expressions, so there’s still some scope for a cardinality estimate that is still sufficiently bad even with this approach. We also note that if we can change the coalesce(DFG,’N’) that must have been hidden in the view to nvl(DFG,’N’) then Oracle would be able to “or expand” the nvl() and use a more appropriate selectivity for that part of the predicate.

However, the points I’ve covered so far tend to produce estimates that are too small and often much too small. So maybe the key to the problem is in the Note section that tells us that Oracle has (successfully) used dynamic sampling for this statement. In other words, all the theory of how the optimizer calculates selectivity may be irrelevant – the estimate will be based on the luck of the sample.

So let’s take a look at the (slightly edited) table stats we’ve been given:

column_name data_type num_distinct low_value      high_value     density   num_null  histogram
DT_TM       DATE           6179571 78740B1E0A383C 7876020B01262B 1.6182E-7 0         NONE
DFG         VARCHAR2             1 4E             4E             1         0         NONE

Notice that the DFG (apparently) has the value ‘N’ for every row in the table (low_value = high_value = 0x4E, num_nulls = 0). The date range is 30-Nov-2016 to 11-Feb-2018, with no histogram but 6.18M distinct values for 80 Billion rows. Neither column has a histogram.

A little arithmetic tells us that (on average) there ought to be about 182M (= 80B / 438 days) rows for any one day – and that’s worth thinking about for three separate reasons.

First, an estimate of 479M against an average of 182M isn’t too surprising if it’s based on a fairly small sample, it’s only out by a factor of 2.6. On the other hand, getting an an estimate of 5 billion – which can happen on bad days – is extremely unlikely if the data is uniformly distributed across dates.

Secondly, the DBA supplied us with some data from the recent past with an aggregate query for “trunc(dt_tm)”, with the following results:

TRUNC(DT_TM)   COUNT(*)
------------ ----------
01-FEB-18    44,254,425
02-FEB-18    46,585,349
03-FEB-18    43,383,099
04-FEB-18    32,748,364
05-FEB-18    37,993,126
06-FEB-18    39,708,994
07-FEB-18    38,696,777
08-FEB-18    41,871,780
09-FEB-18    46,702,852
10-FEB-18    42,744,870
11-FEB-18    34,971,845
12-FEB-18    37,165,983

Recent data seems to follow an average of around 40M rows per day, so the estimate of 182M that we can derive from the stored statistics is a long way off: the present is behaving very differently from the past and that’s a relatively common problem with very large data sets – though it’s more usual for rolling averages to increase from the past to the present because the data is often representing the growth of a business over time. Can we create a hypothesis to explain the discrepancy, and could that hypothesis also account for the sample producing some very strange estimates ?

Finally, slightly more subtle and only included for completeness, if this column is supposed to hold date and time to the nearest second – which is what you might expect from an Oracle date type – there are 38 million possible values (438 x 86,400) it could be holding, and that’s more than the actual number of distinct values by a factor of 6. We can also work out that 80 billion rows over 438 days is 2,000 rows per second (on average). Averages are often misleading, of course, many systems have a pattern where a working day shows most of the data created in a 12 – 16 hour window with a couple of hours of more intense activity. For reference, though: average rows per second for the recent data is roughly 40M/86400 = 460; while the average we derive from the stored statistics is 80B / 6M = 13000 rows per second; this unlikely pattern needs a “non-uniform” explanation.

How do these three thoughts help us to understand or, to be more accurate, to make a sensible guess about why the optimizer can use dynamic sampling and get a wildly variable estimate which can be 1 or 2 orders of magnitude wrong. (varying between 479M and 5,000M compared to the recent actual 40M)?

Here’s one simple idea: extrapolate the 40M rows per day over 80B rows: that’s 2,000 days (possibly rather more since businesses tend to grow). What if the dt_tm is the timestamp for the moment the row was loaded into the database, and a couple of years ago (maybe around “30th Nov 2016”) the data was restructured and the existing five years of data was loaded over a very short period of time – let’s say one week. This would leave you with 17B rows of “new” data with a dt_tm spread at 40M rows per day for most of 438 days, and 63B rows of “historic” data packed into 7 days (at 9B rows per day).

I don’t know how Oracle would have randomly selected its sample from an extremely large table with 2,048 physical data segments but it’s totally believable that a small, widely scattered sample could end up with an extremely unrepresentative subset of the data. A completely random sample of the data would produce an estimate of around 500M rows for the predicate; but it would only take a fairly small variation in the sample (taking a few too many “historic” rows) to produce a large enough change in the estimate to change the execution plan, and a rare, but not extreme variation could easily take the estimate up to 5B.

Next Steps

It would be at this point in a performance assignment that I’d be asking around to find out if my guess about a massive data load operation was correct – if I couldn’t get the answer by talking to people I’d run a query against the whole data set to check the hypothesis, because there’s clearly some sort of skew in the data that’s causing a problem. I’d also run the critical part of the query a couple of times with events 10046/level 4 and 10053 set (but only fetching the first few rows) to find out from the trace file how large a sample Oracle was using, and then run the sampling query a few times to see what the sampled results looked like. Depending on the results I’d either find a way to stop Oracle from sampling for this query or I might create a virtual column (or just extended stats since it’s 11g) on just the trunc(dt_tm), possibly with a histogram in place (maybe coded by hand) if that could isolate the special dates and leave Oracle with a better estimate of the typical date. I might find I had to change the coalesce() to an nvl() as well – or create a virtual  column – to stop the sampling.

Finally, it’s worth noting that in 11g it’s possible to create pending (table preference “PUBLISH” = FALSE) stats for testing purposes; it’s also worth noting that the default histogram on trunc(dt_tm) would be a height-balanced histogram while we could create a frequency histogram in 12c since 12c allows us to specify up to 2,048 buckets.

Footnote

If you check the ODC thread you’ll see that the OP has marked as correct a suggestion to change:

    TRUNC (TB1.DT_TM)  = to_date(:b1,'MM/DD/YYYY HH24:MI:SS');  

to

    dt_tm >= trunc(to_date(:b1,'MM/DD/YYYY HH24:MI:SS'))
and dt_tm <  trunc(to_date(:b1,'MM/DD/YYYY HH24:MI:SS'))+1

Note that that’s “greater than or equal to” at one end and “strictly less than” at the other when using “date + 1”.

This has the effect of giving the optimizer a chance of using the low/high values of the column to produce a better (though perhaps still overlarge) and consistent estimate of the rows in the date range; and it may also stop the optimizer from doing dynamic sampling at level 3 (the “I’m guessing, let’s check” level) though it’s possible that the sampling would be disabled only if the coalesce() were changed to an nvl() as well.

Of course, from the information supplied, this looks like the OP would have to change a view definition and the run-time code to achieve the result. But in an ideal world doing things that avoid confusing the optimizer is usually the sensible strategy provided it doesn’t take an extreme amount of coding and testing.

 

June 23, 2018

Cursor_sharing force

Filed under: CBO,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 12:05 pm GMT Jun 23,2018

Prompted by a recent ODC (OTN) question I’ve just written up an example of one case where setting the cursor_sharing parameter to force doesn’t work as you might expect. It’s a specific example of what I believe is a theme that can appear in several different circumstances: if your SQL mixes “genuine” bind variable with literals then the literals may not be substituted.

Here’s a simple data set to start with:


rem
rem     Script:         cursor_sharing_limit.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2018
rem     Purpose:
rem
rem     Last tested
rem             18.1.0.0
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1
as
select
        rownum            n1,
        rownum            n2,
        lpad(rownum,10)   small_vc,
        rpad('x',100,'x') padding
from dual
connect by
        level <= 1e4 -- > comment to avoid WordPress format issue
;

I haven’t bothered to gather stats – it’s not needed in trivial cases of CTAS.

Here’s one way to execute SQL that doesn’t get literal substitution when (apparently) it should:


alter system flush shared_pool;
alter session set cursor_sharing=force;

declare
        m_ct number;
        m_n1 number := 20;
begin
        execute immediate
                'select /*+ trace this */ count(*) from t1 where n2 = 15 and n1 = :b1'
                into m_ct using m_n1;
        dbms_output.put_line(m_ct);

        execute immediate
                'select /*+ trace this too */ count(*) from t1 where n1 = 15 and n2 = 15'
                into m_ct;
        dbms_output.put_line(m_ct);
end;
/

alter session set cursor_sharing=exact;

I’ve included a “hint” that allows me to find the SQL statements in v$sql very easily – and here they are, preceded by the query I used to find them:


select  sql_id, parse_calls, executions, rows_processed, sql_text 
from    v$sql
where   sql_text like 'select%trace this%'
and     sql_text not like '%v$sql%'
;


SQL_ID        PARSE_CALLS EXECUTIONS ROWS_PROCESSED
------------- ----------- ---------- --------------
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------------------
cbu4s78h5pfj5           1          1              1
select /*+ trace this too */ count(*) from t1 where n1 = :"SYS_B_0" and n2 = :"SYS_B_1"

cru67sufptx8x           1          1              1
select /*+ trace this */ count(*) from t1 where n2 = 15 and n1 = :b1

Notice how one statement – which used only literals – underwent bind variable transformation but the other – with its mix of literals and bind variables – didn’t. On the other hand, if I execute the mixed format statement from a basic SQL*Plus session then pull the plan from memory, this is what I get:


SQL> select /*+ SQL*Plus session */ count(*) from t1 where n2 = 15 and n1 = :b1;

  COUNT(*)
----------
         0

1 row selected.

SQL> select  sql_id, parse_calls, executions, rows_processed, sql_text
  2  from    v$sql
  3  where   sql_text like 'select%Plus session%'
  4  and     sql_text not like '%v$sql%'
  5  ;

SQL_ID        PARSE_CALLS EXECUTIONS ROWS_PROCESSED
------------- ----------- ---------- --------------
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------------------
gq2qy2a9yuta7           1          1              1
select /*+ SQL*Plus session */ count(*) from t1 where n2 = :"SYS_B_0" and n1 = :b1

When trying to take advantage of cursor_sharing = force, you have to think about how the SQL gets to the database engine – is it delivered directly, or is it sent through some layer of code that means the raw SQL statement is protected in some way from the substitution code.

Footnote:

I happen to have used an example that puts the bind variable into the where clause; you will see the same effect even if the bind variables are in the select list – for example if you’ve selected something like to_char(date_col, :formatting_variable).

April 6, 2018

SQL Monitor

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 7:50 am GMT Apr 6,2018

I’ve mentioned the SQL Monitor report from time to time as a very useful way of reviewing execution plans – the feature is automatically enabled by parallel execution and by queries that are expected to take more than a few seconds to complete, and the inherent overheads of monitoring are less than the impact of enabling the rowsource execution statistics that allow you to use the ‘allstats’ format of dbms_xplan.display_cursor() to get detailed execution information for a query.

The drawback to the SQL Monitor feature is that it doesn’t report predicate information. It’s also important to note that it falls under the performance and diagnostic licences: some of the available performance information comes from v$active_session_history, and the report is generated by a call to the dbms_sqltune package.  [Update: in 12c the reports are also became available in the dbms_sql_monitor package]

There are two basic calls – report_sql_monitor_list(), which appeared in 11.2, produces a summary of the statements and their individual executions (from the information that is still in memory, of course) and report_sql_monitor() shows detailed execution plans. Here’s a simple bit of SQL*Plus code showing basic use – it lists a summary of all the statements monitored in the last half hour, then (as it stands at present) the full monitoring details of the most recently completed monitored statement:


set long 250000
set longchunksize 65536

set linesize 254
set pagesize 100
set trimspool on

set heading off

column text_line format a254

spool report_sql_monitor

select 
        dbms_sqltune.report_sql_monitor_list(
                active_since_date       => sysdate - 30 / (24*60),
                type                    => 'TEXT'
        ) text_line 
from    dual
;

select 
        dbms_sqltune.report_sql_monitor(
--              sql_id                  => '&m_sql_id',
--              start_time_filter       => sysdate - 30/(24 * 60),
--              sql_exec_id             => &m_exec_id,
                type                    =>'TEXT'
        ) text_line 
from    dual
;

spool off




Here’s a variation that reports the details of the most recently completed execution of a query with the specified SQL_ID:

set linesize 255
set pagesize 200
set trimspool on
set long 200000

column text_line format a254
set heading off

define m_sql_id = 'fssk2xabr717j'

spool rep_mon

SELECT  dbms_sqltune.report_sql_monitor(
                sql_id=> v.sql_id,
                sql_exec_id => v.max_sql_exec_id
        ) text_line
from     (
        select
                sql_id,
                max(sql_exec_id)        max_sql_exec_id
        from
                v$sql_monitor
        where
                sql_id = '&m_sql_id'
        and     status like 'DONE%'
        group by
                sql_id
        )       v
;

spool off

set heading on
set linesize 132
set pagesize 60

And a sample of the text output, which is the result of monitoring the query “select * from dba_objects” (with an arraysize of 1,000 set in SQL*Plus):


SQL Monitoring Report

SQL Text
------------------------------
select /*+ monitor */ * from dba_objects

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  SYS (262:54671)
 SQL ID              :  7nqa1nnbav642
 SQL Execution ID    :  16777216
 Execution Started   :  04/05/2018 19:43:42
 First Refresh Time  :  04/05/2018 19:43:42
 Last Refresh Time   :  04/05/2018 19:45:04
 Duration            :  82s
 Module/Action       :  sqlplus@linux12 (TNS V1-V3)/-
 Service             :  SYS$USERS
 Program             :  sqlplus@linux12 (TNS V1-V3)
 Fetch Calls         :  93

Global Stats
===========================================================================
| Elapsed |   Cpu   |    IO    |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes |
===========================================================================
|    0.31 |    0.29 |     0.00 |     0.02 |    93 |   6802 |   18 |   9MB |
===========================================================================

SQL Plan Monitoring Details (Plan Hash Value=2733869014)
=================================================================================================================================================================================
| Id |                Operation                 |       Name       |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity | Activity Detail |
|    |                                          |                  | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |   (# samples)   |
=================================================================================================================================================================================
|  0 | SELECT STATEMENT                         |                  |         |      |        83 |     +0 |     1 |    91314 |      |       |       |          |                 |
|  1 |   VIEW                                   | DBA_OBJECTS      |   91084 | 2743 |        83 |     +0 |     1 |    91314 |      |       |       |          |                 |
|  2 |    UNION-ALL                             |                  |         |      |        83 |     +0 |     1 |    91314 |      |       |       |          |                 |
|  3 |     TABLE ACCESS BY INDEX ROWID          | SUM$             |       1 |      |           |        |       |          |      |       |       |          |                 |
|  4 |      INDEX UNIQUE SCAN                   | I_SUM$_1         |       1 |      |           |        |       |          |      |       |       |          |                 |
|  5 |     TABLE ACCESS FULL                    | USER_EDITIONING$ |       1 |    2 |         1 |     +0 |   872 |        1 |      |       |       |          |                 |
|  6 |      TABLE ACCESS BY INDEX ROWID BATCHED | OBJ$             |       1 |    3 |           |        |       |          |      |       |       |          |                 |
|  7 |       INDEX RANGE SCAN                   | I_OBJ1           |       1 |    2 |           |        |       |          |      |       |       |          |                 |
|  8 |     FILTER                               |                  |         |      |        83 |     +0 |     1 |    91312 |      |       |       |          |                 |
|  9 |      HASH JOIN                           |                  |   91394 |  211 |        83 |     +0 |     1 |    91312 |      |       |    2M |          |                 |
| 10 |       TABLE ACCESS FULL                  | USER$            |     125 |    2 |         1 |     +0 |     1 |      125 |      |       |       |          |                 |
| 11 |       HASH JOIN                          |                  |   91394 |  207 |        83 |     +0 |     1 |    91312 |      |       |    1M |   100.00 | Cpu (1)         |
| 12 |        INDEX FULL SCAN                   | I_USER2          |     125 |    1 |         1 |     +0 |     1 |      125 |      |       |       |          |                 |
| 13 |        TABLE ACCESS FULL                 | OBJ$             |   91394 |  204 |        83 |     +0 |     1 |    91312 |   13 |   9MB |       |          |                 |
| 14 |      TABLE ACCESS FULL                   | USER_EDITIONING$ |       1 |    2 |         1 |     +0 |   872 |        1 |    2 | 16384 |       |          |                 |
| 15 |      NESTED LOOPS SEMI                   |                  |       1 |    2 |           |        |       |          |      |       |       |          |                 |
| 16 |       INDEX SKIP SCAN                    | I_USER2          |       1 |    1 |           |        |       |          |      |       |       |          |                 |
| 17 |       INDEX RANGE SCAN                   | I_OBJ4           |       1 |    1 |           |        |       |          |      |       |       |          |                 |
| 18 |      TABLE ACCESS FULL                   | USER_EDITIONING$ |       1 |    2 |           |        |       |          |      |       |       |          |                 |
| 19 |     HASH JOIN                            |                  |       2 |    4 |         1 |    +82 |     1 |        1 |      |       |       |          |                 |
| 20 |      NESTED LOOPS                        |                  |       2 |    4 |         1 |    +82 |     1 |        2 |      |       |       |          |                 |
| 21 |       STATISTICS COLLECTOR               |                  |         |      |         1 |    +82 |     1 |        2 |      |       |       |          |                 |
| 22 |        TABLE ACCESS FULL                 | LINK$            |       2 |    2 |         1 |    +82 |     1 |        2 |    2 | 16384 |       |          |                 |
| 23 |       TABLE ACCESS CLUSTER               | USER$            |       1 |    1 |         1 |    +82 |     2 |        2 |      |       |       |          |                 |
| 24 |        INDEX UNIQUE SCAN                 | I_USER#          |       1 |      |         1 |    +82 |     2 |        2 |    1 |  8192 |       |          |                 |
| 25 |      TABLE ACCESS FULL                   | USER$            |       1 |    1 |           |        |       |          |      |       |       |          |                 |
=================================================================================================================================================================================


1 row selected.


In a future note I’ll show an example of using one of these reports to identify the critical performance issue with an SQL statement that was raised recently on the ODC (OTN) database forum, but I’ll just point out one detail from this report. The “Time active (s)” says the query ran for about 83 seconds, but the Global Stats section tells us the elapsed time was 0.31 seconds. In this case the difference between these two is the time spent passing the data to the client.

Footnote

It is possible to force monitoring for an SQL statement with the /*+ monitor */ hint. Do be careful with this in production systems; each time the statement is executed the session will try to get the “Real-time descriptor latch” which is a latch with no latch children so if you monitor a lightweight statement that is called many times from many sessions you may find you lose a lot of time to latch contention and the attendant CPU spinning.

 

March 13, 2018

Deferred Invalidation

Filed under: 12c,CBO,Infrastructure,Oracle,Troubleshooting,Upgrades — Jonathan Lewis @ 6:30 pm GMT Mar 13,2018

I was going to write an article on the way 12.2 has introduced the option for “deferred invalidation” for a number of DDL operations, but I did a quick google search before I started writing and found that both Franck Pachot and Richard Foote (yes, rebuild index is one of the operations) had got there long ago so here are a couple of links – as much for my own benefit as anything else:

Richard Foote:

Franck Pachot:

Franck’s 2nd example may be particularly relevant to some clients of mine who were having problems with SQL queries that were crashing (slowly and randomly) instead of running very efficiently because they were running queries against one subpartition of a table while another subpartition of the same table was subject to exchange. With a little bad luck in the timing an exchange that took place between a parse and an execute would cause a query to have its cursor invalidated and re-parsed in a way that failed to do (sub-)partition elimination the way it should have because the local indexes were in an indeterminate state.

 

March 12, 2018

Comparing Plans

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 8:01 am GMT Mar 12,2018

It can be difficult to find the critical differences when comparing execution plans when you want to find out why the optimizer has changed its choice of plan and what may have happened to cause the change, and even the various diff_plan_xxx() functions in dbms_xplan don’t help very much, so I thought I’d write up an example that appeared recently on the ODC database forum to give people some ideas about how to approach the problem. There is, however, no simple algorithm that you can apply to narrow your focus down to the most probable cause of change, there are simply a few methods that have to be applied with a little flair and imagination.

We start with a query that has a “bad” plan and a “good” plan, with the implied questions: “What changed?” and “How do we get the good plan back?” If you’ve managed to capture the good plan the instant answer to “getting it back” is to create an SQL Plan Baseline for the query – but that’s really not a desirable thing to do every time a plan changes, and it’s not necessarily possible if the query keeps changing slightly over time. Ideally you should try identify why the plan can vary and work out how to manage that variation.

In theory you could collect all the statistics for all the tables and indexes involved in the query, making sure you’ve got both the “before” and “after” statistics; then you need to check for the actual values used for any bind variables just in case some specific values have a particular impact on the optimizer’s calculations; but gathering all that data is a lot of work, and having all the raw data that lets you (in theory) deduce why the plan has changed will probably not help very much unless you’re very lucky.

If you have to go through an exhaustive analysis to solve the problem it’s a lot of tedious work. (This, in part, is why tools like SQLd360 and SQLTXPLAIN are so helpful – as a first step they make it easy to collect a lot of the information you might need.) But with a simple text editor and a purely visual approach it’s often quite easy to narrow the focus down to the most relevant part of the plan – and that’s what I’m going to demonstrate with this example.

This posting is going to get quite long because the good and bad plans are 108 and 110 lines respectively, so I’m going to use the “hide/reveal” html codes inline as we work through a few steps of analysis that eliminate sections of the plan and make it easier to read and understand. So here’s the bad plan:

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                               |       |       |       |   244K(100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID               | ACCOUNT                       |     1 |    25 |       |     5  (20)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS              |                               |       |       |       |            |          |
|   3 |    BITMAP AND                              |                               |       |       |       |            |          |
|   4 |     BITMAP INDEX SINGLE VALUE              | IDX_ACCOUNT_OBJ_SOURCE        |       |       |       |            |          |
|   5 |     BITMAP INDEX SINGLE VALUE              | IDX_ACCOUNT_OBJ_TYPE          |       |       |       |            |          |
|   6 |     BITMAP CONVERSION FROM ROWIDS          |                               |       |       |       |            |          |
|   7 |      SORT ORDER BY                         |                               |       |       |       |            |          |
|   8 |       INDEX RANGE SCAN                     | UQ_ACCOUNT                    |     5 |       |       |     2   (0)| 00:00:01 |
|   9 |   TABLE ACCESS BY INDEX ROWID              | ACCOUNT                       |     1 |    25 |       |     5  (20)| 00:00:01 |
|  10 |    BITMAP CONVERSION TO ROWIDS             |                               |       |       |       |            |          |
|  11 |     BITMAP AND                             |                               |       |       |       |            |          |
|  12 |      BITMAP INDEX SINGLE VALUE             | IDX_ACCOUNT_OBJ_SOURCE        |       |       |       |            |          |
|  13 |      BITMAP INDEX SINGLE VALUE             | IDX_ACCOUNT_OBJ_TYPE          |       |       |       |            |          |
|  14 |      BITMAP CONVERSION FROM ROWIDS         |                               |       |       |       |            |          |
|  15 |       SORT ORDER BY                        |                               |       |       |       |            |          |
|  16 |        INDEX RANGE SCAN                    | UQ_ACCOUNT                    |     5 |       |       |     2   (0)| 00:00:01 |
|  17 |  HASH JOIN RIGHT OUTER                     |                               | 93634 |    28M|       |   244K  (1)| 00:49:00 |
|  18 |   VIEW                                     | X_SECURITY_ACCOUNTS_V         |   434 | 12152 |       |   368   (2)| 00:00:05 |
|  19 |    HASH GROUP BY                           |                               |   434 |  7812 |       |   368   (2)| 00:00:05 |
|  20 |     VIEW                                   |                               |   434 |  7812 |       |   367   (2)| 00:00:05 |
|  21 |      SORT UNIQUE                           |                               |   434 |   101K|       |   367   (2)| 00:00:05 |
|  22 |       UNION-ALL                            |                               |       |       |       |            |          |
|  23 |        CONNECT BY WITH FILTERING (UNIQUE)  |                               |       |       |       |            |          |
|  24 |         CONCATENATION                      |                               |       |       |       |            |          |
|  25 |          INLIST ITERATOR                   |                               |       |       |       |            |          |
|  26 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |     1 |    69 |       |     7   (0)| 00:00:01 |
|  27 |            INDEX RANGE SCAN                | UQ_ACCOUNT                    |     1 |       |       |     6   (0)| 00:00:01 |
|  28 |          INLIST ITERATOR                   |                               |       |       |       |            |          |
|  29 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |     1 |    69 |       |     7   (0)| 00:00:01 |
|  30 |            INDEX RANGE SCAN                | UQ_ACCOUNT                    |     1 |       |       |     6   (0)| 00:00:01 |
|  31 |         NESTED LOOPS                       |                               |    36 |  4752 |       |    34   (0)| 00:00:01 |
|  32 |          CONNECT BY PUMP                   |                               |       |       |       |            |          |
|  33 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                       |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  34 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT            |    18 |       |       |     2   (0)| 00:00:01 |
|  35 |        CONNECT BY WITH FILTERING (UNIQUE)  |                               |       |       |       |            |          |
|  36 |         CONCATENATION                      |                               |       |       |       |            |          |
|  37 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                       |     1 |    69 |       |     4   (0)| 00:00:01 |
|  38 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT            |     1 |       |       |     3   (0)| 00:00:01 |
|  39 |          INLIST ITERATOR                   |                               |       |       |       |            |          |
|  40 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |    20 |  1380 |       |    47   (0)| 00:00:01 |
|  41 |            INDEX RANGE SCAN                | IDX_ACCOUNT_PARENT            |    20 |       |       |    39   (0)| 00:00:01 |
|  42 |         NESTED LOOPS                       |                               |   374 | 49368 |       |   261   (0)| 00:00:04 |
|  43 |          CONNECT BY PUMP                   |                               |       |       |       |            |          |
|  44 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                       |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  45 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT            |    18 |       |       |     2   (0)| 00:00:01 |
|  46 |        TABLE ACCESS BY INDEX ROWID         | ACCOUNT                       |     1 |    26 |       |     4   (0)| 00:00:01 |
|  47 |         INDEX RANGE SCAN                   | IDX_ACCOUNT_PARENT            |     1 |       |       |     3   (0)| 00:00:01 |
|  48 |   HASH JOIN RIGHT OUTER                    |                               | 52211 |    14M|       |   244K  (1)| 00:48:55 |
|  49 |    VIEW                                    | X_SECURITY_OBJECTS_V          |  1976 | 55328 |       |  3303   (1)| 00:00:40 |
|  50 |     HASH GROUP BY                          |                               |  1976 | 35568 |       |  3303   (1)| 00:00:40 |
|  51 |      VIEW                                  |                               |  1976 | 35568 |       |  3302   (1)| 00:00:40 |
|  52 |       SORT UNIQUE                          |                               |  1976 |   482K|       |  3302   (1)| 00:00:40 |
|  53 |        UNION-ALL                           |                               |       |       |       |            |          |
|  54 |         CONNECT BY WITH FILTERING (UNIQUE) |                               |       |       |       |            |          |
|  55 |          TABLE ACCESS FULL                 | ORGANIZATION                  |  1965 |   132K|       |   551   (2)| 00:00:07 |
|  56 |          HASH JOIN                         |                               |     5 |   675 |       |  1099   (1)| 00:00:14 |
|  57 |           CONNECT BY PUMP                  |                               |       |       |       |            |          |
|  58 |           TABLE ACCESS FULL                | ORGANIZATION                  | 76990 |  5939K|       |   547   (1)| 00:00:07 |
|  59 |         FILTER                             |                               |       |       |       |            |          |
|  60 |          CONNECT BY WITH FILTERING (UNIQUE)|                               |       |       |       |            |          |
|  61 |           TABLE ACCESS FULL                | ORGANIZATION                  |     4 |   272 |       |   548   (1)| 00:00:07 |
|  62 |           HASH JOIN                        |                               |     1 |   124 |       |  1096   (1)| 00:00:14 |
|  63 |            CONNECT BY PUMP                 |                               |       |       |       |            |          |
|  64 |            TABLE ACCESS FULL               | ORGANIZATION                  | 76990 |  5112K|       |   548   (1)| 00:00:07 |
|  65 |         INLIST ITERATOR                    |                               |       |       |       |            |          |
|  66 |          TABLE ACCESS BY INDEX ROWID       | ORGANIZATION                  |     1 |    37 |       |     5   (0)| 00:00:01 |
|  67 |           INDEX RANGE SCAN                 | UQ_ORGANIZATION_NATURAL       |     1 |       |       |     4   (0)| 00:00:01 |
|  68 |    HASH JOIN                               |                               | 52211 |    12M|       |   241K  (1)| 00:48:16 |
|  69 |     TABLE ACCESS FULL                      | CURRENCY                      |    40 |  2560 |       |     3   (0)| 00:00:01 |
|  70 |     HASH JOIN                              |                               | 52211 |  9993K|  3072K|   241K  (1)| 00:48:16 |
|  71 |      TABLE ACCESS FULL                     | ACCOUNT                       | 45570 |  2536K|       |   274   (1)| 00:00:04 |
|  72 |      HASH JOIN                             |                               | 52211 |  7087K|       |   240K  (1)| 00:48:06 |
|  73 |       TABLE ACCESS FULL                    | COMPANY                       |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  74 |       HASH JOIN RIGHT OUTER                |                               | 52211 |  6628K|       |   240K  (1)| 00:48:06 |
|  75 |        VIEW                                |                               |    19 |   247 |       |    18  (12)| 00:00:01 |
|  76 |         FILTER                             |                               |       |       |       |            |          |
|  77 |          CONNECT BY WITH FILTERING         |                               |       |       |       |            |          |
|  78 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |     1 |    69 |       |     3   (0)| 00:00:01 |
|  79 |            INDEX RANGE SCAN                | UQ_ACCOUNT                    |     1 |       |       |     2   (0)| 00:00:01 |
|  80 |           NESTED LOOPS                     |                               |    18 |  2376 |       |    13   (0)| 00:00:01 |
|  81 |            CONNECT BY PUMP                 |                               |       |       |       |            |          |
|  82 |            TABLE ACCESS BY INDEX ROWID     | ACCOUNT                       |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  83 |             INDEX RANGE SCAN               | IDX_ACCOUNT_PARENT            |    18 |       |       |     2   (0)| 00:00:01 |
|  84 |        HASH JOIN RIGHT OUTER               |                               | 52211 |  5965K|       |   240K  (1)| 00:48:06 |
|  85 |         VIEW                               |                               |     2 |    26 |       |    30   (7)| 00:00:01 |
|  86 |          FILTER                            |                               |       |       |       |            |          |
|  87 |           CONNECT BY WITH FILTERING        |                               |       |       |       |            |          |
|  88 |            TABLE ACCESS BY INDEX ROWID     | ORGANIZATION                  |     1 |    69 |       |    12   (0)| 00:00:01 |
|  89 |             BITMAP CONVERSION TO ROWIDS    |                               |       |       |       |            |          |
|  90 |              BITMAP AND                    |                               |       |       |       |            |          |
|  91 |               BITMAP INDEX SINGLE VALUE    | IDX_ORGANIZATION_OBJ_TYPE     |       |       |       |            |          |
|  92 |               BITMAP OR                    |                               |       |       |       |            |          |
|  93 |                BITMAP INDEX SINGLE VALUE   | IDX_ORGANIZATION_OBJ_COMPANY  |       |       |       |            |          |
|  94 |                BITMAP INDEX SINGLE VALUE   | IDX_ORGANIZATION_OBJ_COMPANY  |       |       |       |            |          |
|  95 |                BITMAP INDEX SINGLE VALUE   | IDX_ORGANIZATION_OBJ_COMPANY  |       |       |       |            |          |
|  96 |                BITMAP INDEX SINGLE VALUE   | IDX_ORGANIZATION_OBJ_COMPANY  |       |       |       |            |          |
|  97 |                BITMAP INDEX SINGLE VALUE   | IDX_ORGANIZATION_OBJ_COMPANY  |       |       |       |            |          |
|  98 |            NESTED LOOPS                    |                               |     1 |   135 |       |    15   (0)| 00:00:01 |
|  99 |             CONNECT BY PUMP                |                               |       |       |       |            |          |
| 100 |             TABLE ACCESS BY INDEX ROWID    | ORGANIZATION                  |     1 |    79 |       |     3   (0)| 00:00:01 |
| 101 |              INDEX RANGE SCAN              | IDX_ORGANIZATION_PARENT       |     1 |       |       |     2   (0)| 00:00:01 |
| 102 |         HASH JOIN                          |                               | 52211 |  5302K|       |   240K  (1)| 00:48:06 |
| 103 |          TABLE ACCESS FULL                 | TIME                          |     9 |   171 |       |     3   (0)| 00:00:01 |
| 104 |          NESTED LOOPS                      |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 105 |           NESTED LOOPS                     |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 106 |            NESTED LOOPS                    |                               |  5756 |   185K|       |   552   (1)| 00:00:07 |
| 107 |             FAST DUAL                      |                               |     1 |       |       |     2   (0)| 00:00:01 |
| 108 |             TABLE ACCESS FULL              | ORGANIZATION                  |  5756 |   185K|       |   550   (1)| 00:00:07 |
| 109 |            INDEX RANGE SCAN                | IDX_FACT_PLAN_SUPP_K_COMP_ORG |    65 |       |       |    19   (0)| 00:00:01 |
| 110 |           TABLE ACCESS BY INDEX ROWID      | FACT_PLAN_SUPP                |   110 |  5720 |       |    82   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------------------

and the good plan

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                              |       |       |       |   545K(100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID               | ACCOUNT                      |     1 |    25 |       |     5  (20)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS              |                              |       |       |       |            |          |
|   3 |    BITMAP AND                              |                              |       |       |       |            |          |
|   4 |     BITMAP INDEX SINGLE VALUE              | IDX_ACCOUNT_OBJ_SOURCE       |       |       |       |            |          |
|   5 |     BITMAP INDEX SINGLE VALUE              | IDX_ACCOUNT_OBJ_TYPE         |       |       |       |            |          |
|   6 |     BITMAP CONVERSION FROM ROWIDS          |                              |       |       |       |            |          |
|   7 |      SORT ORDER BY                         |                              |       |       |       |            |          |
|   8 |       INDEX RANGE SCAN                     | UQ_ACCOUNT                   |     5 |       |       |     2   (0)| 00:00:01 |
|   9 |   TABLE ACCESS BY INDEX ROWID              | ACCOUNT                      |     1 |    25 |       |     5  (20)| 00:00:01 |
|  10 |    BITMAP CONVERSION TO ROWIDS             |                              |       |       |       |            |          |
|  11 |     BITMAP AND                             |                              |       |       |       |            |          |
|  12 |      BITMAP INDEX SINGLE VALUE             | IDX_ACCOUNT_OBJ_SOURCE       |       |       |       |            |          |
|  13 |      BITMAP INDEX SINGLE VALUE             | IDX_ACCOUNT_OBJ_TYPE         |       |       |       |            |          |
|  14 |      BITMAP CONVERSION FROM ROWIDS         |                              |       |       |       |            |          |
|  15 |       SORT ORDER BY                        |                              |       |       |       |            |          |
|  16 |        INDEX RANGE SCAN                    | UQ_ACCOUNT                   |     5 |       |       |     2   (0)| 00:00:01 |
|  17 |  HASH JOIN RIGHT OUTER                     |                              |   539K|   162M|       |   545K  (2)| 01:49:03 |
|  18 |   VIEW                                     | X_SECURITY_ACCOUNTS_V        |   434 | 12152 |       |   368   (2)| 00:00:05 |
|  19 |    HASH GROUP BY                           |                              |   434 |  7812 |       |   368   (2)| 00:00:05 |
|  20 |     VIEW                                   |                              |   434 |  7812 |       |   367   (1)| 00:00:05 |
|  21 |      SORT UNIQUE                           |                              |   434 |   101K|       |   367   (1)| 00:00:05 |
|  22 |       UNION-ALL                            |                              |       |       |       |            |          |
|  23 |        CONNECT BY WITH FILTERING (UNIQUE)  |                              |       |       |       |            |          |
|  24 |         CONCATENATION                      |                              |       |       |       |            |          |
|  25 |          INLIST ITERATOR                   |                              |       |       |       |            |          |
|  26 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                      |     1 |    69 |       |     7   (0)| 00:00:01 |
|  27 |            INDEX RANGE SCAN                | UQ_ACCOUNT                   |     1 |       |       |     6   (0)| 00:00:01 |
|  28 |          INLIST ITERATOR                   |                              |       |       |       |            |          |
|  29 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                      |     1 |    69 |       |     7   (0)| 00:00:01 |
|  30 |            INDEX RANGE SCAN                | UQ_ACCOUNT                   |     1 |       |       |     6   (0)| 00:00:01 |
|  31 |         NESTED LOOPS                       |                              |    36 |  4752 |       |    34   (0)| 00:00:01 |
|  32 |          CONNECT BY PUMP                   |                              |       |       |       |            |          |
|  33 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  34 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |    18 |       |       |     2   (0)| 00:00:01 |
|  35 |        CONNECT BY WITH FILTERING (UNIQUE)  |                              |       |       |       |            |          |
|  36 |         CONCATENATION                      |                              |       |       |       |            |          |
|  37 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |     1 |    69 |       |     4   (0)| 00:00:01 |
|  38 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |     1 |       |       |     3   (0)| 00:00:01 |
|  39 |          INLIST ITERATOR                   |                              |       |       |       |            |          |
|  40 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                      |    20 |  1380 |       |    47   (0)| 00:00:01 |
|  41 |            INDEX RANGE SCAN                | IDX_ACCOUNT_PARENT           |    20 |       |       |    39   (0)| 00:00:01 |
|  42 |         NESTED LOOPS                       |                              |   374 | 49368 |       |   261   (0)| 00:00:04 |
|  43 |          CONNECT BY PUMP                   |                              |       |       |       |            |          |
|  44 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  45 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |    18 |       |       |     2   (0)| 00:00:01 |
|  46 |        TABLE ACCESS BY INDEX ROWID         | ACCOUNT                      |     1 |    26 |       |     4   (0)| 00:00:01 |
|  47 |         INDEX RANGE SCAN                   | IDX_ACCOUNT_PARENT           |     1 |       |       |     3   (0)| 00:00:01 |
|  48 |   HASH JOIN RIGHT OUTER                    |                              |   300K|    82M|       |   544K  (2)| 01:48:58 |
|  49 |    VIEW                                    | X_SECURITY_OBJECTS_V         |  1920 | 53760 |       |  3303   (1)| 00:00:40 |
|  50 |     HASH GROUP BY                          |                              |  1920 | 34560 |       |  3303   (1)| 00:00:40 |
|  51 |      VIEW                                  |                              |  1920 | 34560 |       |  3302   (1)| 00:00:40 |
|  52 |       SORT UNIQUE                          |                              |  1920 |   468K|       |  3302   (1)| 00:00:40 |
|  53 |        UNION-ALL                           |                              |       |       |       |            |          |
|  54 |         CONNECT BY WITH FILTERING (UNIQUE) |                              |       |       |       |            |          |
|  55 |          TABLE ACCESS FULL                 | ORGANIZATION                 |  1909 |   128K|       |   551   (2)| 00:00:07 |
|  56 |          HASH JOIN                         |                              |     5 |   675 |       |  1099   (1)| 00:00:14 |
|  57 |           CONNECT BY PUMP                  |                              |       |       |       |            |          |
|  58 |           TABLE ACCESS FULL                | ORGANIZATION                 | 76836 |  5927K|       |   547   (1)| 00:00:07 |
|  59 |         FILTER                             |                              |       |       |       |            |          |
|  60 |          CONNECT BY WITH FILTERING (UNIQUE)|                              |       |       |       |            |          |
|  61 |           TABLE ACCESS FULL                | ORGANIZATION                 |     4 |   272 |       |   548   (1)| 00:00:07 |
|  62 |           HASH JOIN                        |                              |     1 |   124 |       |  1096   (1)| 00:00:14 |
|  63 |            CONNECT BY PUMP                 |                              |       |       |       |            |          |
|  64 |            TABLE ACCESS FULL               | ORGANIZATION                 | 76836 |  5102K|       |   548   (1)| 00:00:07 |
|  65 |         INLIST ITERATOR                    |                              |       |       |       |            |          |
|  66 |          TABLE ACCESS BY INDEX ROWID       | ORGANIZATION                 |     1 |    37 |       |     5   (0)| 00:00:01 |
|  67 |           INDEX RANGE SCAN                 | UQ_ORGANIZATION_NATURAL      |     1 |       |       |     4   (0)| 00:00:01 |
|  68 |    HASH JOIN                               |                              |   300K|    74M|       |   541K  (2)| 01:48:18 |
|  69 |     TABLE ACCESS FULL                      | CURRENCY                     |    40 |  2560 |       |     3   (0)| 00:00:01 |
|  70 |     HASH JOIN RIGHT OUTER                  |                              |   300K|    56M|       |   541K  (2)| 01:48:18 |
|  71 |      VIEW                                  |                              |    19 |   247 |       |    18  (12)| 00:00:01 |
|  72 |       FILTER                               |                              |       |       |       |            |          |
|  73 |        CONNECT BY WITH FILTERING           |                              |       |       |       |            |          |
|  74 |         TABLE ACCESS BY INDEX ROWID        | ACCOUNT                      |     1 |    69 |       |     3   (0)| 00:00:01 |
|  75 |          INDEX RANGE SCAN                  | UQ_ACCOUNT                   |     1 |       |       |     2   (0)| 00:00:01 |
|  76 |         NESTED LOOPS                       |                              |    18 |  2376 |       |    13   (0)| 00:00:01 |
|  77 |          CONNECT BY PUMP                   |                              |       |       |       |            |          |
|  78 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  79 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |    18 |       |       |     2   (0)| 00:00:01 |
|  80 |      HASH JOIN                             |                              |   300K|    52M|  3072K|   541K  (2)| 01:48:18 |
|  81 |       TABLE ACCESS FULL                    | ACCOUNT                      | 45564 |  2536K|       |   274   (1)| 00:00:04 |
|  82 |       HASH JOIN                            |                              |   300K|    36M|  2576K|   539K  (2)| 01:47:50 |
|  83 |        TABLE ACCESS FULL                   | ORGANIZATION                 | 58577 |  1887K|       |   550   (1)| 00:00:07 |
|  84 |        HASH JOIN                           |                              |  6984K|   619M|       |   503K  (2)| 01:40:44 |
|  85 |         TABLE ACCESS FULL                  | COMPANY                      |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  86 |         HASH JOIN RIGHT OUTER              |                              |  6984K|   559M|       |   503K  (2)| 01:40:44 |
|  87 |          VIEW                              |                              |     2 |    26 |       |    29   (7)| 00:00:01 |
|  88 |           FILTER                           |                              |       |       |       |            |          |
|  89 |            CONNECT BY WITH FILTERING       |                              |       |       |       |            |          |
|  90 |             TABLE ACCESS BY INDEX ROWID    | ORGANIZATION                 |     1 |    69 |       |    12   (0)| 00:00:01 |
|  91 |              BITMAP CONVERSION TO ROWIDS   |                              |       |       |       |            |          |
|  92 |               BITMAP AND                   |                              |       |       |       |            |          |
|  93 |                BITMAP INDEX SINGLE VALUE   | IDX_ORGANIZATION_OBJ_TYPE    |       |       |       |            |          |
|  94 |                BITMAP OR                   |                              |       |       |       |            |          |
|  95 |                 BITMAP INDEX SINGLE VALUE  | IDX_ORGANIZATION_OBJ_COMPANY |       |       |       |            |          |
|  96 |                 BITMAP INDEX SINGLE VALUE  | IDX_ORGANIZATION_OBJ_COMPANY |       |       |       |            |          |
|  97 |                 BITMAP INDEX SINGLE VALUE  | IDX_ORGANIZATION_OBJ_COMPANY |       |       |       |            |          |
|  98 |                 BITMAP INDEX SINGLE VALUE  | IDX_ORGANIZATION_OBJ_COMPANY |       |       |       |            |          |
|  99 |                 BITMAP INDEX SINGLE VALUE  | IDX_ORGANIZATION_OBJ_COMPANY |       |       |       |            |          |
| 100 |             NESTED LOOPS                   |                              |     1 |   135 |       |    15   (0)| 00:00:01 |
| 101 |              CONNECT BY PUMP               |                              |       |       |       |            |          |
| 102 |              TABLE ACCESS BY INDEX ROWID   | ORGANIZATION                 |     1 |    79 |       |     3   (0)| 00:00:01 |
| 103 |               INDEX RANGE SCAN             | IDX_ORGANIZATION_PARENT      |     1 |       |       |     2   (0)| 00:00:01 |
| 104 |          HASH JOIN                         |                              |  6984K|   472M|       |   503K  (2)| 01:40:43 |
| 105 |           NESTED LOOPS                     |                              |     9 |   171 |       |     5   (0)| 00:00:01 |
| 106 |            FAST DUAL                       |                              |     1 |       |       |     2   (0)| 00:00:01 |
| 107 |            TABLE ACCESS FULL               | TIME                         |     9 |   171 |       |     3   (0)| 00:00:01 |
| 108 |           TABLE ACCESS FULL                | FACT_PLAN_SUPP               |    84M|  4207M|       |   503K  (2)| 01:40:37 |
-----------------------------------------------------------------------------------------------------------------------------------

You might want to cut and paste these into two separate windows in a text editor – which is how I worked the plans (using vi on a linux box).

I’ve highlighted two lines from the bad plan – operations 1 and 17, which are both indented one place; the level of indent tells me that operation 17 is the start of plan for the main query and operation 1 is the start of plan for an inline scalar subquery, so my first check is that operations 1 to 16 (highlighted in the good plan) are the same in both plans: they are, so they’re probably not the source of the problem and can be eliminated from the plans. In passing, you’ll notice that operation 9 is a table access which is indented one more place – depending on exact version of Oracle (and which patches have fixed which bugs) this might mean there are two scalar subqueries in the select list, it might mean there’s one scalar subquery with a “pushed” filter subquery, or (most likely on recent versions of Oracle) it might mean that there’s a case or decode() operator with two scalar subqueries invoked by the operator.

I’ve also highlighted two other chunks in both plans. Looking at the bad plan, operation 75 is an inline view derived from operations 76 – 83; this corresponds to the view at operation 71 in the good plan, derived from operations 72 – 79: a quick check shows that these two sets of lines are identical, so they can be eliminated and replaced with just a single VIEW operation which I’ll identify by adding in an object Name of VIEW_1.

Similarly I’ve highlighted operations 86 – 101 in the bad plan and 88 – 103 in the good plan – a visual check shows that they are identical (with just a tiny difference in the final VIEW cost) so I’ve eliminated those lines and given the controlling VIEW operation an object Name of VIEW_2.

With the three chunks removed, this is what the plans look like:

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                               |       |       |       |   244K(100)|          |
|  17 |  HASH JOIN RIGHT OUTER                     |                               | 93634 |    28M|       |   244K  (1)| 00:49:00 |
|  18 |   VIEW                                     | X_SECURITY_ACCOUNTS_V         |   434 | 12152 |       |   368   (2)| 00:00:05 |
|  19 |    HASH GROUP BY                           |                               |   434 |  7812 |       |   368   (2)| 00:00:05 |
|  20 |     VIEW                                   |                               |   434 |  7812 |       |   367   (2)| 00:00:05 |
|  21 |      SORT UNIQUE                           |                               |   434 |   101K|       |   367   (2)| 00:00:05 |
|  22 |       UNION-ALL                            |                               |       |       |       |            |          |
|  23 |        CONNECT BY WITH FILTERING (UNIQUE)  |                               |       |       |       |            |          |
|  24 |         CONCATENATION                      |                               |       |       |       |            |          |
|  25 |          INLIST ITERATOR                   |                               |       |       |       |            |          |
|  26 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |     1 |    69 |       |     7   (0)| 00:00:01 |
|  27 |            INDEX RANGE SCAN                | UQ_ACCOUNT                    |     1 |       |       |     6   (0)| 00:00:01 |
|  28 |          INLIST ITERATOR                   |                               |       |       |       |            |          |
|  29 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |     1 |    69 |       |     7   (0)| 00:00:01 |
|  30 |            INDEX RANGE SCAN                | UQ_ACCOUNT                    |     1 |       |       |     6   (0)| 00:00:01 |
|  31 |         NESTED LOOPS                       |                               |    36 |  4752 |       |    34   (0)| 00:00:01 |
|  32 |          CONNECT BY PUMP                   |                               |       |       |       |            |          |
|  33 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                       |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  34 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT            |    18 |       |       |     2   (0)| 00:00:01 |
|  35 |        CONNECT BY WITH FILTERING (UNIQUE)  |                               |       |       |       |            |          |
|  36 |         CONCATENATION                      |                               |       |       |       |            |          |
|  37 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                       |     1 |    69 |       |     4   (0)| 00:00:01 |
|  38 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT            |     1 |       |       |     3   (0)| 00:00:01 |
|  39 |          INLIST ITERATOR                   |                               |       |       |       |            |          |
|  40 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                       |    20 |  1380 |       |    47   (0)| 00:00:01 |
|  41 |            INDEX RANGE SCAN                | IDX_ACCOUNT_PARENT            |    20 |       |       |    39   (0)| 00:00:01 |
|  42 |         NESTED LOOPS                       |                               |   374 | 49368 |       |   261   (0)| 00:00:04 |
|  43 |          CONNECT BY PUMP                   |                               |       |       |       |            |          |
|  44 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                       |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  45 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT            |    18 |       |       |     2   (0)| 00:00:01 |
|  46 |        TABLE ACCESS BY INDEX ROWID         | ACCOUNT                       |     1 |    26 |       |     4   (0)| 00:00:01 |
|  47 |         INDEX RANGE SCAN                   | IDX_ACCOUNT_PARENT            |     1 |       |       |     3   (0)| 00:00:01 |
|  48 |   HASH JOIN RIGHT OUTER                    |                               | 52211 |    14M|       |   244K  (1)| 00:48:55 |
|  49 |    VIEW                                    | X_SECURITY_OBJECTS_V          |  1976 | 55328 |       |  3303   (1)| 00:00:40 |
|  50 |     HASH GROUP BY                          |                               |  1976 | 35568 |       |  3303   (1)| 00:00:40 |
|  51 |      VIEW                                  |                               |  1976 | 35568 |       |  3302   (1)| 00:00:40 |
|  52 |       SORT UNIQUE                          |                               |  1976 |   482K|       |  3302   (1)| 00:00:40 |
|  53 |        UNION-ALL                           |                               |       |       |       |            |          |
|  54 |         CONNECT BY WITH FILTERING (UNIQUE) |                               |       |       |       |            |          |
|  55 |          TABLE ACCESS FULL                 | ORGANIZATION                  |  1965 |   132K|       |   551   (2)| 00:00:07 |
|  56 |          HASH JOIN                         |                               |     5 |   675 |       |  1099   (1)| 00:00:14 |
|  57 |           CONNECT BY PUMP                  |                               |       |       |       |            |          |
|  58 |           TABLE ACCESS FULL                | ORGANIZATION                  | 76990 |  5939K|       |   547   (1)| 00:00:07 |
|  59 |         FILTER                             |                               |       |       |       |            |          |
|  60 |          CONNECT BY WITH FILTERING (UNIQUE)|                               |       |       |       |            |          |
|  61 |           TABLE ACCESS FULL                | ORGANIZATION                  |     4 |   272 |       |   548   (1)| 00:00:07 |
|  62 |           HASH JOIN                        |                               |     1 |   124 |       |  1096   (1)| 00:00:14 |
|  63 |            CONNECT BY PUMP                 |                               |       |       |       |            |          |
|  64 |            TABLE ACCESS FULL               | ORGANIZATION                  | 76990 |  5112K|       |   548   (1)| 00:00:07 |
|  65 |         INLIST ITERATOR                    |                               |       |       |       |            |          |
|  66 |          TABLE ACCESS BY INDEX ROWID       | ORGANIZATION                  |     1 |    37 |       |     5   (0)| 00:00:01 |
|  67 |           INDEX RANGE SCAN                 | UQ_ORGANIZATION_NATURAL       |     1 |       |       |     4   (0)| 00:00:01 |
|  68 |    HASH JOIN                               |                               | 52211 |    12M|       |   241K  (1)| 00:48:16 |
|  69 |     TABLE ACCESS FULL                      | CURRENCY                      |    40 |  2560 |       |     3   (0)| 00:00:01 |
|  70 |     HASH JOIN                              |                               | 52211 |  9993K|  3072K|   241K  (1)| 00:48:16 |
|  71 |      TABLE ACCESS FULL                     | ACCOUNT                       | 45570 |  2536K|       |   274   (1)| 00:00:04 |
|  72 |      HASH JOIN                             |                               | 52211 |  7087K|       |   240K  (1)| 00:48:06 |
|  73 |       TABLE ACCESS FULL                    | COMPANY                       |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  74 |       HASH JOIN RIGHT OUTER                |                               | 52211 |  6628K|       |   240K  (1)| 00:48:06 |
|  75 |        VIEW                                | --- VIEW_1                    |    19 |   247 |       |    18  (12)| 00:00:01 |
|  84 |        HASH JOIN RIGHT OUTER               |                               | 52211 |  5965K|       |   240K  (1)| 00:48:06 |
|  85 |         VIEW                               | --- VIEW_2                    |     2 |    26 |       |    30   (7)| 00:00:01 |
| 102 |         HASH JOIN                          |                               | 52211 |  5302K|       |   240K  (1)| 00:48:06 |
| 103 |          TABLE ACCESS FULL                 | TIME                          |     9 |   171 |       |     3   (0)| 00:00:01 |
| 104 |          NESTED LOOPS                      |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 105 |           NESTED LOOPS                     |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 106 |            NESTED LOOPS                    |                               |  5756 |   185K|       |   552   (1)| 00:00:07 |
| 107 |             FAST DUAL                      |                               |     1 |       |       |     2   (0)| 00:00:01 |
| 108 |             TABLE ACCESS FULL              | ORGANIZATION                  |  5756 |   185K|       |   550   (1)| 00:00:07 |
| 109 |            INDEX RANGE SCAN                | IDX_FACT_PLAN_SUPP_K_COMP_ORG |    65 |       |       |    19   (0)| 00:00:01 |
| 110 |           TABLE ACCESS BY INDEX ROWID      | FACT_PLAN_SUPP                |   110 |  5720 |       |    82   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                              |       |       |       |   545K(100)|          |
|  17 |  HASH JOIN RIGHT OUTER                     |                              |   539K|   162M|       |   545K  (2)| 01:49:03 |
|  18 |   VIEW                                     | X_SECURITY_ACCOUNTS_V        |   434 | 12152 |       |   368   (2)| 00:00:05 |
|  19 |    HASH GROUP BY                           |                              |   434 |  7812 |       |   368   (2)| 00:00:05 |
|  20 |     VIEW                                   |                              |   434 |  7812 |       |   367   (1)| 00:00:05 |
|  21 |      SORT UNIQUE                           |                              |   434 |   101K|       |   367   (1)| 00:00:05 |
|  22 |       UNION-ALL                            |                              |       |       |       |            |          |
|  23 |        CONNECT BY WITH FILTERING (UNIQUE)  |                              |       |       |       |            |          |
|  24 |         CONCATENATION                      |                              |       |       |       |            |          |
|  25 |          INLIST ITERATOR                   |                              |       |       |       |            |          |
|  26 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                      |     1 |    69 |       |     7   (0)| 00:00:01 |
|  27 |            INDEX RANGE SCAN                | UQ_ACCOUNT                   |     1 |       |       |     6   (0)| 00:00:01 |
|  28 |          INLIST ITERATOR                   |                              |       |       |       |            |          |
|  29 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                      |     1 |    69 |       |     7   (0)| 00:00:01 |
|  30 |            INDEX RANGE SCAN                | UQ_ACCOUNT                   |     1 |       |       |     6   (0)| 00:00:01 |
|  31 |         NESTED LOOPS                       |                              |    36 |  4752 |       |    34   (0)| 00:00:01 |
|  32 |          CONNECT BY PUMP                   |                              |       |       |       |            |          |
|  33 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  34 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |    18 |       |       |     2   (0)| 00:00:01 |
|  35 |        CONNECT BY WITH FILTERING (UNIQUE)  |                              |       |       |       |            |          |
|  36 |         CONCATENATION                      |                              |       |       |       |            |          |
|  37 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |     1 |    69 |       |     4   (0)| 00:00:01 |
|  38 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |     1 |       |       |     3   (0)| 00:00:01 |
|  39 |          INLIST ITERATOR                   |                              |       |       |       |            |          |
|  40 |           TABLE ACCESS BY INDEX ROWID      | ACCOUNT                      |    20 |  1380 |       |    47   (0)| 00:00:01 |
|  41 |            INDEX RANGE SCAN                | IDX_ACCOUNT_PARENT           |    20 |       |       |    39   (0)| 00:00:01 |
|  42 |         NESTED LOOPS                       |                              |   374 | 49368 |       |   261   (0)| 00:00:04 |
|  43 |          CONNECT BY PUMP                   |                              |       |       |       |            |          |
|  44 |          TABLE ACCESS BY INDEX ROWID       | ACCOUNT                      |    18 |  1368 |       |    10   (0)| 00:00:01 |
|  45 |           INDEX RANGE SCAN                 | IDX_ACCOUNT_PARENT           |    18 |       |       |     2   (0)| 00:00:01 |
|  46 |        TABLE ACCESS BY INDEX ROWID         | ACCOUNT                      |     1 |    26 |       |     4   (0)| 00:00:01 |
|  47 |         INDEX RANGE SCAN                   | IDX_ACCOUNT_PARENT           |     1 |       |       |     3   (0)| 00:00:01 |
|  48 |   HASH JOIN RIGHT OUTER                    |                              |   300K|    82M|       |   544K  (2)| 01:48:58 |
|  49 |    VIEW                                    | X_SECURITY_OBJECTS_V         |  1920 | 53760 |       |  3303   (1)| 00:00:40 |
|  50 |     HASH GROUP BY                          |                              |  1920 | 34560 |       |  3303   (1)| 00:00:40 |
|  51 |      VIEW                                  |                              |  1920 | 34560 |       |  3302   (1)| 00:00:40 |
|  52 |       SORT UNIQUE                          |                              |  1920 |   468K|       |  3302   (1)| 00:00:40 |
|  53 |        UNION-ALL                           |                              |       |       |       |            |          |
|  54 |         CONNECT BY WITH FILTERING (UNIQUE) |                              |       |       |       |            |          |
|  55 |          TABLE ACCESS FULL                 | ORGANIZATION                 |  1909 |   128K|       |   551   (2)| 00:00:07 |
|  56 |          HASH JOIN                         |                              |     5 |   675 |       |  1099   (1)| 00:00:14 |
|  57 |           CONNECT BY PUMP                  |                              |       |       |       |            |          |
|  58 |           TABLE ACCESS FULL                | ORGANIZATION                 | 76836 |  5927K|       |   547   (1)| 00:00:07 |
|  59 |         FILTER                             |                              |       |       |       |            |          |
|  60 |          CONNECT BY WITH FILTERING (UNIQUE)|                              |       |       |       |            |          |
|  61 |           TABLE ACCESS FULL                | ORGANIZATION                 |     4 |   272 |       |   548   (1)| 00:00:07 |
|  62 |           HASH JOIN                        |                              |     1 |   124 |       |  1096   (1)| 00:00:14 |
|  63 |            CONNECT BY PUMP                 |                              |       |       |       |            |          |
|  64 |            TABLE ACCESS FULL               | ORGANIZATION                 | 76836 |  5102K|       |   548   (1)| 00:00:07 |
|  65 |         INLIST ITERATOR                    |                              |       |       |       |            |          |
|  66 |          TABLE ACCESS BY INDEX ROWID       | ORGANIZATION                 |     1 |    37 |       |     5   (0)| 00:00:01 |
|  67 |           INDEX RANGE SCAN                 | UQ_ORGANIZATION_NATURAL      |     1 |       |       |     4   (0)| 00:00:01 |
|  68 |    HASH JOIN                               |                              |   300K|    74M|       |   541K  (2)| 01:48:18 |
|  69 |     TABLE ACCESS FULL                      | CURRENCY                     |    40 |  2560 |       |     3   (0)| 00:00:01 |
|  70 |     HASH JOIN RIGHT OUTER                  |                              |   300K|    56M|       |   541K  (2)| 01:48:18 |
|  71 |      VIEW                                  | --- VIEW_1                   |    19 |   247 |       |    18  (12)| 00:00:01 |
|  80 |      HASH JOIN                             |                              |   300K|    52M|  3072K|   541K  (2)| 01:48:18 |
|  81 |       TABLE ACCESS FULL                    | ACCOUNT                      | 45564 |  2536K|       |   274   (1)| 00:00:04 |
|  82 |       HASH JOIN                            |                              |   300K|    36M|  2576K|   539K  (2)| 01:47:50 |
|  83 |        TABLE ACCESS FULL                   | ORGANIZATION                 | 58577 |  1887K|       |   550   (1)| 00:00:07 |
|  84 |        HASH JOIN                           |                              |  6984K|   619M|       |   503K  (2)| 01:40:44 |
|  85 |         TABLE ACCESS FULL                  | COMPANY                      |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  86 |         HASH JOIN RIGHT OUTER              |                              |  6984K|   559M|       |   503K  (2)| 01:40:44 |
|  87 |          VIEW                              | --- VIEW_2                   |     2 |    26 |       |    29   (7)| 00:00:01 |
| 104 |          HASH JOIN                         |                              |  6984K|   472M|       |   503K  (2)| 01:40:43 |
| 105 |           NESTED LOOPS                     |                              |     9 |   171 |       |     5   (0)| 00:00:01 |
| 106 |            FAST DUAL                       |                              |     1 |       |       |     2   (0)| 00:00:01 |
| 107 |            TABLE ACCESS FULL               | TIME                         |     9 |   171 |       |     3   (0)| 00:00:01 |
| 108 |           TABLE ACCESS FULL                | FACT_PLAN_SUPP               |    84M|  4207M|       |   503K  (2)| 01:40:37 |
-----------------------------------------------------------------------------------------------------------------------------------

At this point I go into “astronomer” mode with a blink comparator. I’ll open two text files, one for each plan, in separate windows, position the windows one behind the other, align the texts and then “alt-tab” between the windows watching for the “blink” that highlights where the texts differ.

In the case of these two plans the numeric parts slip sideways by one character on the blink – fortunately I find that a small enough step that I don’t worry about doing a manual edit to re-align them.  In the first stage I’m only looking for changes in the operations and objects, with a secondary check on the numbers. For these two plans I can see very quickly that they match up to operation 69 (highlighted in both plans above).

I’ve highlighted a few other lines in the first 69 operations of the plans:

  • Operation 17 has significantly different rows/costs in the two plans 93,600/244K vs. 539K/545K
  • Operation 48 also has significant differences: 52,000/244K vs, 300K/545K – and as the second child of the hash join at operation 17 explains the previous differences
  • Operations 49 and 68 are the two child operations of the hash join at operation 48, with operation 68 carrying the differences
  • Operation 70 (where our plans shapes start to differ) is the second child of the hash join at operation 68 and introduces the differences

Although there are a few small differences in cardinality (row) and cost estimates in other lines in the first 70 lines of the plan they are small and (probably) ignorable by comparison.

The upshot of this is that we can assume (reasonably confidently) that the critical differences occur in the section of the plan that is the hash join [right outer] from operation 70 and its descendents, so our plans reduce to  the following (which emulates in text the type of appearance you would get from OEM, or TOAD, or SQL*Developer by collapsing parts of plans):

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                               |       |       |       |   244K(100)|          |
...
|  17 |  HASH JOIN RIGHT OUTER                     |                               | 93634 |    28M|       |   244K  (1)| 00:49:00 |
|  18 |   VIEW                                     | X_SECURITY_ACCOUNTS_V         |   434 | 12152 |       |   368   (2)| 00:00:05 |
...
|  48 |   HASH JOIN RIGHT OUTER                    |                               | 52211 |    14M|       |   244K  (1)| 00:48:55 |
|  49 |    VIEW                                    | X_SECURITY_OBJECTS_V          |  1976 | 55328 |       |  3303   (1)| 00:00:40 |
...
|  68 |    HASH JOIN                               |                               | 52211 |    12M|       |   241K  (1)| 00:48:16 |
|  69 |     TABLE ACCESS FULL                      | CURRENCY                      |    40 |  2560 |       |     3   (0)| 00:00:01 |
|  70 |     HASH JOIN                              |                               | 52211 |  9993K|  3072K|   241K  (1)| 00:48:16 |
|  71 |      TABLE ACCESS FULL                     | ACCOUNT                       | 45570 |  2536K|       |   274   (1)| 00:00:04 |
|  72 |      HASH JOIN                             |                               | 52211 |  7087K|       |   240K  (1)| 00:48:06 |
|  73 |       TABLE ACCESS FULL                    | COMPANY                       |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  74 |       HASH JOIN RIGHT OUTER                |                               | 52211 |  6628K|       |   240K  (1)| 00:48:06 |
|  75 |        VIEW                                | --- VIEW_1                    |    19 |   247 |       |    18  (12)| 00:00:01 |
|  84 |        HASH JOIN RIGHT OUTER               |                               | 52211 |  5965K|       |   240K  (1)| 00:48:06 |
|  85 |         VIEW                               | --- VIEW_2                    |     2 |    26 |       |    30   (7)| 00:00:01 |
| 102 |         HASH JOIN                          |                               | 52211 |  5302K|       |   240K  (1)| 00:48:06 |
| 103 |          TABLE ACCESS FULL                 | TIME                          |     9 |   171 |       |     3   (0)| 00:00:01 |
| 104 |          NESTED LOOPS                      |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 105 |           NESTED LOOPS                     |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 106 |            NESTED LOOPS                    |                               |  5756 |   185K|       |   552   (1)| 00:00:07 |
| 107 |             FAST DUAL                      |                               |     1 |       |       |     2   (0)| 00:00:01 |
| 108 |             TABLE ACCESS FULL              | ORGANIZATION                  |  5756 |   185K|       |   550   (1)| 00:00:07 |
| 109 |            INDEX RANGE SCAN                | IDX_FACT_PLAN_SUPP_K_COMP_ORG |    65 |       |       |    19   (0)| 00:00:01 |
| 110 |           TABLE ACCESS BY INDEX ROWID      | FACT_PLAN_SUPP                |   110 |  5720 |       |    82   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                              |       |       |       |   545K(100)|          |
...
|  17 |  HASH JOIN RIGHT OUTER                     |                              |   539K|   162M|       |   545K  (2)| 01:49:03 |
|  18 |   VIEW                                     | X_SECURITY_ACCOUNTS_V        |   434 | 12152 |       |   368   (2)| 00:00:05 |
...
|  48 |   HASH JOIN RIGHT OUTER                    |                              |   300K|    82M|       |   544K  (2)| 01:48:58 |
|  49 |    VIEW                                    | X_SECURITY_OBJECTS_V         |  1920 | 53760 |       |  3303   (1)| 00:00:40 |
...
|  68 |    HASH JOIN                               |                              |   300K|    74M|       |   541K  (2)| 01:48:18 |
|  69 |     TABLE ACCESS FULL                      | CURRENCY                     |    40 |  2560 |       |     3   (0)| 00:00:01 |
|  70 |     HASH JOIN RIGHT OUTER                  |                              |   300K|    56M|       |   541K  (2)| 01:48:18 |
|  71 |      VIEW                                  | --- VIEW_1                   |    19 |   247 |       |    18  (12)| 00:00:01 |
|  80 |      HASH JOIN                             |                              |   300K|    52M|  3072K|   541K  (2)| 01:48:18 |
|  81 |       TABLE ACCESS FULL                    | ACCOUNT                      | 45564 |  2536K|       |   274   (1)| 00:00:04 |
|  82 |       HASH JOIN                            |                              |   300K|    36M|  2576K|   539K  (2)| 01:47:50 |
|  83 |        TABLE ACCESS FULL                   | ORGANIZATION                 | 58577 |  1887K|       |   550   (1)| 00:00:07 |
|  84 |        HASH JOIN                           |                              |  6984K|   619M|       |   503K  (2)| 01:40:44 |
|  85 |         TABLE ACCESS FULL                  | COMPANY                      |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  86 |         HASH JOIN RIGHT OUTER              |                              |  6984K|   559M|       |   503K  (2)| 01:40:44 |
|  87 |          VIEW                              | --- VIEW_2                   |     2 |    26 |       |    29   (7)| 00:00:01 |
| 104 |          HASH JOIN                         |                              |  6984K|   472M|       |   503K  (2)| 01:40:43 |
| 105 |           NESTED LOOPS                     |                              |     9 |   171 |       |     5   (0)| 00:00:01 |
| 106 |            FAST DUAL                       |                              |     1 |       |       |     2   (0)| 00:00:01 |
| 107 |            TABLE ACCESS FULL               | TIME                         |     9 |   171 |       |     3   (0)| 00:00:01 |
| 108 |           TABLE ACCESS FULL                | FACT_PLAN_SUPP               |    84M|  4207M|       |   503K  (2)| 01:40:37 |
-----------------------------------------------------------------------------------------------------------------------------------

The image we get by stripping out (or collapsing) all the bits which looks as if they aren’t significant is so much simpler to grasp than the original, and gives us a couple of extra ideas to follow if we were simply doing a generic “optimise this query” exercise starting from a position of ignorance. In fact all we’re after in this particular exercise is a pointer to the most likely place where “something changed” as this may tall us why the plan changed and what we can do to get back the old plan and stabilise it for the future.

Now that we’ve seen the condensed plans I’m going to reduce them one more time, and take out the collapse option, to hide everything above operation 70 because that’s a good focal point for the investigating the change:

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|  70 |     HASH JOIN                              |                               | 52211 |  9993K|  3072K|   241K  (1)| 00:48:16 |
|  71 |      TABLE ACCESS FULL                     | ACCOUNT                       | 45570 |  2536K|       |   274   (1)| 00:00:04 |
|  72 |      HASH JOIN                             |                               | 52211 |  7087K|       |   240K  (1)| 00:48:06 |
|  73 |       TABLE ACCESS FULL                    | COMPANY                       |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  74 |       HASH JOIN RIGHT OUTER                |                               | 52211 |  6628K|       |   240K  (1)| 00:48:06 |
|  75 |        VIEW                                | --- VIEW_1                    |    19 |   247 |       |    18  (12)| 00:00:01 |
|  84 |        HASH JOIN RIGHT OUTER               |                               | 52211 |  5965K|       |   240K  (1)| 00:48:06 |
|  85 |         VIEW                               | --- VIEW_2                    |     2 |    26 |       |    30   (7)| 00:00:01 |
| 102 |         HASH JOIN                          |                               | 52211 |  5302K|       |   240K  (1)| 00:48:06 |
| 103 |          TABLE ACCESS FULL                 | TIME                          |     9 |   171 |       |     3   (0)| 00:00:01 |
| 104 |          NESTED LOOPS                      |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 105 |           NESTED LOOPS                     |                               |   634K|    51M|       |   240K  (1)| 00:48:06 |
| 106 |            NESTED LOOPS                    |                               |  5756 |   185K|       |   552   (1)| 00:00:07 |
| 107 |             FAST DUAL                      |                               |     1 |       |       |     2   (0)| 00:00:01 |
| 108 |             TABLE ACCESS FULL              | ORGANIZATION                  |  5756 |   185K|       |   550   (1)| 00:00:07 |
| 109 |            INDEX RANGE SCAN                | IDX_FACT_PLAN_SUPP_K_COMP_ORG |    65 |       |       |    19   (0)| 00:00:01 |
| 110 |           TABLE ACCESS BY INDEX ROWID      | FACT_PLAN_SUPP                |   110 |  5720 |       |    82   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------------------
|  70 |     HASH JOIN RIGHT OUTER                  |                              |   300K|    56M|       |   541K  (2)| 01:48:18 |
|  71 |      VIEW                                  | --- VIEW_1                   |    19 |   247 |       |    18  (12)| 00:00:01 |
|  80 |      HASH JOIN                             |                              |   300K|    52M|  3072K|   541K  (2)| 01:48:18 |
|  81 |       TABLE ACCESS FULL                    | ACCOUNT                      | 45564 |  2536K|       |   274   (1)| 00:00:04 |
|  82 |       HASH JOIN                            |                              |   300K|    36M|  2576K|   539K  (2)| 01:47:50 |
|  83 |        TABLE ACCESS FULL                   | ORGANIZATION                 | 58577 |  1887K|       |   550   (1)| 00:00:07 |
|  84 |        HASH JOIN                           |                              |  6984K|   619M|       |   503K  (2)| 01:40:44 |
|  85 |         TABLE ACCESS FULL                  | COMPANY                      |   112 |  1008 |       |     3   (0)| 00:00:01 |
|  86 |         HASH JOIN RIGHT OUTER              |                              |  6984K|   559M|       |   503K  (2)| 01:40:44 |
|  87 |          VIEW                              | --- VIEW_2                   |     2 |    26 |       |    29   (7)| 00:00:01 |
| 104 |          HASH JOIN                         |                              |  6984K|   472M|       |   503K  (2)| 01:40:43 |
| 105 |           NESTED LOOPS                     |                              |     9 |   171 |       |     5   (0)| 00:00:01 |
| 106 |            FAST DUAL                       |                              |     1 |       |       |     2   (0)| 00:00:01 |
| 107 |            TABLE ACCESS FULL               | TIME                         |     9 |   171 |       |     3   (0)| 00:00:01 |
| 108 |           TABLE ACCESS FULL                | FACT_PLAN_SUPP               |    84M|  4207M|       |   503K  (2)| 01:40:37 |
-----------------------------------------------------------------------------------------------------------------------------------

Comparing the two fragments we can see that they both have a funny little nested loop driven by a select from dual: perhaps that’s just there to inject a value like sysdate-N, or a single value from a PL/SQL function, or a call to sys_context() that can act as a “variable constant”. It probably doesn’t matter where that tablescan of dual goes, really, so long as it happens early enough; it’s guaranteed to be only one row so the fact that it drives a nested loop is only a formality. (I’d guess that it’s probably the leading table in the join order, but appears very late in the plan because of “swap_join_inputs” introduced for all the hash joins.)

Apart from that one common nested loop (with different tables as the inner table) almost everything in both plans is a hash join with full tablescans – with one exception: in the bad plan the join to fact_plan_supp is an index-access nested loop driven by a tablescan of organization (technically it’s driven by the result of the join between dual and the result of the tablescan of organization, of course); in the good plan we see a full tablescan of fact_plan_supp – which is millions of rows at a huge fraction (503K/541K) of the total cost of the query.

It seems slightly surprising that such a massive and expensive tablescan should produce the good plan, but it does, and we have to ask why. Since the bad plan has the nested loop, and the nested loop is driven by table organization we need to look at that table in both plans. In the bad plan the prediction is 5,756 rows; in the good plan the prediction is 58,577 rows. The dramatic change in that prediction is probably the cause of the change in plan. Imagine scaling up the number of rows in the bad plan by the factor of 10 implied by those figures and the cost of the nested loop (operations 104 and 105) would go up to 240K – vastly more than the cost of the tablescan chosen for the good plan. If the prediction for the good plan is in the right ball park then the error in the cardinality estimate the optimizer has made for the organization table is almost certain to be the cause of the performance problem seen in the bad plan.

The next task, of course, is to find out why the prediction changed – which means looking at the statistics on the organization table and finding out the values used for any predicates against that one table. (Unfortunately these plans report no predicate section – and the absence of any asterisks (“*”) against the operation lines suggests the plans came from the AWR, which doesn’t capture the predicate information.) There are a couple of common possibilities suggested by the scale of the difference:

  • A column that should have a histogram has lost its histogram
  • A column that didn’t have a histogram has acquired one
  • Some column data is skewed and has a histogram, and an unlucky choice of bind variable on the first call produced a very low estimate of cardinality
  • Stats collection for histograms is sampled in 11g-  and still sampled for hybrid histograms in 12c – an unlucky sample produced a very misleading histogram
  • A bind variable has gone out of range of the low/high values with a significant pro-rated scale-down on the row estimate

After doing the analysis above, I highlighted the organization table to the originator of the question – who came back with the information that a column had recently been added to the table by the process of dropping and recreating the table then gathering stats. If you do that, and have the stats collection method_opt left at its default, you won’t get any histograms initially: the default method_opt is “for all columns size auto” – which lets Oracle decide which columns should be tested for histograms based on historic use and data skew. If it’s a “new” table there will be no historic use, so no attempt to generate histograms. The first option on my list may be the right one.

Footnote:

It took about 15 minutes to copy the plans and do the analysis – it’s taken about 3 hours (plus a few more minutes this morning for a last spelling and grammar check) to explain what I did … and even then the explanation could do with more work.

 

Next Page »

Powered by WordPress.com.