Oracle Scratchpad

October 11, 2019

v$session

Filed under: Execution plans,Infrastructure,Oracle,Performance — Jonathan Lewis @ 12:29 pm BST Oct 11,2019

Here’s an odd, and unpleasant, detail about querying v$session in the “most obvious” way. (And if you were wondering what made me resurrect and complete a draft on “my session id” a couple of days ago, this posting is the reason). Specifically if you want to select some information for your own session from v$session the query you’re likely to use in any recent version of Oracle will probably be of the form:


select {list for columns} from v$session where sid = to_number(sys_context('userenv','sid'));

Unfortunately that one little statement hides two anomalies – which you can see in the execution plan. Here’s a demonstration cut from an SQL*Plus session running under 19.3.0.0:


SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  gcfrzq9knynj3, child number 0
-------------------------------------
select program from V$session where sid = sys_context('userenv','sid')

Plan hash value: 2422122865

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  MERGE JOIN CARTESIAN     |                 |     1 |    33 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |    12 |     0   (0)|
|*  3 |    FIXED TABLE FULL       | X$KSLWT         |     1 |     8 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSLED (ind:2) |     1 |     4 |     0   (0)|
|   5 |   BUFFER SORT             |                 |     1 |    21 |     0   (0)|
|*  6 |    FIXED TABLE FULL       | X$KSUSE         |     1 |    21 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------ -------------------------------------------
   3 - filter("W"."KSLWTSID"=TO_NUMBER(SYS_CONTEXT('userenv','sid')))
   4 - filter("W"."KSLWTEVT"="E"."INDX")
   6 - filter((BITAND("S"."KSUSEFLG",1)<>0 AND BITAND("S"."KSSPAFLG",1)<>0 AND 
              "S"."INDX"=TO_NUMBER(SYS_CONTEXT('userenv','sid'))
              AND INTERNAL_FUNCTION("S"."CON_ID") AND "S"."INST_ID"=USERENV('INSTANCE')))

As you can see, v$session is a join of 3 separate structures – x$kslwt (v$session_wait), x$ksled (v$event_name), and x$ksuse (the original v$session as it was some time around 8i), and the plan shows two “full tablescans” and a Cartesian merge join. Tablescans and Cartesian merge joins are not necessarily bad – especially where small tables and tiny numbers of rows are concerned – but they do merit at least a brief glance.

x$ksuse is a C structure in the fixed SGA and that structure is a segmented array (which seems to be chunks of 126 entries in 19.3, and chunks of 209 entries in 12.2 – but that’s fairly irrelevant). The SID is simply the index into the array counting from 1, so if you have a query with a predicate like ‘SID = 99’ Oracle can work out the address of the 99th entry in the array and access it very quickly – which is why the SID column is reported as a “fixed index” column in the view v$indexed_fixed_column.

But we have two problems immediately visible:

  1. the optimizer is not using the “index” to access x$ksuse despite the fact that we’re giving it exactly the value we want to use (and we can see a suitable predicate at operation 6 in the plan)
  2. the optimizer has decided to start executing the query at the x$kslwt table

Before looking at why thing’s have gone wrong, let’s check the execution plan to see what would have happened if we’d copied the value from the sys_context() call into a bind variable and queried using the bind variable – which we’ll keep as a character type to make it a fair comparison:

SQL_ID  cm3ub1tctpdyt, child number 0
-------------------------------------
select program from v$session where sid = to_number(:v1)

Plan hash value: 1627146547

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  MERGE JOIN CARTESIAN     |                 |     1 |    32 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |    12 |     0   (0)|
|*  3 |    FIXED TABLE FIXED INDEX| X$KSLWT (ind:1) |     1 |     8 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSLED (ind:2) |     1 |     4 |     0   (0)|
|   5 |   BUFFER SORT             |                 |     1 |    20 |     0   (0)|
|*  6 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |    20 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("W"."KSLWTSID"=TO_NUMBER(:V1))
   4 - filter("W"."KSLWTEVT"="E"."INDX")
   6 - filter(("S"."INDX"=TO_NUMBER(:V1) AND BITAND("S"."KSSPAFLG",1)<>0
              AND BITAND("S"."KSUSEFLG",1)<>0 AND INTERNAL_FUNCTION("S"."CON_ID") AND
              "S"."INST_ID"=USERENV('INSTANCE')))


When we have a (character) bind variable instead of a sys_context() value the optimizer manages to use the “fixed indexes”., but it’s still started executing at x$kslwt, and still doing a Cartesian merge join. The plan would be the same if the bind variable were a numeric type, and we’d still get the same plan if we replaced the bind variable with a literal number.

So problem number 1 is that Oracle only seems able to use the fixed index path for literal values and simple bind variables (plus a few “simple” functions). It doesn’t seem to use the fixed indexes for most functions (even deterministic ones) returning a value and the sys_context() function is a particular example of this.

Transitivity

Problem number 2 comes from a side-effect of something that I first described about 13 years ago – transitive closure. Take a look at the predicates in both the execution plans above. Where’s the join condition between x$ksuse and x$kslwt ? There should be one, because the underlying SQL defining [g]v$session  has the following joins:

from
      x$ksuse s,
      x$ksled e,
      x$kslwt w
where
      bitand(s.ksspaflg,1)!=0
and   bitand(s.ksuseflg,1)!=0
and   s.indx=w.kslwtsid       -- this is the SID column for v$session and v$session_wait
and   w.kslwtevt=e.indx
 

What’s happened here is that the optimizer has used transitive closure: “if a = b and b = c then a = c” to clone the predicate “s.indx = to_number(sys_context(…))” to “w.kslwtsid = to_number(sys_context(…))”. But at the same time the optmizer has eliminated the predicate “s.indx = w.kslwtsid”, which it shouldn’t do because this is 12.2.0.1 and ever since 10g we’ve had the parameter _optimizer_transitivity_retain = true — but SYS is ignoring the parameter setting.

So we no longer have a join condition between x$ksuse and x$kslwt – which means there has to be a cartesian merge join between them and the only question is whether this should take place before or after the join between x$kslwt and x$ksled. In fact, the order doesn’t really matter because there will be only one row identified in x$kslwt and one row in x$ksuse, and the join to x$ksled is simply a lookup (by undeclarable unique key) to translate an id into a name and it will take place only once whatever we do about the other two structures.

But there is a catch – especially if your sessions parameter is 25,000 (which it shouldn’t be) and the number of connected sessions is currently 20,000 (which it shouldn’t be) – the predicate against x$ksuse does a huge amount of work as it walks the entire array testing every row (and it doesn’t even do the indx test first – it does a couple of bitand() operations). Even then this wouldn’t be a disaster – we’re only talking a couple of hundredths of a second of CPU – until you find the applications that run this query a huge number of times.

We would prefer to avoid two full tablescans since the arrays could be quite large, and of the two it’s the tablescan of x$ksuse that is going to be the greater threat; so is there a way to bypass the threat?  Once we’ve identified the optimizer anomaly we’ve got a pointer to a solution. Transitivity is going wrong, so let’s attack the transitivity. Checking the hidden parameters we can find a parameter: _optimizer_generate_transitive_pred which defaults to true, so let’s set it to false for the query and check the plan:

select  /*+   opt_param('_optimizer_generate_transitive_pred','FALSE')
*/  program from  v$session where  sid = sys_context('userenv','sid')

Plan hash value: 3425234845

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  NESTED LOOPS             |                 |     1 |    32 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |    28 |     0   (0)|
|   3 |    FIXED TABLE FULL       | X$KSLWT         |    47 |   376 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |    20 |     0   (0)|
|*  5 |   FIXED TABLE FIXED INDEX | X$KSLED (ind:2) |     1 |     4 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(("S"."INDX"="W"."KSLWTSID" AND BITAND("S"."KSSPAFLG",1)<>0
              AND BITAND("S"."KSUSEFLG",1)<>0 AND "S"."INDX"=TO_NUMBER(SYS_CONTEXT('user
              env','sid')) AND INTERNAL_FUNCTION("S"."CON_ID") AND
              "S"."INST_ID"=USERENV('INSTANCE')))
   5 - filter("W"."KSLWTEVT"="E"."INDX")


Although it’s not nice to insert hidden parameters into the optimizer activity we do have a result. We don’t have any filtering on x$kslwt – fortunately this seems to be limited in size (but see footnote) to the number of current sessions (unlike x$ksuse which has an array size defined by the sessions parameter or derived from the processes parameter). For each row in x$kslwt we do an access into x$ksuse using the “index” (note that we don’t see access predicates for the fixed tables, we just have to note the operation says FIXED INDEX and spot the “index-related” predicate in the filter predicate list), so this strategy has reduced the number of times we check the complex predicate on x$ksuse rows.

It’s still far from ideal, though. What we’d really like to do is access x$kslwt by index using the known value from sys_context(‘userenv’,’sid’). As it stands the path we get from using a hidden parameter which isn’t listed as legal for the opt_param() hint is a plan that we would get if we used an unhinted query that searched for audsid = sys_context(‘userenv’,’sessionid’).

SQL_ID  7f3f9b9f32u7z, child number 0
-------------------------------------
select  program from  v$session where  audsid =
sys_context('userenv','sessionid')

Plan hash value: 3425234845

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  NESTED LOOPS             |                 |     2 |    70 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     2 |    62 |     0   (0)|
|   3 |    FIXED TABLE FULL       | X$KSLWT         |    47 |   376 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |    23 |     0   (0)|
|*  5 |   FIXED TABLE FIXED INDEX | X$KSLED (ind:2) |     1 |     4 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(("S"."INDX"="W"."KSLWTSID" AND BITAND("S"."KSSPAFLG",1)<>0
              AND BITAND("S"."KSUSEFLG",1)<>0 AND "S"."KSUUDSES"=TO_NUMBER(SYS_CONTEXT('
              userenv','sessionid')) AND INTERNAL_FUNCTION("S"."CON_ID") AND
              "S"."INST_ID"=USERENV('INSTANCE')))
   5 - filter("W"."KSLWTEVT"="E"."INDX")


The bottom line, then, seems to be that if you need a query by SID against v$session to be as efficient as possible then your best bet is to load a numeric variable with the sys_context(‘userenv’,’sid’) and then select where “sid = :bindvariable”.  Otherwise query by audsid, or use a hidden parameter to affect the optimizer.

Until the the SYS schema follows the _optimizer_transitivity_retain parameter or treats sys_context() the same way it treats a bind variable there is always going to be some unnecessary work when querying v$session and that excess will grow with either the number of connected sessions (if you optimize the query) or with the value of the sessions parameter.

Footnote

In (much) older versions of Oracle v$session_wait sat on top of x$ksusecst, which was part of the same C structure as x$ksuse. In newer versions of Oracle x$kslwt is a structure that is created on demand in the PGA/UGA – I hope that there’s a short cut that allows Oracle to find the waiting elements in x$ksuse[cst] efficiently, rather than requiring a walk through the whole thing, otherwise a tablescan of the (nominally smaller) x$kslwt structure will be at least as expensive as a tablescan of the x$ksuse structure.

Update (just a few minutes after posting)

Bob Bryla has pointed out in a tweet that there are many “bugs” not fixed until 19.1 for which the workaround is to set “_optimizer_transitivity_retain” to false. So maybe this isn’t an example of SYS doing something particularly strange – it may be part of a general reworking of the mechanism that still has a couple of undesirable side effects.

Bob’s comment prompted me to clone the x$ tables into real tables in a non-SYS schema and model the fixed indexes with primary keys, and I found that the resulting plan (though very efficient) still discarded the join predicate. So we may be seeing the side effects of a code enhancement relating to generating predicates that produce unique key access paths. (A “contrary” test, the one in the 2013 article I linked to, still retains the join predicate for the query that has non-unique indexes.)

 

September 28, 2019

Negative Offload

Filed under: Exadata,Execution plans,HCC,Oracle,Troubleshooting — Jonathan Lewis @ 5:38 pm BST Sep 28,2019

At the Trivadis Performance Days 2019 I did a presentation on using execution plans to understand what a query was doing. One of the examples I showed was a plan from an Exadata system (using 11.2.0.4) that needed to go faster. The plan was from the SQL Monitor report and all I want to show you is one line that’s reporting a tablescan. To fit the screen comfortably I’ve removed a number of columns from the output.

The report had been generated while the statement was still running (hence the “->” at the left hand edge) and the query had scanned 166 segments (with no partition elimination) of a table with 4,500 data segments (450 range partitions and 10 hash sub-partitions – note the design error, by the way, hash partitioning in Oracle should always hash for a powert of 2).


SQL Plan Monitoring Details (Plan Hash Value=3764612084)  
============================================================================================================================================
| Id   |           Operation            | Name  | Read  | Read  | Write | Write |   Cell   | Mem  | Activity |       Activity Detail       |  
|      |                                |       | Reqs  | Bytes | Reqs  | Bytes | Offload  |      |   (%)    |         (# samples)         |   
============================================================================================================================================
| -> 5 |      TABLE ACCESS STORAGE FULL | TXN   |  972K | 235GB |       |       | -203.03% |   7M |    63.43 | Cpu (1303)                  | 
|      |                                |       |       |       |       |       |          |      |          | cell smart table scan (175) | 
============================================================================================================================================

In the presentation I pointed out that for a “cell smart table scan” (note the Activity Detail colum) this line was using a surprisingly large amount of CPU.

We had been told that the table was using hybrid columnar compression (HCC) and had been given some figures that showed the compression factor was slightly better than 4. I had also pointed out that the typical size of a read request was 256KB. (Compare Read Reqs with Read Bytes)

To explain the excessive CPU I claimed that we were seeing “double decompression” – the cell was decompressing (uncompressing) compression units (CUs), finding that the resulting decompressed data was larger than the 1MB unit that Exadata allows and sending the original compressed CU to the database server where it was decompressed again – and the server side decompression was burning up the CPU.

This claim is (almost certainly) true – but the justification I gave for the claim was at best incomplete (though, to be brutally honest, I have to admit that I’d made a mistake): I pointed out that the Cell Offload was negative 200% and that this was what told us about the double decompression. While double decompression was probably happening the implication I had made was that a negative offload automatically indicated double decompression – and that’s was an incorrect assumption on my part. Fortunately Maurice Müller caught up with me after the session was over and pointed out the error then emailed me a link to a relevant article by Ahmed Aangour.

The Cell Offload is a measure of the difference between the volume of data read and the volume of data returned to the server. If the cell reads 256KB from disc, but the column and row selection means the cell returns 128KB the Cell Offload would be 50%; if the cell returns 64KB the Cell Offload would be 75% (100 * (1 – 64KB/256KB)). But what if you select all the rows and columns from a compressed table – the volume of data after decompression would be larger than the compressed volume the cell had read from disc – and in this case we knew that we were reading 256KB at a time and the compression factor was slightly greater than 4, so the uncompressed data would probably be around 1MB, giving us a Cell Offload of 100 * (1 – 1024KB / 256KB) = negative 300%

Key Point: Any time that decompression, combined with the row and column selection, produces more data than the volume of data read from disc the Cell Offload will go negative. A negative Cell Offload is not inherently a problem (though it might hint at a suboptimal use of compression).

Follow-up Analysis

Despite the error in my initial understanding the claim that we were seeing double decompression was still (almost certainly) true – but we need to be a little more sophisticated in the analysis. The clue is in the arithmetic a few lines further up the page. We can see that we are basically reading 256KB chunks of the table, and we know that 256KB will expand to roughly 1MB so we ought to see a Cell Offload of about -300%; but the Cell Offload is -200%. This suggests fairly strongly that on some of the reads the decompressed data is slightly less than 1MB, which allows the cell to return the decompressed data to the database server, while some of the time the decompressed data is greater than 1MB, forcing the cell to send the original (compressed) CU to the databsae server.

We may even be able work the arithmetic backwards to estimate the number of times double decompression appeared.  Assume that two-thirds of the time the cell decompressed the data and successfully sent (just less than) 1MB back to the database server and one-third of the time the cell decompressed the data and found that the result was too large and sent 256KB of compressed data back to the server, and let’s work with the 972,000 read requests reported to see what drops out of the arithmetic:

  • Total data read: 972,000 * 256KB = 243,000 MB
  • Data sent to db server:  648,000 * 1MB + 324,000 * 256KB = 729,000 MB
  • Cell Offload = 100 * (1 – 729/243) = -200%   Q.E.D.

Of course it would be nice to avoid guessing – and if we were able to check the session activity stats (v$sessstat) while the query was running (or after it had completed) we could pick up several numbers that confirmed our suspicion. For 11.2.0.4, for example, we would keep an eye on:

	cell CUs sent uncompressed
	cell CUs processed for uncompressed
	EHCC {class} CUs Decompressed

Differences between these stats allows you to work out the number of compression units that failed the 1MB test on the cell server and were sent to the database server to be decompressed. There is actually another statistic named “cell CUs sent compressed” which would make life easy for us, but I’ve not seen it populated in my tests – so maybe it doesn’t mean what it seems to say.

Here’s an example from an 11.2.0.4 system that I presented a few years ago showing some sample numbers.

cell CUs sent uncompressed              5,601
cell CUs processed for uncompressed     5,601

EHCC CUs Decompressed                  17,903
EHCC Query High CUs Decompressed       12,302 

This reveals an annoying feature of 11g (continued in 12.1) that results in double counting of the statistics, confusing the issue when you’re trying to analyze what’s going on. In this case the table consisted of 12,302 compression units, and the query was engineered to cause the performance problem to appear. The first two statistics show us how many CUs were decompressed successfully (we’ll see a change appearing there in 12.1). We then see that all 12,302 of the table’s “query high” compression units were decompressed – but the “total” of all CUs decompressed was 17.903.

It’s not a coincidence that 12,302 + 5,601 = 17,903; there’s some double counting going on. I don’t know how many of the statistics are affected in this way, but Oracle has counted the CUs that passsed decompression once as they were processed at the cell server and again as they arrived at the database server. In this example we can infer that 12,302 – 5,601 = 6,701 compression units failed decompression at the cell server and were sent to the database server in compressed form to be decompressed again.

Here’s a couple of sets of figures from some similar tests run on 12.1.0.2 – one with a table compressed to query high another compressed to query low. There is one critical difference from the 11g figures but the same double-counting seems to have happened. In both cases the “EHCC Query [Low|High] CUs Decompressed” show the correct number of CUs in each table. Note, though that the “cell CUs processed for uncompress” in 12.1 appear to report the number of attempted decompressions rather than 11g’s number of successful decompressions.


=========================

cell CUs sent uncompressed                     19,561	-- successful decompressions at cell server
cell CUs processed for uncompressed            19,564	=> 3 failures

EHCC CUs Decompressed                          39,125	=  2 * 19,561 successes + 3 db server decompression
EHCC Query High CUs Decompressed               19,564

=========================

cell CUs sent uncompressed                     80,037	-- successful decompressions at cell server
cell CUs processed for uncompressed            82,178	=> 2,141 failures

EHCC CUs Decompressed                         162,215	=  2 * 80,037 successes + 2,141 db server decompressions
EHCC Query Low CUs Decompressed                82,178

=========================

I’ve annotated the figures to explain the arithmetic.

There has been some significant renaming and separation of statistics in 12.2, as described in this post by Roger MacNicol, and the problems of double-counting should have disappeared. I haven’t yet tested my old models in the latest versions of Oracle, though, so can’t show you anyy figures to demonstrate the change.

Takeaways

There are 4 key points to note in this posting.

  • Hash (sub)partitioning should be based on powers of 2, otherwise some partitions will be twice size of others.
  • There is a 1MB limit on the “data packet” sent between the cell server and database server in Exadata.
  • If you select a large fraction of the rows and columns from an HCC compressed table you may end up decompressing a lot of your data twice if the decompressed data for a read request is larger than the 1MB unit (and the cost will be highly visible at the database server as CPU usage).
  • The Cell Offload figure for a tablescan (in particular) will go negative if the volume of data sent from the cell server to the database server is larger than the volume of data read from the disk- even if double decompression hasn’t been happening.

A little corollary to the third point: if you are writing to a staging table with the expectation of doing an unfiltered tablescan (or a select *), then you probably don’t want to use hybrid columnar compression on the table as you will probably end up using a lot of CPU at the database server to compress it, then do double-decompression using even more CPU on the database server.  It’s only if you really need to minimise disk usage and have lots of CPU capacity to spare that you have a case for using hybrid columnar compression for the table (and Oracle In-Memory features may also change the degree of desirability).

Footnote

I haven’t said anything about accessing table data by index when the table is subject to HCC compression. I haven’t tested the mechanism in recent versions of Oracle but it used to be the case that the cell server would supply the whole compression unit (CU) to the database server which would decompress it to construct the relevant row. One side effect of this was that the same CU could be decompressed (with a high CPU load) many times in the course of a single query.

 

August 23, 2019

Optimizer Tricks 1

Filed under: CBO,Execution plans,Indexing,Oracle — Jonathan Lewis @ 12:39 pm BST Aug 23,2019

I’ve got a number of examples of clever little tricks the optimizer can do to transform your SQL before starting in on the arithmetic of optimisation. I was prompted to publish this one by a recent thread on ODC. It’s worth taking note of these tricks when you spot one as a background knowledge of what’s possible makes it much easier to interpret and trouble-shoot from execution plans. I’ve labelled this one “#1” since I may publish a few more examples in the future, and then I’ll have to catalogue them – but I’m not making any promises about that.

Here’s a table definition, and a query that’s hinted to use an index on that table.


rem
rem     Script:         optimizer_tricks_01.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2019
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             11.2.0.4
rem

create table t1 (
        v1      varchar2(10),
        v2      varchar2(10),
        v3      varchar2(10),
        padding varchar2(100)
);

create index t1_i1 on t1(v1, v2, v3);


explain plan for
select
        /*+ index(t1 (v1, v2, v3)) */
        padding 
from 
        t1
where
        v1 = 'ABC'
and     nvl(v3,'ORA$BASE') = 'SET2'
;

select * from table(dbms_xplan.display);

The query uses the first and third columns of the index, but wraps the 3rd column in an nvl() function. Because of the hint the optimizer will generate a plan with an index range scan, but the question is – what will the Predicate Information tell us about Oracle’s use of my two predicates:


Plan hash value: 3320414027

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |     1 |    66 |     0   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    66 |     0   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"='ABC')
       filter(NVL("V3",'ORA$BASE')='SET2')

The nvl() test is used during the index range scan (from memory I think much older versions of Oracle would have postponed the predicate test until they had accessed the table itself). This means Oracle will do a range scan over the whole section of the index where v1 = ‘ABC’, testing every index entry it finds against the nvl() predicate.

But what happens if we modify column v3 to be NOT NULL? (“alter table t1 modify v3 not null;”) Here’s the new plan:


Plan hash value: 3320414027

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |     1 |    66 |     0   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    66 |     0   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"='ABC' AND "V3"='SET2')
       filter("V3"='SET2')


The optimizer will decide that with the NOT NULL status of the column the nvl() function can be eliminated and the predicate can be replaced with a simple column comparison. At this point the v3 predicate can now be used to reduce the number of index entries that need to be examined by using a type of skip-scan/iterator approach, but Oracle still has to test the predciate against the index entries it walks through – so the predicate still appears as a filter predicate as well.

You might notice, by the way, that the Plan hash value does not change as the predicate use changes – even though the change in use of predicates could make a huge difference to the performance. (As indicated in the comments at the top of the script, I’ve run this model against 11.2.0.4 – which is the version used in the ODC thread – and 19.3.0.0: the behaviour is the same in both versions, and the Plan hash value doesn’t change from version to version.)

Footnote

The reason why I decided to publish this note is that the original thread on the ODC forums reported the Following contradictory details – an index definition and the optimizer’s use of that index as shown in the predicate section of the plan:


Index column name      Column position
---------------------- ----------------
FLEX_VALUE_SET_ID      1
PARENT_FLEX_VALUE      2
RANGE_ATTRIBUTE        3
CHILD_FLEX_VALUE_LOW   4
CHILD_FLEX_VALUE_HIGH  5
ZD_EDITION_NAME        6

---------------------------------------------------------------------------
|* 17 |      INDEX RANGE SCAN             | FND_FLEX_VALUE_NORM_HIER_U1   |
---------------------------------------------------------------------------
  17 - access("FLEX_VALUE_SET_ID"=:B1 AND NVL("ZD_EDITION_NAME",'ORA$BASE')='SET2')  
       filter((NVL("ZD_EDITION_NAME",'ORA$BASE')='SET2'  ..... lots more bits of filter predicate.

Since the expression nvl(zd_edition_name, ‘ORA$BASE’) = ‘SET2’ appears as an access predicate and a filter predicate it must surely be a column in the index. So either this isn’t the definition of the index being used or, somehow, there’s a trick that allows zd_edition_name to appear as a column name in the index when it really means nvl(zd_edition_name,’ORA$BASE’) at run-time. (And if there is I want to know what it is – edition-based redefinition and tricks with virtual columns spring to mind, but I avoid thinking about complicated explanations when a simpler one might be available.)

 

August 20, 2019

Join View

Filed under: constraints,Infrastructure,Joins,Oracle — Jonathan Lewis @ 12:39 pm BST Aug 20,2019

It’s strange how one thing leads to another when you’re trying to check some silly little detail. This morning I wanted to find a note I’d written about the merge command and “stable sets”, and got to a draft about updatable join views that I’d started in 2016 in response to a question on OTN (as it was at the time) and finally led to a model that I’d written in 2008 showing that the manuals were wrong.

Since the manual – even the 19c manual – is still wrong regarding the “Delete Rule” for updatable (modifiable) join views I thought I’d quickly finish off the draft and post the 2008 script. Here’s what the manual says about deleting from join views (my emphasis on “exactly”):

Rows from a join view can be deleted as long as there is exactly one key-preserved table in the join. The key preserved table can be repeated in the FROM clause. If the view is defined with the WITH CHECK OPTION clause and the key preserved table is repeated, then the rows cannot be deleted from the view.

But here’s a simple piece of code to model a delete from a join view that breaks the rule:


rem
rem     Script:         delete_join.sql 
rem     Dated:          Dec 2008
rem     Author:         J P Lewis
rem

create table source
as
select level n1
from dual
connect by level <= 10
/ 
 
create table search
as
select level n1
from dual
connect by level <= 10
/ 

alter table source modify n1 not null;
alter table search modify n1 not null;

create unique index search_idx on search(n1);
-- create unique index source_idx on source(n1)

I’ve set up a “source” and a “search” table with 10 rows each and the option for creating unique indexes on each table for a column that’s declared non-null. Initially, though, I’ve only created the index on search to see what happens when I run a couple of “join view” deletes using “ANSI” syntax.

prompt  ===============================
prompt  Source referenced first in ANSI
prompt  ===============================

delete from (select * from source s join search s1 on s.n1 = s1.n1);
select count(1) source_count from source;
select count(1) search_count from search;
rollback;
 
prompt  ===============================
prompt  Search referenced first in ANSI
prompt  ===============================

delete from (select * from search s join source s1 on s.n1 = s1.n1);
select count(1) source_count from source;
select count(1) search_count from search;
rollback;

With just one of the two unique indexes in place the order of the tables in the inline view makes no difference to the outcome. Thanks to the unique index on search any row in the inline view corresponds to exactly one row in the source table, while a single row in the search table could end up appearing in many rows in the view – so the delete implictly has to operate as “delete from source”. So both deletes will result in the source_count being zero, and the search_count remaining at 10.

If we now repeat the experiment but create BOTH unique indexes, both source and search will be key-preserved in the join. According to the manual the delete should produce some sort of error. In fact the delete works in both cases – but the order that the tables appear makes a difference. When source is the first table in the in-line view the source_count drops to zero and the search_count stays at 10; when search is the first table in the in-line view the search_count drops to zero and the source_count stays at 10.

I wouldn’t call this totally unreasonable – but it’s something you need to know if you’re going to use the method, and something you need to document very carefully in case someone editing your code at a later date (or deciding that they could add a unique index) doesn’t realise the significance of the table order.

This does lead on to another important test – is it the order that the tables appear in the from clause that matters, or the order they appear in the join order that Oracle uses to optimise the query. (We hope – and expect – that it’s the join order as written, not the join order as optimised, otherwise the effect of the delete could change from day to day as the optimizer chose different execution plans!). To confirm my expectation I switched to traditional Oracle syntax with hints (still with unique indexes on both tables), writing a query with search as the first table in the from clause, but hinting the inline view to vary the optimised join order.


prompt  ============================================
prompt  Source hinted as leading table in join order 
prompt  ============================================

delete from (
        select 
                /*+ leading(s1, s) */
                * 
        from 
                search s,
                source s1 
        where
                s.n1 = s1.n1
        )
;

select count(1) source_count from source; 
select count(1) search_count from search;
rollback;

prompt  ============================================
prompt  Search hinted as leading table in join order 
prompt  ============================================

delete from (
        select 
                /*+ leading(s, s1) */
                * 
        from 
                search s,
                source s1 
        where
                s.n1 = s1.n1
        )
;

select count(1) source_count from source; 
select count(1) search_count from search;
rollback;

In both cases the rows were deleted from search (the first table in from clause). And, to answer the question you should be asking, I did check the execution plans to make sure that the hints had been effective:


============================================
Source hinted as leading table in join order
============================================

------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
------------------------------------------------------------------
|   0 | DELETE STATEMENT    |            |    10 |    60 |     1 |
|   1 |  DELETE             | SEARCH     |       |       |       |
|   2 |   NESTED LOOPS      |            |    10 |    60 |     1 |
|   3 |    INDEX FULL SCAN  | SOURCE_IDX |    10 |    30 |     1 |
|*  4 |    INDEX UNIQUE SCAN| SEARCH_IDX |     1 |     3 |       |
------------------------------------------------------------------

============================================
Search hinted as leading table in join order
============================================

------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
------------------------------------------------------------------
|   0 | DELETE STATEMENT    |            |    10 |    60 |     1 |
|   1 |  DELETE             | SEARCH     |       |       |       |
|   2 |   NESTED LOOPS      |            |    10 |    60 |     1 |
|   3 |    INDEX FULL SCAN  | SEARCH_IDX |    10 |    30 |     1 |
|*  4 |    INDEX UNIQUE SCAN| SOURCE_IDX |     1 |     3 |       |
------------------------------------------------------------------

Summary

Using updatable join views to handle deletes can be very efficient but the manual’s statement of the “Delete Rule” is incorrect. It is possible to have several key-preserved tables in the view that you’re using, and if that’s the case you need to play safe and ensure that the table you want to delete from is the first table in the from clause. This means taking steps to eliminate the risk of someone editing some code at a later date without realising the importance of the table order.

Update (very shortly after publication)

Iduith Mentzel has pointed out in comment #1 below that the SQL Language Reference Guide and the DBA Administration Guide are not consistent in their descriptions of deleting from a join view, and that the SQL Language Reference Guide correctly states that the delete will be applied to the first mentioned key-preserved table.

 

 

August 8, 2019

Free Space

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 1:27 pm BST Aug 8,2019

Several years ago I wrote a note about reporting dba_free_space and dba_extents to produce a map of the space usage in a tablespace in anticipation of messing about with moving or rebuilding objects to try and reduce the size of the files in the tablespace.  In the related page where I published the script I pointed out that a query against dba_extents would be expensive because it makes use of structure x$ktfbue which generates the information dynamically by reading segment header blocks. I also pointed out in a footnote to the original article that if you’ve enabled the recyclebin and have “dropped” some objects then there will be some space that is reported as free but is not quite free since the extents will still be allocated. This brings me to the topic for today’s blog.

While visiting a client site recently I came across an instance that was running a regular report to monitor available space in the database. Basically this was a query against view dba_free_space. Surprisingly it was taking a rather long time to complete – and the reason for this came in two parts. First, the recyclebin was enabled and had some objects in it and secondly there were no stats on the fixed object x$ktfbue.

In the case of the client the particular query produced a plan that included the following lines:


Id  Operation             Name              Rows    Bytes  Cost (%CPU)  Time
--  --------------------- ----------------  ----   ------  -----------  --------
63  HASH JOIN                               2785     212K     46  (85)  00:00:01
64    TABLE ACCESS FULL   RECYCLEBIN$       1589    20657      7   (0)  00:00:01
65    FIXED TABLE FULL    X$KTFBUE          100K    6347K     38 (100)  00:00:01 

This is part of the view where Oracle calculates the size of all the extents of objects in the recyclebin so that they can be reported as free space. Notice that in this plan (which is dependent on version, system stats, object_stats and various optimizer parameters) the optimizer has chosen to do a hash join between the recyclebin (recyclebin$) and the x$ structure – and that has resulted in a “full tablescan” of x$ktfbue, which means Oracle reads the segment header block of every single segment in the entire database. (I don’t know where the row stats came from as there were no stats on x$ktfbue, and this plan was pulled from the AWR history tables so the query had been optimised and captured some time in the past.)

If there had been nothing in the recyclebin the hash join and two tablescans wouldn’t have mattered, unfortunately the recyclebin had been enabled and there were a few rows in recyclebin$, so the “tablescan” happened. Here’s a cut-n-paste from a much simpler query run against a fairly new (no 3rd party app) database running 12.1.0.2 to give you some idea of the impact:


SQL> execute snap_events.start_snap

PL/SQL procedure successfully completed.

SQL> select count(*) from x$ktfbue;

  COUNT(*)
----------
      8774

1 row selected.

SQL> execute snap_events.end_snap
---------------------------------------------------------
Session Events - 01-Aug 21:28:13
---------------------------------------------------------
Event                                             Waits   Time_outs        Csec    Avg Csec    Max Csec
-----                                             -----   ---------        ----    --------    --------
Disk file operations I/O                              7           0           0        .018           1
db file sequential read                           5,239           0          14        .003           6
SQL*Net message to client                             7           0           0        .000           0
SQL*Net message from client                           7           0       1,243     177.562         572
events in waitclass Other                             3           1           0        .002           0

PL/SQL procedure successfully completed.

On my little laptop, with nothing else going on, I’ve managed to get away with “only” 5,239 single block reads, and squeezed them all into just 14 centiseconds (local SSD helps). The clients wasn’t so lucky – they were seeing tens of thousands of real physical reads.

The ideal solution, of course, was to purge the recyclebin and disable the feature – it shouldn’t be necessary to enable it on a production system – but that’s something that ought to require at least some paperwork. In the short term gathering stats on the fixed table helped because the plan changed from a hash join with “tablescan” of x$ktfbue to a nested loop with an “indexed” access path, looking more like the following (from a query against just recyclebin$ and x$ktfbue)

---------------------------------------------------------------------------------------------
| Id  | Operation                | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                  |       |       |     4 (100)|          |
|   1 |  NESTED LOOPS            |                  |     7 |   182 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL      | RECYCLEBIN$      |     6 |    66 |     4   (0)| 00:00:01 |
|*  3 |   FIXED TABLE FIXED INDEX| X$KTFBUE (ind:1) |     1 |    15 |     0   (0)|          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("BUE"."KTFBUESEGBNO"="BLOCK#" AND "BUE"."KTFBUESEGTSN"="TS#" AND
              "BUE"."KTFBUESEGFNO"="FILE#"))

This was still fairly resource-intensive for the client, but was something of an improvement – they had a lot more than 6 items in their recyclebin.

Part of the problem, of course, is that x$ktfbue is one of the objects that Oracle skips when you gather “fixed object” stats – it can be a bit expensive for exactly the reason that querying it can be expensive, all those single block segment header reads.

If you want to check the stats and gather them (as a one-off, probably) here’s some suitable SQL:


select
        table_name, num_rows, avg_row_len, sample_size, last_analyzed
from
        dba_tab_statistics
where
        owner = 'SYS'
and     table_name = 'X$KTFBUE'
;

begin
        dbms_stats.gather_table_stats('SYS','X$KTFBUE');
end;
/

Summary

You probably shouldn’t have the recyclebin enabled in a production system; but if you do, and if you also run a regular report on free space (as many sites seem to do) make sure (a) you have a regular routine to minimise the number of objects that it accumulates and (b) gather statistics (occasionally) on x$ktfbue to minimise the overhead of the necessary join between recyclebin$ and x$ktfbue.

July 12, 2019

opt_estimate 5

Filed under: CBO,Execution plans,Hints,Oracle,Statistics — Jonathan Lewis @ 10:28 am BST Jul 12,2019

If you’ve been wondering why I resurrected my drafts on the opt_estimate() hint, a few weeks ago I received an email containing an example of a query where a couple of opt_estimate() hints were simply not working. The critical features of the example was that the basic structure of the query was of a type that I had not previously examined. That’s actually a common type of problem when trying to investigate any Oracle feature from cold – you can spend days thinking about all the possible scenarios you should model then the first time you need to do apply your knowledge to a production system the requirement falls outside every model you’ve examined.

Before you go any further reading this note, though, I should warn you that it ends in frustration because I didn’t find a solution to the problem I wanted to fix – possibly because there just isn’t a solution, possibly because I didn’t look hard enough.

So here’s a simplified version of the problem – it involves pushing a predicate into a union all view. First some data and a baseline query:

rem
rem     Script:         opt_estimate_3a.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2019
rem

create table t1
as
select
        rownum                          id,
        100 * trunc(rownum/100)-1       id2,
        mod(rownum,1e3)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        dual
connect by
        rownum <= 1e4   -- > comment to avoid WordPress format issue
;

create table t2a pctfree 75 as select * from t1;
create table t2b pctfree 75 as select * from t1;

create index t2ai on t2a(id);
create index t2bi on t2b(id);

explain plan for
select
        t1.v1,
        t2.flag,
        t2.v1
from
        t1,
        (select 'a' flag, t2a.* from t2a
         union all
         select 'b', t2b.* from t2b
        )       t2
where
        t2.id = t1.n1
and     t1.id = 99
/

select * from table(dbms_xplan.display(null,null,'outline alias'))
/


There is one row with t1.id = 99, and I would like the optimizer to use an indexed access path to select the one matching row from each of the two tables in the union all view. The smart execution plan would be a nested loop using a “pushed join predicate” – and that’s exactly what we get by default with this data set:


-----------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |     2 |    96 |    30   (4)| 00:00:01 |
|   1 |  NESTED LOOPS                          |      |     2 |    96 |    30   (4)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL                    | T1   |     1 |    19 |    26   (4)| 00:00:01 |
|   3 |   VIEW                                 |      |     1 |    29 |     4   (0)| 00:00:01 |
|   4 |    UNION ALL PUSHED PREDICATE          |      |       |       |            |          |
|   5 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2A  |     1 |    15 |     2   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN                  | T2AI |     1 |       |     1   (0)| 00:00:01 |
|   7 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2B  |     1 |    15 |     2   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN                  | T2BI |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$1        / T1@SEL$1
   3 - SET$5715CE2E / T2@SEL$1
   4 - SET$5715CE2E
   5 - SEL$639F1A6F / T2A@SEL$2
   6 - SEL$639F1A6F / T2A@SEL$2
   7 - SEL$B01C6807 / T2B@SEL$3
   8 - SEL$B01C6807 / T2B@SEL$3

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$639F1A6F" "T2A"@"SEL$2")
      INDEX_RS_ASC(@"SEL$639F1A6F" "T2A"@"SEL$2" ("T2A"."ID"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B01C6807" "T2B"@"SEL$3")
      INDEX_RS_ASC(@"SEL$B01C6807" "T2B"@"SEL$3" ("T2B"."ID"))
      USE_NL(@"SEL$1" "T2"@"SEL$1")
      LEADING(@"SEL$1" "T1"@"SEL$1" "T2"@"SEL$1")
      NO_ACCESS(@"SEL$1" "T2"@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SET$1")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SEL$2")
      OUTLINE_LEAF(@"SEL$1")
      PUSH_PRED(@"SEL$1" "T2"@"SEL$1" 1)
      OUTLINE_LEAF(@"SET$5715CE2E")
      OUTLINE_LEAF(@"SEL$B01C6807")
      OUTLINE_LEAF(@"SEL$639F1A6F")
      ALL_ROWS
      OPT_PARAM('_nlj_batching_enabled' 0)
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."ID"=99)
   6 - access("T2A"."ID"="T1"."N1")
   8 - access("T2B"."ID"="T1"."N1")

So that worked well – operation 2 predicts one row for the tablescan of t1, with a nested loop join and union all pushed predicate where an index range scan of t2a_i1 and t2b_i1 gives us one row from each table. The “Predicate Information” tells us that the t1.n1 join predicate has been pushed inside the view to both subqueries so we see “t2a.id = t1.n1”, and “t2b.id = t1.n1”.

So what if I want to tell Oracle that it will actually find 5 rows in the t2a range scan and table access and 7 rows in the t2b range scan and table access (perhaps in a more complex view that would persuade Oracle to use two different indexes to get into the view and change the join order and access method for the next few tables it accessed). Since I’ve recently just written about the nlj_index_scan option for opt_estimate() you might think that this is the one we need to use – perhaps something like:


opt_estimate(@sel$639f1a6f nlj_index_scan, t2a@sel$2 (t1), t2a_i1, scale_rows=5)
opt_estimate(@sel$b01c6807 nlj_index_scan, t2b@sel$3 (t1), t2b_i1, scale_rows=7)

You’ll notice I’ve been very careful to find the fully qualified aliases for t2a and t2b by looking at the “Query Block Name / Object Alias” section of the plan (if the view appeared as a result of Oracle using Concatenation or OR-Expansion you would find that you got two query block names that looked similar but had suffixes of “_1” and “_2”). But it wasn’t worth the effort, it didn’t work. Fiddling around with all the possible variations I could think of didn’t help (maybe I should have used set$5715ce2e as the query block target for both the hints – no; what if I …)

Of course if we look at the “Outline Data” we’d notice that the use_nl() hint in the outline says: “USE_NL(@SEL$1 T2@SEL$1)”, so we don’t have a nested loop into t2a and t2b, we have a nested loop into the  view t2. So I decided to forget the nested loop idea and just go for the indexes (and the tables, when I got to them) with the following hints (you’ll notice that during the course of my experiments I added my own query block names to the initial query blocks – so the generated query block names have changed):



explain plan for
select
        /*+
                qb_name(main)
                opt_estimate(@sel$f2bf1101, index_scan, t2a@subq_a, t2ai, scale_rows=5)
                opt_estimate(@sel$f2bf1101, table,      t2a@subq_a,       scale_rows=5)
                opt_estimate(@sel$f4e7a233, index_scan, t2b@subq_b, t2bi, scale_rows=7)
                opt_estimate(@sel$f4e7a233, table,      t2b@subq_b,       scale_rows=7)
        */
        t1.v1,
        t2.flag,
        t2.v1
from
        t1,
        (select /*+ qb_name(subq_a) */ 'a' flag, t2a.* from t2a
         union all
         select /*+ qb_name(subq_b) */ 'b', t2b.* from t2b
        )       t2
where
        t2.id = t1.n1
and     t1.id = 99
;

select * from table(dbms_xplan.display(null,null,'outline alias'));


-----------------------------------------------------------------------------------------------
| Id  | Operation                              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |      |     2 |    96 |    30   (4)| 00:00:01 |
|   1 |  NESTED LOOPS                          |      |     2 |    96 |    30   (4)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL                    | T1   |     1 |    19 |    26   (4)| 00:00:01 |
|   3 |   VIEW                                 |      |     1 |    29 |     4   (0)| 00:00:01 |
|   4 |    UNION ALL PUSHED PREDICATE          |      |       |       |            |          |
|   5 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2A  |     5 |    75 |     2   (0)| 00:00:01 |
|*  6 |      INDEX RANGE SCAN                  | T2AI |     5 |       |     1   (0)| 00:00:01 |
|   7 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2B  |     7 |   105 |     2   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN                  | T2BI |     7 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------


Excellent – we get the cardinalities we want to see for the tables – except the view operator doesn’t hold the sum of the table cardinalities, and the join doesn’t multiply up the estimates either. I couldn’t find a way of getting the view to show 12 rows (not even with a guessed – but presumably unimplemented – opt_estimate(view …) hint!), however during the course of my experiments I tried the hint: “opt_estimate(@main, table, t2@main, scale_rows=15)”. This didn’t have any visible effect in the plan but while searching through the 10053 trace file I found the following lines:

Table Stats::
  Table: from$_subquery$_002  Alias: T2  (NOT ANALYZED)
  #Rows: 20000  SSZ: 0  LGR: 0  #Blks:  37  AvgRowLen:  15.00  NEB: 0  ChainCnt:  0.00  ScanRate:  0.00  SPC: 0  RFL: 0  RNF: 0  CBK: 0  CHR: 0  KQDFLG: 9

Access path analysis for from$_subquery$_002
    >> Single Tab Card adjusted from 20000.000000 to 300000.000000 due to opt_estimate hint

Access path analysis for from$_subquery$_002
    >> Single Tab Card adjusted from 12.000000 to 180.000000 due to opt_estimate hint

So at some point in the code path the optimizer is aware that 5 + 7 = 12, and that 12 * 15 = 180. But this doesn’t show up in the final execution plan. You might notice, by the way, that the scale_rows=15 has been applied NOT ONLY to the place where I was aiming – it’s also been applied to scale up the 20,000 rows that are estimated to be in the union all to 300,000 as the estimate for a tablescan of the two tables.

Possibly if I spent more time working through the 10053 trace file (which, as I’ve said before, I try to avoid doing) I might have found exactly which code path Oracle followed to get to the plan it produced and managed to tweak some hints to get the numbers I wanted to see. Possibly the optimizer was already following the code path that actually produced the numbers I wanted, then “forgot” to use them. One day, perhaps, I’ll tale another look at the problem – but since I wasn’t trying to solve a problem for a client (and given that there was an alternative workaround) I closed the 10053 trace file and put the model aside for a rainy day.

Footnote

One thought did cross my mind as a way of finding out if there was a real solution – and I offer this for anyone who wants to play: create a second data set that genuinely produces the 5 and 7 I want to see (and, check that the view reports the sum of the two components); then run the original query against the original data so that you’ve got the execution plan in memory, overwrite the original data with the new data set (without changing the statistics on the orginal). Then use the SQL Tuning Advisor to see if it produces a SQL profile for the captured SQL_ID that reproduces the correct plan for the second data set and check what opt_estimate() hints it uses.  (Warning – this may turn into a frustrating waste of time.)

Update Oct 2019

I’ve been saying for years that I don’t like the trick of pulling the Outline Information from an execution plan in memory and storing it in the database as an SQL Profile because that’s effectively storing an SQL Plan Baseline as an SQL Profile and there might be subtle and (potentially) misleading side effects of abusing the two mechanisms. Behind the argument I’ve also made the observation that while both mechamisms store hints, the hints for an SQL Profile are about statistics and the hints for an SQL Plan Baseline are about transformations, joins, and other mechanis.

However .;.

I’ve now down the test I described in the foot note above – created a table with data in it that made Oracle choose full tablescans for the t2a and t2b tables, then changed the data (without changing the object statistic) and run the SQL Tuning tool to see if the optimizer would suggest the plan I wanted and offer a profile to produce it.

I was successful – Oracle offered the profile, and when I looked at it (before accepting it) it looked like this:


         1 OPT_ESTIMATE(@"SEL$1", TABLE, "T2"@"SEL$1", SCALE_ROWS=200)
         1 OPT_ESTIMATE(@"SEL$1", JOIN, ("T2"@"SEL$1", "T1"@"SEL$1"), SCALE_ROWS=15)
         1 OPTIMIZER_FEATURES_ENABLE(default)
         1 IGNORE_OPTIM_EMBEDDED_HINTS

But when I accepted it and looked at it again it looked like this:


        BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$639F1A6F" "T2A"@"SEL$2")
        IGNORE_OPTIM_EMBEDDED_HINTS
        BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B01C6807" "T2B"@"SEL$3")
        INDEX_RS_ASC(@"SEL$B01C6807" "T2B"@"SEL$3" ("T2B"."ID"))
        USE_NL(@"SEL$1" "T2"@"SEL$1")
        LEADING(@"SEL$1" "T1"@"SEL$1" "T2"@"SEL$1")
        NO_ACCESS(@"SEL$1" "T2"@"SEL$1")
        FULL(@"SEL$1" "T1"@"SEL$1")
        OUTLINE(@"SET$1")
        OUTLINE(@"SEL$3")
        OUTLINE(@"SEL$2")
        OUTLINE_LEAF(@"SEL$1")
        PUSH_PRED(@"SEL$1" "T2"@"SEL$1" 1)
        OUTLINE_LEAF(@"SET$5715CE2E")
        OUTLINE_LEAF(@"SEL$B01C6807")
        OUTLINE_LEAF(@"SEL$639F1A6F")
        ALL_ROWS
        DB_VERSION('19.1.0')
        OPTIMIZER_FEATURES_ENABLE('19.1.0')
        INDEX_RS_ASC(@"SEL$639F1A6F" "T2A"@"SEL$2" ("T2A"."ID"))

In other words, Oracle has recorded something that looks like an SQL Plan Baseline and called it an SQL Profile.

July 1, 2019

opt_estimate 4

Filed under: CBO,Execution plans,Hints,Oracle,Statistics — Jonathan Lewis @ 1:18 pm BST Jul 1,2019

In the previous article in this series on the opt_estimate() hint I mentioned the “query_block” option for the hint. If you can identify a specify query block that becomes an “outline_leaf” in an execution plan (perhaps because you’ve deliberately given an query block name to an inline subquery and applied the no_merge() hint to it) then you can use the opt_estimate() hint to tell the optimizer how many rows will be produced by that query block (each time it starts). The syntax of the hint is very simple:


opt_estimate(@{query block name}  query_block  rows={number of rows})

As with other options for the hint, you can use scale_rows=, min=, max= as alternatives (the last seems to be used in the code generated by Oracle for materialized view refreshes) but the simple “rows=N” is likely to be the most popular. In effect it does the same as the “non-specific” version of the cardinality() hint – which I’ve suggested from time to time as a way of telling the optimizer the size of a data set in a materialized CTE (“with” subquery), e.g.


set serveroutput off

with demo as (
        select  /*+
                        qb_name(mat_cte)
                        materialize
                        cardinality(@mat_cte 11)
--                      opt_estimate(@mat_cte query_block rows=11)
                */
                distinct trunc(created)    date_list
        from    all_objects
)
select  * from demo
;

select * from table(dbms_xplan.display_cursor);
    

Regardless of whether you use the opt_estimate() or cardinality() hint above, the materialized temporary table will be reported with 11 rows. (Note that in this case where the hint is inside the query block it applies to the “@mat_cte” isn’t necessary).

In the previous article I generated some data with a script called opt_est_gby.sql to show you the effects of the group_by and having options of the opt_estimate() hint and pointed out that there were case where you might also want to include the query_block option as well. Here’s a final example query showing the effect, with the scale_rows feature after creating a table t2 as a copy of t1 but setting pctfree 75 (to make a tablescan more expensive) and creating an index on t2(id):


create table t2 pctfree 75 as select * from t1;
create index t2_i1 on t2(id);

select
        t2.n1, t1ct
from
        t2,
        (
        select  /*+
                        qb_name(main)
                        opt_estimate(@main group_by scale_rows=4)
                        opt_estimate(@main having scale_rows=0.4)
                        opt_estimate(@main query_block scale_rows=0.5)
                */
                mod(n1,10), count(*) t1ct
        from    t1
        group by
                mod(n1,10)
        having
                count(*) > 100
        ) v1
where
        t2.id = v1.t1ct
;

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     8 |   168 |    27   (8)| 00:00:01 |
|   1 |  NESTED LOOPS                |       |     8 |   168 |    27   (8)| 00:00:01 |
|   2 |   NESTED LOOPS               |       |     8 |   168 |    27   (8)| 00:00:01 |
|   3 |    VIEW                      |       |     8 |   104 |    10  (10)| 00:00:01 |
|*  4 |     FILTER                   |       |       |       |            |          |
|   5 |      HASH GROUP BY           |       |     8 |    32 |    10  (10)| 00:00:01 |
|   6 |       TABLE ACCESS FULL      | T1    |  3000 | 12000 |     9   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN          | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
|   8 |   TABLE ACCESS BY INDEX ROWID| T2    |     1 |     8 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(COUNT(*)>100)
   7 - access("T2"."ID"="V1"."T1CT")


I’ve inlined the last query (with the two opt_estimate() hints) that I used in the previous article, and added a third opt_estimate() hint to that inline view. In this case I didn’t have to add a no_merge() hint because the numbers worked in my favour but to be safe in a production environment that’s a hint that I should have included.

You may recall that the hash group by on its own resulted in a prediction of 200 rows, and with the having clause the prediction dropped to 10 rows (standard 5%). With my three opt_estimate() hints in place I should see the effects of the following arithmetic:


group by      200       * 4   = 800
having        5% of 800 * 0.4 =  16
query block   16        * 0.5 =   8

As you can see, the cardinality prediction for the VIEW operation is, indeed, 8 – so the combination of hints has worked. It’s just a shame that we can’t see the three individual steps in the arithmetic as we walk the plan.

A Warning

As always I can only repeat – hinting is not easy; and “not easy” usually translates to “not stable / not safe” (and thanks to a Freudian slip while typing: “not sage”. You probably don’t know how do it properly, except in the very simplest cases, and we don’t really know how Oracle is interpreting the hints (particularly the undocumented ones). Here’s an example of how puzzling even the opt_estimate(query_block) hint can be – as usual starting with some data:

rem
rem     Script:         opt_estimate_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2017
rem

create table t1
as
select * from all_objects;

create table t2
as
select * from all_objects;

As you can see, I’ve been a bit lazy with this example (which I wrote a couple of years ago) and it uses all_objects as a convenient source of data. Unfortunately this means you won’t necessarily be able to reproduce exactly the results I’m about to show you, which I did on a small instance of 12.2.0.1. I’m going to examine four versions of a simple query which

  • restricts the rows from t1,
  • finds the unique set of object_types in that subset of t1
  • then joins to t2 by object_type

select
        /*+ 
                qb_name(main)
        */
        t2.object_id, t2.object_name, created
from    (
        select  /*+ qb_name(inline) */
                distinct object_type
        from    t1 
        where 
                created >= date'2017-03-01' 
        )       v1,
        t2
where
        t2.object_type = v1.object_type
;


select
        /*+ 
                qb_name(main)
                merge(@inline)
        */
        t2.object_id, t2.object_name, created
from    (
        select  /*+ qb_name(inline) */
                distinct object_type
        from    t1 
        where 
                created >= date'2017-03-01' 
        )       v1,
        t2
where
        t2.object_type = v1.object_type
;


select
        /*+ 
                qb_name(main)
                opt_estimate(@inline query_block rows=14)
        */
        t2.object_id, t2.object_name, created
from    (
        select  /*+ qb_name(inline) */
                distinct object_type
        from    t1 
        where 
                created >= date'2017-03-01' 
        )       v1,
        t2
where
        t2.object_type = v1.object_type
;


select
        /*+ 
                qb_name(main)
                merge(@inline)
                opt_estimate(@inline query_block rows=14)
        */
        t2.object_id, t2.object_name, created
from    (
        select  /*+ qb_name(inline) */
                distinct object_type
        from    t1 
        where 
                created >= date'2017-03-01' 
        )       v1,
        t2
where
        t2.object_type = v1.object_type
;

The first version is my unhinted baseline (where, in my case, Oracle doesn’t use complex view merging), the second forces complex view merging of the inline aggregate view, then queries 3 and 4 repeat queries 1 and 2 but tell the optimizer that the number of distinct object_type values  is 14 (roughly half the actual in may case). But there is an oddity in the last query – I’ve told the optimizer how many rows it should estimate for the inline view but I’ve also told it to get rid of the inline view and merge it into the outer query block; so what effect is that going to have? My hope would be that the hint would have to be ignored because it’s going to apply to a query block that doesn’t exist in the final plan and that makes it irrelevant and unusable. Here are the four execution plans:


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      | 61776 |  4464K|   338   (7)| 00:00:01 |
|*  1 |  HASH JOIN           |      | 61776 |  4464K|   338   (7)| 00:00:01 |
|   2 |   VIEW               |      |    27 |   351 |   173   (9)| 00:00:01 |
|   3 |    HASH UNIQUE       |      |    27 |   486 |   173   (9)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| T1   | 59458 |  1045K|   164   (4)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | T2   | 61776 |  3680K|   163   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."OBJECT_TYPE"="V1"."OBJECT_TYPE")
   4 - filter("CREATED">=TO_DATE(' 2017-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


--------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           | 61776 |  5308K|       |  1492   (2)| 00:00:01 |
|   1 |  VIEW                  | VM_NWVW_1 | 61776 |  5308K|       |  1492   (2)| 00:00:01 |
|   2 |   HASH UNIQUE          |           | 61776 |  5489K|  6112K|  1492   (2)| 00:00:01 |
|*  3 |    HASH JOIN RIGHT SEMI|           | 61776 |  5489K|       |   330   (5)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | T1        | 59458 |  1045K|       |   164   (4)| 00:00:01 |
|   5 |     TABLE ACCESS FULL  | T2        | 61776 |  4403K|       |   163   (4)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T2"."OBJECT_TYPE"="OBJECT_TYPE")
   4 - filter("CREATED">=TO_DATE(' 2017-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      | 32032 |  2314K|   338   (7)| 00:00:01 |
|*  1 |  HASH JOIN           |      | 32032 |  2314K|   338   (7)| 00:00:01 |
|   2 |   VIEW               |      |    14 |   182 |   173   (9)| 00:00:01 |
|   3 |    HASH UNIQUE       |      |    14 |   252 |   173   (9)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| T1   | 59458 |  1045K|   164   (4)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | T2   | 61776 |  3680K|   163   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."OBJECT_TYPE"="V1"."OBJECT_TYPE")
   4 - filter("CREATED">=TO_DATE(' 2017-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


--------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |    14 |  1232 |       |  1492   (2)| 00:00:01 |
|   1 |  VIEW                  | VM_NWVW_1 |    14 |  1232 |       |  1492   (2)| 00:00:01 |
|   2 |   HASH UNIQUE          |           |    14 |  1274 |  6112K|  1492   (2)| 00:00:01 |
|*  3 |    HASH JOIN RIGHT SEMI|           | 61776 |  5489K|       |   330   (5)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | T1        | 59458 |  1045K|       |   164   (4)| 00:00:01 |
|   5 |     TABLE ACCESS FULL  | T2        | 61776 |  4403K|       |   163   (4)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T2"."OBJECT_TYPE"="OBJECT_TYPE")
   4 - filter("CREATED">=TO_DATE(' 2017-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The first plan tells us that most of the rows in t1 have created > 1st March 2017 and there are (estimated) 27 distinct values for object_type; and there are 61,776 rows in t2 (which is basically the same as t1), and none of them are eliminated by the join on object_type from the inline view.

The second plan (with the forced complext view merging) shows Oracle changing the view with “distinct” into a (right) semi-join between t2 and t1 with the internal view name of VM_NWVW_1 – and the cardinality is correct.

The third plan shows that my hint telling the optimizer to assume the original inline view produces 14 rows has been accepted and, not surprisingly, when we claim that we have roughly half the number of object_type values the final estimate of rows in the join is roughly halved.

So what happens in the fourth plan when our hint applies to a view that no longer exists? I think the optimizer should have discarded the hint as irrelevant the moment it merged the view. Unfortunately it seems to have carried the hint up into the merged view and used it to produce a wildly inaccurate estimate for the final cardinality. If this had been a three-table join this is the sort of error that could make a sensible hash join into a third table become an unbelievably stupid nested loop join. If you had thought you were doing something incredibly clever with (just) the one opt_estimate() hint, the day might come when a small change in the statistics resulted in the optimizer using a view merge strategy you’d never seen before and producing a catastrophic execution plan in (say) an overnight batch that then ran “forever”.

Hinting is hard, you really have to be extremely thorough in your hints and make sure you cover all the options that might appear. And then you might still run into something that looks (as this does) like a bug.

Footnote

Here’s a closing thought: even if you manage to tell the optimizer exactly how many rows will come out of a query block to be joined to the next table in the query, you may still get a very bad plan unless you can also tell the optimizer how many distinct values of the join column(s) there are in that data set. Which means you may also have to learn all about the (even more undocumented) column_stats() hint.

 

June 28, 2019

opt_estimate 3

Filed under: CBO,Execution plans,Hints,Oracle,Statistics — Jonathan Lewis @ 1:12 pm BST Jun 28,2019

This is just a quick note to throw out a couple of of the lesser-known options for the opt_estimate() hint – and they may be variants that are likely to be most useful since they address a problem where the optimizer can produce consistently bad cardinality estimates. The first is the “group by” option – a hint that I once would have called a “strategic” hint but which more properly ought to be called a “query block” hint. Here’s the simplest possible example (tested under 12.2, 18.3 and 19.2):


rem
rem     Script:         opt_est_gby.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2019
rem 

create table t1
as
select
        rownum                  id,
        mod(rownum,200)         n1,
        lpad(rownum,10,'0')     v1,
        rpad('x',100)           padding
)
from
        dual
connect by
        level <= 3000
;

set autotrace on explain

prompt  =============================
prompt  Baseline cardinality estimate
prompt  (correct cardinality is 10)
prompt  Estimate will be 200
prompt  =============================

select  /*+
                qb_name(main)
        */
        mod(n1,10), count(*) 
from    t2 
group by 
        mod(n1,10)
;

I’ve generated a table of 3,000 rows with a column n1 holding 15 rows each of 200 distinct values. The query then aggregates on mod(n1,10) so it has to return 10 rows, but the optimizer doesn’t have a mechanism for inferring this and produces the following plan – the Rows value from the HASH GROUP BY at operation 1 is the only thing we’re really interested in here:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   200 |   800 |    10  (10)| 00:00:01 |
|   1 |  HASH GROUP BY     |      |   200 |   800 |    10  (10)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  3000 | 12000 |     9   (0)| 00:00:01 |
---------------------------------------------------------------------------

It looks as if the optimizer’s default position is to use num_distinct from the underlying column as the estimate for the aggregate. We can work around this in the usual two ways with an opt_estimate() hint. First, let’s tell the optimizer that it’s going to over-estimate the cardinality by a factor of 10:


select  /*+
                qb_name(main)
                opt_estimate(@main group_by, scale_rows = 0.1)
        */
        mod(n1,10), count(*) 
from    t1 
group by 
        mod(n1,10)
;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    20 |    80 |    10  (10)| 00:00:01 |
|   1 |  HASH GROUP BY     |      |    20 |    80 |    10  (10)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  3000 | 12000 |     9   (0)| 00:00:01 |
---------------------------------------------------------------------------

The hint uses group_by as the critical option parameter, and then I’ve used the standard scale_rows=nnn to set a scaling factor that should be used to adjust the result of the default calculation. At 10% (0.1) this gives us an estimate of 20 rows.

Alternatively, we could simply tell the optimizer how many rows we want it to believe will be generated for the aggregate – let’s just tell it that the result will be 10 rows.

select  /*+
                qb_name(main)
                opt_estimate(@main group_by, rows = 10)
        */
        mod(n1,10), count(*) 
from    t1 
group by 
        mod(n1,10)
;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    10 |    40 |    10  (10)| 00:00:01 |
|   1 |  HASH GROUP BY     |      |    10 |    40 |    10  (10)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  3000 | 12000 |     9   (0)| 00:00:01 |
---------------------------------------------------------------------------

We use the same group_by as the critical parameter, with rows=nnn.

Next steps

After an aggregation there’s often a “having” clause so you might consider using the group_by option to fix up the cardinality of the having clause if you know what the normal effect of the having clause should be. For example: “having count(*) > NNN” will use the optimizer’s standard 5% “guess” and “having count(*) = NNN” will use the standard 1% guess. However, having seen the group_by options I took a guess that there might be a having option to the opt_estimate() hint as well, so I tried it – with autotrace enabled here are three queries, first the unhinted baseline (which uses the standard 5% on my having clause) then a couple of others with hints to tweak the cardinality:

select  /*+
                qb_name(main)
        */
        mod(n1,10), count(*)
from    t1
group by
        mod(n1,10)
having
        count(*) > 100
;

select  /*+
                qb_name(main)
                opt_estimate(@main having scale_rows=0.4)
        */
        mod(n1,10), count(*)
from    t1
group by
        mod(n1,10)
having
        count(*) > 100
;

select  /*+
                qb_name(main)
                opt_estimate(@main group_by scale_rows=2)
                opt_estimate(@main having scale_rows=0.3)
        */
        mod(n1,10), count(*)
from    t1
group by
        mod(n1,10)
having
        count(*) > 100
;

The first query gives us the baseline cardinality of 10 (5% of 200). The second query scales the having cardinality down by a factor of 0.4  (with means an estimate of 4). The final query first doubles the group by cardinality (to 400), then scales the having cardinality (which would have become 20) down by a factor of 0.3 with the nett effect of producing a cardinality of 6. Here are the plans.

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    10 |    40 |    10  (10)| 00:00:01 |
|*  1 |  FILTER             |      |       |       |            |          |   --  10
|   2 |   HASH GROUP BY     |      |    10 |    40 |    10  (10)| 00:00:01 |   -- 200
|   3 |    TABLE ACCESS FULL| T1   |  3000 | 12000 |     9   (0)| 00:00:01 |
----------------------------------------------------------------------------

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     4 |    16 |    10  (10)| 00:00:01 |
|*  1 |  FILTER             |      |       |       |            |          |    --   4
|   2 |   HASH GROUP BY     |      |     4 |    16 |    10  (10)| 00:00:01 |    -- 200
|   3 |    TABLE ACCESS FULL| T1   |  3000 | 12000 |     9   (0)| 00:00:01 |
----------------------------------------------------------------------------

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     6 |    24 |    10  (10)| 00:00:01 |
|*  1 |  FILTER             |      |       |       |            |          |    --   6
|   2 |   HASH GROUP BY     |      |     6 |    24 |    10  (10)| 00:00:01 |    -- 400
|   3 |    TABLE ACCESS FULL| T1   |  3000 | 12000 |     9   (0)| 00:00:01 |
----------------------------------------------------------------------------

It’s a little sad that the FILTER operation shows no estimate while the HASH GROUP BY operation shows the estimate after the application of the having clause. It would be nice to see the plan reporting the figures which I’ve added at the end of line for operations 1 and 2.

You may wonder why one would want to increase the estimate for the group by then reduce it for the having. While I’m not going to go to the trouble of creating a worked example it shouldn’t be too hard to appreciate the idea that the optimizer might use complex view merging to postpone a group by until after a join – so increasing the estimate for a group by might be necessary to ensure that that particular transformation doesn’t happen, while following this up with a reduction to the having might then ensure that the next join is a nested loop rather than a hash join. Of course, if you don’t need to be this subtle you might simply take advantage of yet another option to the opt_estimate() hint, the query_block option – but that will (probably) appear in the next article in this series.

 

June 26, 2019

Glitches

Filed under: 12c,Bugs,Execution plans,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 5:11 pm BST Jun 26,2019

Here’s a question just in from Oracle-L that demonstrates the pain of assuming things work consistently when sometimes Oracle development hasn’t quite finished a bug fix or enhancement. Here’s the problem – which starts from the “scott.emp” table (which I’m not going to create in the code below):

rem
rem     Script:         fbi_fetch_first_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2019
rem 

-- create and populate EMP table from SCOTT demo schema

create index e_sort1 on emp (job, hiredate);
create index e_low_sort1 on emp (lower(job), hiredate);

set serveroutput off
alter session set statistics_level = all;
set linesize 156
set pagesize 60

select * from emp where job='CLERK'         order by hiredate fetch first 2 rows only; 
select * from table(dbms_xplan.display_cursor(null,null,'cost allstats last outline alias'));

select * from emp where lower(job)='clerk' order by hiredate fetch first 2 rows only; 
select * from table(dbms_xplan.display_cursor(null,null,'cost allstats last outline alias'));

Both queries use the 12c “fetch first” feature to select two rows from the table. We have an index on (job, hiredate) and a similar index on (lower(job), hiredate), and given the similarity of the queries and the respective indexes (get the first two rows by hiredate where job/lower(job) is ‘CLERK’/’clerk’) we might expect to see the same execution plan in both cases with the only change being the choice of index used. But here are the plans:


select * from emp where job='CLERK'         order by hiredate fetch
first 2 rows only

Plan hash value: 92281638

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |      1 |        |     2 (100)|      2 |00:00:00.01 |       4 |
|*  1 |  VIEW                         |         |      1 |      2 |     2   (0)|      2 |00:00:00.01 |       4 |
|*  2 |   WINDOW NOSORT STOPKEY       |         |      1 |      3 |     2   (0)|      2 |00:00:00.01 |       4 |
|   3 |    TABLE ACCESS BY INDEX ROWID| EMP     |      1 |      3 |     2   (0)|      3 |00:00:00.01 |       4 |
|*  4 |     INDEX RANGE SCAN          | E_SORT1 |      1 |      3 |     1   (0)|      3 |00:00:00.01 |       2 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "EMP"."HIREDATE")<=2)
   4 - access("JOB"='CLERK')


select * from emp where lower(job)='clerk' order by hiredate fetch
first 2 rows only

Plan hash value: 4254915479

-------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name        | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |             |      1 |        |     1 (100)|      2 |00:00:00.01 |       2 |       |       |          |
|*  1 |  VIEW                                 |             |      1 |      2 |     1   (0)|      2 |00:00:00.01 |       2 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK             |             |      1 |      1 |     1   (0)|      2 |00:00:00.01 |       2 |  2048 |  2048 | 2048  (0)|
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| EMP         |      1 |      1 |     1   (0)|      4 |00:00:00.01 |       2 |       |       |          |
|*  4 |     INDEX RANGE SCAN                  | E_LOW_SORT1 |      1 |      1 |     1   (0)|      4 |00:00:00.01 |       1 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "EMP"."HIREDATE")<=2)
   4 - access("EMP"."SYS_NC00009$"='clerk')


As you can see, with the “normal” index Oracle is able to walk the index “knowing” that the data is appearing in order, and stopping as soon as possible (almost) – reporting the WINDOW operation as “WINDOW NOSORT STOPKEY”. On the other hand with the function-based index Oracle retrieves all the data by index, sorts it, then applies the ranking requirement – reporting the WINDOW operation as “WINDOW SORT PUSHED RANK”.

Clearly it’s not going to make a lot of difference to performance in this tiny case, but there is a threat that the whole data set for ‘clerk’ will be accessed – and that’s the first performance threat, with the additional threat that the optimizer might decide that a full tablescan would be more efficient than the index range scan.

Can we fix it ?

Yes, Bob, we can. The problem harks back to a limitation that probably got fixed some time between 10g and 11g – here are two, simpler, queries against the emp table and the two new indexes, each with the resulting execution plan when run under Oracle 10.2.0.5:


select ename from emp where       job  = 'CLERK' order by hiredate;
select ename from emp where lower(job) = 'clerk' order by hiredate;

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     3 |    66 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP     |     3 |    66 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | E_SORT1 |     3 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("JOB"='CLERK')


--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |     3 |    66 |     3  (34)| 00:00:01 |
|   1 |  SORT ORDER BY               |             |     3 |    66 |     3  (34)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| EMP         |     3 |    66 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | E_LOW_SORT1 |     3 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access(LOWER("JOB")='clerk')

The redundant SORT ORDER BY is present in 10g even for a simple index range scan. By 11.2.0.4 the optimizer was able to get rid of the redundant step, but clearly there’s a little gap in the code relating to the over() clause that hasn’t acquired the correction – even in 18.3.0.0 (or 19.2 according to a test on https://livesql.oracle.com).

To fix the 10g problem you just had to include the first column of the index in the order by clause: the result doesn’t change, of course, because you’re simply prefixing the required columns with a column which holds the single value you were probing the index for but suddenly the optimizer realises that it can do a NOSORT operation – so the “obvious” guess was to do the same for this “first fetch” example:

select * from emp where lower(job)='clerk' order by lower(job), hiredate fetch first 2 rows only;

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name        | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |      1 |        |     3 (100)|      2 |00:00:00.01 |       4 |
|*  1 |  VIEW                         |             |      1 |      2 |     3  (34)|      2 |00:00:00.01 |       4 |
|*  2 |   WINDOW NOSORT STOPKEY       |             |      1 |      1 |     3  (34)|      2 |00:00:00.01 |       4 |
|   3 |    TABLE ACCESS BY INDEX ROWID| EMP         |      1 |      1 |     2   (0)|      3 |00:00:00.01 |       4 |
|*  4 |     INDEX RANGE SCAN          | E_LOW_SORT1 |      1 |      1 |     1   (0)|      3 |00:00:00.01 |       2 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "EMP"."SYS_NC00009$","EMP"."HIREDATE")<=2)
   4 - access("EMP"."SYS_NC00009$"='clerk')

It’s just one of those silly little details where you can waste a HUGE amount of time (in a complex case) because it never crossed your mind that something that clearly ought to work might need testing for a specific use case – and I’ve lost count of the number of times I’ve been caught out by this type of “not quite finished” anomaly.

Footnote

If you follow the URL to the Oracle-L thread you’ll see that Tanel Poder has supplied a couple of MoS Document Ids discussing the issue and warning of other bugs with virtual column / FBI translation, and has shown an alternative workaround that takes advantage of a hidden parameter.

 

June 25, 2019

opt_estimate 2

Filed under: CBO,Execution plans,Hints,Oracle,Statistics — Jonathan Lewis @ 8:22 pm BST Jun 25,2019

This is a note that was supposed to be a follow-up to an initial example of using the opt_estimate() hint to manipulate the optimizer’s statistical understanding of how much data it would access and (implicitly) how much difference that would make to the resource usage. Instead, two years later, here’s part two – on using opt_estimate() with nested loop joins. As usual I’ll start with a little data set:


rem
rem     Script:         opt_est_nlj.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2017
rem

create table t1
as
select 
        trunc((rownum-1)/15)    n1,
        trunc((rownum-1)/15)    n2,
        rpad(rownum,180)        v1
from    dual
connect by
        level <= 3000 --> hint to avoid wordpress format issue
;

create table t2
pctfree 75
as
select 
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from    dual
connect by
        level <= 3000 --> hint to avoid wordpress format issue
;

create index t1_i1 on t1(n1);
create index t2_i1 on t2(n1);

There are 3,000 rows in each table, with 200 distinct values for each of columns n1 and n2. There is an important difference between the tables, though, as the rows for a given value are well clustered in t1 and widely scattered in t2. I’m going to execute a join query between the two tables, ultimately forcing a very bad access path so that I can show some opt_estimate() hints making a difference to cost and cardinality calculations. Here’s my starting query, with execution plan, unhinted (apart from the query block name hint):

select
        /*+ qb_name(main) */
        t1.v1, t2.v1
from    t1, t2
where
        t1.n1 = 15
and     t2.n1 = t1.n2
;

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |   225 | 83700 |    44   (3)| 00:00:01 |
|*  1 |  HASH JOIN                           |       |   225 | 83700 |    44   (3)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    15 |  2805 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS FULL                  | T2    |  3000 |   541K|    42   (3)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N1"="T1"."N2")
   3 - access("T1"."N1"=15)

You’ll notice the tablescan and hash join with t2 as the probe (2nd) table and a total cost of 44, which largely due to the tablescan cost of t2 (which I had deliberately defined with pctfree 75 to make the tablescan a little expensive). Let’s hint the query to do a nested loop from t1 to t2 to see why the hash join is preferred over the nested loop:


alter session set "_nlj_batching_enabled"=0;

select
        /*+
                qb_name(main)
                leading(t1 t2)
                use_nl(t2)
                index(t2)
                no_nlj_prefetch(t2)
        */
        t1.v1, t2.v1
from    t1, t2
where
        t1.n1 = 15
and     t2.n1 = t1.n2
;

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |   225 | 83700 |   242   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                        |       |   225 | 83700 |   242   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    15 |  2805 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |    15 |  2775 |    16   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | T2_I1 |    15 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N1"=15)
   5 - access("T2"."N1"="T1"."N2")

I’ve done two slightly odd things here – I’ve set a hidden parameter to disable nlj batching and I’ve used a hint to block nlj prefetching. This doesn’t change the arithmetic the optimizer uses, but it does mean the presentation of the nested loop goes back to the original pre-9i form which makes it a little easier to see costs and cardinalities adding and multiplying their way through the plan. I do not do this in production systems.

As you can see, the total cost is 242 with this plan and most of the cost is due to the indexed access into t2. The optimizer has correctly estimated that each probe of t2 will acquire 15 rows and that those 15 rows will be scattered across 15 blocks, so the join cardinality comes to 15 * 15 = 255 and the cost comes to: 2 (t1 cost) + (15 (t1 rows) * 16 (t2 unit cost)) = 242.

So let’s tell the optimizer that its estimated cardinality for the index range scan is wrong.


select
        /*+
                qb_name(main)
                leading(t1 t2)
                use_nl(t2)
                index(t2)
                no_nlj_prefetch(t2)
                opt_estimate(@main nlj_index_scan, t2@main (t1), t2_i1, scale_rows=0.06)
        */
        t1.v1, t2.v1
from    t1, t2
where
        t1.n1 = 15
and     t2.n1 = t1.n2
;

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |   225 | 83700 |    32   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                        |       |   225 | 83700 |    32   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    15 |  2805 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |    15 |  2775 |     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N1"=15)
   5 - access("T2"."N1"="T1"."N2")

I’ve used the hint opt_estimate(@main nlj_index_scan, t2@main (t1), t2_i1, scale_rows=0.06).

The form is: (@qb_name   nlj_index_scan,   target_table_alias   (list of possible driving tables),   target_index,   numeric_adjustment).

The numeric_adjustment could be rows=nnn or, as I have here, scale_rows=nnn; the target_index has to be specified by name rather than list of columns, and the list of possible driving tables should be a comma-separated list of fully-qualified table aliases. There’s a similar nlj_index_filter option which I can’t demonstrate in this post because it probably needs an index of at least two-columns before it can be used.

The things to note in this plan are: the index range scan at operation 5 has now has a cardinality (Rows) estimate of 1 (that’s 0.06 * the original 15). This hasn’t changed the cost of the range scan (because that cost was already one before we applied the opt_estimate() hint) but, because the cost of the table access is dependent on the index selectivity the cost of the table access is down to 2 (from 16). On the other hand the table cardinality hasn’t dropped so now it’s not consistent with the number of rowids predicted by the index range scan. The total cost of the query has dropped to 32, though, which is: 2 (t1 cost) + (15 (t1 rows) * 2 (t2 unit cost)).

Let’s try to adjust the prediction that the optimizer makes about the number of rows we fetch from the table. Rather than going all the way to being consistent with the index range scan I’ll dictate a scaling factor that will make it easy to see the effect – let’s tell the optimizer that we will get one-fifth of the originally expected rows (i.e. 3).


select
        /*+
                qb_name(main)
                leading(t1 t2)
                use_nl(t2)
                index(t2)
                no_nlj_prefetch(t2)
                opt_estimate(@main nlj_index_scan, t2@main (t1), t2_i1, scale_rows=0.06)
                opt_estimate(@main table         , t2@main     ,        scale_rows=0.20)
        */
        t1.v1, t2.v1
from    t1, t2
where
        t1.n1 = 15
and     t2.n1 = t1.n2
;

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |    47 | 17484 |    32   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                        |       |    47 | 17484 |    32   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    15 |  2805 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |     3 |   555 |     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N1"=15)
   5 - access("T2"."N1"="T1"."N2")

By adding the hint opt_estimate(@main table, t2@main, scale_rows=0.20) we’ve told the optimizer that it should scale the estimated row count down by a factor of 5 from whatever it calculates. Bear in mind that in a more complex query the optimizer might decide to follow the path we expected and that factor of 0.2 will be applied whenever t2 is accessed. Notice in this plan that the join cardinality at operation 1 has also dropped from 225 to 47 – if the optimizer is told that its cardinality (or selectivity) calculation is wrong for the table the numbers involved in the selectivity will carry on through the plan, producing a different “adjusted NDV” for the join cardinality calculation.

Notice, though, that the total cost of the query has not changed. The cost was dictated by the optimizer’s estimate of the number of table blocks to be visited after the index range scan. The estimated number of table blocks hasn’t changed, it’s just the number of rows we will find there that we’re now hacking.

Just for completion, let’s make one final change (again, something that might be necessary in a more complex query), let’s fix the join cardinality:


select
        /*+
                qb_name(main)
                leading(t1 t2)
                use_nl(t2)
                index(t2)
                no_nlj_prefetch(t2)
                opt_estimate(@main nlj_index_scan, t2@main (t1), t2_i1, scale_rows=0.06)
                opt_estimate(@main table         , t2@main     ,        scale_rows=0.20)
                opt_estimate(@main join(t2 t1)   ,                      scale_rows=0.5)
        */
        t1.v1, t2.v1
from    t1, t2
where
        t1.n1 = 15
and     t2.n1 = t1.n2
;

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |    23 |  8556 |    32   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                        |       |    23 |  8556 |    32   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    15 |  2805 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |     2 |   370 |     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N1"=15)
   5 - access("T2"."N1"="T1"."N2")

I’ve used the hint opt_estimate(@main join(t2 t1), scale_rows=0.5) to tell the optimizer to halve its estimate of the join cardinality between t1 and t2 (whatever order they appear in). With the previous hints in place the estimate had dropped to 47 (which must have been 46 and a large bit), with this final hint it has now dropped to 23. Interestingly the cardinality estimate for the table access to t2 has dropped at the same time (almost as if the optimizer has “rationalised” the join cardinality by adjusting the selectivity of the second table in the join – that’s something I may play around with in the future, but it may require reading a 10053 trace, which I tend to avoid doing).

Side not: If you have access to MoS you’ll find that Doc ID: 2402821.1 “How To Use Optimizer Hints To Specify Cardinality For Join Operation”, seems to suggest that the cardinality() hint is something to use for single table cardinalities, and implies that the opt_estimate(join) option is for two-table joins. In fact both hints can be used to set the cardinality of multi-table joins (where “multi” can be greater than 2).

Finally, then, let’s eliminate the hints that force the join order and join method and see what happens to our query plan if all we include is the opt_estimate() hints (and the qb_name() and no_nlj_prefetch hints and remember we’vs disabled “nlj batching“).

select
        /*+
                qb_name(main)
                no_nlj_prefetch(t2)
                opt_estimate(@main nlj_index_scan, t2@main (t1), t2_i1, scale_rows=0.06)
                opt_estimate(@main table         , t2@main     ,        scale_rows=0.20)
                opt_estimate(@main join(t2 t1)   ,                      scale_rows=0.5)
        */
        t1.v1, t2.v1
from    t1, t2
where
        t1.n1 = 15
and     t2.n1 = t1.n2
;

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |    23 |  8556 |    32   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                        |       |    23 |  8556 |    32   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    15 |  2805 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 |    15 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |     2 |   370 |     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN                  | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N1"=15)
   5 - access("T2"."N1"="T1"."N2")

Note
-----
   - this is an adaptive plan

WIth a little engineering on the optimizer estimates we’ve managed to con Oracle into using a different path from the default choice. Do notice, though, the closing Note section (which didn’t appear in all the other examples): I’ve left Oracle with the option of checking the actual stats as the query runs, so if I run the query twice Oracle might spot that the arithmetic is all wrong and throw in some SQL Plan Directives – which are just another load of opt_estimate() hints.

In fact, in this example, the plan we wanted became desirable as soon as we applied the nlj_ind_scan fix-up as this made the estimated cost of the index probe into t2 sufficiently low (even though it left an inconsistent cardinality figure for the table rows) that Oracle would have switched from the default hash join to the nested loop on that basis alone.

Closing Comment

As I pointed out in the previous article, this is just scratching the surface of how the opt_estimate() hint works, and even with very simple queries it can be hard to tell whether any behaviour we’ve seen is actually doing what we think it’s doing. In a third article I’ll be looking at something prompted by the most recent email I’ve had about opt_estimate() – how it might (or might not) behave in the presence of inline views and transformations like merging or pushing predicates. I’ll try not to take 2 years to publish it.

 

June 17, 2019

Can’t Unnest

Filed under: Conditional SQL,Execution plans,Oracle,subqueries — Jonathan Lewis @ 3:35 pm BST Jun 17,2019

In an echo of a very old “conditional SQL” posting, a recent posting on the ODC general database discussion forum ran into a few classic errors of trouble-shooting. By a lucky coincidence this allowed me to rediscover and publish an old example of parallel execution gone wild before moving on to talk about the fundamental problem exhibited in the latest query.

The ODC thread started with a question along the lines of “why isn’t Oracle using the index I hinted”, with the minor variation that it said “When I hint my SQL with an index hint it runs quickly so I’ve created a profile that applies the hint, but the hint doesn’t get used in production.”

The query was a bit messy and, as is often the case with ODC, the formatting wasn’t particularly readable, so I’ve extracted the where clause from the SQL that was used to generate the profile and reformatted it below. See if you can spot the hint clue that tells you why there might be a big problem using this SQL to generate a profile to use in the production environment:


WHERE   
        MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' 
AND     MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' 
AND     MSG.SRCH_4_FLD_VAL = '123456'   
AND     (
            (    'INVOICENUMBER' = 'INVOICENUMBER' 
             AND MSG.MSG_ID IN (
                        SELECT  *   
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST('123456') AS TABLE_OF_VARCHAR)))
            ) 
         OR (    'INVOICENUMBER' = 'SIEBELORDERID' 
             AND MSG.SRCH_3_FLD_VAL IN (
                        SELECT  *   
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST('') AS TABLE_OF_VARCHAR)))
            )
        ) 
AND     MSG.MSG_ID = TRK.INV_NUM(+) 
AND     (   TRK.RESEND_DT IS NULL 
         OR TRK.RESEND_DT = (
                        SELECT  MAX(TRK1.RESEND_DT)   
                        FROM    FNM.BCS_INV_RESEND_TRK TRK1   
                        WHERE   TRK1.INV_NUM = TRK.INV_NUM
                )
        )

If the SQL by itself doesn’t give you an inportant clue, compare it with the Predicate Information from the “good” execution plan that it produced:


Predicate Information (identified by operation id):  
---------------------------------------------------  
   2 - filter(("TRK"."RESEND_DT" IS NULL OR "TRK"."RESEND_DT"=))  
   8 - filter(("MSG"."SRCH_4_FLD_VAL"='123456' AND "MSG"."MSG_CAPTR_STG_CD"='PRE_BCS'))  
   9 - access("MSG"."MSG_ID"="COLUMN_VALUE" AND "MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
       filter("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
  10 - access("MSG"."MSG_ID"="TRK"."INV_NUM")  
  13 - access("TRK1"."INV_NUM"=:B1)  

Have you spotted the thing that isn’t there in the predicate information ?

What happened to the ‘INVOICENUMBER’ = ‘INVOICENUMBER’ predicate and the ‘INVOICENUMBER’ = ‘SIEBELORDERID’ predicate? They’ve disappeared because the optimizer knows that the first predicate is always true and doesn’t need to be tested at run-time and the second one is always false and doesn’t need to be tested at run-time. Moreover both predicates are part of a conjunct (AND) – so in the second case the entire two-part predicate can be eliminated; so the original where clause can immediately be reduced to:


WHERE   
        MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' 
AND     MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' 
AND     MSG.SRCH_4_FLD_VAL = '123456'   
AND     (
                 MSG.MSG_ID IN (
                        SELECT  *   
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST('123456') AS TABLE_OF_VARCHAR)))
        ) 
AND     MSG.MSG_ID = TRK.INV_NUM(+) 
AND     (   TRK.RESEND_DT IS NULL 
         OR TRK.RESEND_DT = (
                        SELECT  MAX(TRK1.RESEND_DT)   
                        FROM    FNM.BCS_INV_RESEND_TRK TRK1   
                        WHERE   TRK1.INV_NUM = TRK.INV_NUM
                )
        )

Looking at this reduced predicate you may note that the IN subquery referencing the fnm_gn_in_string_list() collection could now be unnested and used to drive the final execution plan, and the optimizer will even recognize that it’s a rowsource with at most one row. So here’s the “good” execution plan:


---------------------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                               | Name                  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |  
---------------------------------------------------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                        |                       |      1 |        |      2 |00:00:00.08 |      12 |      7 |       |       |          |  
|   1 |  SORT ORDER BY                          |                       |      1 |      1 |      2 |00:00:00.08 |      12 |      7 |  2048 |  2048 | 2048  (0)|  
|*  2 |   FILTER                                |                       |      1 |        |      2 |00:00:00.08 |      12 |      7 |       |       |          |  
|   3 |    NESTED LOOPS OUTER                   |                       |      1 |      1 |      2 |00:00:00.08 |      10 |      7 |       |       |          |  
|   4 |     NESTED LOOPS                        |                       |      1 |      1 |      2 |00:00:00.06 |       6 |      5 |       |       |          |  
|   5 |      VIEW                               | VW_NSO_1              |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |       |       |          |  
|   6 |       HASH UNIQUE                       |                       |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |  1697K|  1697K|  487K (0)|  
|   7 |        COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |       |       |          |  
|*  8 |      TABLE ACCESS BY INDEX ROWID        | FNM_VSBL_MSG          |      1 |      1 |      2 |00:00:00.06 |       6 |      5 |       |       |          |  
|*  9 |       INDEX RANGE SCAN                  | XIE2FNM_VSBL_MSG      |      1 |      4 |      4 |00:00:00.04 |       4 |      3 |       |       |          |  
|* 10 |     INDEX RANGE SCAN                    | XPKBCS_INV_RESEND_TRK |      2 |      1 |      2 |00:00:00.01 |       4 |      2 |       |       |          |  
|  11 |    SORT AGGREGATE                       |                       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |  
|  12 |     FIRST ROW                           |                       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |  
|* 13 |      INDEX RANGE SCAN (MIN/MAX)         | XPKBCS_INV_RESEND_TRK |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |  
---------------------------------------------------------------------------------------------------------------------------------------------------------------  

The plan looks great – Oracle predicts a single row driver (operation 5) which can use a very good index (XIE2FNM_VSBL_MSG) in a nested loop, followed by a second nested loop, followed by a filter subquery and a sort of a tiny amount of data. Predictions match actuals all the way down the plan, and the workload is tiny. So what goes wrong in production?

You’ve probably guessed the flaw in this test. Why would anyone include a predicate like ‘INVOICENUMBER’ = ‘INVOICENUMBER’ in production code, or even worse ‘INVOICENUMBER’ = ‘SIEBELORDERID’. The OP has taken a query using bind variables picked up the actual values that were peeked when the query was executed, and substituted them into the test as literals. This has allowed the optimizer to discard two simple predicates and one subquery when the production query would need a plan that catered for the possibility that the second subquery would be the one that had to be executed and the first one bypassed. Here’s the corrected where clause using SQL*Plus variables (not the substitution type, the proper type) for the original bind variables:


WHERE
        MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE'
AND     MSG.MSG_CAPTR_STG_CD = 'PRE_BCS'
AND     MSG.SRCH_4_FLD_VAL = :BindInvoiceTo
AND     (
            (    :BindSearchBy = 'INVOICENUMBER' 
             AND MSG.MSG_ID IN (
                        SELECT  *
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST(:BindInvoiceList) AS TABLE_OF_VARCHAR)))
            )
         OR (    :BindSearchBy = 'SIEBELORDERID' 
             AND MSG.SRCH_3_FLD_VAL IN (
                        SELECT  *
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST(:BindSeibelIDList) AS TABLE_OF_VARCHAR)))
            )
        )
AND     MSG.MSG_ID = TRK.INV_NUM(+)
AND     (   TRK.RESEND_DT IS NULL
         OR TRK.RESEND_DT = (
                        SELECT  MAX(TRK1.RESEND_DT)
                        FROM    FNM.BCS_INV_RESEND_TRK TRK1
                        WHERE   TRK1.INV_NUM = TRK.INV_NUM
                )
        )

And this, with the “once good” hint in place to force the use of the XIE2FNM_VSBL_MSG index, is the resulting execution plan


---------------------------------------------------------------------------------------------------------  
| Id  | Operation                           | Name                  | E-Rows |  OMem |  1Mem | Used-Mem |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                    |                       |        |       |       |          |  
|   1 |  SORT ORDER BY                      |                       |      1 | 73728 | 73728 |          |  
|*  2 |   FILTER                            |                       |        |       |       |          |  
|   3 |    NESTED LOOPS OUTER               |                       |      1 |       |       |          |  
|*  4 |     TABLE ACCESS BY INDEX ROWID     | FNM_VSBL_MSG          |      1 |       |       |          |  
|*  5 |      INDEX FULL SCAN                | XIE2FNM_VSBL_MSG      |   4975K|       |       |          |  
|*  6 |     INDEX RANGE SCAN                | XPKBCS_INV_RESEND_TRK |      1 |       |       |          |  
|*  7 |    COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST |      1 |       |       |          |  
|*  8 |    COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST |      1 |       |       |          |  
|   9 |    SORT AGGREGATE                   |                       |      1 |       |       |          |  
|  10 |     FIRST ROW                       |                       |      1 |       |       |          |  
|* 11 |      INDEX RANGE SCAN (MIN/MAX)     | XPKBCS_INV_RESEND_TRK |      1 |       |       |          |  
---------------------------------------------------------------------------------------------------------  
 
Predicate Information (identified by operation id):  
---------------------------------------------------  
   2 - filter((((:BINDSEARCHBY='INVOICENUMBER' AND  IS NOT NULL) OR  
              (:BINDSEARCHBY='SIEBELORDERID' AND  IS NOT NULL)) AND ("TRK"."RESEND_DT" IS NULL OR  
              "TRK"."RESEND_DT"=)))  
   4 - filter(("MSG"."SRCH_4_FLD_VAL"=:BINDINVOICETO AND "MSG"."MSG_CAPTR_STG_CD"='PRE_BCS'))  
   5 - access("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
       filter("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
   6 - access("MSG"."MSG_ID"="TRK"."INV_NUM")  
   7 - filter(VALUE(KOKBF$)=:B1)  
   8 - filter(VALUE(KOKBF$)=:B1)  
  11 - access("TRK1"."INV_NUM"=:B1)  

The “unnested driving subquery” approach can no longer be used – we now start with the fnm_vsbl_msg table (accessing it using a most inefficient execution path because that’s what the hint does for us, and we can obey the hint), and for each row check which of the two subqueries we need to execute. There is, in fact, no way we can hint this query to operate efficiently [at least, that’s my opinion, .I may be wrong].

The story so far

If you’re going to try to use SQL*Plus (or similar) to test a production query with bind variables you can’t just use a sample of literal values in place of the bind variables (though you may get lucky sometimes, of course), you should set up some SQL*Plus variables and assign values to them.

Though I haven’t said it presiously in this article this is an example where a decision that really should have been made by the front-end code has been embedded in the SQL and passed to the database as SQL which cannot be run efficiently. The front end code should have been coded to recognise the choice between invoice numbers and Siebel order ids and sent the appropriate query to the database.

Next Steps

WIthout making a significant change to the front-end mechanism wrapper is it possible to change the SQL so something the optimizer can handle efficiently? Sometimes the answer is yes; so I’ve created a simpler model to demonstrate the basic problem and supply a solution for cases like this one. The key issue is finding a way of working around the OR clauses that are trying to allow the optimizer to choose between two subqueries but make it impossible for either to be unnested into a small driving data set.

First, some tables:


rem
rem     Script:         or_in_twice.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2019
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,371)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

alter table t1 add constraint t1_pk primary key(id);

create table t2
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,372)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

create table t3
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,373)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;


Now a query – first setting up a variable in SQL*Plus to allow us to emulate a production query with bind variables. Since I’m only going to use Explain Plan the variable won’t be peekable, so there would still be some scope for this plan not matching a production plan, but it’s adequate to demonstrate the structural problem:


variable v1 varchar2(10)
exec :v1 := 'INVOICE'

explain plan for
select
        t1.v1 
from
        t1
where
        (
            :v1 = 'INVOICE' 
        and t1.id in (select id from t2 where n1 = 0)
        )
or      (
            :v1 = 'ORDERID' 
        and t1.id in (select id from t3 where n1 = 0)
        )
;

select * from table(dbms_xplan.display);

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    10 |   150 |    26   (4)| 00:00:01 |
|*  1 |  FILTER            |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   146K|    26   (4)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |     8 |    26   (4)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL| T3   |     1 |     8 |    26   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:V1='INVOICE' AND  EXISTS (SELECT 0 FROM "T2" "T2" WHERE
              "ID"=:B1 AND "N1"=0) OR :V1='ORDERID' AND  EXISTS (SELECT 0 FROM "T3"
              "T3" WHERE "ID"=:B2 AND "N1"=0))
   3 - filter("ID"=:B1 AND "N1"=0)
   4 - filter("ID"=:B1 AND "N1"=0)

As you can see, thanks to the OR that effectively gives Oracle the choice between running the subquery against t3 or the one against t2, Oracle is unable to do any unnesting. (In fact different versions of Oracle allow different levels of sophistication with disjuncts (OR) of subqueries, so this is the kind of example that’s always useful to keep for tests against future versions.)

Since we know that we are going to use one of the data sets supplied in one of the subqueries and have no risk of double-counting or eliminating required duplicates, one strategy we could adopt for this query is to rewrite the two subqueries as a single subquery with a union all – because we know the optimizer can usually handle a single IN subquery very nicely. So let’s try the following:


explain plan for
select
        t1.v1
from
        t1
where
        t1.id in (
                select  id 
                from    t2 
                where   n1 = 0
                and     :v1 = 'INVOICE'
                union all
                select  id 
                from    t3 
                where   n1 = 0
                and     :v1 = 'ORDERID'
        )
;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |    54 |  1512 |    77   (3)| 00:00:01 |
|*  1 |  HASH JOIN             |          |    54 |  1512 |    77   (3)| 00:00:01 |
|   2 |   VIEW                 | VW_NSO_1 |    54 |   702 |    51   (2)| 00:00:01 |
|   3 |    HASH UNIQUE         |          |    54 |   432 |    51   (2)| 00:00:01 |
|   4 |     UNION-ALL          |          |       |       |            |          |
|*  5 |      FILTER            |          |       |       |            |          |
|*  6 |       TABLE ACCESS FULL| T2       |    27 |   216 |    26   (4)| 00:00:01 |
|*  7 |      FILTER            |          |       |       |            |          |
|*  8 |       TABLE ACCESS FULL| T3       |    27 |   216 |    26   (4)| 00:00:01 |
|   9 |   TABLE ACCESS FULL    | T1       | 10000 |   146K|    26   (4)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."ID"="ID")
   5 - filter(:V1='INVOICE')
   6 - filter("N1"=0)
   7 - filter(:V1='ORDERID')
   8 - filter("N1"=0)


Thanks to the FILTERs at operations 5 and 7 this plan will pick the data from just one of the two subqueries, reduce it to a unique list and then use that as the build table to a hash join. Of course, with different data (or suitable hints) that hash join could become a nested loop using a high precision index.

But there’s an alternative. We manually rewrote the two subqueries as a single union all subquery and as we did so we moved the bind variable comparisons inside their respective subqueries; maybe we don’t need to introduce the union all. What would happen if we simply take the original query and move the “constant” predicates inside their subqueries?


explain plan for
select
        t1.v1
from
        t1
where
        t1.id in (select id from t2 where n1 = 0 and :v1 = 'INVOICE')
or      t1.id in (select id from t3 where n1 = 0 and :v1 = 'ORDERID')
;

select * from table(dbms_xplan.display);


-----------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |    54 |  1512 |    77   (3)| 00:00:01 |
|*  1 |  HASH JOIN             |          |    54 |  1512 |    77   (3)| 00:00:01 |
|   2 |   VIEW                 | VW_NSO_1 |    54 |   702 |    51   (2)| 00:00:01 |
|   3 |    HASH UNIQUE         |          |    54 |   432 |    51   (2)| 00:00:01 |
|   4 |     UNION-ALL          |          |       |       |            |          |
|*  5 |      FILTER            |          |       |       |            |          |
|*  6 |       TABLE ACCESS FULL| T3       |    27 |   216 |    26   (4)| 00:00:01 |
|*  7 |      FILTER            |          |       |       |            |          |
|*  8 |       TABLE ACCESS FULL| T2       |    27 |   216 |    26   (4)| 00:00:01 |
|   9 |   TABLE ACCESS FULL    | T1       | 10000 |   146K|    26   (4)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."ID"="ID")
   5 - filter(:V1='ORDERID')
   6 - filter("N1"=0)
   7 - filter(:V1='INVOICE')
   8 - filter("N1"=0)

In 12.2.0.1 and 18.3.0.0 it gets the same plan as we did with our “single subquery” rewrite – the optimizer is able to construct the union all single subquery (although the ordering of the subqueries has been reversed) and unnest without any other manual intervention. (You may find that earlier versions of Oracle don’t manage to do this, but you might have to go all the way back to 10g.

Conclusion

Oracle doesn’t like disjuncts (OR) and finds conjuncts (AND) much easier to cope with. Mixing OR and subqueries is a good way to create inefficient execution plans, especially when you try to force the optimizer to handle a decision that should have been taken in the front-end code. The optimizer, however, gets increasingly skilled at handling the mixture as you move through the newer versions; but you may have to find ways to give it a little help if you see it running subqueries as filter subqueries when you’re expecting it to unnest a subquery to produce a small driving data set.

 

June 13, 2019

Parallel Fun – 2

Filed under: Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 6:24 pm BST Jun 13,2019

I started writing this note in March 2015 with the following introductory comment:

A little while ago I wrote a few notes about a very resource-intensive parallel query. One of the points I made about it was that it was easy to model, and then interesting to run on later versions of Oracle. So today I’m going to treat you to a few of the observations and notes I made after modelling the problem; and here’s the SQL to create the underlying objects:

Unfortunately I failed to do anything more with the model I had created until a few days ago (June 2019 – in case I stall again) when a related question came up on the ODC database forum. This time I’m ready to go a little further – so I’ll start with a bait-and-switch approach. Here are the first few lines (omitting the SQL) of an SQL Monitor report from an instance of 18.3 – is this a power-crazed machine or what ?


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  TEST_USER (169:11324)
 SQL ID              :  73y5quma4jnw4
 SQL Execution ID    :  16777216
 Execution Started   :  06/13/2019 22:06:32
 First Refresh Time  :  06/13/2019 22:06:32
 Last Refresh Time   :  06/13/2019 22:07:03
 Duration            :  31s
 Module/Action       :  MyModule/MyAction
 Service             :  SYS$USERS
 Program             :  sqlplus@linux183.localdomain (TNS V1-V3)
 Fetch Calls         :  591

Global Stats
=========================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs | Bytes |
=========================================================================================
|      14 |    3.18 |     0.00 |        0.05 |       11 |   591 |  25978 |   62 |  13MB |
=========================================================================================

Parallel Execution Details (DOP=3 , Servers Allocated=6730)
==========================================================================================

It didn’t take long to run the query, only about 31 seconds. But the thing to notice in the report is that while the DOP is reported as 3, the number of “Servers Allocated” is a massive 6,730. So the big question – before I show you more of the report, explain what’s happening, and supply the code to build the model: how many PX processes did I actually start.

Here’s a little more of the output:


Parallel Execution Details (DOP=3 , Servers Allocated=6730)
==========================================================================================================================================================
|      Name      | Type  | Group# | Server# | Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Buffer | Read | Read  |        Wait Events         |
|                |       |        |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes |         (sample #)         |
==========================================================================================================================================================
| PX Coordinator | QC    |        |         |      14 |    3.13 |          |        0.05 |       11 |  23727 |      |     . | PX Deq: Join ACK (5)       |
|                |       |        |         |         |         |          |             |          |        |      |       | PX Deq: Signal ACK EXT (2) |
|                |       |        |         |         |         |          |             |          |        |      |       | sql_id: 6405a2hc50bt4 (1)  |
| p004           | Set 1 |      1 |       1 |    0.00 |    0.00 |          |             |          |    180 |      |     . | library cache: mutex X (1) |
|                |       |        |         |         |         |          |             |          |        |      |       |                            |
| p005           | Set 1 |      1 |       2 |    0.00 |    0.00 |          |             |          |    100 |      |     . |                            |
| p006           | Set 1 |      1 |       3 |    0.00 |    0.00 |          |             |          |     90 |      |     . |                            |
| p000           | Set 1 |      2 |       1 |    0.01 |    0.01 |          |             |          |        |      |     . |                            |
| p001           | Set 1 |      2 |       2 |    0.02 |    0.02 |          |             |          |        |      |     . |                            |
| p002           | Set 2 |      2 |       1 |    0.01 |    0.01 |     0.00 |             |          |    944 |   32 |   7MB |                            |
| p003           | Set 2 |      2 |       2 |    0.01 |    0.01 |     0.00 |             |          |    937 |   30 |   7MB |                            |
==========================================================================================================================================================

Despite “allocating” 6,730 servers Oracle is only admitting to having used 7 of them -so let’s take a closer look at how they’re used. There are two groups, and we have one set of 3 slaves in group 1, and two sets of two slaves in group 2. (It looks to me as if the Group# and Type columns should be the other way around given the hierarchy of group / type / server#). We can understand a little more of what these numbers mean if we look at the execution plan – particularly the special columns relating to Data Flow Operations (DFOs) and “DFO trees”.


SQL Plan Monitoring Details (Plan Hash Value=3398913290)
========================================================================================================================================================================
| Id |          Operation           |   Name   |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity |      Activity Detail       |
|    |                              |          | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |        (# samples)         |
========================================================================================================================================================================
|  0 | SELECT STATEMENT             |          |         |      |        32 |     +0 |     1 |     8846 |      |       |     . |     2.70 | Cpu (1)                    |
|  1 |   FILTER                     |          |         |      |        32 |     +0 |     1 |     8846 |      |       |     . |     5.41 | PX Deq: Signal ACK EXT (2) |
|  2 |    PX COORDINATOR            |          |         |      |        32 |     +0 |     5 |     8846 |      |       |     . |          |                            |
|  3 |     PX SEND QC (RANDOM)      | :TQ20002 |    9146 |  128 |        29 |     +2 |     2 |     8846 |      |       |     . |          |                            |
|  4 |      HASH JOIN BUFFERED      |          |    9146 |  128 |        29 |     +2 |     2 |     8846 |      |       |   9MB |          |                            |
|  5 |       PX RECEIVE             |          |    8846 |   11 |        14 |     +2 |     2 |     8846 |      |       |     . |          |                            |
|  6 |        PX SEND HYBRID HASH   | :TQ20000 |    8846 |   11 |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  7 |         STATISTICS COLLECTOR |          |         |      |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  8 |          PX BLOCK ITERATOR   |          |    8846 |   11 |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  9 |           TABLE ACCESS FULL  | T2       |    8846 |   11 |         1 |     +0 |    23 |     8846 |   24 |   1MB |     . |          |                            |
| 10 |       PX RECEIVE             |          |   50000 |  116 |        14 |     +2 |     2 |     2509 |      |       |     . |          |                            |
| 11 |        PX SEND HYBRID HASH   | :TQ20001 |   50000 |  116 |         1 |     +0 |     2 |     2509 |      |       |     . |          |                            |
| 12 |         PX BLOCK ITERATOR    |          |   50000 |  116 |         1 |     +0 |     2 |     2509 |      |       |     . |          |                            |
| 13 |          TABLE ACCESS FULL   | T1       |   50000 |  116 |         1 |     +0 |    26 |     2509 |   38 |  12MB |     . |          |                            |
| 14 |    PX COORDINATOR            |          |         |      |        31 |     +1 |  8978 |     2252 |      |       |     . |    13.51 | PX Deq: Join ACK (5)       |
| 15 |     PX SEND QC (RANDOM)      | :TQ10000 |       1 |   77 |        32 |     +0 |  6667 |     3692 |      |       |     . |          |                            |
| 16 |      PX BLOCK ITERATOR       |          |       1 |   77 |        32 |     +0 |  6667 |    92478 |      |       |     . |     2.70 | Cpu (1)                    |
| 17 |       TABLE ACCESS FULL      | T3       |       1 |   77 |        32 |     +0 | 53118 |    92478 |   32 |   8MB |     . |    67.57 | Cpu (25)                   |
========================================================================================================================================================================

The “Name” column shows us that we have two DFO trees (:TQ2nnnn, and :TQ1nnnn) – this is why we see two “groups” in PX server detail, and why those groups can have difference deggrees of parallelism.

Looking at the general shape of the plan you can see that operation 1 is a FILTER operation with two child operations, one at operation 2 the other at operation 14. So we probably have a filter subquery in place operated as DFO tree #1 while the main query is operated as DFO tree #2. This means the main query is running at DOP = 2 (it’s a hash join with hash distribution so it needs two sets of slave processes so all the details agree with what we’ve learned abaout Group# 2 above); and the subquery is operating a DOP = 3 – and it’s using only one set of slave processes.

There is a little anomaly in the number of Execs of operation 14 – at some point I will examine this more closely, but it might simply be a reporting error that has added the number of Execs of its child operations to its own Execs, it might be something to do with counting in Exec calls by its parent, it might be a side effect of scalar subquery caching. I’ll worry about it when I have a good reason to do so. What I want to look at is the Execs of operations 15/16, the PX Block Iterator / PX Send QC. There are 6,667 reports of PX slave executing, and that matches up quite nicely with the 6,730 reported “Servers Allocated” – so it would appear that Oracle says it’s allocating a server whenever it uses a server. But does it really “allocate” (and, presumably, de-allocate).

Here’s how you find out – you run the query again, taking various snapshot and looking for cross-references. I’ve got some results from v$pq_tqstat and v$pq_slace for the run that produced the SQL Monitor report above, and some of the QC session stats and enqueue stats for a subsequent run. This is what we see:


select  process, count(*) 
from    v$pq_tqstat 
group by 
        process 
order by 
        process
;


PROCESS                    COUNT(*)
------------------------ ----------
P000                              3
P001                              3
P002                              2
P003                              2
P004                           2225
P005                           2214
P006                           2218
QC                             2243


SQL> select slave_name, sessions from V$pq_slave order by slave_name;

SLAV   SESSIONS
---- ----------
P000          1
P001          1
P002          1
P003          1
P004       2242
P005       2242
P006       2242

Key Session Stats
=================
Name                                                                         Value                                                                          
----                                                                         -----                                                                          
opened cursors cumulative                                                    6,955                                                                          
user calls                                                                  20,631                                                                          
recursive calls                                                             20,895                                                                          
enqueue requests                                                            22,699                                                                          
enqueue conversions                                                         13,610                                                                          
enqueue releases                                                            15,894                                                                          
parse count (total)                                                          6,857                                                                          
execute count                                                                6,966                                                                          
DFO trees parallelized                                                           2
Parallel operations not downgraded                                           2,268

Key Enqueue Stats
=================
Type Short name                   Requests       Waits     Success      Failed    Wait m/s                                                                  
---- ----------                   --------       -----     -------      ------    --------                                                                  
DA   Slave Process Array             2,272          13       2,272           0          43                                                                  
PS   contention                     27,160       1,586      27,080           7         415                                                                  
SE   contention                      6,784           0       6,785           0           0                                                                  

TYPE                 DESCRIPTION
-------------------- ------------------------------------------------------------------------
PS                   Parallel Execution Server Process reservation and synchronization
DA                   Slave Process Spawn reservation and synchronization
SE                   Lock used by transparent session migration

Oracle really did start and stop something like 6,700 PX session (constantly re-using the same small set of PX slave processes) for each execution of the filter subquery. This is definitely a performance threat – we keep acquiring and releasing PX slaves, we keep creating new sessions (yes, really), and we keep searching for cursors in the library cache. All these activities are highly contentious. If you start running multiple queries that do this sort of thing you find that you see increasing amounts of time being spent on latch contention, PX slave allocation, mutex waits, and all the other problems you get with sessions that log on, do virtually nothing, then log off in rapid succession.

So how do you write SQL that does this type of thing. Here’s my data model (you may want to limit the number of rows in the tables:


create table t1 as
select * from all_source;

create table t2 as
select * from all_source where mod(line,20) = 1;

create table t3 as
select * from all_source;

And here’s all you have to do to start creating problems – I’ve added explicit hints to force parallelism (particularly for the subquery), it’s more likely that it has been introduced accidentally by table or index definitions, or by an “alter session” to “force parallel”:


set feedback only

select
        /*+ 
                parallel(t1 2) 
                parallel(t2 2)
                leading(t1 t2)
                use_hash(t2)
                swap_join_inputs(t2)
                pq_distribute(t2 hash hash)
                cardinality(t1,50000)
        */
        t1.owner,
        t1.name,
        t1.type
from
        t1
join
        t2
on      t2.owner = t1.owner
and     t2.name = t1.name
and     t2.type = t1.type
where
        t1.line = 1
and     (
           mod(length(t1.text), 10) = 0
        or exists (
                select --+ parallel(t3 3) 
                        null
                from    t3
                where   t3.owner = t1.owner
                and     t3.name = t1.name
                and     t3.type = t1.type
                and     t3.line >= t1.line
                )
        )
;

set feedback on

I’ve written notes in the past about SQL that forces the optimizer to run subqueries as filter subqueries instead of unnesting them – this is just an example of that type of query, pushed into parallelism. It’s not the only way (see comment #1 from Dmitry Remizov below) to end up with scalar subqueries being executed many times as separate DFO trees even though Oracle has enhanced the optimizer several times over the years in ways that bypass the threat – but the probalm can still appear and it’s important to notice in development that you’ve got a query that Oracle can’t work around.

 

June 6, 2019

Scalar Subquery Costing

Filed under: CBO,Execution plans,Oracle — Jonathan Lewis @ 7:54 pm BST Jun 6,2019

A question came up on Oracle-l list-server a few days ago about how Oracle calculates costs for a scalar subquery in the select list. The question included an example to explain the point of the question. I’ve reproduced the test below, with the output from an 18.3 test system. The numbers don’t match the numbers produced in the original posting but they are consistent with the general appearance.

rem
rem     Script:         ssq_costing.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2019
rem     Purpose:        
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem

create table t_1k ( n1 integer ) ;
create table t_100k ( n1 integer ) ;

insert into t_1k
  select
         level
    from dual
    connect by level <= 1e3;

insert into t_100k
  select level
    from dual
    connect by level <= 1e5;

commit ;

begin
  dbms_stats.gather_table_stats ( null, 'T_1K') ;
  dbms_stats.gather_table_stats ( null, 'T_100K') ;
end ;
/

explain plan for
select 
        /*+ qb_name(QB_MAIN) */
        (
        select /*+ qb_name(QB_SUBQ) */ count(*)
        from t_1k
        where t_1k.n1 = t_100k.n1
        )
from t_100k
;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |   100K|   488K|  1533K  (2)| 00:01:00 |
|   1 |  SORT AGGREGATE    |        |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T_1K   |     1 |     4 |    17   (0)| 00:00:01 |
|   3 |  TABLE ACCESS FULL | T_100K |   100K|   488K|    36   (9)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T_1K"."N1"=:B1)

The key point to note is this – the scalar subquery has to execute 100,000 times because that’s the number of rows in the driving table. The cost for executing the scalar subquery once is 17 – so the total cost of the query should be 1,700,036 – not 1,533K (and for execution plans the K means x1000, not x1024). There’s always room for rounding errors, of course, but a check of the 10053 (CBO trace) file shows the numbers to be 17.216612 for the t_1k tablescan, 36.356072 for the t_100K tablescan, and 1533646.216412 for the whole query. So how is Oracle managing to get a cost that looks lower than it ought to be?

There’s plenty of scope for experimenting to see how the numbers change – and my first thought was simply to see what happens as you change the number of distinct values in the t_100K.n1 column. It would be rather tedious to go through the process of modifying the data a few hundred times to see what happens, so I took advantage of the get_column_stats() and set_column_stats() procedures in the dbms_stats package to create a PL/SQL loop that faked a number of different scenarios that lied about the actual table data.


delete from plan_table;
commit;

declare

        srec                    dbms_stats.statrec;
        n_array                 dbms_stats.numarray;

        m_distcnt               number;
        m_density               number;
        m_nullcnt               number;
        m_avgclen               number;


begin

        dbms_stats.get_column_stats(
                ownname         => user,
                tabname         => 't_100k',
                colname         => 'n1', 
                distcnt         => m_distcnt,
                density         => m_density,
                nullcnt         => m_nullcnt,
                srec            => srec,
                avgclen         => m_avgclen
        ); 

        for i in 1 .. 20 loop

                m_distcnt := 1000 * i;
                m_density := 1/m_distcnt;

                dbms_stats.set_column_stats(
                        ownname         => user,
                        tabname         => 't_100k',
                        colname         => 'n1', 
                        distcnt         => m_distcnt,
                        density         => m_density,
                        nullcnt         => m_nullcnt,
                        srec            => srec,
                        avgclen         => m_avgclen
                ); 


        execute immediate
        '
                explain plan set statement_id = ''' || m_distcnt || 
        '''
                for
                select
                        /*+ qb_name(QB_MAIN) */
                        (
                        select /*+ qb_name(QB_SUBQ) */ count(*)
                        from t_1k
                        where t_1k.n1 = t_100k.n1
                        )
                from t_100k
        ';
        
        end loop;       

end;
/

The code is straightforward. I’ve declared a few variables to hold the column stats from the t_100k.n1 column, called get_column stats(), then looped 20 times through a process that changes the number of distinct values (and corresponding density) recorded in the column stats, then used execute immediate to call “explain plan” for the original query.

You’ll notice I’ve given each plan a separate statement_id that corresponds to the num_distinct that generated the plan. In the code above I’ve changed the num_distinct from 1,000 to 20,000 in steps of 1,000.

Once the PL/SQL block ends I’ll have a plan table with 20 execution plans stored in it and, rather than reporting those plans with calls to dbms_xplan.display(), I’m going to be selective about which rows and columns I report.

select
        statement_id, 
        io_cost,
        io_cost - lag(io_cost,1) over (order by to_number(statement_id)) io_diff,
        cpu_cost,
        cpu_cost - lag(cpu_cost,1) over (order by to_number(statement_id)) cpu_diff,
        cost
from 
        plan_table
where 
        id = 0
order by 
        to_number(statement_id)
;

I’ve picked id = 0 (the top line of the plan) for each statement_id and I’ve reported the cost column, which is made up of the io_cost column plus a scaled down value of the cpu_cost column. I’ve also used the analytic lag() function to calculate how much the io_cost and cpu_cost changed from the previous statement_id. Here are my results from 18c:


STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
1000                                17033            1099838920                 17253
2000                                34033      17000 2182897480 1083058560      34470
3000                                51033      17000 3265956040 1083058560      51686
4000                                68033      17000 4349014600 1083058560      68903
5000                                85033      17000 5432073160 1083058560      86119
6000                               102033      17000 6515131720 1083058560     103336
7000                               119033      17000 7598190280 1083058560     120553
8000                               136033      17000 8681248840 1083058560     137769
9000                               153033      17000 9764307400 1083058560     154986
10000                              170033      17000 1.0847E+10 1083058560     172202
11000                              197670      27637 1.2608E+10 1760725019     200191
12000                              338341     140671 2.1570E+10 8962036084     342655
13000                              457370     119029 2.9153E+10 7583261303     463200
14000                              559395     102025 3.5653E+10 6499938259     566525
15000                              647816      88421 4.1287E+10 5633279824     656073
16000                              725185      77369 4.6216E+10 4929119846     734428
17000                              793452      68267 5.0565E+10 4349223394     803565
18000                              854133      60681 5.4431E+10 3865976350     865019
19000                              908427      54294 5.7890E+10 3459031472     920005
20000                              957292      48865 6.1003E+10 3113128324     969492

The first pattern that hits the eye is the constant change of 17,000 in the io_cost in the first few lines of the output. For “small” numbers of distinct values the (IO) cost of the query is (33 + 17 * num_distinct) – in other words, the arithmetic seems to assume that it will execute the query once for each value and then cache the results so that repeated executions for any given value will not be needed. This looks as if the optimizer is trying to match its arithmetic to the “scalar subquery caching” mechanism.

But things change somewhere between 10,000 and 11,000 distinct values. The point comes where adding one more distinct value causes a much bigger jump in cost than 17, and that’s because Oracle assumes it’s reached a point where there’s a value that it won’t have room for in the cache and will have to re-run the subquery multiple times for that value as it scans the rest of the table. Let’s find the exact break point where that happens.

Changing my PL/SQL loop so that we calculate m_distcnt as “19010 + i” this is the output from the final query:


-- m_distcnt := 10910 + i;

STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
10911                              185520            1.1834E+10                187887
10912                              185537         17 1.1835E+10    1083059     187904
10913                              185554         17 1.1836E+10    1083058     187921
10914                              185571         17 1.1837E+10    1083059     187938
10915                              185588         17 1.1838E+10    1083058     187956
10916                              185605         17 1.1839E+10    1083059     187973
10917                              185622         17 1.1841E+10    1083059     187990
10918                              185639         17 1.1842E+10    1083058     188007
10919                              185656         17 1.1843E+10    1083059     188025
10920                              185673         17 1.1844E+10    1083058     188042
10921                              185690         17 1.1845E+10    1083059     188059
10922                              185707         17 1.1846E+10    1083058     188076
10923                              185770         63 1.1850E+10    4027171     188140
10924                              185926        156 1.1860E+10    9914184     188298
10925                              186081        155 1.1870E+10    9912370     188455
10926                              186237        156 1.1880E+10    9910555     188613
10927                              186393        156 1.1890E+10    9908741     188770
10928                              186548        155 1.1900E+10    9906928     188928
10929                              186703        155 1.1909E+10    9905114     189085
10930                              186859        156 1.1919E+10    9903302     189243

If we have 10,922 distinct values in the column the optimizer calculates as if it will be able to cache them all; but if we have 10,923 distinct values the optimizer thinks that there’s going to be one value where it can’t cache the result and will have to run the subquery more than once.

Before looking at this in more detail let’s go to the other interesting point – when does the cost stop changing: we can see the cost increasing as the number of distinct values grows, we saw at the start that the cost didn’t seem to get as large as we expected, so there must be a point where it stops increasing before it “ought” to.

I’ll jump straight to the answer: here’s the output from the test when I start num_distinct off at slightly less than half the number of rows in the table:


 -- m_distcnt := (50000 - 10) + i;

STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
49991                             1514281            9.6488E+10               1533579
49992                             1514288          7 9.6489E+10     473357    1533586
49993                             1514296          8 9.6489E+10     473337    1533594
49994                             1514303          7 9.6490E+10     473319    1533601
49995                             1514311          8 9.6490E+10     473299    1533609
49996                             1514318          7 9.6491E+10     473281    1533616
49997                             1514325          7 9.6491E+10     473262    1533624
49998                             1514333          8 9.6492E+10     473243    1533631
49999                             1514340          7 9.6492E+10     473224    1533639
50000                             1514348          8 9.6493E+10     473205    1533646
50001                             1514348          0 9.6493E+10          0    1533646
50002                             1514348          0 9.6493E+10          0    1533646
50003                             1514348          0 9.6493E+10          0    1533646
50004                             1514348          0 9.6493E+10          0    1533646
50005                             1514348          0 9.6493E+10          0    1533646
50006                             1514348          0 9.6493E+10          0    1533646
50007                             1514348          0 9.6493E+10          0    1533646
50008                             1514348          0 9.6493E+10          0    1533646
50009                             1514348          0 9.6493E+10          0    1533646
50010                             1514348          0 9.6493E+10          0    1533646

The cost just stops changing when num_distinct = half the rows in the table.

Formulae

During the course of these experiments I had been exchanging email messages with Nenad Noveljic via the Oracle-L list-server (full monthly archive here) and he came up with the suggesion of a three-part formula that assumed a cache size and gave a cost of

  • “tablescan cost + num_distinct * subquery unit cost” for values of num_distinct up to the cache size;
  • then, for values of num_distinct greater than the cache_size and up to half the size of the table added a marginal cost representing the probability that some values would not be cached;
  • then for values of num_distinct greater than half the number of rows in the table reported the cost associated with num_distinct = half the number of rows in the table.

Hence:

  • for 1 <= num_distinct <= 10922, cost = (33 + num_distinct + 17)
  • for 10,923 <= num_distinct <= 50,000, cost = (33 + 10,922 * 17) + (1 – 10,922/num_distinct) * 100,000 * 17
  • for 50,000 <= num_distinct <= 100,000, cost = cost(50,000).

The middle line needs a little explanation: ( 1-10,922 / num_distinct ) is the probability that a value will not be in the cache; this has to be 100,000 to give the expected number of rows that will not be cached, and then multiplied by 17 as the cost of running the subquery for those rows.

The middle line can be re-arranged as 33 + 17 * (10,922 + (1 – 10,922/num_distinct) * 100,000)

Tweaking

At this point I could modify my code loop to report the calculated value for the cost and compare it with the actual cost to show you that the two values didn’t quite match. Instead I’ll jump forward a little bit to a correction that needs to be made to the formula above. It revolves around how Oracle determines the cache size. There’s a hidden parameter (which I mentioned in CBO Fundamentals) that controls scalar subquery caching. In the book I think I only referenced it in the context of subqueries in the “where” clause. The parameter is “_query_execution_cache_max_size” and has a default value of 131072 (power(2,7)) – so when I found that the initial formula didn’t quite work I made the following observation:

  • 131072 / 10922 = 12.00073
  • 131072 / 12 = 10922.666…

So I put 1092.66667 into the formula to see if that would improve things.

For the code change I added a variable m_cost to the PL/SQL block, and set it inside the loop as follows:

m_cost := round(33 + 17 * (10922.66667 + 100000 * (1 - (10922.66667 / m_distcnt))));

Then in the “execute immediate” I changed the “explain plan” line to read:

explain plan set statement_id = ''' || lpad(m_distcnt,7) || ' - ' || lpad(m_cost,8) ||

This allowed me to show the formula’s prediction of (IO)cost in final output, and here’s what I got for values of num_distinct in the region of 10,922:


STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
  10911 -   183901                 185520            1.1834E+10                187887
  10912 -   184057                 185537         17 1.1835E+10    1083059     187904
  10913 -   184212                 185554         17 1.1836E+10    1083058     187921
  10914 -   184368                 185571         17 1.1837E+10    1083059     187938
  10915 -   184524                 185588         17 1.1838E+10    1083058     187956
  10916 -   184680                 185605         17 1.1839E+10    1083059     187973
  10917 -   184836                 185622         17 1.1841E+10    1083059     187990
  10918 -   184992                 185639         17 1.1842E+10    1083058     188007
  10919 -   185147                 185656         17 1.1843E+10    1083059     188025
  10920 -   185303                 185673         17 1.1844E+10    1083058     188042
  10921 -   185459                 185690         17 1.1845E+10    1083059     188059
  10922 -   185615                 185707         17 1.1846E+10    1083058     188076
  10923 -   185770                 185770         63 1.1850E+10    4027171     188140
  10924 -   185926                 185926        156 1.1860E+10    9914184     188298
  10925 -   186081                 186081        155 1.1870E+10    9912370     188455
  10926 -   186237                 186237        156 1.1880E+10    9910555     188613
  10927 -   186393                 186393        156 1.1890E+10    9908741     188770
  10928 -   186548                 186548        155 1.1900E+10    9906928     188928
  10929 -   186703                 186703        155 1.1909E+10    9905114     189085
  10930 -   186859                 186859        156 1.1919E+10    9903302     189243

The formula is only supposed to work in the range 10923 – 50,000, so the first few results don’t match; but in the range 10,923 to 10,930 the match is exact. Then, in the region of 50,000 we get:


STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
  49991 -  1514281                1514281            9.6488E+10               1533579
  49992 -  1514288                1514288          7 9.6489E+10     473357    1533586
  49993 -  1514296                1514296          8 9.6489E+10     473337    1533594
  49994 -  1514303                1514303          7 9.6490E+10     473319    1533601
  49995 -  1514311                1514311          8 9.6490E+10     473299    1533609
  49996 -  1514318                1514318          7 9.6491E+10     473281    1533616
  49997 -  1514325                1514325          7 9.6491E+10     473262    1533624
  49998 -  1514333                1514333          8 9.6492E+10     473243    1533631
  49999 -  1514340                1514340          7 9.6492E+10     473224    1533639
  50000 -  1514348                1514348          8 9.6493E+10     473205    1533646
  50001 -  1514355                1514348          0 9.6493E+10          0    1533646
  50002 -  1514363                1514348          0 9.6493E+10          0    1533646
  50003 -  1514370                1514348          0 9.6493E+10          0    1533646
  50004 -  1514377                1514348          0 9.6493E+10          0    1533646
  50005 -  1514385                1514348          0 9.6493E+10          0    1533646
  50006 -  1514392                1514348          0 9.6493E+10          0    1533646
  50007 -  1514400                1514348          0 9.6493E+10          0    1533646
  50008 -  1514407                1514348          0 9.6493E+10          0    1533646
  50009 -  1514415                1514348          0 9.6493E+10          0    1533646
  50010 -  1514422                1514348          0 9.6493E+10          0    1533646

Again, the formula applies only in the range up to 50,000 (half the rows in the table) – and the match is perfect in that range.

Next steps

The work so far gives us some idea of the algorithm that the optimizer is using to derive a cost, but this is just one scenario and there are plenty of extra questions we might ask. What, as the most pressing one, is the significance of the number 12 in the calculation 131,072/12. From previous experience I guess that is was related to the length of the input and output values of the scalar subquery – as in “value X for n1 returns value Y for count(*)”.

To pursue this idea I recreated the data sets using varchar2(10) as the definition of n1 and lpad(rownum,10) as the value – the “breakpoint” dropped from 10,922 down to 5,461. Checking the arithmetic 131,072 / 5461 = 24.001456, then 131,072/24 = 5461.333… And that’s the number that made fhe formular work perfectly for the modified data set.

Then I set used set_column_stats() to hack the avg_col_,len of t_100K.n1 to 15 and the break point dropped to 4,096.  Again we do the two arithmetic steps: 131072/4096 = 32 (but then we don’t need to do the reverse step since the first result is integral).

Checking the original data set when n1 was a numeric the avg_col_len was 5, so we have three reference points:

  • Avg_col_len = 5. “Cache unit size” = 12
  • Avg_col_len = 11. Cache unit size = 24 (don’t forget the avg_col_len includes the length byte, so our padded varchar2(10) has a length of 11).
  • Avg_col_len = 15, Cache unit size = 32

There’s an obvious pattern here: “Cache unit size” = (2 x avg_col_len + 2).  Since I hadn’t been changing the t_1k.n1 column at the same time, that really does look like a deliberate factor of 2 (I’d thought intially that maybe the 12 was affected by the lengths of both columns in the predicate – but that doesn’t seem to be the case.)

The scientific method says I should now make a prediction based on my hypothesis – so I set the avg_col_len for t_100K.n1 to 23 and guessed that the break point would be at 2730 – and it was.  (131072 / (2 * 23 + 2) = 2730.6666…) .

The next question, of course, is “where does the “spare 2″ come from?” Trying to minimize the change in the code I modified my subquery to select sum(to_number(n1)) rather than count(*), then to avg(to_number(n1)) – remember I had changed n1 to a varchar2(10) that looked like a number left-padded with spaces. In every variant of the tests I’d done so far all I had to do to get an exact match between the basic formula and the optimizer’s cost calculation was to use “2 * avg_col_len + 22” as the cache unit size – and 22 is the nominal maximum length of an internally stored numeric column.

Bottom line: the cache unit size seems to be related to the input and output values, but I don’t know why there’s a factor of 2 applied to the input column length, and I don’t know why the length of count(*) is deemed to be 2 when other derived numeric outputs use have the more intuitive 22 for their length.

tl;dr

The total cost calculation for a scalar subquery in the select list is largely affected by:

  • a fixed cache size (131,072 bytes) possibly set by hidden parameter _query_execution_cache_max_size
  • the avg_col_len of the input (correlating) column(s) from the driving table
  • the nominal length of the output (select list) of the subquery

There is an unexplained factor of 2 used with the avg_col_len of the input, and a slightly surprising value of 2 if the output is simply count(*).

If the number N of distinct values for the driving column(s) is less than the number of possible cache entries the effect of the scalar subquery is to add N * estimated cost of executing the subquery once.  As the number of distinct values for the driving column(s) goes above the limit then the incremental effect of the subquery is based on the expected number of times an input value will not be cached. When the number of distinct values in the driving column(s) exceeds half the number of rows in the driving table the cost stops increasing – there is no obvious reason when the algorithm does this.

There are many more cases that I could investigate at this point – but I think this model is enough as an indication of general method. If you come across a variation where you actually need to work out how the optimizer derived a cost then this framework will probably be enough to get you started in the right direction.

 

May 21, 2019

Misleading Execution Plan

Filed under: Execution plans,Oracle,subqueries — Jonathan Lewis @ 11:48 am BST May 21,2019

A couple of weeks ago I published a note about an execution plan which showed the details of a scalar subquery in the wrong place (as far as the typical strategies for interpreting execution plans are concerned). In a footnote to the article I commented that Andy Sayer had produced a simple reproducible example of the anomaly based around the key features of the query supplied in the original posting and had emailed it to me.  With his permission (and with some minor modifications) I’ve reproduced it below:


rem
rem     Script:         misplaced_subq_plan.sql
rem     Author:         Andrew Sayer
rem     Dated:          May 2019
rem

drop table recursive_table;
drop table lookup_t;
drop table join_t;

@@setup

set linesize 180
set pagesize 60

create table recursive_table (
        my_id           number constraint rt_pk primary key,
        parent_id       number,
        fk_col          number
);

insert into recursive_table 
select 
        rownum, 
        nullif(rownum-1,0)      parent_id, 
        mod(rownum,10) 
from 
        dual 
connect by 
        rownum <=100
;

prompt  ==================================================
prompt  Note that fk_col will be zero for 1/10 of the rows
prompt  ==================================================

create table lookup_t(
        pk_col number  constraint lt_pk primary key,
        value varchar2(30 char)
)
;

insert into lookup_t 
select 
        rownum, 
        rpad('x',30,'x') 
from 
        dual 
connect by 
        rownum <=100
;

create table join_t(
        pk_col number primary key,
        col_10 number,
        value varchar2(100 char)
);

insert into join_t 
select 
        rownum, mod(rownum,10), rpad('x',30,'x') 
from 
        dual 
connect by 
        rownum <=1000 --> comment to avoid WordPress format problem.
;

execute dbms_stats.gather_table_stats(null,'recursive_table')
execute dbms_stats.gather_table_stats(null,'lookup_t')
execute dbms_stats.gather_table_stats(null,'join_t')

prompt	================================
prompt	note that pk_col will never be 0
prompt	================================

set serverout off
alter session set statistics_level=all;

var x number
exec :x := 1

spool misplaced_subq_plan

select  /* distinct */ 
        r.my_id, j.value, r.ssq_value
from    (
	select 
		my_id, parent_id, fk_col, 
		(select l.value from lookup_t l where l.pk_col = r.parent_id) ssq_value 
        from 
		recursive_table r 
	connect by 
		prior my_id = parent_id 
	start with 
		my_id = :x
	) r
join    join_t j
on	r.fk_col = j.pk_col
/

select * from table(dbms_xplan.display_cursor(format=>'allstats projection'));

set serveroutput on

spool off

The code generates, populates, and queries three tables:

  • recursive_table is used in a “connect by” query to generate some data.
  • lookup_t is used in a scalar subquery in the select list of the “connect by” query.
  • join_t is then joined to the result of the “connect by” query to eliminate some rows.

The construction allows us to see a difference between the number of rows returned and the number of times the scalar subquery is executed, and makes it easy to detect an anomaly in the presentation of the execution plan. And here is the execution plan from an 18.3 instance:

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                 |      1 |        |     90 |00:00:00.01 |     170 |       |       |          |
|   1 |  TABLE ACCESS BY INDEX ROWID              | LOOKUP_T        |    100 |      1 |     99 |00:00:00.01 |     102 |       |       |          |
|*  2 |   INDEX UNIQUE SCAN                       | LT_PK           |    100 |      1 |     99 |00:00:00.01 |       3 |       |       |          |
|*  3 |  HASH JOIN                                |                 |      1 |      2 |     90 |00:00:00.01 |     170 |  1123K|  1123K|     1/0/0|
|   4 |   VIEW                                    |                 |      1 |      2 |    100 |00:00:00.01 |     125 |       |       |          |
|*  5 |    CONNECT BY NO FILTERING WITH START-WITH|                 |      1 |        |    100 |00:00:00.01 |      23 |  9216 |  9216 |     2/0/0|
|   6 |     TABLE ACCESS FULL                     | RECURSIVE_TABLE |      1 |    197 |    100 |00:00:00.01 |      23 |       |       |          |
|   7 |   TABLE ACCESS FULL                       | JOIN_T          |      1 |      1 |   1000 |00:00:00.01 |      45 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("L"."PK_COL"=:B1)
   3 - access("R"."FK_COL"="J"."PK_COL")
   5 - access("PARENT_ID"=PRIOR NULL)
       filter("MY_ID"=:X)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - "L"."VALUE"[VARCHAR2,120]
   2 - "L".ROWID[ROWID,10]
   3 - (#keys=1) "R"."MY_ID"[NUMBER,22], "J"."VALUE"[VARCHAR2,400], "R"."SSQ_VALUE"[VARCHAR2,120], "J"."VALUE"[VARCHAR2,400]
   4 - "R"."MY_ID"[NUMBER,22], "R"."FK_COL"[NUMBER,22], "R"."SSQ_VALUE"[VARCHAR2,120]
   5 - "PARENT_ID"[NUMBER,22], "MY_ID"[NUMBER,22], "FK_COL"[NUMBER,22], "R"."PARENT_ID"[NUMBER,22], PRIOR NULL[22], LEVEL[4]
   6 - "MY_ID"[NUMBER,22], "PARENT_ID"[NUMBER,22], "FK_COL"[NUMBER,22]
   7 - "J"."PK_COL"[NUMBER,22], "J"."VALUE"[VARCHAR2,400]

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan


In a typical execution plan with scalar subqueries in the select list, the sub-plans for the scalar subqueries appear in the plan before the main query – and in this plan you can see the scalar subquery here at operations 1 and 2.

But the scalar subquery is in the select list of a non-mergeable view (operations 4, 5, 6). We can see that this view generates 100 rows (A-rows of operation 4) and the scalar subquery starts 100 times (Starts of operation 1) – so we can infer that the subquery ran for each row generated by the view.

The problem, though, is that the result set from the view is joined to another table, eliminating some rows and reducing the size of the result set; so if we don’t look carefully at all the details of the plan we appear to have a driving query that produces a result set of 90 rows (at operation 3), but manages to execute the scalar subquery just above it in the plan more times than there are rows in the result set.

It’s easy to unpick what’s really happening in this very simple query with a very short plan – but much harder to do so in the original case where the scalar subquery appeared “outside” the hash join when it actually executed inside a complex subplan that generated the second input (proble table) for the hash join.

As a further little note – if you look at the Column Projection Information you’ll see that operation 4 is where Oracle first projects ‘r.ssq_value[varchar2,120]’ which is the column created by the execution of the sub-plan at operation 1.

Arguably the execution plan should have look more like:


Plan hash value: 2557600799

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                 |      1 |        |     90 |00:00:00.01 |     170 |       |       |          |
|*  1 |  HASH JOIN                                |                 |      1 |      2 |     90 |00:00:00.01 |     170 |  1123K|  1123K|     1/0/0|
|   2 |   VIEW                                    |                 |      1 |      2 |    100 |00:00:00.01 |     125 |       |       |          |
|   3 |    TABLE ACCESS BY INDEX ROWID            | LOOKUP_T        |    100 |      1 |     99 |00:00:00.01 |     102 |       |       |          |
|*  4 |     INDEX UNIQUE SCAN                     | LT_PK           |    100 |      1 |     99 |00:00:00.01 |       3 |       |       |          |
|*  5 |    CONNECT BY NO FILTERING WITH START-WITH|                 |      1 |        |    100 |00:00:00.01 |      23 |  9216 |  9216 |     2/0/0|
|   6 |     TABLE ACCESS FULL                     | RECURSIVE_TABLE |      1 |    100 |    100 |00:00:00.01 |      23 |       |       |          |
|   7 |   TABLE ACCESS FULL                       | JOIN_T          |      1 |      1 |   1000 |00:00:00.01 |      45 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------

Inevitably, there are cases where the sub-plan for a scalar subquery appears much closer to its point of operation rather than being moved to the top of the execution plan. So any time you have scalar subqueries in select lists inside in-line views keep a careful lookout for where they appear and how many times they run in the execution plan. And don’t forget that giving every query block a name will help you track down your migrating subqueries.

Footnote

If you’re wondering why the Column Projection Information reports s.ssq_value as varchar2(120) when I’ve declared the column as varchar2(30), my declaration is 30 CHAR, and the database (by default) is running with a multi-byte character set that allows a maximum of 4 bytes per character.

Update (22nd May 201)

Following the comment from Iudith Mentzel below about clever optimisations, primary keys, and related inferences I thought it worth pointing out that it is possible to modify the demonstration query to get the same plan (shape) with different Start counts. We note that instead of putting the scalar subquery inside the inline view we would get the same result if we passed the parent_id to the outer query block and ran the scalar subquery there:

select  /* distinct */ 
        r.my_id, j.value,
        (select l.value from lookup_t l where l.pk_col = r.parent_id) ssq_value 
from    (
        select 
                my_id, parent_id, fk_col
        from 
                recursive_table r 
        connect by 
                prior my_id = parent_id 
        start with 
                my_id = :x
        ) r
join    join_t j
on      r.fk_col = j.pk_col
/

This gives us the following execution plan (with rowsource execution statistics):


Plan hash value: 2557600799

--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                 |      1 |        |     90 |00:00:00.03 |      29 |       |       |          |
|   1 |  TABLE ACCESS BY INDEX ROWID              | LOOKUP_T        |     90 |      1 |     89 |00:00:00.01 |      97 |       |       |          |
|*  2 |   INDEX UNIQUE SCAN                       | LT_PK           |     90 |      1 |     89 |00:00:00.01 |       8 |       |       |          |
|*  3 |  HASH JOIN                                |                 |      1 |      2 |     90 |00:00:00.03 |      29 |  1695K|  1695K|     1/0/0|
|   4 |   VIEW                                    |                 |      1 |      2 |    100 |00:00:00.01 |       7 |       |       |          |
|*  5 |    CONNECT BY NO FILTERING WITH START-WITH|                 |      1 |        |    100 |00:00:00.01 |       7 |  6144 |  6144 |     2/0/0|
|   6 |     TABLE ACCESS FULL                     | RECURSIVE_TABLE |      1 |    100 |    100 |00:00:00.01 |       7 |       |       |          |
|   7 |   TABLE ACCESS FULL                       | JOIN_T          |      1 |      1 |   1000 |00:00:00.01 |      22 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------

Note that the plan hash values are the same even though (mechanically) the real order of activity is dramatically different. But now we can see that the scalar subquery (operations 1 and 2) starts 90 times – once for each row returned by the hash join at operation 3, and we have done slightly fewer buffer visits (97 compared to 102) for that part of the plan.

It’s a pity, though, that when you start poking at a plan and looking too closely there are always new buggy bits to see. With the scalar subquery now at its optimal position (and maybe it will eventually get there without a manual rewrite) the arithmetic of “summing up the plan” has gone wrong for (at least) the Buffers column. In the new plan the 97 buffer visits attributed to operation 1 (and its descendents) should have been added to the 29 buffer visits attributed to the hash join (and its descendents) at operation 3 to get a total of 126; instead the 97 have just disappeared from the query total.

By comparison, and reading the operations in the original plan a suitable order, we see the view at operation 4 reporting 109 buffers which comes from 7 for its “obvious” descendents plus the 102 from operation 1 that actually happen inside the view. Then the hash join at operation 3 reports 131 buffers which is the 109 from the view plus the 22 from the tablescan at operation 7, and that 131 buffers is the final figure for the query.

So, for this particular example, it doesn’t matter what you do, the execution plan and its stats try to confuse you.

May 7, 2019

Execution Plan Puzzle

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 12:45 pm BST May 7,2019

Here’s an execution plan that’s just been published on the ODC database forum. The plan comes from a call to dbms_xplan.display_cursor() with rowsource execution statistics enabled.

There’s something unusual about the execution statistics that I don’t think I’ve seen before – can anyone else see anything really odd, or (better still) anything which they would expect others to find odd but which they can easily explain.

A couple of hints:

  • It’s nothing to do with the fact that E-Rows and A-Rows don’t match – that’s never a surprise.
  • It’s not really about the fact that huge amounts of time seems to appear out of “nowhere” in the A-Time column
  • It is something to do with the relationship between A-Rows and Starts

I’ve inserted a few separator lines to break the plan into smaller pieces that can be examined in isolation. There are two “Load as Select” sections (presumably from “with” subqueries) and the main body of the query.

We don’t, as at time of writing, have the SQL or the Oracle version number that produced this plan. [Update: version now reported as 12.1.0.2]


---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                       | Name                         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                |                              |      1 |        |     50 |00:00:18.00 |     367K|     55 |     55 |       |       |          |
|   1 |  TEMP TABLE TRANSFORMATION                      |                              |      1 |        |     50 |00:00:18.00 |     367K|     55 |     55 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   2 |   LOAD AS SELECT                                |                              |      1 |        |      0 |00:00:00.55 |   56743 |      0 |     53 |  1040K|  1040K|          |
|   3 |    TABLE ACCESS BY INDEX ROWID                  | OBJECTS                      |   7785 |      1 |   7785 |00:00:00.03 |    8150 |      0 |      0 |       |       |          |
|*  4 |     INDEX UNIQUE SCAN                           | PK_OBJECTS                   |   7785 |      1 |   7785 |00:00:00.01 |     360 |      0 |      0 |       |       |          |
|   5 |    TABLE ACCESS BY INDEX ROWID BATCHED          | ATTRIBUTES                   |   7785 |      1 |   5507 |00:00:00.05 |   12182 |      0 |      0 |       |       |          |
|*  6 |     INDEX RANGE SCAN                            | UK_ATTR                      |   7785 |      1 |   5507 |00:00:00.03 |    9621 |      0 |      0 |       |       |          |
|*  7 |      TABLE ACCESS FULL                          | ATTRIBUTE_TYPES              |      1 |      1 |      1 |00:00:00.01 |      38 |      0 |      0 |       |       |          |
|   8 |    TABLE ACCESS BY INDEX ROWID BATCHED          | ATTRIBUTES                   |   7785 |      1 |   5507 |00:00:00.03 |   12182 |      0 |      0 |       |       |          |
|*  9 |     INDEX RANGE SCAN                            | UK_ATTR                      |   7785 |      1 |   5507 |00:00:00.02 |    9621 |      0 |      0 |       |       |          |
|* 10 |      TABLE ACCESS FULL                          | ATTRIBUTE_TYPES              |      1 |      1 |      1 |00:00:00.01 |      38 |      0 |      0 |       |       |          |
|  11 |    TABLE ACCESS BY INDEX ROWID BATCHED          | ATTRIBUTES                   |   1366 |      1 |   1366 |00:00:00.02 |    4592 |      0 |      0 |       |       |          |
|* 12 |     INDEX RANGE SCAN                            | IDX_ATTR_NDC_OBJECT_VALUE    |   1366 |      1 |   1366 |00:00:00.01 |    3227 |      0 |      0 |       |       |          |
|* 13 |      INDEX RANGE SCAN                           | NCI_NODES_COVERING_IDX       |   1366 |      1 |   1366 |00:00:00.01 |     595 |      0 |      0 |       |       |          |
|* 14 |    VIEW                                         |                              |      1 |     12 |   7785 |00:00:00.41 |   24174 |      0 |      0 |       |       |          |
|* 15 |     FILTER                                      |                              |      1 |        |   7891 |00:00:00.39 |   19582 |      0 |      0 |       |       |          |
|* 16 |      CONNECT BY WITH FILTERING                  |                              |      1 |        |  66134 |00:00:00.37 |   19144 |      0 |      0 |  7069K|  1062K| 6283K (0)|
|  17 |       TABLE ACCESS BY INDEX ROWID               | NODES                        |      1 |      1 |      1 |00:00:00.01 |       4 |      0 |      0 |       |       |          |
|* 18 |        INDEX UNIQUE SCAN                        | PK_NODES                     |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|  19 |       NESTED LOOPS                              |                              |      9 |     11 |  66133 |00:00:00.19 |   19137 |      0 |      0 |       |       |          |
|  20 |        CONNECT BY PUMP                          |                              |      9 |        |  66134 |00:00:00.01 |       0 |      0 |      0 |       |       |          |
|* 21 |        TABLE ACCESS BY INDEX ROWID BATCHED      | NODES                        |  66134 |     11 |  66133 |00:00:00.15 |   19137 |      0 |      0 |       |       |          |
|* 22 |         INDEX RANGE SCAN                        | NCI_NODES_PARENT_NODE_ID     |  66134 |     11 |  67807 |00:00:00.08 |   12139 |      0 |      0 |       |       |          |
|  23 |         TABLE ACCESS BY INDEX ROWID             | OBJECT_TYPES                 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|* 24 |          INDEX UNIQUE SCAN                      | UK_IDX_OBJECT_TYPE_NDC       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
|  25 |       TABLE ACCESS BY INDEX ROWID               | OBJECT_TYPES                 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|* 26 |        INDEX UNIQUE SCAN                        | UK_IDX_OBJECT_TYPE_NDC       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
|* 27 |      TABLE ACCESS BY INDEX ROWID                | OBJECT_TYPES                 |    219 |      1 |      1 |00:00:00.01 |     438 |      0 |      0 |       |       |          |
|* 28 |       INDEX UNIQUE SCAN                         | PK_OBJECT_TYPES              |    219 |      1 |    219 |00:00:00.01 |     219 |      0 |      0 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  29 |   LOAD AS SELECT                                |                              |      1 |        |      0 |00:00:02.86 |   37654 |     53 |      2 |  1040K|  1040K|          |
|  30 |    TABLE ACCESS BY INDEX ROWID                  | OBJECTS                      |    316 |      1 |    316 |00:00:00.01 |     603 |      0 |      0 |       |       |          |
|* 31 |     INDEX UNIQUE SCAN                           | PK_OBJECTS                   |    316 |      1 |    316 |00:00:00.01 |     287 |      0 |      0 |       |       |          |
|  32 |    TABLE ACCESS BY INDEX ROWID BATCHED          | ATTRIBUTES                   |    316 |      1 |    316 |00:00:00.01 |     950 |      0 |      0 |       |       |          |
|* 33 |     INDEX RANGE SCAN                            | UK_ATTR                      |    316 |      1 |    316 |00:00:00.01 |     666 |      0 |      0 |       |       |          |
|* 34 |      TABLE ACCESS FULL                          | ATTRIBUTE_TYPES              |      1 |      1 |      1 |00:00:00.01 |      38 |      0 |      0 |       |       |          |
|  35 |    HASH UNIQUE                                  |                              |      1 |    148 |    316 |00:00:02.86 |   37650 |     53 |      0 |  1041K|  1041K| 1371K (0)|
|* 36 |     FILTER                                      |                              |      1 |        |   5500 |00:00:02.85 |   36097 |     53 |      0 |       |       |          |
|  37 |      MERGE JOIN CARTESIAN                       |                              |      1 |    148 |   5114K|00:00:02.23 |   34073 |     53 |      0 |       |       |          |
|* 38 |       HASH JOIN                                 |                              |      1 |     12 |    657 |00:00:01.05 |   34016 |      0 |      0 |  1003K|  1003K|  728K (0)|
|  39 |        NESTED LOOPS                             |                              |      1 |     69 |    969 |00:00:00.36 |   20145 |      0 |      0 |       |       |          |
|  40 |         NESTED LOOPS                            |                              |      1 |    132 |    970 |00:00:00.36 |   19975 |      0 |      0 |       |       |          |
|  41 |          VIEW                                   |                              |      1 |     12 |    312 |00:00:00.35 |   19582 |      0 |      0 |       |       |          |
|* 42 |           FILTER                                |                              |      1 |        |    312 |00:00:00.35 |   19582 |      0 |      0 |       |       |          |
|* 43 |            CONNECT BY WITH FILTERING            |                              |      1 |        |  66134 |00:00:00.34 |   19144 |      0 |      0 |  6219K|  1010K| 5527K (0)|
|  44 |             TABLE ACCESS BY INDEX ROWID         | NODES                        |      1 |      1 |      1 |00:00:00.01 |       4 |      0 |      0 |       |       |          |
|* 45 |              INDEX UNIQUE SCAN                  | PK_NODES                     |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|  46 |             NESTED LOOPS                        |                              |      9 |     11 |  66133 |00:00:00.18 |   19137 |      0 |      0 |       |       |          |
|  47 |              CONNECT BY PUMP                    |                              |      9 |        |  66134 |00:00:00.01 |       0 |      0 |      0 |       |       |          |
|* 48 |              TABLE ACCESS BY INDEX ROWID BATCHED| NODES                        |  66134 |     11 |  66133 |00:00:00.15 |   19137 |      0 |      0 |       |       |          |
|* 49 |               INDEX RANGE SCAN                  | NCI_NODES_PARENT_NODE_ID     |  66134 |     11 |  67807 |00:00:00.08 |   12139 |      0 |      0 |       |       |          |
|  50 |               TABLE ACCESS BY INDEX ROWID       | OBJECT_TYPES                 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|* 51 |                INDEX UNIQUE SCAN                | UK_IDX_OBJECT_TYPE_NDC       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
|  52 |             TABLE ACCESS BY INDEX ROWID         | OBJECT_TYPES                 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|* 53 |              INDEX UNIQUE SCAN                  | UK_IDX_OBJECT_TYPE_NDC       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
|* 54 |            TABLE ACCESS BY INDEX ROWID          | OBJECT_TYPES                 |    219 |      1 |      1 |00:00:00.01 |     438 |      0 |      0 |       |       |          |
|* 55 |             INDEX UNIQUE SCAN                   | PK_OBJECT_TYPES              |    219 |      1 |    219 |00:00:00.01 |     219 |      0 |      0 |       |       |          |
|* 56 |          INDEX RANGE SCAN                       | NCI_NODES_PARENT_NODE_ID     |    312 |     11 |    970 |00:00:00.01 |     393 |      0 |      0 |       |       |          |
|* 57 |         TABLE ACCESS BY INDEX ROWID             | NODES                        |    970 |      6 |    969 |00:00:00.01 |     170 |      0 |      0 |       |       |          |
|* 58 |        VIEW                                     | index$_join$_065             |      1 |     42 |      4 |00:00:00.69 |   13871 |      0 |      0 |       |       |          |
|* 59 |         HASH JOIN                               |                              |      1 |        |    434 |00:00:00.01 |      12 |      0 |      0 |  1519K|  1519K| 1491K (0)|
|  60 |          INDEX FAST FULL SCAN                   | PK_OBJECT_TYPES              |      1 |     42 |    434 |00:00:00.01 |       4 |      0 |      0 |       |       |          |
|  61 |          INDEX FAST FULL SCAN                   | UK_IDX_OBJECT_TYPE_NDC       |      1 |     42 |    434 |00:00:00.01 |       8 |      0 |      0 |       |       |          |
|  62 |       BUFFER SORT                               |                              |    657 |     12 |   5114K|00:00:00.63 |      57 |     53 |      0 |   372K|   372K|  330K (0)|
|  63 |        VIEW                                     |                              |      1 |     12 |   7785 |00:00:00.02 |      57 |     53 |      0 |       |       |          |
|  64 |         TABLE ACCESS FULL                       | SYS_TEMP_0FD9D761B_1445481D  |      1 |     12 |   7785 |00:00:00.02 |      57 |     53 |      0 |       |       |          |
|  65 |      TABLE ACCESS BY INDEX ROWID                | OBJECTS                      |    657 |      1 |    657 |00:00:00.01 |    1068 |      0 |      0 |       |       |          |
|* 66 |       INDEX UNIQUE SCAN                         | PK_OBJECTS                   |    657 |      1 |    657 |00:00:00.01 |     410 |      0 |      0 |       |       |          |
|  67 |      TABLE ACCESS BY INDEX ROWID BATCHED        | ATTRIBUTES                   |    318 |      1 |    318 |00:00:00.01 |     956 |      0 |      0 |       |       |          |
|* 68 |       INDEX RANGE SCAN                          | UK_ATTR                      |    318 |      1 |    318 |00:00:00.01 |     670 |      0 |      0 |       |       |          |
|* 69 |        TABLE ACCESS FULL                        | ATTRIBUTE_TYPES              |      1 |      1 |      1 |00:00:00.01 |      38 |      0 |      0 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|  70 |   SORT GROUP BY                                 |                              |      1 |      1 |     50 |00:00:14.59 |     273K|      2 |      0 |   619K|   471K|  550K (0)|
|  71 |    VIEW                                         |                              |      1 |      1 |   4375 |00:00:13.31 |     273K|      2 |      0 |       |       |          |
|  72 |     HASH UNIQUE                                 |                              |      1 |      1 |   4375 |00:00:13.31 |     273K|      2 |      0 |  1186K|  1186K| 1400K (0)|
|  73 |      TABLE ACCESS BY INDEX ROWID                | OBJECTS                      |   4606 |      1 |   4606 |00:00:05.59 |   37088 |      0 |      0 |       |       |          |
|* 74 |       INDEX UNIQUE SCAN                         | PK_OBJECTS                   |   4606 |      1 |   4606 |00:00:05.56 |   32472 |      0 |      0 |       |       |          |
|* 75 |      HASH JOIN                                  |                              |      1 |      1 |   4375 |00:00:13.29 |     273K|      2 |      0 |  1410K|  1075K| 1423K (0)|
|  76 |       NESTED LOOPS                              |                              |      1 |      1 |   4375 |00:00:00.07 |   12952 |      2 |      0 |       |       |          |
|  77 |        NESTED LOOPS                             |                              |      1 |      2 |   4375 |00:00:00.06 |   12593 |      2 |      0 |       |       |          |
|  78 |         NESTED LOOPS                            |                              |      1 |      1 |   4375 |00:00:00.05 |   11761 |      2 |      0 |       |       |          |
|* 79 |          HASH JOIN                              |                              |      1 |      1 |   5500 |00:00:00.01 |      60 |      2 |      0 |  1321K|  1321K| 1775K (0)|
|  80 |           VIEW                                  |                              |      1 |     12 |   7785 |00:00:00.01 |      54 |      0 |      0 |       |       |          |
|  81 |            TABLE ACCESS FULL                    | SYS_TEMP_0FD9D761B_1445481D  |      1 |     12 |   7785 |00:00:00.01 |      54 |      0 |      0 |       |       |          |
|  82 |           VIEW                                  |                              |      1 |    148 |    316 |00:00:00.01 |       6 |      2 |      0 |       |       |          |
|  83 |            TABLE ACCESS FULL                    | SYS_TEMP_0FD9D761C_1445481D  |      1 |    148 |    316 |00:00:00.01 |       6 |      2 |      0 |       |       |          |
|  84 |          TABLE ACCESS BY INDEX ROWID BATCHED    | ATTRIBUTES                   |   5500 |      1 |   4375 |00:00:00.04 |   11701 |      0 |      0 |       |       |          |
|* 85 |           INDEX RANGE SCAN                      | IDX_ATTR_NDC_OBJECT_VALUE    |   5500 |      1 |   4375 |00:00:00.02 |    7353 |      0 |      0 |       |       |          |
|* 86 |         INDEX RANGE SCAN                        | NCI_ATTRIBUTE_VALUES_ATTR_ID |   4375 |      2 |   4375 |00:00:00.01 |     832 |      0 |      0 |       |       |          |
|  87 |        TABLE ACCESS BY INDEX ROWID              | ATTRIBUTE_VALUES             |   4375 |      2 |   4375 |00:00:00.01 |     359 |      0 |      0 |       |       |          |
|  88 |       VIEW                                      |                              |      1 |   1730 |   4606 |00:00:13.21 |     260K|      0 |      0 |       |       |          |
|* 89 |        FILTER                                   |                              |      1 |        |   4606 |00:00:00.06 |    2094 |      0 |      0 |       |       |          |
|* 90 |         CONNECT BY WITH FILTERING               |                              |      1 |        |   4922 |00:00:00.05 |    2037 |      0 |      0 |   478K|   448K|  424K (0)|
|  91 |          NESTED LOOPS                           |                              |      1 |    148 |    316 |00:00:00.01 |     953 |      0 |      0 |       |       |          |
|  92 |           NESTED LOOPS                          |                              |      1 |    148 |    316 |00:00:00.01 |     637 |      0 |      0 |       |       |          |
|  93 |            VIEW                                 | VW_NSO_1                     |      1 |    148 |    316 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|  94 |             HASH UNIQUE                         |                              |      1 |    148 |    316 |00:00:00.01 |       3 |      0 |      0 |  2170K|  2170K| 2517K (0)|
|  95 |              VIEW                               |                              |      1 |    148 |    316 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|  96 |               TABLE ACCESS FULL                 | SYS_TEMP_0FD9D761C_1445481D  |      1 |    148 |    316 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|* 97 |            INDEX UNIQUE SCAN                    | PK_NODES                     |    316 |      1 |    316 |00:00:00.01 |     634 |      0 |      0 |       |       |          |
|  98 |           TABLE ACCESS BY INDEX ROWID           | NODES                        |    316 |      1 |    316 |00:00:00.01 |     316 |      0 |      0 |       |       |          |
|  99 |          NESTED LOOPS                           |                              |      2 |   1582 |   4606 |00:00:00.01 |    1081 |      0 |      0 |       |       |          |
| 100 |           CONNECT BY PUMP                       |                              |      2 |        |   4922 |00:00:00.01 |       0 |      0 |      0 |       |       |          |
|*101 |           TABLE ACCESS BY INDEX ROWID BATCHED   | NODES                        |   4922 |     11 |   4606 |00:00:00.01 |    1081 |      0 |      0 |       |       |          |
|*102 |            INDEX RANGE SCAN                     | NCI_NODES_PARENT_NODE_ID     |   4922 |     11 |   4608 |00:00:00.01 |     950 |      0 |      0 |       |       |          |
| 103 |            TABLE ACCESS BY INDEX ROWID          | OBJECT_TYPES                 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|*104 |             INDEX UNIQUE SCAN                   | UK_IDX_OBJECT_TYPE_NDC       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
| 105 |          TABLE ACCESS BY INDEX ROWID            | OBJECT_TYPES                 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
|*106 |           INDEX UNIQUE SCAN                     | UK_IDX_OBJECT_TYPE_NDC       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |      0 |       |       |          |
|*107 |         TABLE ACCESS BY INDEX ROWID             | OBJECT_TYPES                 |      3 |      1 |      1 |00:00:00.01 |      57 |      0 |      0 |       |       |          |
|*108 |          INDEX UNIQUE SCAN                      | PK_OBJECT_TYPES              |      3 |      1 |      3 |00:00:00.01 |       3 |      0 |      0 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("O"."OBJECT_ID"=:B1)
   6 - access("A"."OBJECT_ID"=:B1 AND "A"."ATTRIBUTE_TYPE_ID"=)
   7 - filter("NAME_DISPLAY_CODE"='ATTRIBUTE_TYPE.MDR_CODELIST')
   9 - access("A"."OBJECT_ID"=:B1 AND "A"."ATTRIBUTE_TYPE_ID"=)
  10 - filter("NAME_DISPLAY_CODE"='ATTRIBUTE_TYPE.MDR_CODELIST')
  12 - access("A"."NAME_DISPLAY_CODE"='ATTRIBUTE_TYPE.MDR_ACTIVE_STATUS' AND "A"."OBJECT_ID"=)
  13 - access("NOD"."NODE_ID"=:B1)
  14 - filter("GROUP_ACTIVE_STATUS"='LOOKUP_VALUE.ACTIVE_STATUS_A')
  15 - filter(("DELETION_DATE"='01-Jan-1900' AND  IS NOT NULL AND "OBJECT_ID" IS NOT NULL))
  16 - access("N"."PARENT_NODE_ID"=PRIOR NULL)
  18 - access("N"."NODE_ID"=TO_NUMBER(:I_NODE_ID))
  21 - filter("N"."OBJECT_TYPE_ID"<>)
  22 - access("connect$_by$_pump$_029"."PRIOR n.node_id"="N"."PARENT_NODE_ID")
  24 - access("NAME_DISPLAY_CODE"='OBJECT_TYPE.ARCHIVE_CONTAINER')
  26 - access("NAME_DISPLAY_CODE"='OBJECT_TYPE.ARCHIVE_CONTAINER')
  27 - filter("NAME"=:I_SEARCH_OBJ_TYPE)
  28 - access("OBJECT_TYPE_ID"=:B1)
  31 - access("O"."OBJECT_ID"=:B1)
  33 - access("A"."OBJECT_ID"=:B1 AND "A"."ATTRIBUTE_TYPE_ID"=)
  34 - filter("AT"."NAME_DISPLAY_CODE"='ATTRIBUTE_TYPE.MDR_MASTER_UNIQUE_ITEM_ID')
  36 - filter(("CN"."CODELIST"= AND "CN"."CODELIST_MUI"=))
  38 - access("N"."OBJECT_TYPE_ID"="OBJECT_TYPE_ID")
  42 - filter(("DELETION_DATE"='01-Jan-1900' AND  IS NOT NULL))
  43 - access("PARENT_NODE_ID"=PRIOR NULL)
  45 - access("NODE_ID"=TO_NUMBER(:I_NODE_ID))
  48 - filter("OBJECT_TYPE_ID"<>)
  49 - access("connect$_by$_pump$_049"."PRIOR node_id "="PARENT_NODE_ID")
  51 - access("NAME_DISPLAY_CODE"='OBJECT_TYPE.ARCHIVE_CONTAINER')
  53 - access("NAME_DISPLAY_CODE"='OBJECT_TYPE.ARCHIVE_CONTAINER')
  54 - filter(("NAME_DISPLAY_CODE"='OBJECT_TYPE.MDR_CL_CONTAINER' OR "NAME_DISPLAY_CODE"='OBJECT_TYPE.MDR_STUDY_CL_PARENT_CONTAINER'))
  55 - access("OBJECT_TYPE_ID"=:B1)
  56 - access("N"."PARENT_NODE_ID"="NODE_ID")
  57 - filter("N"."DELETION_DATE"='01-Jan-1900')
  58 - filter(("NAME_DISPLAY_CODE"MEMBER OF"PKG_MDR_COMP_RPT_MGR"."F_GET_LOV_MAPPED_OBJTYPES"("VP40"."VARCHAR_TBL"('OBJECT_TYPE.MDR_CL_CODELIST')) OR
              "NAME_DISPLAY_CODE"MEMBER OF"PKG_MDR_COMP_RPT_MGR"."F_GET_LOV_MAPPED_OBJTYPES"("VP40"."VARCHAR_TBL"('OBJECT_TYPE.MDR_CL_SUBSET'))))
  59 - access(ROWID=ROWID)
  66 - access("O"."OBJECT_ID"=:B1)
  68 - access("A"."OBJECT_ID"=:B1 AND "A"."ATTRIBUTE_TYPE_ID"=)
  69 - filter("AT"."NAME_DISPLAY_CODE"='ATTRIBUTE_TYPE.MDR_MASTER_UNIQUE_ITEM_ID')
  74 - access("O"."OBJECT_ID"=NVL(:B1,"PKG_MDR_UTIL"."F_GET_REF_NODE_OBJECT_ID"(:B2,"REQ_INFO_TYPE"("USER_CREDENTIALS_TYPE"(1,NULL,'en-US'),
              "AUDIT_INFO_TYPE"(NULL,NULL,NULL),NULL,NULL,NULL))))
  75 - access("COD"."CL_NODE_ID"="CL"."NODE_ID" AND "AV"."ATTRIBUTE_VALUE"="COD"."OBJ_NAME")
  79 - access("CN"."CODELIST_MUI"="CL"."MUI_VALUE")
  85 - access("A"."NAME_DISPLAY_CODE"='ATTRIBUTE_TYPE.MDR_CODELIST_VALUE' AND "CN"."OBJECT_ID"="A"."OBJECT_ID")
  86 - access("A"."ATTRIBUTE_ID"="AV"."ATTRIBUTE_ID")
  89 - filter(("DELETION_DATE"='01-Jan-1900' AND  IS NOT NULL))
  90 - access("N"."PARENT_NODE_ID"=PRIOR NULL)
  97 - access("N"."NODE_ID"="NODE_ID")
 101 - filter("N"."OBJECT_TYPE_ID"<>)
 102 - access("connect$_by$_pump$_082"."PRIOR n.node_id "="N"."PARENT_NODE_ID")
 104 - access("NAME_DISPLAY_CODE"='OBJECT_TYPE.ARCHIVE_CONTAINER')
 106 - access("NAME_DISPLAY_CODE"='OBJECT_TYPE.ARCHIVE_CONTAINER')
 107 - filter("NAME_DISPLAY_CODE"MEMBER OF"PKG_MDR_COMP_RPT_MGR"."F_GET_LOV_MAPPED_OBJTYPES"("VP40"."VARCHAR_TBL"('OBJECT_TYPE.MDR_CL_CODE')))
 108 - access("OBJECT_TYPE_ID"=:B1)

Any observations welcome. I’m not expecting many people to see the anomaly I see (and there may be further anomalies I haven’t even looked for that others do see straight away), but it’s possible that the pattern is one that some people frequently see and find totally unsurprising.

Update – where’s the anomaly

The anomaly is the presence of operations 73 and 74.

There are two different observations that make these lines stand out. First, operation 72 is a hash unique which is a “single child” operation that calls its child to supply a rowsource and then reduces that rowsource to a distinct set using a hashing mechanism. But in this plan we can see that operation 72 appears to have two child operations – numbers 73 and 75 – so clearly the plan isn’t following the pure “standard” pattern.

Secondly, notice that operations 73 and 74 both report 4,606 Starts. An operation that reports “N” starts has to have a parent operation calling it N times, which means the parent operation must have reported (at least) N rows  under the A-Rows heading. But we know that the hash unique operation will call its child operation exactly once – and we can see that the hash unique here has only been called once. So something else much be causing the 4,606 Starts.

Fortunately we remember that “scalar subqueries in the select list” will report their execution plans above the part of the plan that describes the main body of the query. In fact we can see this several times in the two “load as select” parts of this plan; operations (3,4), (5,6,7), (8,9,10), (11,12,13) describe 4 scalar subqueries that must be embedded in the select list of the first “with” subquery that is described by operations 14 – 28.

So we could assume, for the moment, that operations 73 and 74 are in some way an inline scalar subquery in a select list – and that leads to the next step in the problem. A scalar subquery will operate at most once for each row returned in the main rowsource – though the number of starts might be reduced by the effects of scalar subquery caching. Operations 73 and 74 start 4,606 times; the rowsource that we feel it ought to be associated with is the hash join immediately below it (operation 75) which returns 4,375 rows, moreover the first child of the hash join returns 4,375 rows – so we’re not seeing enough rows returned to justify our second attempt at interpreting the plan.

So where can we find something that returns 4,606 (or more) rows that would allow us to say “maybe that’s were the scalar subquery goes” ?

Look further down the plan – operation 88 (the view operation that constitutes the second child of the hash join) reports A-Rows = 4,606. Operations 73,74 really ought to be reported between operations 88 and 89.  There’s a filter at operation 89 that reduces the 4,922 rows produced by operation 90 to 4,606 and it’s after that that the scalar subquery runs to add a column to the rowsource before passing it upwards. (We might be able to see this if we had the projection information for the plan)

Corroborating this claim we can look at the A-Time for operation 88: it’s 13.21 seconds and there’s nothing below it that accounts for that time; but if we insert operations 73 and 74 just below operation 88 we suddenly have 4,606 subquery calls which report 5.59 seconds and that’s a step in the right direction for identifying the 13.21 seconds that appeared “from nowhere” – especially when you notice that the predicate for operation 74 (or 88a) calls a PL/SQL packaged procedure that is either calling three more Pl/SQL procedures or 3 user-defined types and probably using a fair amount of unrecorded time.

Conclusion

Scalar subqueries in select lists can dump their execution plans in places you don’t expect. We know that the plan for a scalar subquery in the select list of a simple query will report itself above the main body of the query plan. Here we have an example of a scalar subquery that reports itself an extra step further out than we intuitively suspect and, quite possibly, if we hadn’t had the rowsource execution statistics to guide us, we wouldn’t have been able to work out what the plan was really trying to tell us.

Footnote

Since Andy Sayer had been commenting on the same ODC thread I emailed him a brief version of the above notes last night, and he created, and emailed to me, a very simple example to reproduce this behaviour which I’ve also tested on 11.2.0.4.

Next Page »

Powered by WordPress.com.