This is an example from the Oracle Developer Community of using the output of SQL Monitor to detect a problem with object statistics that resulted in an extremely poor choice of execution plan.
A short time after posting the original statement of the problem the OP identified where he thought the problem was and the general principle of why he thought he had a problem – so I didn’t have to read the entire execution plan to work out a strategy that would be (at least) a step in the right direction of solving the performance problem.
This note, then, is just a summary of the five minute that I spent confirming the OP’s hypothesis and explaining how to work around the problem he had identified. It does, however, give a little lead-in to the comments I made to the OP in order to give a more rounded picture of what his execution plan wass telling us.
So here’s the top half of the plan (which is the first subquery of a “concatenation”) with the first few predicates:
=============================================================================================================================================================================================================================== | Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Cell | Mem | Activity | Activity Detail | | | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Offload | (Max) | (%) | (# samples) | =============================================================================================================================================================================================================================== | 0 | SELECT STATEMENT | | | | 12086 | +5 | 1 | 864K | | | | | 0.01 | Cpu (1) | | 1 | CONCATENATION | | | | 12086 | +5 | 1 | 864K | | | | | | | | 2 | FILTER | | | | 12191 | +4 | 1 | 864K | | | | | 0.03 | Cpu (4) | | 3 | FILTER | | | | 12191 | +4 | 1 | 26M | | | | | 0.01 | Cpu (1) | | 4 | NESTED LOOPS | | 241 | 251K | 12191 | +4 | 1 | 26M | | | | | 0.02 | Cpu (3) | | 5 | NESTED LOOPS | | 241 | 251K | 12191 | +4 | 1 | 26M | | | | | 0.07 | Cpu (8) | | 6 | NESTED LOOPS | | 241 | 251K | 12232 | +4 | 1 | 26M | | | | | 0.05 | Cpu (6) | | 7 | NESTED LOOPS | | 5407 | 233K | 12242 | +4 | 1 | 86M | | | | | | | | 8 | MERGE JOIN CARTESIAN | | 1 | 35 | 12242 | +4 | 1 | 1000 | | | | | | | | 9 | TABLE ACCESS BY INDEX ROWID | REF1 | 1 | 3 | 1 | +4 | 1 | 1 | | | | | | | | 10 | INDEX RANGE SCAN | REF1_PK | 1 | 2 | 12242 | +4 | 1 | 1 | | | | | | | | 11 | BUFFER SORT | | 84 | 32 | 12242 | +4 | 1 | 1000 | | | | 104K | | | | 12 | TABLE ACCESS BY INDEX ROWID | STAGE | 84 | 32 | 1 | +4 | 1 | 1000 | | | | | | | | 13 | INDEX RANGE SCAN | STAGE_IDX1 | 84 | 4 | 1 | +4 | 1 | 1000 | | | | | | | | 14 | PARTITION RANGE ITERATOR | | 8292 | 232K | 12232 | +4 | 1000 | 86M | | | | | | | | 15 | TABLE ACCESS STORAGE FULL | TAB1 | 8292 | 232K | 12245 | +1 | 1000 | 86M | 103M | 521GB | 1.96% | 7M | 51.81 | gc buffer busy acquire (1) | | | | | | | | | | | | | | | | latch: cache buffers chains (1) | | | | | | | | | | | | | | | | Cpu (1196) | | | | | | | | | | | | | | | | gcs drm freeze in enter server mode (2) | | | | | | | | | | | | | | | | reliable message (5) | | | | | | | | | | | | | | | | cell single block physical read (2827) | | | | | | | | | | | | | | | | cell smart table scan (1977) | | | | | | | | | | | | | | | | read by other session (304) | | 16 | PARTITION RANGE ITERATOR | | 1 | 12 | 12191 | +4 | 86M | 26M | | | | | 0.42 | Cpu (51) | | 17 | TABLE ACCESS BY LOCAL INDEX ROWID | TAB2 | 1 | 12 | 12191 | +4 | 86M | 26M | 4M | 28GB | | | 32.14 | gc cr grant 2-way (20) | | | | | | | | | | | | | | | | gc cr request (2) | | | | | | | | | | | | | | | | gc remaster (6) | | | | | | | | | | | | | | | | Cpu (319) | | | | | | | | | | | | | | | | gcs drm freeze in enter server mode (4) | | | | | | | | | | | | | | | | latch: gc element (2) | | | | | | | | | | | | | | | | cell single block physical read (3563) | | 18 | INDEX RANGE SCAN | TAB2_IX1 | 166 | 3 | 12210 | +2 | 86M | 26M | 1M | 11GB | | | 15.17 | Cpu (292) | | | | | | | | | | | | | | | | cell single block physical read (1557) | | 19 | INDEX UNIQUE SCAN | MTD_PK | 1 | 1 | 12242 | +4 | 26M | 26M | 292 | 2MB | | | 0.17 | Cpu (20) | | | | | | | | | | | | | | | | cell single block physical read (1) | | 20 | TABLE ACCESS BY INDEX ROWID | REF2 | 1 | 2 | 12191 | +4 | 26M | 26M | 7 | 57344 | | | 0.11 | Cpu (13) | | 21 | TABLE ACCESS BY INDEX ROWID | CONTROLTAB | 1 | 1 | 1 | +4 | 1 | 1 | | | | | | | | 22 | INDEX UNIQUE SCAN | CONTROLTAB_PK | 1 | | 1 | +4 | 1 | 1 | | | | | | | | 23 | MINUS | | | | 102 | +4 | 25 | 3 | | | | | | | | 24 | TABLE ACCESS BY INDEX ROWID | CUST_ORG_PK | 1 | 3 | 942 | +4 | 25 | 10 | | | | | | | | 25 | INDEX UNIQUE SCAN | MC_PK | 1 | 2 | 942 | +4 | 25 | 25 | | | | | | | | 26 | SORT UNIQUE NOSORT | | 1 | 4 | 8 | +4 | 25 | 9 | | | | | | | | 27 | TABLE ACCESS BY INDEX ROWID | REF1 | 1 | 3 | 8 | +4 | 25 | 9 | | | | | | | | 28 | INDEX RANGE SCAN | REF1_PK | 1 | 2 | 8 | +4 | 25 | 9 | | | | | | | Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter( EXISTS ( (SELECT /*+ INDEX_RS_ASC ("CUST_ORG_PK" "MC_PK") */ "CUST_ID" FROM "USER1"."CUST_ORG_PK" "CUST_ORG_PK" WHERE "CUST_ID"=:B1 AND "CUST_ORG_PK"."INDICATR"='Y') MINUS (SELECT /*+ INDEX_RS_ASC ("REF1" "REF1_PK") */ TO_NUMBER("VAL") FROM "USER1"."REF1" "REF1" WHERE "PUSER"='ZZZ' AND "EDATE" .ge. TRUNC(SYSDATE@!) AND TO_NUMBER("VAL")=:B2 AND "SDATE" .le. TRUNC(SYSDATE@!)))) 3 - filter( EXISTS (SELECT /*+ INDEX_RS_ASC ("CONTROLTAB" "CONTROLTAB_PK") */ 0 FROM "USER2"."CONTROLTAB" "CONTROLTAB" WHERE "CONTROLTAB"."CNTRLID"=9999 AND NVL("CONTROLTAB"."STATUS",'F')='S')) 9 - filter("REF1"."EDATE" .ge. TRUNC(SYSDATE@!)) 10 - access("REF1"."PUSER"='XXX' AND "REF1"."NAME"='CODE' AND "REF1"."SDATE" .le. TRUNC(SYSDATE@!)) 13 - access("STAGE"."NAME"='XXX' AND "STAGE"."STAT"='I') Note: various inequality symbols changed to .le. / .ge. to avoid WordPress format issue.
The first thing to note is that the “Time (active)” shown at the top line is about 12,000 seconds – so it’s a long running query. Your next observation – before you look at the shape of the plan – might be to note that operations 15, 17 and 18 between them record thousands of seconds of time, mostly I/O time but including 1,200 seconds of CPU time. This might draw your eye to the part of the plan that tells you what you are doing at these operations and why you are doing it.
Looking at the detail – operation 15 is a full tablescan that is the child of a partition range iterator (operation 14), and that iterator is the second child of a nested loop join (operation 7). Why is the optimizer so out of balance that it thinks a table scan of (possibly) multiple partitions of a partitioned table is a good candidate for the second child of a nested loop?! The answer comes from the first child – the Merge Join Cartesian at operation 8 has been given a cardinality estimate of just one row. When the cardinality estimate is one for the first table in a join then it doesn’t matter whether Oracle uses a nested loop join or a hash join, whatever happens next is only supposed to happen once.
Unfortunately when we compare “Rows (Estim)” and “Rows (Actual)” for the operation we see that the Merge Join Cartesian produced 1,000 rows – so the partition tablescan was going to happen 1,000 times (which we can confirm from the “Execs” column of operation 14). As a first step, then, we need to ensure that the optimizer gets a better estimate of the number of rows produced by the Cartesian join. So lets look at its child operations.
- Operation 9 (Table Access of REF1) is predicted to return one row – with “Rows (actual)” = 1.
- Operation 11 (Buffer Sort of data from STAGE1) is predicted to return 84 rows – with “Rows (actual)” = 1,000
Since the number of rows produced by a Cartesian join should be the product of the number of rows of the two inputs this tells us that the optimizer’s estimate of the number of rows from REF1 has been rounded up to 1 from a very small fraction (less than 1/84). If we can correct this then we may get Oracle to change the awful nested loop to an acceptable hash join. Wven if we can’t fix this mis-estimate we may be able to do something that improves the estimate for STAGE1 to something sufficienlty large that it will trigger the switch to a hash join. So let’s look at the predicates for these two tables.
REF1 predicates
9 - filter("REF1"."EDATE">=TRUNC(SYSDATE@!)) 10 - access("REF1"."PUSER"='XXX' AND "REF1"."NAME"='CODE' AND "REF1"."SDATE"<=TRUNC(SYSDATE@!))
The index range scan is based on an access predicate (with no filter predicate), so it looks like there’s a nearly perfect three-column index on REF1, but the optimizer can’t use the number of distinct keys in the index to get a good estimate of cardinality because one of the predicates is range-based. So the arithmetic will look at the three predicates separately and multiply up their selectivities. (It’s possible, of course, that this might be the first three columns of a 4, or more, column index.)
It’s a reasonable guess that the number of distinct combinations of (puser, name) will be much smaller than num_distinct(puser) * num_distinct(name) – so one strategy that might help increase the table’s cardinality estimate is to create extended statistics on the column group (puser, name).
Another reasonable guess is that the number of distinct values for the two columns is (relatively) small, with some skew to the distribution (name = ‘CODE’ looks particularly susceptible to being a commonly occurring value) – so perhaps we need a histogram on one or both of the columns (which would then require a histogram to be created on the column group as well if we wanted the optimizer to use the column group). We’d also have to make sure that the queried values didn’t fall outside the known low/high values for the columns if we wanted the column group to be used.
STAGE1 Predicates
13 - access("STAGE"."NAME"='XXX' AND "STAGE"."STAT"='I')
This is the access(-only) predicate for the index stage_idx1, and there are no filter predicates when we reach the table. So stage_idx1 might be a two-column index on the table that we are using completely, or it may be an index with more columns that we are using only incompletely. We can see that the cardinality estimate is out by a factor of 12 (84 predicted, 1,000 actual) so if this is the complete index (which would allow Oracle to use the distinct_keys value to estimate cardinality) there must be an uneven data distribution in the values; but if this is just the first two columns of a longer index then we might benefit from extended stats (viz: another column group) on this pair of columns.
Again, even if we create a column group, or take automatic advantage of the distinct_keys figure, the predicate STAT=’I’ (is that state, status?) looks a little dangerous – status columns tend to have a small number of distinct values with a signficant skew to the distribution of values – so we may want to create a histogram on the STAT column, which would then require us to create a histogram on the column group if we also wanted the column group stats to have an effect.
What Happened Next?
I made the suggestions about column groups and histogram to the OP – without the level of detail that I’ve given in the explanations above – and got the following response:
You are spot on.
There does exists frequency histogram on column NAME(having 14 distinct value) and STAT(having 7 distinct values) of table STAGE. Also there already exists a frequency histogram on column PUSER and height balanced histogram on column NAME of table REF1. But still they were not helping the existing query.
By creating a column group on both on ref1 (puser, name) and stage(name, stat) with histogram for both i.e. ‘FOR COLUMNS SIZE 254’. The merge Cartesian removed automatically from the path by optimizer and its doing a hash join now with TAB1 finishing the query in quick time.
Summary
When the cardinality (rows) estimate for an operation drops to one (which might mean much less than one and rounded up) then all hell can break loose and the choice of parent operation – and its cardinality estimate – might be insanely stupid, leading to a catastrophically poor execution plan.
Very low cardinality estimates are often the result of multiplying individual column selectivities to produce an estimated selectivity that is unrealistic (much too small) when compared with the actual number of distinct combinations that exist in the table. In such cases creating a column group, (possibly with a histogram) may be all you need to do to get a better cardinality estimate and a vastly improved execution plan.