Oracle Scratchpad

August 25, 2021

qbregistry 2

Filed under: CBO,dbms_xplan,Oracle,Transformations — Jonathan Lewis @ 1:45 pm BST Aug 25,2021

Following a question (very similar to one I had been asking myself) that appeared on twitter in response to my original posting on the new qbregistry format option in the dbms_xplan package, I’ve drafted a note of how I interpreted the execution plan so that I could more clearly see how my visualisation of the transformation maps (or fails to map) to the Query Block Registry.

I can’t guarantee the correctness of the description I’ve given here, but it’s probably fairly accurate.

Original Query (hiding the unnest and no_semijoin hints)

select  
        /* sel$1 */
        * 
from    t1 
where   t1.owner = 'OUTLN' 
and     object_name in (
                select  /* sel$2 */
                        distinct t2.object_name 
                from   t2 
                where  t2.object_type = 'TABLE'
        )
;

Transformation 1: unnest the subquery

This is visible in the Outline Information of the execution plan as the hint UNNEST(@”SEL$2″ UNNEST_INNERJ_DISTINCT_VIEW) and produces two new query blocks, the query block that “is” the unnested subquery and the query block that joins t1 to the unnested subquery vw_nso_1.

select  
        /* SEL$5DA710D3 */
        t1.* 
from
        t1,
        (
        select  /* SEL$683B0107 */
                distinct
                t2.object_name 
        from    t2 
        where   t2.object_type = 'TABLE'
        )       vw_nso_1
where
        t1.owner = 'OUTLN' 
and     vw_nso_1.object_name = t1.object_name



Transformation 2: view merge (join then aggregate)

This is visible in the Outline Information of the execution plan in the hint MERGE(@”SEL$683B0107″ >”SEL$5DA710D3″). I think this produces three new query blocks; the block that “is” the merged view, a block that selects (projects) from the merged view, and the query block that the main query now becomes.

We will pretend that t1 has only 4 columns, owner, object_name, object_type, object_id.

select
        /* SEL$B186933D */
        vm_nwvw_2.owner,
        vm_nwvw_2.object_name,
        vm_nwvw_2.object_type
        vm_nwvw_2.object_id
from    (
        select  /* SEL$2F1334C4 */
                -- no distinct, and t2.object_name and t1.rowid eliminated
                t1.owner,
                t1.object_name,
                t1.object_type
                t1.object_id
        from    (
                select  /* SEL$88A77D12 */
                        distinct
                        t1.rowid,        -- ensures we don't duplicate t1 rows
                        t1.owner,
                        t1.object_name,
                        t2.object_name,  -- seems redundant, but is in the trace file
                        t1.object_type
                        t1.object_id
                from
                        t1,
                        t2
                where
                        t2.object_type = 'TABLE'
                and     t1.owner = 'OUTLN'
                and     t1.object_name = t2.object_name
                ) 
        ) vm_nwvw_2
;

Transformation 3: aggregate into partial join

I realised only as I was writing this note that I had completely forgotten that the plan reported a semi join even though the subquery had been hinted with a no_semijoin hint, and that the reported semi join was actually a partial join.

However, the query block registry is identical with or without a partial join (controlled by the hint [no]partial_join) so there doesn’t seem to be a transformation stage corresponding to the choice of strategy. Maybe the apparently redundant query block layer allows the variation to appear if required.

It’s Difficult

A problem I have with the query block registry is deciding what it’s telling us – and maybe the trace file and the execution plan are not trying to tell us exactly the same thing. I think it’s quite difficult, anyway, to find a good way of presenting the information that is completely informative but clear and uncluttered.

Something that may help, when you can check the trace file and the final execution plan, is the order in which query blocks are registered. Some of them may be discarded, of course, as the optimizer works through options, some of them may be marked as COPY, but if you ignore those you may be able to see from what’s left the evolution of the plan. Here, for example, is the extract of the lines where each query block is registered, taken from the CBO trace for this query, with numbering:

32:Registered qb: SEL$1 0xc26e28e8 (PARSER)
36:  signature (): qb_name=SEL$1 nbfros=1 flg=0
37:    fro(0): flg=4 objn=76167 hint_alias="T1"@"SEL$1"

39:Registered qb: SEL$2 0xc26e0d58 (PARSER)
43:  signature (): qb_name=SEL$2 nbfros=1 flg=0
44:    fro(0): flg=4 objn=76168 hint_alias="T2"@"SEL$2"

966:Registered qb: SEL$683B0107 0xbcc187c8 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
970:  signature (): qb_name=SEL$683B0107 nbfros=1 flg=0
971:    fro(0): flg=0 objn=76168 hint_alias="T2"@"SEL$2"

973:Registered qb: SEL$5DA710D3 0xbb1fcb10 (SUBQUERY UNNEST SEL$1; SEL$2;)
977:  signature (): qb_name=SEL$5DA710D3 nbfros=2 flg=0
978:    fro(0): flg=0 objn=76167 hint_alias="T1"@"SEL$1"
979:    fro(1): flg=5 objn=0 hint_alias="VW_NSO_1"@"SEL$5DA710D3"

1471:Registered qb: SEL$2F1334C4 0xbcf4b210 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3)
1475:  signature (): qb_name=SEL$2F1334C4 nbfros=1 flg=0
1476:    fro(0): flg=5 objn=0 hint_alias="VM_NWVW_2"@"SEL$2F1334C4"

1478:Registered qb: SEL$88A77D12 0xbcda5540 (PROJECTION VIEW FOR CVM SEL$683B0107)
1482:  signature (): qb_name=SEL$88A77D12 nbfros=2 flg=0
1483:    fro(0): flg=0 objn=76167 hint_alias="T1"@"SEL$1"
1484:    fro(1): flg=1 objn=0 hint_alias="VW_NSO_1"@"SEL$5DA710D3"

1489:Registered qb: SEL$B186933D 0xbcda5540 (VIEW MERGE SEL$88A77D12; SEL$683B0107; SEL$5DA710D3)
1493:  signature (): qb_name=SEL$B186933D nbfros=2 flg=0
1494:    fro(0): flg=0 objn=76167 hint_alias="T1"@"SEL$1"
1495:    fro(1): flg=0 objn=76168 hint_alias="T2"@"SEL$2"

Because it’s a very simple query you can almost see the “thinking” in the clumping of the line numbers.

  • The first two registrations are the original query blocks.
  • After a break the next pair is the t2 subquery being unested and the query which is the join between t1 and the unnested t2.
  • After another break we see, in rapid succession, the view using the merged join view, the projection view using that merged join view, then the query block selecting from that projection.

The thing I find difficult to keep clear in my mind (when trying to describe what the trace/registry is saying, that is, not when just reading the plan) is the “doubling” effect where transformation steps often seem to produce two query blocks, one for the inline view containing the transformed object and one for the query that is now using the transformed object; and a further source of confusion appears when a query block seems to be able to peek into an inner query block to reference the objects in it. I just keep losing track of the layers!

It’s probablyh as safe as it’s going to be to read this note (unless someone points out an error). I don’t think there’s any more that I could find to say about the example.

August 24, 2021

qbregistry

Filed under: CBO,dbms_xplan,Oracle,Transformations — Jonathan Lewis @ 11:54 am BST Aug 24,2021

If you look at the “Outline Information” from an execution plan it shows you a list of hints that will (in theory, at least) recreate the execution plan and it’s this information that gets stored as the “injection” part of an SQL Plan Baseline. Unfortunately the hints won’t necessarily allow you to infer what transformations the optimizer has used to get to the final execution plan.

If you’re prepared to generate the CBO trace file you could examine the Query Block Registry that appears near the end of the trace file to get some clues – here’s an example from 19.11.0.0 for a simple query involving a single table plus an IN subquery:

Query Block Registry:
SEL$2 0x6d47cde8 (PARSER)
  SEL$5DA710D3 0x6d480e60 (SUBQUERY UNNEST SEL$1; SEL$2;)
    SEL$2F1334C4 0x6d480e60 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3) [FINAL]
  SEL$683B0107 0x6d47cde8 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
    SEL$B186933D 0x6d48e3a8 (VIEW MERGE SEL$88A77D12; SEL$683B0107; SEL$5DA710D3) [FINAL]
    SEL$88A77D12 0x6d48e3a8 (PROJECTION VIEW FOR CVM SEL$683B0107)
      SEL$B186933D 0x6d48e3a8 (VIEW MERGE SEL$88A77D12; SEL$683B0107; SEL$5DA710D3) [FINAL]
SEL$1 0x6d480e60 (PARSER)
  SEL$5DA710D3 0x6d480e60 (SUBQUERY UNNEST SEL$1; SEL$2;)

I’m not going to say anything about interpreting this extract because I want to highlight a recent feature of the dbms_xplan package (brought to my attention by Franck Pachot some time ago). One of the format options for displaying execution plans will report the query block registry. Here’s the output from display_cursor(format=>’qbregistry’)) in 21.3.0.0 for the query that produced the CBO trace extract above:

Query Block Registry:
---------------------
  SEL$1 (PARSER)
    SEL$5DA710D3 (SUBQUERY UNNEST SEL$1 ; SEL$2)
      SEL$2F1334C4 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3) [FINAL]
  SEL$2 (PARSER)
    SEL$683B0107 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
      SEL$88A77D12 (PROJECTION VIEW FOR CVM SEL$683B0107)
        SEL$B186933D (VIEW MERGE SEL$88A77D12 ; SEL$683B0107) [FINAL]

Two things to notice here – first that the output has reduced the 9 lines to 7 lines (which can only be helpful). secondly that the redundant memory addresses which appear in the trace file don’t get copied into the report.

I’m still not going to say anything about interpreting the output because I want to show you the display_cursor() output for the same query when executed in 19.11.0..0. It looks like this:

Query Block Registry:
---------------------

  <q o="13"><n><![CDATA[SEL$88A77D12]]></n><p><![CDATA[SEL$683B0107]]></p><
        f><h><t><![CDATA[T1]]></t><s><![CDATA[SEL$1]]></s></h><h><t><![CDATA[VW_N
        SO_1]]></t><s><![CDATA[SEL$5DA710D3]]></s></h></f></q>
  <q o="12"><n><![CDATA[SEL$683B0107]]></n><p><![CDATA[SEL$2]]></p><f><h><t
        ><![CDATA[T2]]></t><s><![CDATA[SEL$2]]></s></h></f></q>
  <q o="2"><n><![CDATA[SEL$1]]></n><f><h><t><![CDATA[T1]]></t><s><![CDATA[S
        EL$1]]></s></h></f></q>
  <q o="2"><n><![CDATA[SEL$2]]></n><f><h><t><![CDATA[T2]]></t><s><![CDATA[S
        EL$2]]></s></h></f></q>
  <q o="18" f="y" h="y"><n><![CDATA[SEL$B186933D]]></n><p><![CDATA[SEL$88A7
        7D12]]></p><i><o><t>VW</t><v><![CDATA[SEL$683B0107]]></v></o></i><f><h><t
        ><![CDATA[T1]]></t><s><![CDATA[SEL$1]]></s></h><h><t><![CDATA[T2]]></t><s
        ><![CDATA[SEL$2]]></s></h></f></q>
  <q o="19" h="y"><n><![CDATA[SEL$5DA710D3]]></n><p><![CDATA[SEL$1]]></p><i
        ><o><t>SQ</t><v><![CDATA[SEL$2]]></v></o></i><f><h><t><![CDATA[T1]]></t><
        s><![CDATA[SEL$1]]></s></h><h><t><![CDATA[VW_NSO_1]]></t><s><![CDATA[SEL$
        5DA710D3]]></s></h></f></q>
  <q o="15" f="y"><n><![CDATA[SEL$2F1334C4]]></n><p><![CDATA[SEL$5DA710D3]]
        ></p><f><h><t><![CDATA[VM_NWVW_2]]></t><s><![CDATA[SEL$2F1334C4]]></s></h
        ></f></q>

Yes, it’s naked XML (extracted from the v$sql_plan.other_xml column for operation 1).

I had been living in hope that someone else would write a messy bit of SQL to translate this into something readable – but the last time I searched the Internet for “other_xml qbregistry” I got the magical result of a Googlewhack (i.e. only one hit), which was in Russian, and largely a short description of all the options for the format command.

Since I’ve just installed 21.3 on a VM I decided to bite the bullet but I’ve taken the short-cut to writing the code. I’ve run a trace on a call to dbms_xplan.display_cursor() and extracted the critical query from the resulting trace file. Then I spent 30 minutes making it readable, hacking it to make it almost workable on 19c, then finding out why it can’t work without a little extra effort. Here’s the resulting hack:

rem
rem     Script:         qbregistry_query.sql
rem     Author:         Oracle Corp / Jonathan Lewis
rem     Dated:          Aug 2021
rem
rem     Last tested 
rem             19.11.0.0
rem

define m_sql_id='232sya6twg7sq'
define m_origin = 2

with 
xml as (
        select  other_xml
        from    V$sql_plan 
        where   sql_id = '&m_sql_id' 
        and     id = 1
        and     other_xml is not null
),
allqbs as ( 
        select 
                extractvalue(d.column_value, '/q/n') qbname, 
                extractvalue(d.column_value, '/q/@f') final, 
                extractvalue(d.column_value, '/q/p') prev, 
                extractvalue(d.column_value, '/q/@o') origin 
        from 
                table(xmlsequence(extract(xmltype ((select other_xml from xml)), '/other_xml/qb_registry/q'))) d 
), 
inpqbs as ( 
        select 
                xml.qbname qbname, 
                listagg(xml.depqbs, ',') within group (
                        order by xml.depqbs) depqbs 
        from 
                xmltable('/other_xml/qb_registry/q/i/o' passing xmltype((select other_xml from xml)) 
                        columns depqbs varchar2(256) path 'v', 
                        qbname varchar2(256) path './../../n'
                ) xml 
        where     xml.depqbs in ( select qbname from allqbs) 
        group by xml.qbname
), 
recqb   (src, origin, dest, final, lvl, inpobjs) as ( 
        select 
                qbname src, origin origin, null dest, final final, 1 lvl, null inpobjs 
        from 
                allqbs
        where 
--              origin = &m_origin
                origin in (2,3)
        union all 
        select 
                a.qbname src, a.origin origin, a.prev dest, a.final final, lvl+1, 
                (select depqbs from inpqbs i where i.qbname = a.qbname) inpobjs 
        from
                allqbs a, 
                recqb r 
        where a.prev = r.src
)
search depth first by src asc set ordseq, 
finalans as ( 
        select 
                src,
/* 
                (
                select 
                        name 
                from    v$query_block_origin 
                where 
                        origin_id=origin
                )       origin, 
*/
                origin,
                dest, final, lvl, inpobjs 
        from recqb order by ordseq
) 
select
        /*+ opt_param('parallel_execution_enabled', 'false') */ 
        g.qbreg 
from (
        select 
                rpad(' ', 2*(lvl-1)) || 
                src || ' (' || origin || 
                        case when length(dest)>0 
                                then ' ' || dest 
                                else '' 
                        end || 
                        case when length(inpobjs)>0 
                                then ' ; ' || inpobjs 
                                else ' ' 
                        end ||
                        ')' || 
                        case when final='y' 
                                then ' [final]' 
                                else '' 
                        end 
                qbreg 
        from 
                finalans
        ) g
/

In lines 10 and 11 I’ve defined a couple of substitution variables that appear further on in the script. One is the SQL_ID for the query you’re interested in, the other is a fixed (probably) symbolic constant used by Oracle.

Lines 14-20 are a “with” subquery that I’ve prepended to Oracle’s internal code to create a single row, single column table holding the other_xml value of the query of interest. You’ll notice that I’ve been fairly casual about this bit since I haven’t catered for the fact that a single sql_id may have several child cursors and might even be obsolete.

Lines 28 and 36 are where I’ve used my “with” subquery to supply the other_xml value that would have appeared as a bind variable (:B1) in the trace file.

Line 49 (commented out for the reason described in footnote 1) uses the m_origin variable to identify a row in the dynamic performance view v$query_block_origin (highlighted at line 68) and that’s where we have a problem with Oracle 19c: the view doesn’t exist, nor does the x$qbname structure that the view is based on (although it’s easy to find a table of the values in the oracle executable – albeit that several items in the 21c list don’t appear in the 19.11 executable).

In the code above I’ve actually commented out the whole of the inline scalar subquery that translates an origin number into an origin name and reported the actual value of origin. Originally I did this to check whether it was worth spending any more working on the code – and this is the result I got the initial test:

SEL$1 (2 )
  SEL$5DA710D3 (19 SEL$1 ; SEL$2)
    SEL$2F1334C4 (15 SEL$5DA710D3 ) [final]
SEL$2 (2 )
  SEL$683B0107 (12 SEL$2 )
    SEL$88A77D12 (13 SEL$683B0107 )
      SEL$B186933D (18 SEL$88A77D12 ; SEL$683B0107) [final]

A quick check by eye shows that it’s got the same pattern and set of query block names that the 21c output produced so it’s clearly a step in the right direction. Now all I need is a way to translate the origin numbers into names.

I could have tried searching x$ksmfsv to see if I could spot a pointer to the relevant structure and fake my way through the whole process of creating a “nearly dynamic” performance view, but I decided the quick and dirty workaround was to dump a CSV file listing the view contents in 21c, then read the file back as an external table to copy the data into a local IOT (index organized table) called my_query_block_origin. With the inline view back in play – and the name suitably changed – the 19c and 21c queries produced the same result (which is slightly surprising as the “SUBQUERY UNNEST” and “VIEW MERGE” options don’t seem to exist in the 19.11 list I found in the oracle executable.)

Footnote 1

Here’s a query to show the content of that 21c view (which is fairly interesting in its own right):

set linesize 144
set pagesize 100
set trimspool on
set tabout off

column  name format a60
column  hint_token format a32

spool query_block_origin.lst

select
        origin_id,
        name,
        hint_token
from
        v$query_block_origin
/

 ORIGIN_ID NAME                                                         HINT_TOKEN
---------- ------------------------------------------------------------ --------------------------------
         0 NOT NAMED
         1 ALLOCATE
         2 PARSER
         3 HINT
         4 COPY
         5 SAVE
         6 MV REWRITE                                                   REWRITE
         7 PUSHED PREDICATE                                             PUSH_PRED
         8 STAR TRANSFORM SUBQUERY
         9 COMPLEX VIEW MERGE
        10 COMPLEX SUBQUERY UNNEST
        11 OR EXPANSION                                                 USE_CONCAT
        12 SUBQ INTO VIEW FOR COMPLEX UNNEST
        13 PROJECTION VIEW FOR CVM
        14 GROUPING SET TO UNION
        15 SPLIT/MERGE QUERY BLOCKS
        16 COPY PARTITION VIEW
        17 RESTORE
        18 VIEW MERGE                                                   MERGE
        19 SUBQUERY UNNEST                                              UNNEST
        20 STAR TRANSFORM                                               STAR_TRANSFORMATION
        21 INDEX JOIN
        22 STAR TRANSFORM TEMP TABLE
        23 MAP QUERY BLOCK
        24 VIEW ADDED
        25 SET QUERY BLOCK
        26 QUERY BLOCK TABLES CHANGED
        27 QUERY BLOCK SIGNATURE CHANGED
        28 MV UNION QUERY BLOCK
        29 SPLIT QUERY BLOCK FOR GSET-TO-UNION                          EXPAND_GSET_TO_UNION
        30 PREDICATES REMOVED FROM QUERY BLOCK                          PULL_PRED
        31 PREDICATES ADDED TO QUERY BLOCK
        32 OLD PUSHED PREDICATE                                         OLD_PUSH_PRED
        33 ORDER BY REMOVED FROM QUERY BLOCK                            ELIMINATE_OBY
        34 JOIN REMOVED FROM QUERY BLOCK                                ELIMINATE_JOIN
        35 OUTER-JOIN REMOVED FROM QUERY BLOCK                          OUTER_JOIN_TO_INNER
        36 STAR TRANSFORMATION JOINBACK ELIMINATION                     ELIMINATE_JOIN
        37 BITMAP JOIN INDEX JOINBACK ELIMINATION                       ELIMINATE_JOIN
        38 CONNECT BY COST BASED TRANSFORMATION                         CONNECT_BY_COST_BASED
        39 CONNECT BY WITH FILTERING                                    CONNECT_BY_FILTERING
        40 CONNECT BY WITH NO FILTERING                                 NO_CONNECT_BY_FILTERING
        41 CONNECT BY START WITH QUERY BLOCK
        42 CONNECT BY FULL SCAN QUERY BLOCK
        43 PLACE GROUP BY                                               PLACE_GROUP_BY
        44 CONNECT BY NO FILTERING COMBINE                              NO_CONNECT_BY_FILTERING
        45 VIEW ON SELECT DISTINCT
        46 COALESCED SUBQUERY                                           COALESCE_SQ
        47 QUERY HAS COALESCED SUBQUERIES                               COALESCE_SQ
        48 SPLIT QUERY BLOCK FOR DISTINCT AGG OPTIM                     TRANSFORM_DISTINCT_AGG
        49 CONNECT BY ELIMINATE DUPLICATES FROM INPUT                   CONNECT_BY_ELIM_DUPS
        50 CONNECT BY COST BASED TRANSFORMATION FOR WHR ONLY            CONNECT_BY_CB_WHR_ONLY
        51 TABLE EXPANSION                                              EXPAND_TABLE
        52 TABLE EXPANSION BRANCH
        53 JOIN FACTORIZATION SET QUERY BLOCK                           FACTORIZE_JOIN
        54 DISTINCT PLACEMENT                                           PLACE_DISTINCT
        55 JOIN FACTORIZATION BRANCH QUERY BLOCK
        56 TABLE LOOKUP BY NESTED LOOP QUERY BLOCK                      TABLE_LOOKUP_BY_NL
        57 FULL OUTER JOIN TRANSFORMED TO OUTER                         FULL_OUTER_JOIN_TO_OUTER
        58 LEFT OUTER JOIN TRANSFORMED TO ANTI                          OUTER_JOIN_TO_ANTI
        59 VIEW DECORRELATED                                            DECORRELATE
        60 QUERY VIEW DECORRELATED                                      DECORRELATE
        61 NOT EXISTS SQ ADDED
        62 BRANCH WITH OUTER JOIN
        63 BRANCH WITH ANTI JOIN
        64 UNION ALL FOR FULL OUTER JOIN
        65 VECTOR TRANSFORMATION                                        VECTOR_TRANSFORM
        66 VECTOR TRANSFORMATION TEMP TABLE
        67 QUERY ANSI REARCHiTECTURE                                    ANSI_REARCH
        68 VIEW ANSI REARCHiTECTURE                                     ANSI_REARCH
        69 ELIMINATION OF GROUP BY                                      ELIM_GROUPBY
        70 UAL BRANCH OF UNNESTED SUBQUERY
        71 QUERY BLOCK HAS BUSHY JOIN                                   BUSHY_JOIN
        72 SUBQUERY ELIMINATE                                           ELIMINATE_SQ
        73 OR EXPANSION UNION ALL BRANCH
        74 OR EXPANSION UNION ALL VIEW                                  OR_EXPAND
        75 DIST AGG GROUPING SETS UNION ALL TRANSFORMATION              USE_DAGG_UNION_ALL_GSETS
        76 MATERIALIZED WITH CLAUSE
        77 STATISTCS BASED TRANSFORMED QB
        78 PQ TABLE EXPANSION
        79 LEFT OUTER JOIN TRANSFORMED TO BOTH INNER AND ANTI
        80 SHARD TEMP TABLE
        81 BRANCH OF COMPLEX UNNESTED SET QUERY BLOCK
        82 DIST AGG GROUPING SETS OPTIMIZATION                          DAGG_OPTIM_GSETS


You’ll notice the highlight for origin_id 2 which has the name PARSER – that’s the (first) significant value when reporting the query block registry but take note, also, of origin_id 3 which has the name hint. This is where the code built into 21c goes wrong. If you use the qb_name hint to name all your query blocks then their origin_id will be 3, and Oracle’s code won’t find them.

When I added the hint /*+ qb_name(main) */ to the query this is what I got from my registry query:

Query Block Registry:
---------------------

  SEL$1 (PARSER)
    SEL$7D4DB4AA (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$1)
      SEL$EFD91A2C (PROJECTION VIEW FOR CVM SEL$7D4DB4AA)
        SEL$7086F02E (VIEW MERGE SEL$EFD91A2C ; SEL$7D4DB4AA) [FINAL]

And when I also added the hint /*+ qb_name(subq) */ to the subquery the result was this:

Query Block Registry:
---------------------

An uncaught error happened in display_cursor : ORA-06502: PL/SQL: numeric or value error

I’ve said for a long time: “always name all your query blocks”. I think 21c (temporarily) demonstrates why you have two options: name ALL of them or name NONE of them. If you name just some of them you might not notice that parts of your plan don’t appear in the registry report, and I’d say it’s better to see an error than to be fooled into thinking you’ve got complete information.

Footnote 2

There’s another new option for the format parameter in 21c which is qbregistry_graph. I haven’t considered playing about with the trace file to see if I can extract and hack the SQL that generates the appropriate output (but that might change if I pick up a tool to turn the textual description into a graphic). For the registry listing above this is what the “graph” output looks like:

Query Block Registry Graph (dot format):
---------------------
digraph g{
  rankdir = TB
  "SEL$88A77D12"
  "SEL$683B0107"
  "SEL$1"
  "SEL$2"
  "SEL$B186933D" [peripheries=2]
  "SEL$5DA710D3"
  "SEL$2F1334C4" [peripheries=2]
  "SEL$683B0107" -> "SEL$88A77D12" [label="PROJECTION VIEW FOR CVM"]
  "SEL$2" -> "SEL$683B0107" [label="SUBQ INTO VIEW FOR COMPLEX UNNEST"]
  "SEL$88A77D12" -> "SEL$B186933D" [label="VIEW MERGE"]
  "SEL$1" -> "SEL$5DA710D3" [label="SUBQUERY UNNEST"]
  "SEL$5DA710D3" -> "SEL$2F1334C4" [label="SPLIT/MERGE QUERY BLOCKS"]
  "SEL$683B0107" -> "SEL$B186933D" [style=dotted]
  "SEL$2" -> "SEL$5DA710D3" [style=dotted]
  { rank = same }
  {
    rank="sink";
    rankdir = LR;
    item1 [style=invis];
    item2 [shape="plaintext" label="Participating query blocks"];
    item3 [label="&nbsp;" peripheries=2];
    item4 [shape="plaintext" label="Final query blocks"];
    item1 -> item2 [style=dotted];
    { rank=same item3 item4; }
  }
}

Footnote 2.1 (a few days later)

A couple of days after publishing this note I received an email pointing out that the qbregistry_graph output is in the Graphviz DOT language (see http://www.graphviz.org/) and there are even websites where it can easily be rendered into graphic form. This is the result I got by pasting the output into the website:

I haven’t tried to think through a generalised pattern for drawing these pictures, but I think I’d prefer to see a diagram which showed that the “final” query block sel$2f1334c4 was used by query block sel$b186933d. After all, the word “final” in this context means only that the query block was one for which the optimizer produced an independent (sub-)plan.

Footnote 3

For completeness – here’s the original SQL and plan for the statement that produced this qbregistry example:

select
        *
from    t1
where   owner = 'OUTLN'
and     object_name in (
                select  /*+
                                unnest
                                no_semijoin
                        */
                        distinct object_name
                from   t2
                where  object_type = 'TABLE'
        )
;

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |       |       |     5 (100)|          |
|   1 |  VIEW                                  | VM_NWVW_2 |     1 |   483 |     5  (20)| 00:00:01 |
|   2 |   HASH UNIQUE                          |           |     1 |   155 |     5  (20)| 00:00:01 |
|   3 |    NESTED LOOPS SEMI                   |           |     1 |   155 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1        |     1 |   128 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | T1_I1     |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2        |     1 |    27 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                  | T2_I2     |    48 |       |     0   (0)|          |
----------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$B186933D / "VM_NWVW_2"@"SEL$2F1334C4"
   2 - SEL$B186933D
   4 - SEL$B186933D / "T1"@"SEL$1"
   5 - SEL$B186933D / "T1"@"SEL$1"
   6 - SEL$B186933D / "T2"@"SEL$2"
   7 - SEL$B186933D / "T2"@"SEL$2"

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('21.1.0')
      DB_VERSION('21.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$B186933D")
      MERGE(@"SEL$683B0107" >"SEL$5DA710D3")
      OUTLINE_LEAF(@"SEL$2F1334C4")
      OUTLINE(@"SEL$88A77D12")
      OUTLINE(@"SEL$683B0107")
      OUTLINE(@"SEL$5DA710D3")
      UNNEST(@"SEL$2" UNNEST_INNERJ_DISTINCT_VIEW)
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      NO_ACCESS(@"SEL$2F1334C4" "VM_NWVW_2"@"SEL$2F1334C4")
      INDEX_RS_ASC(@"SEL$B186933D" "T1"@"SEL$1" ("T1"."OWNER"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B186933D" "T1"@"SEL$1")
      INDEX_RS_ASC(@"SEL$B186933D" "T2"@"SEL$2" ("T2"."OBJECT_TYPE"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B186933D" "T2"@"SEL$2")
      LEADING(@"SEL$B186933D" "T1"@"SEL$1" "T2"@"SEL$2")
      USE_NL(@"SEL$B186933D" "T2"@"SEL$2")
      USE_HASH_AGGREGATION(@"SEL$B186933D" UNIQUE)
      PARTIAL_JOIN(@"SEL$B186933D" "T2"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("OWNER"='OUTLN')
   6 - filter("OBJECT_NAME"="OBJECT_NAME")
   7 - access("OBJECT_TYPE"='TABLE')

Query Block Registry:
---------------------
  SEL$1 (PARSER)
    SEL$5DA710D3 (SUBQUERY UNNEST SEL$1 ; SEL$2)
      SEL$2F1334C4 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3) [FINAL]
  SEL$2 (PARSER)
    SEL$683B0107 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
      SEL$88A77D12 (PROJECTION VIEW FOR CVM SEL$683B0107)
        SEL$B186933D (VIEW MERGE SEL$88A77D12 ; SEL$683B0107) [FINAL]

Tables t1 and t2 are copies of the same data set, which is a subset of 100 rows from all_objects. You won’t necessarily see this plan on your systems because (even with the hints) the plan can vary depending on the number of rows with owner = ‘OUTLN’ (which is likely to be zero) or with object_type = ‘TABLE’ (which might be all of them). The script I started with was one I had used in a note I wrote about “distinct” appearing in the select list of subqueries, but the data it produced in the newer versions of Oracle was sufficiently different that I had to be a little more careful in constructing a data set that produced stable plans.

If you cross-check the Query Block Registry with the Outline Information you’ll see that the lines labelled FINAL start with the query block names that are shown as “outline_leaf” entries, and the other 5 query block names appear as “outline” entries.

Reading down the tree I then find myself strugling to interpret the QBR. I think I know what has happened, but I can’t quite manage to see exactly how the QBR is telling me that it happened.

QBR – tentative interpretation

Part of the difficulty is that the QBR seems to have a section for every initial query block in the query, so there’s (a) likely to be some overlap between sections and (b) some sequencing that means you can’t get the full picture just by reading straight from top to bottom. In this case we have two initial query blocks (the main query, implicitly named sel$1, and the subquery implicitly named sel$2), and I think the interpretation is as follows:

Starting with sel$2 section we can see that its second line tells us that the subquery was unnested and the resulting aggregate inline view is the sole content of a query block called sel$683B0107.

Jumping backwards to the sel$1 section, its second line tells us that sel$5DA710D3 is a query block consisting of a join between t1 and the inline aggregate view.

Sticking with the sel$1 tree, we then see a query block that tells us that the optimizer has transformed an “aggregate then join” into a “join then aggregate”. sel$2F1334C4 is the query block holding nothing but a select from the view VM_VWNW_2.

Returning to the sel$2 tree, sel$88A77D12 is the resulting query block when the inline aggregate view is merged using complex view merging. This is where I get a bit stuck, because this seems to be repeating a step that we’ve handled in the sel$1 section by a different route.

The final step of the sel$2 tree is sel$B186933D the query block where we select from the non-mergeable inline view VM_VWNW_2 that seems to have come from one of two different places.

Bottom line on this one: even though it’s an extremely simple query and I believe I understand what the execution plan is telling us about the transformations that took place, the query block registry is still something of a mystery to me.

August 5, 2021

Finding SQL

Filed under: dbms_xplan,Oracle,Problem Solving,Troubleshooting — Jonathan Lewis @ 6:24 pm BST Aug 5,2021

There are some questions about performance issues for which there is no easy answer, and sometimes the best you can do is try to generate an approximate answer then examine the results to eliminate the innocent.

Three such problems – essentially similar – appeared recently on the Oracle Developer Forum, and in this note I’ll supply a mechanism that may be a good enough step in the right direction for at least two of them. The questions were:

  • How do I find who’s been using up all the temporary space recently?
  • How do I find the execution plans for all the SQL that is called by a procedure?
  • What SQL is responsible for generating most redo?

A basic (but incomplete) strategy for attacking these questions is to think of a way of to identify the sql_id and child_number for statements that might be contributing to the problem. If you can think of a suitable attack then you can get all the execution plans (or SQL Monitor reports) for those statements and examine them further.

The strategy is incomplete on two counts – first that you won’t get find the perfect set of statements, you’ll get more than you need but still miss some that are relevant; second that you’ll probably have to run secondary queries to get extra details about statements that look like good culprits for the problem you’re trying to solve.

Who’s been using the temporary space

The way this question was actually posed was more like:

I’ve got a query that started crashing with Oracle error “ORA-01652: unable to extend temp segment by 256 in tablespace TEMP”, but when I check the contents of the temporary tablespace (TEMP) there’s plenty of space available. How do I fix this problem.

This often means your query has changed execution plan and picked a very bad join order with some hash (or merge) joins that have dumped huge amount of data to disc; but it may mean that the space was being taken up by some other activity that doesn’t usually happen when you’re running your query.

So if you find that the SQL Monitor report (or simple call to dbms_xplan.display_cursor) makes you think that your query wasn’t doing anything differently a step that may help is to find all the sql in the library cache that might have been using a lot of temporary space.

For temporary space we can always check v$sql_workarea to identify operations from plans ran as one-pass or multi-pass operations, and we can check their most recent “tempseg size” or maximum tempseg size. Hence:

rem
rem     Script:         check_full_2a.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2021
rem
rem     Columns of interest for temporary space:
rem     onepass_executions, multipasses_executions, last_tempseg_size, max_tempseg_size

select  
        distinct sql_id, child_number
from    v$sql_workarea 
where   onepass_executions != 0
or      multipasses_executions != 0
-- or      max_tempseg_size > 1e7
/


select
        distinct
        sql.sql_id, sql.child_number
from
        v$sql           sql,
        v$sql_workarea  war
where
        (   war.onepass_executions != 0
         or war.multipasses_executions != 0
        )
and     sql.sql_id = war.sql_id
and     sql.child_number = war.child_number
and     sql.last_active_time > sysdate - 15/1440
;

The first query here simply scans through the v$sql_workarea structure (which means it will actually thrash its way through the library cache) looking for operations that spilled to disk or (commented out) have used at least some specified amount of memory.

The second variation joins v$sql_workarea to v$sql so that it can restrict the chosen statements to those that were active some time in the last 15 minutes.

Obviously you will be able to think of other ways of tweaking these statements, and once you’ve got a statement expressing a suitable set of criteria you can embed it into a query that calls dbms_xplan.display_cursor() – as I demontrated about 10 years ago – or dbms_sql_monitor.report_sql_monitor() if you’re suitably licensed.

set linesize 230
set trimspool on
set pagesize 90
set tab off

set long 20000


select
        plan_table_output
from    (
        select  
                distinct sql_id, child_number
        from    v$sql_workarea 
        where   onepass_executions != 0
        or      multipasses_executions != 0
        ) v,
        table(dbms_xplan.display_cursor(v.sql_id, v.child_number, 'memstats'))
;


select
        dbms_sql_monitor.report_sql_monitor(
                sql_id             => v.sql_id,
                start_time_filter  => sysdate - 15/1440,
                type               =>'TEXT'
        ) text_line
from    (
        select
                distinct sql.sql_id
        from
                v$sql           sql,
                v$sql_workarea  war
        where
                (   war.onepass_executions != 0
                or war.multipasses_executions != 0
                )
        and     sql.sql_id = war.sql_id
        and     sql.child_number = war.child_number
        and     sql.last_active_time > sysdate - 15/1440
        ) v
/



A couple of points to note. I’ve included the MEMSTATS format option in the call to dbms_xplan.display_cursor() so that it shows some summary information from v$sql_workarea. However this does have a defect, it doesn’t show space used in temp by materialized “with” subqueries (CTEs) – which is where the call to dbms_sql_monitor.report_sql_monitor() helps because if the execution was captured it will show writes to disc in the “LOAD AS SELECT” operation under the TEMPORARY TABLE TRANSFORMATION operation.

Here’s a sample of output I got from the two queries after forcing a nasty plan to do a big hash join that ultimately produced a small result set.

First the output from the query using package dbms_xplan:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
SQL_ID  1cwabt12zq6zb, child number 0
-------------------------------------
with ttemp as (  select /*+ materialize */ * from t1 ) select  /*+
no_partial_join(@sel$2 t1b) no_place_group_by(@sel$2) */
t1a.object_type,  max(t1a.object_name) from  ttemp t1a, ttemp t1b where
 t1a.object_id = t1b.object_id group by  t1a.object_type order by
t1a.object_type

Plan hash value: 1682228242

-----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                      | Starts | E-Rows | A-Rows |   A-Time   |  OMem |  1Mem |  O/1/M   | Max-Tmp |
-----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                           |      1 |        |     45 |00:00:16.24 |       |       |          |         |
|   1 |  TEMP TABLE TRANSFORMATION               |                           |      1 |        |     45 |00:00:16.24 |       |       |          |         |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6645_5FF4FE |      1 |        |      0 |00:00:02.53 |  2070K|  2070K|          |         |
|   3 |    TABLE ACCESS FULL                     | T1                        |      1 |   1154K|   1154K|00:00:00.55 |       |       |          |         |
|   4 |   SORT GROUP BY                          |                           |      1 |     45 |     45 |00:00:13.72 | 11264 | 11264 |     1/0/0|         |
|*  5 |    HASH JOIN                             |                           |      1 |     18M|     18M|00:00:08.63 |    58M|    12M|          |      71M|
|   6 |     VIEW                                 |                           |      1 |   1154K|   1154K|00:00:01.03 |       |       |          |         |
|   7 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D6645_5FF4FE |      1 |   1154K|   1154K|00:00:01.03 |       |       |          |         |
|   8 |     VIEW                                 |                           |      1 |   1154K|   1154K|00:00:00.49 |       |       |          |         |
|   9 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D6645_5FF4FE |      1 |   1154K|   1154K|00:00:00.43 |       |       |          |         |
-----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T1A"."OBJECT_ID"="T1B"."OBJECT_ID")


30 rows selected.



Note the absence of any numbers in the Max-Tmp column for operation 2 (highlighted line 18).

Then compare with the results below of the the query using package dbms_sql_monitor:

TEXT_LINE
-----------------------------------------------------------------------
SQL Monitoring Report

SQL Text
------------------------------
with ttemp as ( select /*+ materialize */ * from t1 ) 
select /*+ no_partial_join(@sel$2 t1b) no_place_group_by(@sel$2) */ t1a.object_type, max(t1a.object_name) 
from ttemp t1a, ttemp t1b where t1a.object_id = t1b.object_id 
group by t1a.object_type order by t1a.object_type

Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  SYS (10:47180)
 SQL ID              :  1cwabt12zq6zb
 SQL Execution ID    :  16777216
 Execution Started   :  08/05/2021 17:24:56
 First Refresh Time  :  08/05/2021 17:25:00
 Last Refresh Time   :  08/05/2021 17:25:12
 Duration            :  16s
 Module/Action       :  MyModule/MyAction
 Service             :  orclpdb
 Program             :  sqlplus@linux183.localdomain (TNS V1-V3)
 Fetch Calls         :  4

Global Stats
================================================================================
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes | Reqs  | Bytes |
================================================================================
|      16 |      14 |     2.23 |     4 |  84677 |  986 | 253MB |   866 | 219MB |
================================================================================

SQL Plan Monitoring Details (Plan Hash Value=1682228242)
=====================================================================================================================================================================================================================
| Id |                 Operation                  |           Name            |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Write | Write |  Mem  | Temp  | Activity | Activity Detail |
|    |                                            |                           | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)    |   (# samples)   |
=====================================================================================================================================================================================================================
|  0 | SELECT STATEMENT                           |                           |         |       |         1 |    +16 |     1 |       45 |      |       |       |       |     . |     . |          |                 |
|  1 |   TEMP TABLE TRANSFORMATION                |                           |         |       |         1 |    +16 |     1 |       45 |      |       |       |       |     . |     . |          |                 |
|  2 |    LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6645_5FF4FE |         |       |         5 |     +0 |     1 |        2 |      |       |   351 | 176MB |     . |     . |          |                 |
|  3 |     TABLE ACCESS FULL                      | T1                        |      1M |  3166 |         1 |     +4 |     1 |       1M |      |       |       |       |     . |     . |          |                 |
|  4 |    SORT GROUP BY                           |                           |      45 | 17727 |        11 |     +6 |     1 |       45 |      |       |       |       | 10240 |     . |          |                 |
|  5 |     HASH JOIN                              |                           |     18M |  9804 |        13 |     +4 |     1 |      18M |  561 |  66MB |   561 |  66MB |  59MB |  71MB |          |                 |
|  6 |      VIEW                                  |                           |      1M |  2966 |         1 |     +4 |     1 |       1M |      |       |       |       |     . |     . |          |                 |
|  7 |       TABLE ACCESS FULL                    | SYS_TEMP_0FD9D6645_5FF4FE |      1M |  2966 |         2 |     +3 |     1 |       1M |  351 | 176MB |       |       |     . |     . |          |                 |
|  8 |      VIEW                                  |                           |      1M |  2966 |         3 |     +6 |     1 |       1M |      |       |       |       |     . |     . |          |                 |
|  9 |       TABLE ACCESS FULL                    | SYS_TEMP_0FD9D6645_5FF4FE |      1M |  2966 |         5 |     +4 |     1 |       1M |   74 |  12MB |       |       |     . |     . |          |                 |
=====================================================================================================================================================================================================================


1 row selected.

In this report you can see that the materialization resulted in 176MB of data being written to temp at operation 2 (highlighted line 43), which is actually more than was written to temp as the hash join spilled over from memory.

There are a number of defects in this call to dbms_sql_monitor – including the need for an extra cost license. Most particularly (a) the plan is captured only if the query runs for more than 6 seconds or has a parallel componet and (b) we’re only passing in the SQL_ID with a time limit, so we could get several executions reported. We could refine the inputs, though, by including the sql_plan_hash_value in our query against v$sql. and we might include an end-time filter.

Whatever we do to minimise the number of plans reported the point will almost certainly come where we have to do eyeball the data to see if we can identify the queries which were almost certainly running and using the temp space we needed.

How do I find the execution plans for all the SQL that is called by a procedure?

To be written

What SQL is responsible for generating most redo?

To be written

February 16, 2021

Adaptive error

Filed under: CBO,dbms_xplan,Oracle,Statistics — Jonathan Lewis @ 5:41 pm GMT Feb 16,2021

There’s a thread on the Oracle Database Forum at present where someone has supplied a script to create some data that’s guaranteed to reproduce wrong results (provided your system stats and optimizer parameters are at their default values). They’ve even supplied a link to the script on LiveSQL (opens in new window) – which is running 19.8 – to demonstrate the problem.

I’ve tested on 12.2.0.1 and 19.3.0.0 and the problem occurs in both versions – though with my setup the initial plan that returned the wrong results didn’t re-optimize to a plan with the correct results in 12.2.0.1.

I’ve included a script at the end of the note to create the data set but I’ll describe some of the objects as we go along – starting with a query that gives the correct result, apparently because it’s been hinted to do so:

execute dbms_stats.delete_system_stats

set linesize 255
set pagesize  60
set trimspool on

alter session set statistics_level='all';
set serveroutput off

select 
        /*+ use_hash(dwf) */ 
        count(*) count_hash 
from 
        test_dwf_sapfi  dwf
where
         exists (
                select  1 
                from    test_sapfi_coicar_at5dat11      coi
                where   coi.datumzprac = 20200414
                and     to_char(coi.datuct,'yyyymmdd') = dwf.datumucetnipom_code
        );

select * from table(dbms_xplan.display_cursor(format=>'cost outline allstats last partition hint_report adaptive'));


test_dwf_sapfi is a table with a single numeric column datumucetnipom_code, the table is list partitioned by that column with 61 partitions. Each partition is defined to hold a single value. The number is designed to look like a date in the format YYYYMMDD.

test_sapfi_coicar_at5dat11 is a table with two columns (datuct, datumzprac). The first column is a date column with data covering a range of 60 dates, the second column is a numeric column and the table is list partioned on that column. All the data in the table is in one partition of that table and the column holds the same value for every row (again it’s a number that looks like a date).

There are 15,197 rows in each table, and the test_dwf_sapfi data has been created as a copy (with a suitable to_number(to_char()) formatting change from the test_sapfi_coicar_at5dat11 table.

Here’s the execution plan from 19c:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                            |      1 |        |   328 (100)|       |       |      1 |00:00:00.02 |     155 |     69 |       |       |          |
|   1 |  SORT AGGREGATE              |                            |      1 |      1 |            |       |       |      1 |00:00:00.02 |     155 |     69 |       |       |          |
|*  2 |   HASH JOIN RIGHT SEMI       |                            |      1 |    253 |   328   (1)|       |       |  15197 |00:00:00.02 |     155 |     69 |  2352K|  2352K| 2110K (0)|
|   3 |    PART JOIN FILTER CREATE   | :BF0000                    |      1 |    152 |    13   (0)|       |       |  15197 |00:00:00.01 |      25 |      0 |       |       |          |
|   4 |     PARTITION LIST SINGLE    |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |      0 |       |       |          |
|   5 |      TABLE ACCESS FULL       | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |      0 |       |       |          |
|   6 |    PARTITION LIST JOIN-FILTER|                            |      1 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     130 |     69 |       |       |          |
|   7 |     TABLE ACCESS FULL        | TEST_DWF_SAPFI             |     60 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     130 |     69 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      FULL(@"SEL$5DA710D3" "COI"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "DWF"@"SEL$1" "COI"@"SEL$2")
      USE_HASH(@"SEL$5DA710D3" "COI"@"SEL$2")
      SWAP_JOIN_INPUTS(@"SEL$5DA710D3" "COI"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))


Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (N - Unresolved (1))
---------------------------------------------------------------------------
   7 -  SEL$5DA710D3 / DWF@SEL$1
         U -  use_hash(dwf)



You’ll notice there’s no “adaptive” information in the report, and there’s no “Note” section saying it’s an adaptive plan. You might also note that the plan looks as if it’s doing a hash join into “dwf” but the “Hint Report” tells us that the hint has not been used and the “Outline Information” tells us that the plan has actually arrived as the result of the combination /*+ use_hash(coi) swap_join_inputs(coi)” */. In fact this is the default plan (on my system) that would have appeared in the complete absence of hints.

The result of the count(*) should be 15,197 – and you can see that this plan has produced the right answer when you check the A-Rows value for operation 2 (the hash join right semi that generates the rowsource for the sort aggregate).

The adaptive anomaly

So now we try again but with a hint to generate a nested loop join and it gives us the wrong result (8) and an oddity in the plan. I’ve reported the body of the plan twice, the first version includes the adaptive information the second is the tidier plan we get by omitting the ‘adaptive’ format option:

select 
        count(*) count_nl 
from 
        test_dwf_sapfi  dwf
where 
        exists (
                select
                        /*+
                                use_nl (coi)
                        */
                        1
                from    test_sapfi_coicar_at5dat11      coi
                where   coi.datumzprac = 20200414
                and     to_char(coi.datuct,'yyyymmdd') = dwf.datumucetnipom_code
        )
;

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                    | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT             |                            |      1 |        |   329 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|     1 |  SORT AGGREGATE              |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|- *  2 |   HASH JOIN                  |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |  3667K|  1779K|          |
|     3 |    PART JOIN FILTER CREATE   | :BF0000                    |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|     4 |     NESTED LOOPS             |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|-    5 |      STATISTICS COLLECTOR    |                            |      1 |        |            |       |       |     60 |00:00:00.01 |      25 |       |       |          |
|     6 |       SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|     7 |        PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|     8 |         TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|     9 |      PARTITION LIST ITERATOR |                            |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
|  * 10 |       TABLE ACCESS FULL      | TEST_DWF_SAPFI             |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
|-   11 |    PARTITION LIST JOIN-FILTER|                            |      0 |  15197 |   314   (1)|:BF0000|:BF0000|      0 |00:00:00.01 |       0 |       |       |          |
|-   12 |     TABLE ACCESS FULL        | TEST_DWF_SAPFI             |      0 |  15197 |   314   (1)|:BF0000|:BF0000|      0 |00:00:00.01 |       0 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |      1 |        |   329 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   1 |  SORT AGGREGATE            |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   2 |   PART JOIN FILTER CREATE  | :BF0000                    |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|   3 |    NESTED LOOPS            |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|   4 |     SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   5 |      PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |       TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   7 |     PARTITION LIST ITERATOR|                            |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
|*  8 |      TABLE ACCESS FULL     | TEST_DWF_SAPFI             |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      USE_NL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "COI"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "COI"@"SEL$2" "DWF"@"SEL$1")
      SEMI_TO_INNER(@"SEL$5DA710D3" "COI"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))
  10 - filter("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   8 -  SEL$5DA710D3 / COI@SEL$2
         U -  use_nl (coi)

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

Points to note here:

  • The most important item to note is that at operation 3 (of the tidy plan) we can see that the nested loop reports A-Rows as 8, it’s the wrong result.
  • Then there’s the oddityy that operation 2 is a “part join filter create” that shouldn’t be there for a nested loop, that’s a hash join feature that allows the Pstart/Pstop columns to report partition pruning by Bloom filter (“:BFnnnn”), but we’re running a nested loop join which can pass in the partition key, so we see KEY/KEY as the Pstart/Pstop.
  • The third thing we can pick up is that the 8 rows in our nested loop rowsource are echoed in the A-Rows for the 60 executions of the partition table scans of test_dwf_sapfi at operations 7 abd 8 in the reduced plan – it’s probably not a complete coincidence that the nested loop join is passing the partition keys in partition key order (sort unique at operation 4) and there are 8 rows in the last populated partition of test_dwf_sapfi,
  • Finally we note from the Hint Report that the hint, as supplied, was not used, and the outlne shows us that the path was actually “leading(coi dwf) use_nl(dwf)”.

The really fascinating thing about this execution plan is that it contains a hint that was not used – but the plan changed from the default plan to a slightly more expensive plan.

If at first you don’t succeed

There’s just one more surprise to reveal – we had an adaptive plan, which tends to mean the optimizer plays towards a nested loop join but hedges its bets to be able to swing to a hash join in mid-plan. This suggests that the real-time stats collector thought there wasn’t much data and a nested loop was good – but what happens when I run exactly the same query again? In my 12c system the answer was nothing changed, but in my 19c system a new plan appeared:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                            |      1 |        |   331 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   1 |  SORT AGGREGATE              |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|*  2 |   HASH JOIN                  |                            |      1 |    120K|   331   (2)|       |       |  15197 |00:00:00.01 |     154 |  2171K|  2171K| 1636K (0)|
|   3 |    PART JOIN FILTER CREATE   | :BF0000                    |      1 |  15197 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |       |       |          |
|   4 |     SORT UNIQUE              |                            |      1 |  15197 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   5 |      PARTITION LIST SINGLE   |                            |      1 |  15197 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |       TABLE ACCESS FULL      | TEST_SAPFI_COICAR_AT5DAT11 |      1 |  15197 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   7 |    PARTITION LIST JOIN-FILTER|                            |      1 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     129 |       |       |          |
|   8 |     TABLE ACCESS FULL        | TEST_DWF_SAPFI             |     60 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     129 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "COI"@"SEL$2")
      FULL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      LEADING(@"SEL$5DA710D3" "COI"@"SEL$2" "DWF"@"SEL$1")
      USE_HASH(@"SEL$5DA710D3" "DWF"@"SEL$1")
      SEMI_TO_INNER(@"SEL$5DA710D3" "COI"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   6 -  SEL$5DA710D3 / COI@SEL$2
         U -  use_nl (coi)

Note
-----
   - statistics feedback used for this statement

This is the output with the ‘adaptive’ format in place – but the plan isn’t adaptive – the optimizer has used statistics feedback (formerly cardinality feedback)to work out a better plan. The hint is still unused of course but when we check the plan we can see that

  • it has got the right answer – the hash join at operation 2 reports 15,197 rows
  • the “partition join” Bloom filter created at operation 3 has been used for the Pstart/Pstop at operations 7 and 8
  • even though the hint has not been used the plan is (again) not the same as the default plan, we’ve got a hash join with Bloom filter while the default plan had a hash join right semi after a sort unique of the test_sapfi_coicar_at5dat11 data with an overall lower cost.

What Happened ?

Clearly there is a bug. It’s a slightly sensitive bug, and all I had to do to eliminate it was to gather stats on the underlying tables. (You’ll find in the table creation script at the end of this note that there are basically no object stats on the “big” partitioned table, which is presumably why the adaptive stuff came into play and allowed the bug to surface, and why 19c statistics feedback produced a new plan on the second execution)

It may be rather difficult for an outsider to pin down what’s going wrong and bypass the bug. One of the first ideas that appeared on the forum was that the Bloom filter pruning was breaking something – but when I added the hint /*+ opt_param(‘_bloom_pruning_enabled’,’false’) */ to the query all I got was basically the same nested loop plan without the Bloom filter creation and still ended up with the wrong result.

Finally, here’s a plan I got when I hinted query correctly to force the nested loop join with test_dwf_sapfi as the inner (second) table in the join (in other words I hinted the plan that had been giving me the wrong results):

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                            |      1 |        |   405 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   1 |  SORT AGGREGATE           |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   2 |   NESTED LOOPS            |                            |      1 |  38491 |   405   (1)|       |       |  15197 |00:00:00.01 |     154 |       |       |          |
|   3 |    SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   4 |     PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   5 |      TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |    PARTITION LIST ITERATOR|                            |     60 |    253 |     5   (0)|   KEY |   KEY |  15197 |00:00:00.01 |     129 |       |       |          |
|*  7 |     TABLE ACCESS FULL     | TEST_DWF_SAPFI             |     60 |    253 |     5   (0)|   KEY |   KEY |  15197 |00:00:00.01 |     129 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - filter("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   1 -  SEL$5DA710D3
           -  leading(@sel$5da710d3 coi@sel$2 dwf@sel$1)

   7 -  SEL$5DA710D3 / DWF@SEL$1
           -  use_nl(@sel$5da710d3 dwf@sel$1)

Compare this with the plan I got by using the wrong hint, resulting in the adaptive plan, but with Bloom filter pruning disable:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                            |      1 |        |   329 (100)|       |       |      1 |00:00:00.05 |     154 |       |       |          |
|   1 |  SORT AGGREGATE           |                            |      1 |      1 |            |       |       |      1 |00:00:00.05 |     154 |       |       |          |
|   2 |   NESTED LOOPS            |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.05 |     154 |       |       |          |
|   3 |    SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   4 |     PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   5 |      TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |    PARTITION LIST ITERATOR|                            |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.05 |     129 |       |       |          |
|*  7 |     TABLE ACCESS FULL     | TEST_DWF_SAPFI             |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.05 |     129 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - filter("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (1))
---------------------------------------------------------------------------
   0 -  STATEMENT
           -  opt_param('_bloom_pruning_enabled','false')

   5 -  SEL$5DA710D3 / COI@SEL$2
         U -  use_nl (coi)

It’s the same plan (with the same plan hash value though I haven’t shown that) – it has the same predicates, and does the same amount of work, But when the optimizer gets to this plan through the adaptive pathway the run-time engine produces the wrong results (note A-Rows = 8 at operation 2), while if the plan is forced by a correct set of hints the run-time engine produces the right path.

As you might guess, another way to bypass the problem was to disable adaptive plans – but when I did that the only way to get the nested loop path was through correct hinting anyway.

Test it yourself

Here’s a script to create the test data:

rem
rem     Script:         bloom_bug_02.sql
rem     Author:         Michal Telensky / Jonathan Lewis
rem     Dated:          Feb 2021
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem
rem     See also:
rem     https://community.oracle.com/tech/developers/discussion/4480469/reproducible-testcase-for-wrong-results
rem     https://livesql.oracle.com/apex/livesql/s/jzc2uyw6ecf2z2ul35nyrxelv
rem

drop table test_dwf_sapfi;
drop table test_sapfi_coicar_at5dat11;
purge recyclebin;

--
-- Don't do this unless it's a private system
-- Many sites seem to have the defaults anyway
--

execute dbms_stats.delete_system_stats

create table test_sapfi_coicar_at5dat11(
        datuct date,
        datumzprac number(8,0)
 ) 
row store compress advanced 
partition by list (datumzprac) (
        partition p20000101 values (20000101)
)
;

alter table test_sapfi_coicar_at5dat11 add partition p20200414 values (20200414);

insert /*+ append */ into test_sapfi_coicar_at5dat11
select date'2019-11-20' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2019-12-20' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2019-12-29' datuct, 20200414 datumzprac from dual connect by level <   4 union all
select date'2020-01-01' datuct, 20200414 datumzprac from dual connect by level <  55 union all
select date'2020-01-08' datuct, 20200414 datumzprac from dual connect by level <   3 union all
select date'2020-01-13' datuct, 20200414 datumzprac from dual connect by level <   8 union all
select date'2020-01-14' datuct, 20200414 datumzprac from dual connect by level <  117 union all
select date'2020-01-15' datuct, 20200414 datumzprac from dual connect by level <  65 union all
select date'2020-01-30' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2020-01-31' datuct, 20200414 datumzprac from dual connect by level <  12 union all
select date'2020-02-01' datuct, 20200414 datumzprac from dual connect by level <  20 union all
select date'2020-02-05' datuct, 20200414 datumzprac from dual connect by level <   4 union all
select date'2020-02-10' datuct, 20200414 datumzprac from dual connect by level <   5 union all
select date'2020-02-12' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2020-02-17' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2020-02-21' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-02-29' datuct, 20200414 datumzprac from dual connect by level <   37 union all
select date'2020-03-01' datuct, 20200414 datumzprac from dual connect by level < 1851 union all
select date'2020-03-02' datuct, 20200414 datumzprac from dual connect by level <  227 union all
select date'2020-03-03' datuct, 20200414 datumzprac from dual connect by level <   75 union all
select date'2020-03-04' datuct, 20200414 datumzprac from dual connect by level <   19 union all
select date'2020-03-05' datuct, 20200414 datumzprac from dual connect by level <  107 union all
select date'2020-03-06' datuct, 20200414 datumzprac from dual connect by level <  163 union all
select date'2020-03-07' datuct, 20200414 datumzprac from dual connect by level <   72 union all
select date'2020-03-08' datuct, 20200414 datumzprac from dual connect by level <   78 union all
select date'2020-03-09' datuct, 20200414 datumzprac from dual connect by level <  187 union all
select date'2020-03-10' datuct, 20200414 datumzprac from dual connect by level <  124 union all
select date'2020-03-11' datuct, 20200414 datumzprac from dual connect by level <   92 union all
select date'2020-03-12' datuct, 20200414 datumzprac from dual connect by level <  137 union all
select date'2020-03-13' datuct, 20200414 datumzprac from dual connect by level <  397 union all
select date'2020-03-14' datuct, 20200414 datumzprac from dual connect by level <   52 union all
select date'2020-03-15' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-03-16' datuct, 20200414 datumzprac from dual connect by level <  622 union all
select date'2020-03-17' datuct, 20200414 datumzprac from dual connect by level <  215 union all
select date'2020-03-18' datuct, 20200414 datumzprac from dual connect by level <  299 union all
select date'2020-03-19' datuct, 20200414 datumzprac from dual connect by level <  265 union all
select date'2020-03-20' datuct, 20200414 datumzprac from dual connect by level <  627 union all
select date'2020-03-21' datuct, 20200414 datumzprac from dual connect by level <   52 union all
select date'2020-03-22' datuct, 20200414 datumzprac from dual connect by level <   60 union all
select date'2020-03-23' datuct, 20200414 datumzprac from dual connect by level <  168 union all
select date'2020-03-24' datuct, 20200414 datumzprac from dual connect by level <  255 union all
select date'2020-03-25' datuct, 20200414 datumzprac from dual connect by level <  185 union all
select date'2020-03-26' datuct, 20200414 datumzprac from dual connect by level <  240 union all
select date'2020-03-27' datuct, 20200414 datumzprac from dual connect by level <  663 union all
select date'2020-03-28' datuct, 20200414 datumzprac from dual connect by level <   88 union all
select date'2020-03-29' datuct, 20200414 datumzprac from dual connect by level <  771 union all
select date'2020-03-30' datuct, 20200414 datumzprac from dual connect by level <  328 union all
select date'2020-03-31' datuct, 20200414 datumzprac from dual connect by level < 1675 union all
select date'2020-04-01' datuct, 20200414 datumzprac from dual connect by level <  641 union all
select date'2020-04-02' datuct, 20200414 datumzprac from dual connect by level <  251 union all
select date'2020-04-03' datuct, 20200414 datumzprac from dual connect by level <   84 union all
select date'2020-04-06' datuct, 20200414 datumzprac from dual connect by level <  325 union all
select date'2020-04-07' datuct, 20200414 datumzprac from dual connect by level <  366 union all
select date'2020-04-08' datuct, 20200414 datumzprac from dual connect by level <  459 union all
select date'2020-04-09' datuct, 20200414 datumzprac from dual connect by level < 2470 union all
select date'2020-04-10' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-04-11' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-04-12' datuct, 20200414 datumzprac from dual connect by level <   24 union all
select date'2020-04-13' datuct, 20200414 datumzprac from dual connect by level <  130 union all
select date'2020-04-14' datuct, 20200414 datumzprac from dual connect by level <    9  -- > change this value and the final (wrong) result changes in synch
/

commit
/

--
-- There are no indexes, so this method_opt collects fewer stats than expected
-- No column stats on the partition(s), only partition row and block stats
-- It does get basic column stats at the table level.
--

declare
        schema_name varchar2(128);
begin
        select sys_context('userenv', 'current_schema') into schema_name from dual;

        dbms_stats.gather_table_stats(
                ownname          => schema_name,
                tabname          => 'test_sapfi_coicar_at5dat11',
                partname         => 'p20200414',
                estimate_percent => dbms_stats.auto_sample_size,
                method_opt       => 'for all indexed columns size auto'
        );
end;
/

create table test_dwf_sapfi (
        datumucetnipom_code number(8,0) not null enable
) 
row store compress advanced 
partition by list (datumucetnipom_code) (
        partition p20000101 values (20000101) 
)
/

begin
        for i in (
                select  distinct to_char(datuct, 'yyyymmdd') datumucetnipom_code 
                from    test_sapfi_coicar_at5dat11 
                order by 
                        1
        ) loop
                execute immediate 
                        'alter table test_dwf_sapfi add partition p' || 
                                i.datumucetnipom_code || 
                                ' values (' || i.datumucetnipom_code || ')'
                ;
        end loop;
end;
/


insert  /*+ append */ into test_dwf_sapfi 
select  to_number(to_char(datuct, 'yyyymmdd')) 
from    test_sapfi_coicar_at5dat11 
where   datumzprac = 20200414
;

commit;

--
--  The problems (seem to) go away if you collect stats
--

-- execute dbms_stats.gather_table_stats(user,'test_dwf_sapfi',granularity=>'global')


set serveroutput off
set linesize 255
set pagesize 60
set trimspool on

alter session set statistics_level='all';

prompt  ===================================
prompt  plan with incorrect use_hash() hint
prompt  ===================================

select 
        /*  use_hash(dwf) */ 
        count(*) count_hash 
from 
        test_dwf_sapfi  dwf
where
         exists (
                select  1 
                from    test_sapfi_coicar_at5dat11      coi
                where   coi.datumzprac = 20200414
                and     to_char(coi.datuct,'yyyymmdd') = dwf.datumucetnipom_code
        );

select * from table(dbms_xplan.display_cursor(format=>'cost outline allstats last partition hint_report adaptive'));

set serveroutput on
spool off

Update (Sept 2021)

The wrong results still appear in 21.3.0.0

January 26, 2021

Index Hints

Filed under: CBO,dbms_xplan,Execution plans,Hints,Ignoring Hints,Indexing,Oracle — Jonathan Lewis @ 4:28 pm GMT Jan 26,2021

At the end of the previous post on index hints I mentioned that I had been prompted to complete a draft from a few years back because I’d been sent an email by Kaley Crum showing the optimizer ignoring an index_rs_asc() hint in a very simple query. Here, with some cosmetic changes, is the example he sent me.

rem
rem     Script: index_rs_kaley.sql
rem     Dated:  Dec 2020
rem     Author: Kaley Crum
rem
rem     Last tested
rem             19.3.0.0
rem

create table range_scan_me(
        one,
        letter 
)
compress
nologging
as
with rowgen_cte as (
        select  null
        from    dual
        connect by level <=  11315
)
select
        1 one,
        case 
                when rownum <=  64e5     then 'A'
                when rownum  =  64e5 + 1 then 'B'
                when rownum <= 128e5     then 'C' 
        end     letter
from 
        rowgen_cte a
cross join 
        rowgen_cte b 
where 
        rownum <= 128e5
;

create index one_letter_idx on range_scan_me(one, letter) nologging;

The table has 12.8 million rows. Of the two columns the first always holds the value 1, the second has one row holding the value ‘B’, and 6.4M rows each holding ‘A’ and ‘C’. On my laptop it took about 20 seconds to create the table and 26 seconds to create the index; using a total of roughly 376 MB (29,000 blocks for the index, 18,500 blocks for the (compressed) table).

Since this is running on 19.3 Oracle will have created basic statistics on the table and index as it created them. Significantly, though, the statistics created during data loading do not include histograms so the optimizer will not know that ‘B’ is a special case, all it knows is that there are three possible values for the letter column.

Time now to query the data:

set serveroutput off
alter session set statistics_level=all;

select 
        /*+ index_rs_asc(t1 (one, letter)) */ 
        letter, one
from 
        range_scan_me t1
where   one >= 1
and     letter = 'B'
/

select * from table(dbms_xplan.display_cursor(format=>'hint_report allstats last'));

I’ve told the optimizer to use an index range scan, using the “description” method to specify the index I want it to use. The hint is definitely valid, and the index can definitely be used in this way to get the correct result. But here’s the execution plan:

------------------------------------------------------------------------------------------------------
| Id  | Operation        | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                |      1 |        |      1 |00:00:00.01 |       8 |      4 |
|*  1 |  INDEX SKIP SCAN | ONE_LETTER_IDX |      1 |   4266K|      1 |00:00:00.01 |       8 |      4 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ONE">=1 AND "LETTER"='B' AND "ONE" IS NOT NULL
       filter("LETTER"='B')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
         U -  index_rs_asc(t1 (one, letter))

The plan gives us two surprises: first it ignores (and reports that it is ignoring) a perfectly valid hint. Secondly it claims to be using an index skip scan even though the common understanding of a skip scan is that it will be used when “the first column of the index doesn’t appear in the where clause”.

We can infer that the plan is truthful because it has taken only 8 buffer visits to get the result – that’s probably a probe down to the (1,’B’) index entry, then another probe to see if the last index leaf block has any entries in it where column one is greater than 1.

But there are a couple of little oddities about this “ignoring the index” line. First, if we hadn’t hinted the query at all it would have done a tablescan, so the “index” bit of the hint is being obeyed even if the “rs” bit isn’t. Then there’s this:

select 
        /*+ index_rs_desc(t1 (one, letter)) */ 
        letter, one
from 
        range_scan_me t1
where   one >= 1
and     letter = 'B'
/

-------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                |      1 |        |      1 |00:00:00.01 |       8 |
|*  1 |  INDEX SKIP SCAN DESCENDING| ONE_LETTER_IDX |      1 |   4266K|      1 |00:00:00.01 |       8 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ONE">=1 AND "LETTER"='B' AND "ONE" IS NOT NULL)
       filter("LETTER"='B')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
         U -  index_rs_desc(t1 (one, letter))

If we change the index_rs_asc() to index_rs_desc(), the optimizer still ignores the “range scan” bit of the hint, but honours the “descending” bit – we get an index skip scan descending.

Of course this example is a very extreme case – nevertheless it is a valid example of the optimizer behaving in a way that doesn’t seem very user-friendly. If we add ‘outline’ to the format options for the call to dbms_xplan.display_cursor() we’ll find that the index_ss_asc() and index_ss_desc() hints have been substituted for our attempted index_rs_asc() and index_rs_desc().

So, if we really are confident that an index range scan would work a lot better than an index skip scan what could we do. We could try telling it to use an index (posibly even an index range scan ascending), but not to do an index skip scan. Let’s test that and include the Outline Information in the execution plan:

select 
        /*+ index(t1) no_index_ss(t1) */
        letter, one
from 
        range_scan_me t1
where   one >= 1
and     letter = 'B'
;


select * from table(dbms_xplan.display_cursor(format=>'hint_report allstats last outline'));


---------------------------------------------------------------------------------------------
| Id  | Operation        | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                |      1 |        |      1 |00:00:00.78 |   14290 |
|*  1 |  INDEX RANGE SCAN| ONE_LETTER_IDX |      1 |   4266K|      1 |00:00:00.78 |   14290 |
---------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX(@"SEL$1" "T1"@"SEL$1" ("RANGE_SCAN_ME"."ONE" "RANGE_SCAN_ME"."LETTER"))
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ONE">=1 AND "LETTER"='B' AND "ONE" IS NOT NULL)
       filter("LETTER"='B')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index(t1)
           -  no_index_ss(t1)

It worked – we can see the index range scan, and we can see in the Buffers column of the plan why an index range scan was a bad idea – it’s taken 14,290 buffer visits to get the right result. If you check the index size I mentioned further up the page (, and think about how I defined the data, you’ll realise that Oracle has started an index range scan at the leaf block holding (1,B’) – which is half way along the index – and then walked every leaf block from there to the end of the index in an attempt to find any index entries with column one greater than 1.

The other thing to notice here is that the hint in the Outline Information is given as:

INDEX(@"SEL$1" "T1"@"SEL$1" ("RANGE_SCAN_ME"."ONE" "RANGE_SCAN_ME"."LETTER"))

This was the hint that appeared in the outline whether I used the index() hint or the index_rs_asc() hint in the query. Similarly, when I tried index_desc() or index_rs_desc() as the hint the outline reported index_desc() in both cases.

If I try adding just this hint to the query the plan goes back to a skip scan. It’s another case where the hints in the Outline Information (hence, possibly, an SQL Plan Baseline) don’t reproduce the plan that the outline claims to be describing.

Summary

Does Oracle ignore hints?

It looks as if the answer is still no, except it seems to think that a skip scan is just a special case of a range scan (and, from the previous article, a range scan is just a special case of a skip scan). So if you want to ensure that Oracle uses your preferred index strategy you may have to think about including various “no_index” hints to block the indexes you don’t want Oracle to use, and then no_index_ss() and no_index_ffs() to make sure it doesn’t use the wrong method for the index you do want to use. Even then you may find you don’t have quite enough options to block every index option that you’d like to block.

January 25, 2021

Index Hints

Filed under: CBO,dbms_xplan,Hints,Ignoring Hints,Index skip scan,Indexing,Oracle — Jonathan Lewis @ 4:59 pm GMT Jan 25,2021

I’ve lost count of the number of times I’ve reminded people that hinting (correctly) is hard. Even the humble /*+ index() */ hint and its close relatives are open to misunderstanding and accidental misuse, leading to complaints that “Oracle is ignoring my hint”.

Strange though it may seem, I’m still not 100% certain of what some of the basic index hints are supposed to do, and even the “hint report” in the most recent versions of dbms_xplan.display_xxx() hasn’t told me everything I’d like to know. So if you think you know all about hints and indexing this blog note is for you.

I’ll start with a brief, and approximate, timeline for the basic index hints – starting from 8.0

Version Hint
8.0index
8.1index_asc, index_desc, index_ffs, no_index
9.?index_ss, index_ss_asc, index_ss_desc
10.1no_index_ffs, no_index_ss
11.1index_rs_asc, index_rs_desc
Saving these for later
index_combine(8.0), index_join(9.0), use_nl_with_index, use_invisible_indexes,
parallel_index, local_indexes, index_stats, num_index_keys,
change_dupkey_error_index, ignore_row_on_dupkey_index,
domain_index_filter, domain_index_no_sort, domain_index_sort,
xmlindex_rewrite, xmlindex_rewrite_in_select, xmlindex_sel_idx_tbl

For completeness I’ve included the more exotic index-related hints in the list (without a version), and I’ve even highlighted the rarely seen use_nl_with_index() hint to remind myself to raise a rhetorical question about it at the end of this piece.

In this list you’ll notice that the only hint originally available directed the optimizer to access a table by index, but in 8.1 that changed so that we could

  1. tell the optimizer about indexes it should not use
  2. specify whether the index access should use the index in ascending or descending order
  3. use an index fast full scan.

In 9i Oracle then introduced the index skip scan, with the option to specify whether the skip scan should be in ascending or descending order. The index_ss hint seems to be no more than a synonym for the index_ss_asc hint (or should that be the other way round); as far as I can tell the index_ss() hint will not produce a descending skip scan.

You’ll note that there’s no hint to block an index skip scan until the hint no_index_ss() appears in 10g along with the no_index_ffs() hint to block the index fast full scan. Since 10g Oracle has become far more consistent about introducing both a “positive” and a “negative” version of any hints it introduces for new optimizer mechanisms.

Finally we get to 11g and if you search MOS you may still be able to find the bug note (4323868.8) that introduced the index_rs_asc() and index_rs_desc() hints for index range scan ascending and descending.

From MOS Doc 4323868.8: “This fix adds new hints to enforce that an index is selected only if a start/stop keys (predicates) are used: INDEX_RS_ASC INDEX_RS_DESC”

This was necessary because by this time the index() hint allowed the optimizer to decide for itself how to use an index and it was quite difficult to force it to use the strategy you really wanted.

It’s still a source of puzzlement to me that an explicit index() hint will sometimes be turned into an index_rs_asc() when you check the Outline Information from a call to dbms_xplan.display_xxx(), while at other times an explicit index_rs_asc() hint will be turned into a basic index() hint (which might not reproduce the original plan)!

The Warm-up

Here’s a little surprise that could only reveal itself in the 19c hint report – unless you were willing to read your way carefully through a 10053 (CBO) trace file in earlier versions of Oracle. It comes from a little investigation of the index_ffs() hint that I’ve kept repeating over the last 20 years.

rem
rem     Script:         c_indffs.sql
rem     Dated:          March 2001
rem     Author:         Jonathan Lewis
rem

create table t1
nologging
as
select 
        rownum                  id,
        rpad(mod(rownum,50),10) small_vc,
        rpad('x',50)            padding
from
        all_objects
where
        rownum <= 3000
;

alter table t1 modify id not null;

create index t_i1 on t1(id);
create index t_i2 on t1(small_vc,id);

set autotrace traceonly explain

select 
        count(small_vc)
from    t1
where
        id > 2750
;

select 
        /*+ index(t1) */
        count(small_vc)
from    t1
where
        id > 2750
;

select 
        /*+ index_ffs(t1) */
        count(small_vc)
from    t1
where
        id > 2750
;

select 
        /*+ index_ffs(t1) no_index(t1) */
        count(small_vc)
from    t1
where
        id > 2750
;

set autotrace off

I’ve created a table with two indexes, and then enabled autotrace to get the execution plans for 4 queries that vary only in their hinting. Here’s the plan (on 19.3, with my settings for system statistics) for the first query:

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |    15 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T_I2 |   250 |  3750 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">2750)

The plan shows an index fast full scan on the t_i2 (two-column) index. If I add an index() hint to this query do you think that would allow Oracle to continue using the index fast full scan, or will it force Oracle into some other path.

Here’s the plan for the query hinted with index(t1):

---------------------------------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |     1 |    15 |     5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |      |     1 |    15 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |   250 |  3750 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T_I1 |   250 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("ID">2750)

The optimizer has chosen an index range scan on the (single-column) t1 index. Since this path costs more than the index fast full scan it would appear that the index() hint does not allow the optimizer to consider an index fast full scan.

So we might decide that an index_ffs() hint is appropriate to secure the plan we want – and here’s the plan we get with that hint:

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |    15 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T_I2 |   250 |  3750 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">2750)

As expected we get the index fast full scan we wanted. But we might want to add belts and braces – let’s include a no_index() hint to make sure that the optimizer doesn’t consider any other strategy for using an index. Since we’ve seen that the index() hint isn’t associated with the index fast full scan path it seems reasonable to assume that the no_index() hint is also not associated with the index fast full scan path.

Here’s the plan we get from the final variant of my query with index_ffs(t1) no_index(t1):

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |    15 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T_I2 |   250 |  3750 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">2750)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (2))
---------------------------------------------------------------------------
   2 -  SEL$1 / T1@SEL$1
         U -  index_ffs(t1) / hint conflicts with another in sibling query block
         U -  no_index(t1) / hint conflicts with another in sibling query block

The query has produced the execution plan we wanted – but only by “accident”. The hint report (which, by default, is the version that reports only the erroneous or unused hints) tells us that both hints have been ignored because they each conflict with some other hint in a “sibling” query block. In this case they’re conflicting with each other.

So the plan we get was our original unhinted plan – which made it look as if we’d used exactly the right hints to ensure that we’d made the plan completely reproducible. But our hints were wrong, and such (previously invisible) errors can easily lead to complaints about the optimizer ignoring hints.

The Main Event

The previous section was about an annoying little inconsistency in the way in which the “negative” version of a hint may not correspond exactly to the “postive” version. There’s a more worrying issue to address when you try to be more precise in your use of basic index hints.

We’ve seen that an index() hint could mean almost anything other than an index fast full scan, while a no_index() hint (probably) blocks all possible uses of an index, but would you expect an index_rs_asc() hint to produce a skip scan, or an index_ss_asc() hint to produce a range scan? Here’s another old script of mine to create some data and test some hints:

rem
rem     Script:         skip_scan_anomaly.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2009
rem

create table t1
as
with generator as (
        select  --+ materialize
                rownum  id
        from    all_objects 
        where   rownum <= 3000  -- > hint to avoid wordpress format issue
)
select
        mod(rownum,300)                                 addr_id300,
        mod(rownum,200)                                 addr_id200,
        mod(rownum,100)                                 addr_id100,
        mod(rownum,50)                                  addr_id050,
        trunc(sysdate) + trunc(mod(rownum,2501)/3)      effective_date,
        lpad(rownum,10,'0')                             small_vc,
        rpad('x',050)                                   padding
--      rpad('x',100)                                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 250000   -- > hint to avoid wordpress format issue
;

create index t1_i1 on t1(effective_date);
create index t1_i300 on t1(addr_id300, effective_date);
create index t1_i200 on t1(addr_id200, effective_date);
create index t1_i100 on t1(addr_id100, effective_date);
create index t1_i050 on t1(addr_id050, effective_date);

I’ve created a table with rather more indexes than I’ll be using. The significant indexes are t1_i1(effective_date), and t1_i050(addr_id050, effective_date). The former will be available for range scans the latter for skip scans when I test queries with predicates only on effective_date.

Choice of execution path can be affected by the system stats, so I need to point out that I’ve set mine with the following code:

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',500);
exception
        when others then null;  -- with apologies to Tom Kyte
end;
/

And I’ll start with a couple of “baseline” queries and execution plans:

explain plan for
select 
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

select * from table(dbms_xplan.display(format=>'hint_report'));

alter index t1_i1 invisible;

explain plan for
select 
        /*+ index(t1) */
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

You’ll notice at line 11 I’ve made the t1_i1 index invisible, and it will stay that way for a couple more tests. Here are the first two execution plans:

Unhinted
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1500 | 28500 |   428   (9)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  1500 | 28500 |   428   (9)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00',
              'syyyy-mm-dd hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Hinted with index(t1)
-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |  1500 | 28500 |  1558   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1      |  1500 | 28500 |  1558   (1)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN                   | T1_I050 |  1500 |       |    52   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
       filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index(t1)

Unhinted I’ve managed to rig the data and system stats so that the first path is a full tablescan; then, when I add the generic index(t1) hint Oracle recognises and uses the hint in the best possible way, picking the lowest cost index skip scan.

A variation I won’t show here – if I change the hint to index_rs_asc(t1) the optimizer recognizes there is no (currently visible) index that could be used for an index range scan and does a full tablescan, reporting the hint as unused. It won’t try to substitute a skip scan for a range scan.

What happens if I now try the index_ss(t1) hint without specifying an index. Firstly with the t1_i1 index still invisible, then after making t1_i1 visible again:

explain plan for
select 
        /*+ index_ss(t1) */
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

select * from table(dbms_xplan.display(format=>'hint_report'));

Here are the two execution plans, first when t1_i1(effective_date) is still invisible:

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |  1500 | 28500 |  1558   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1      |  1500 | 28500 |  1558   (1)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN                   | T1_I050 |  1500 |       |    52   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
       filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index_ss(t1)

As you might expect the optimizer has picked the t1_i050 index for a skip scan. (There are 3 other candidates for the skip scan, but since the have more distinct values for their leading column they are all turn out to have a higher cost than t1_i050).

So let’s make the t1_i1 index visible and see what the plan looks like:

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |  1500 | 28500 |   521   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |  1500 | 28500 |   521   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |  1500 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
         U -  index_ss(t1)

The optimizer picks an index range scan using the t1_i1 index, and reports the hint as unused! For years I told myself that an index skip scan was derived as a small collection of range scans, so an index range was technically a “degenerate” skip scan i.e. one where the “small collection” consisted of exactly one element. Oracle 19c finally told me I was wrong – the optimizer is ignoring the hint.

The fact that it’s a sloppy hint and you could have been more precise is irrelevant – if the optimizer won’t do a skip scan when you specify a range scan (but watch out for the next “index hints” installment – see footnote) it shouldn’t do a range scan when you specify a skip scan (but that’s just a personal opinion).

We should check, of course, that a precisely targeted skip scan hint works before complaining too loudly – would index_ss(t1 t1_i050), or index_ss_t1 t1_i300) work when there’s a competing index that could produce a lower cost range scan? The answer is yes.

explain plan for
select 
        /*+ index_ss(t1 t1_i050) */
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

select * from table(dbms_xplan.display(format=>'hint_report'));

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |  1500 | 28500 |  1558   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1      |  1500 | 28500 |  1558   (1)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN                   | T1_I050 |  1500 |       |    52   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
       filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index_ss(t1 t1_i050)

If you specify a suitable index in the index_ss() hint then the optimizer will use it and won’t switch to the index range scan. You can, of course, specify the index by description rather than name, so the hint /*+ index_ss(t1 (addr_id050, effective_date)) */ or even a partial description like /*+ index_ss(t1 (addr_id050)) */ would have been equally valid and obeyed.

How much do you know?

I’ll finish off with a rhetorical question, which I’ll introduce with this description take from the 19c SQL Tuning Guide section 9.2.1.6:

The related hint USE_NL_WITH_INDEX(table index) hint instructs the optimizer to join the specified table to another row source with a nested loops join using the specified table as the inner table. The index is optional. If no index is specified, then the nested loops join uses an index with at least one join predicate as the index key.

An intuitive response to this hint would be to assume that most people expect nested loops to use index unique scans or range scans into the second table. So what would your initial expectation be about the validity of use_nl_with_index() if the only way the index could be used was with an index skip scan, or a full scan, or a fast full scan. What if there were two join predicates and there’s a path which could do a nested loop if it used two indexes to do an index join (index_join()) or an index bitmap conversion (index_combine()). Come to that, how confident are you that the hint will work if the index specified is a bitmap index?

Summary

It’s important to be as accurate and thorough as possible when using hints. Even when a hint is documented you may find that you can asked “what if” questions about the hint and find that the only way to get answers to your questions is to do several experiments.

If you’re going to put hints into production code, take at least a little time to say to yourself:

“I know what I want and expect this hint to do; are there any similar actions that it might also be allowed to trigger, and how could I check if I need to allow for them or block them?”

Footnote: This journey of rediscovery was prompted by an email from Kaley Crum who supplied me with an example of Oracle using an index skip scan when it had been hinted to do an index range scan.

January 20, 2021

Hint Errors

Filed under: 19c,dbms_xplan,Execution plans,Hints,Ignoring Hints,Oracle — Jonathan Lewis @ 11:06 am GMT Jan 20,2021

This is a list of possible explanations of errors that you might see in the Hint Report section of an execution plan. It’s just a list of the strings extracted from a chunk of the 19.3 executable around the area where I found something I knew could be reported, so it may have some errors and omissions – but there are plenty of things there that might give you some idea why (in earlier versions of Oracle) you might have seen Oracle “ignoring” a hint:

internally generated hint is being cleared
hint conflicts with another in sibling query block
hint overridden by another in parent query block
conflicting optimizer mode hints
duplicate hint
all join methods are excluded by hints
index specified in the hint doesn't exist
index specified in hint cannot be parallelized
incorrect number of indexes for AND_EQUAL
partition view set up
FULL hint is same as INDEX_FFS for IOT
access path is not supported for IOT
hint on view cannot be pushed into view
hint is discarded during view merging
duplicate tables in multi-table hint
conditions failed for array vector read
same QB_NAME hints for different query blocks
rejected by IGNORE_OPTIM_EMBEDDED_HINTS
specified number must be positive integer
specified number must be positive number
specified number must be >= 0 and <= 1
hint is only valid for serial SQL
hint is only valid for slave SQL
hint is only valid for dyn. samp. query
hint is only valid for update join ix qry
opt_estimate() without value list
opt_estimate() with conflicting values spec
hint overridden by NO_QUERY_TRANSFORMATION
hinted query block name is too long
hinted bitmap tree wasn't fully resolved
bitmap tree specified was invalid
Result cache feature is not enabled
Hint is valid only for select queries
Hint is not valid for this query block
Hint cannot be honored
Pred reorder hint has semantic error
WITH_PLSQL used in a nested query
ORDER_SUBQ with less than two subqueries
conflicting OPT_PARAM hints
conflicting optimizer_feature_enable hints
because of _optimizer_ignore_parallel_hints
conflicting JSON_LENGTH hints

Update August 2021 – New items in 21.3

Hint id larger than number of union groups
ORDER_KEY_VECTOR_USE with less than two IDs
ORDER_SUBQ referenced query block name, which cannot be found
Same table referenced in both lists

February 24, 2020

Fake Baselines – 2

Filed under: Bugs,dbms_xplan,Execution plans,Hints,Ignoring Hints,Oracle — Jonathan Lewis @ 3:38 pm GMT Feb 24,2020

Several years ago (2011) I wrote a note describing how you could attach the Outline Information from one query to the SQL_ID of another query using the official Oracle mechanism of calling dbms_spm.load_plans_from_cursor_cache(). Shortly after publishing that note I drafted a follow-up note with an example demonstrating that even when the alternative outline was technically relevant the optimizer might still fail to use the SQL Plan Baseline. Unfortunately I didn’t quite finish the draft – until today.

The example I started with nearly 10 years ago behaved correctly against 11.1.0.7, but failed to reproduce the plan when I tested it against 11.2.0.3, and [ed: Dec 2021] it still fails against 21.3.0.0. Here’s the test data and the query we’re going to attempt to manipulate:


rem
rem     Script:         fake_sql_baseline_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2010
rem

create table emp1 (
        dept_no         number /* not null */,
        sal             number,
        emp_no          number,
        padding         varchar2(200),
        constraint e1_pk primary key(emp_no)
)
;

create table emp2 (
        dept_no         number /* not null */,
        sal             number,
        emp_no          number,
        padding         varchar2(200),
        constraint e2_pk primary key(emp_no)
)
;

insert into emp1
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',200)
from
        all_objects
where
        rownum <= 20000 -- > comment to avoid wordpress format issue
;

insert into emp2
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',200)
from
        all_objects
where
        rownum <= 20000 -- > comment to avoid wordpress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname                 => user,
                tabname                 => 'EMP1',
                cascade                 => true,
                method_opt              =>'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname                 => user,
                tabname                 => 'EMP2',
                cascade                 => true,
                method_opt              =>'for all columns size 1'
        );
end;
/

select
        /*+ target_query */
        count(*)
from
        emp1
where
        emp1.dept_no not in (
                select  dept_no
                from    emp2
        )
;

select * from table(dbms_xplan.display_cursor(null, null, 'outline'));

I haven’t included the code I run on my testbed to delete all existing SQL Plan Baselines before running this test, I’ll post that at the end of the article.

The query is very simple and will, of course, return no rows since emp1 and emp2 are identical and we’re looking for departments in emp1 that don’t appear in emp2. The “obvious” plan for the optimizer is to unnest the subquery into a distinct (i.e. aggregate) inline view then apply an anti-join. It’s possible that the optimizer will also decide to do complex view merging and postpone the aggregation. Here’s the execution plan from 19.3:


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |   168 (100)|          |
|   1 |  SORT AGGREGATE     |      |     1 |     6 |            |          |
|*  2 |   HASH JOIN ANTI NA |      |  3333 | 19998 |   168   (5)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMP1"."DEPT_NO"="DEPT_NO")

Hint Report (identified by operation id / Query Block Name / Object Alias):

Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  target_query

As expected the subquery unnested, we have the anti-join (in this case, since dept_no can be null, it’s a “Null Aware Anti Join”); and the optimizer has, indeed, decided to do the join before the aggregation.

Assume, now, that for reasons known only to me a merge (anti) join would be more effective than a hash (anti) join. To get the optimizer to do this I’m going to capture the query and associate it with a plan that uses a merge join. There are several minor variations on how we could do this, but I’m going to follow the steps I took in 2011 – but cut out a couple of the steps where I loaded redundant baselines into the SMB (SQLPlan Management Base). As a starting point I’ll just record the sql_id and plan_hash_value for the query (and the child_number just in case I want to use dbms_xplan.display_cursor() to report the in-memory execution plan):

column  sql_id                  new_value       m_sql_id_1
column  plan_hash_value         new_value       m_plan_hash_value_1
column  child_number            new_value       m_child_number_1

select
        sql_id, plan_hash_value, child_number
from
        v$sql
where
        sql_text like '%target_query%'
and     sql_text not like '%v$sql%'
and     rownum = 1
;

Now I’ll hack the query to produce a plan that does the merge join. An easy first step is to look at the current outline and take advantage of the hints there. You’ll notice I included the ‘outline’ format in my call to dbms_xplan.display_cursor() above, even though I didn’t show you that part of the output – here it is now:

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "EMP1"@"SEL$1")
      FULL(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "EMP1"@"SEL$1" "EMP2"@"SEL$2")
      USE_HASH(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      END_OUTLINE_DATA
  */

So I’m going to take the useful-looking hints, get rid of the use_hash() hint and, for good measure, turn it into a no_use_hash() hint. Here’s the resulting query, with its execution plan:

select
        /*+
                unnest(@sel$2)
                leading(@sel$5da710d3 emp1@sel$1 emp2@sel$2)
                no_use_hash(@sel$5da710d3 emp2@sel$2)
                full(@sel$5da710d3 emp2@sel$2)
                full(@sel$5da710d3 emp1@sel$1)
                alternate_query
        */
        count(*)
from
        emp1
where
        emp1.dept_no not in (
                select  dept_no
                from    emp2
        )
;

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |   178 (100)|          |
|   1 |  SORT AGGREGATE      |      |     1 |     6 |            |          |
|   2 |   MERGE JOIN ANTI NA |      |  3333 | 19998 |   178  (11)| 00:00:01 |
|   3 |    SORT JOIN         |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|*  5 |    SORT UNIQUE       |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("EMP1"."DEPT_NO"="DEPT_NO")
       filter("EMP1"."DEPT_NO"="DEPT_NO")

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "EMP1"@"SEL$1")
      FULL(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "EMP1"@"SEL$1" "EMP2"@"SEL$2")
      USE_MERGE(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      END_OUTLINE_DATA
  */

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  alternate_query

Note that I’ve included the text “alternative_query” at the end of the hint list as something to use when I’m searaching v$sql. Note also, that the “no_use_hash()” hint has disappeared and has been replaced by “use_merge()” hint.

The plan tells us that the optimizer is happy to use a “merge join anti NA”, so we can load this plan’s outline iinformation into the SMB by combining the sql_id and plan_hash_value for this query with (for older versions of Oracle, though you can now use the sql_id in recent versions) the text of the previous query so that we can store the new plan. with the old text:


column  sql_id                  new_value       m_sql_id_2
column  plan_hash_value         new_value       m_plan_hash_value_2
column  child_number            new_value       m_child_number_2

select
        sql_id, plan_hash_value, child_number
from
        v$sql
where
        sql_text like '%alternate_query%'
and     sql_text not like '%v$sql%'
and     rownum = 1
;

declare
        m_clob  clob;
begin
        select
                sql_fulltext
        into
                m_clob
        from
                v$sql
        where
                sql_id = '&m_sql_id_1'
        and     child_number = &m_child_number_1
        ;

        dbms_output.put_line(m_clob);

        dbms_output.put_line(
                'Number of plans loaded: ' ||
                dbms_spm.load_plans_from_cursor_cache(
                        sql_id                  => '&m_sql_id_2',
                        plan_hash_value         => &m_plan_hash_value_2,
                        sql_text                => m_clob,
                        fixed                   => 'YES',
                        enabled                 => 'YES'
                )
        );

end;
/

At this point we have one SQL Plan Baseline in the SMB, and it says the old query should execute usng the new plan. So let’s give it a go:

set serveroutput off
alter system flush shared_pool;

alter session set events '10053 trace name context forever';

select
        /*+ target_query */
        count(*)
from
        emp1
where
        emp1.dept_no not in (
                select  dept_no
                from    emp2
        )
/

alter session set events '10053 trace name context off';

select * from table(dbms_xplan.display_cursor(null, null, 'alias outline'));

I’ve enabled the 10053 (optimizer) trace so that I can report a critical few lines from it later on. Here’s the execution plan, omitting the Outline Information but including the Query Block / Alias information.

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |   168 (100)|          |
|   1 |  SORT AGGREGATE     |      |     1 |     6 |            |          |
|*  2 |   HASH JOIN ANTI NA |      |  3333 | 19998 |   168   (5)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$5DA710D3
   3 - SEL$5DA710D3 / EMP1@SEL$1
   4 - SEL$5DA710D3 / EMP2@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMP1"."DEPT_NO"="DEPT_NO")

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  target_query

Note
-----
   - Failed to use SQL plan baseline for this statement

We haven’t used the SQL Plan Baseline – and in 19.3 we even have a note that the optimizer knew there was at least one baseline available that it failed to use! So what went wrong?

I have two diagnostics – first is the content of the baseline itself (warning – the SQL below will report ALL currently saved SQL Plan Baselines); I’ve just made sure that I have only one to report:

set linesize 90

select
        pln.*
from
        (select sql_handle, plan_name
         from   dba_sql_plan_baselines spb
         order by
                sql_handle, plan_name
        ) spb,
        table(dbms_xplan.display_sql_plan_baseline(spb.sql_handle, spb.plan_name)) pln
;


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
SQL handle: SQL_ce3099e9e3bdaf2f
SQL text: select         /*+ target_query */         count(*) from         emp1
          where         emp1.dept_no not in (                 select  dept_no
                        from    emp2         )
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_cwc4tx7jvvbtg02bb0c12         Plan id: 45812754
Enabled: YES     Fixed: YES     Accepted: YES     Origin: MANUAL-LOAD-FROM-CURSOR-CACHE
Plan rows: From dictionary
--------------------------------------------------------------------------------

Plan hash value: 1517539632

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |   178 (100)|          |
|   1 |  SORT AGGREGATE      |      |     1 |     6 |            |          |
|   2 |   MERGE JOIN ANTI NA |      |  3333 | 19998 |   178  (11)| 00:00:01 |
|   3 |    SORT JOIN         |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|*  5 |    SORT UNIQUE       |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("EMP1"."DEPT_NO"="DEPT_NO")
       filter("EMP1"."DEPT_NO"="DEPT_NO")

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  alternate_query

We have an SQL Plan baseline that is accepted, enabled, and fixed; and it’s supposed to produce a “merge join anti NA”, and it clearly “belongs” to our query. So it should have been used.

Then we have the 10053 trace file, in which we find the following:


SPM: planId in plan baseline = 45812754, planId of reproduced plan = 1410137244
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : TEST_USER
  plan_baseline signature  : 14857544400522555183
  plan_baseline plan_id    : 45812754
  plan_baseline hintset    :
    hint num  1 len 27 text: IGNORE_OPTIM_EMBEDDED_HINTS
    hint num  2 len 35 text: OPTIMIZER_FEATURES_ENABLE('19.1.0')
    hint num  3 len 20 text: DB_VERSION('19.1.0')
    hint num  4 len  8 text: ALL_ROWS
    hint num  5 len 29 text: OUTLINE_LEAF(@"SEL$5DA710D3")
    hint num  6 len 16 text: UNNEST(@"SEL$2")
    hint num  7 len 17 text: OUTLINE(@"SEL$1")
    hint num  8 len 17 text: OUTLINE(@"SEL$2")
    hint num  9 len 36 text: FULL(@"SEL$5DA710D3" "EMP1"@"SEL$1")
    hint num 10 len 36 text: FULL(@"SEL$5DA710D3" "EMP2"@"SEL$2")
    hint num 11 len 54 text: LEADING(@"SEL$5DA710D3" "EMP1"@"SEL$1" "EMP2"@"SEL$2")
    hint num 12 len 41 text: USE_MERGE(@"SEL$5DA710D3" "EMP2"@"SEL$2")

During optimization the optimizer has found that SQL Plan Baseline. We can see that the hints in the baseline are exactly the hints from the plan that we wanted – but the optimizer says it can’t reproduce the plan we wanted. In fact if you try adding these hints (omitting the first three) to the query itself in old versions of Oracle you’ll find that the merge join will appear as expected; but it won’t appear in versions newer than 11.1.0.7 and Oracle will use a hash join. If you want a merge join you’ll have to use the /*+ no_use_hash() */ hint and, as you can see, that won’t get into the SQL Plan Baseline.

Conclusion

This is just a simple example of how the optimizer may be able to produce a plan if hinted in one way, but the Outline Information  consists of a different set of hints that won’t reproduce the plan they seem to describe. My no_use_hash() hint has been discarded and replaced with a use_merge() hint that fails to reproduce the merge join in circumstances that makes me think there’s a bug in the optimizer.

If you happen to be unlucky you may find that the plan you really need to see can’t be forced through a SQL Plan Baseline. In this example it may be necessary to use the SQL Patch mechanism to include the no_use_hash() hint in a set of hints you associate with the query.

 

December 30, 2019

Scalar Subq Bug

Filed under: CBO,dbms_xplan,Execution plans,Oracle,subqueries — Jonathan Lewis @ 9:30 am GMT Dec 30,2019

This is an observation that came up on the Oracle Developer Forum a couple of days ago, starting life as the fairly common problem:

I have a “select” that runs quickly  but when I use in a “create as select” it runs very slowly.

In many cases this simply means that the query was a distributed (or remote) query and the plan changed because the driving site changed from the remote server to the local server. There are a couple of other reasons why the plan used for a query changes when it is used in a CTAS, but distributed DML is the one most commonly seen.

In this example, though, the query was not a distributed query, it was a fully local query. There were three features to the query that were possibly suspect, though:

  • “ANSI” syntax
  • scalar subqueries in the select list
  • redundant “order by” clauses in inline views

The OP had supplied the (horrible) SQL in a text format along with images from the Enterprise Manager SQL Monitor screen showing the two execution plans – and two things were  obvious from the plans: first that the simple select had eliminated the scalar subqueries (which were redundant) while the CTAS had kept them in the plan, and secondly most of the elapsed time for the CTAS was spent in lots of executions of the scalar subqueries.

My first thought was that the problem was probably a quirk of how the optimizer translates “ANSI” SQL to Oracle-standard SQL, so I created a model that captured the key features of the problem – starting with 3 tables:

rem
rem     Script:         ctas_scalar_subq.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2019
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

create table t1 as
select * from all_objects
where rownum <= 10000 -- > comment to avoid wordpress format issue
;

alter table t1 add constraint t1_pk primary key(object_id);

create table t2 as
select * from t1
;

alter table t2 add constraint t2_pk primary key(object_id);

create table t3 as
select * from all_objects
where rownum <= 500 -- > comment to avoid wordpress format issue
;

alter table t3 add constraint t3_pk primary key(object_id);

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T2',
                method_opt  => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T3',
                method_opt  => 'for all columns size 1'
        );
end;
/

I’m going to use the small t3 table as the target for a simple scalar subquery in the select list of a query that selects some columns from t2; then I’m going to use that query as an inline view in a join to t1 and select some columns from the result. Here’s the starting query that’s going to become an inline view:


select 
        t2.*,
        (
        select  t3.object_type 
        from    t3 
        where   t3.object_id = t2.object_id
        )       t3_type
from
        t2
order by
        t2.object_id
;

And here’s how I join the result to t1:


explain plan for
        select
                v2.*
        from    (
                select
                        t1.object_id,
                        t1.object_name  t1_name,
                        v1.object_name  t2_name,
                        t1.object_type  t1_type,
                        v1.object_type  t2_type
                from
                        t1
                join (
                        select 
                                t2.*,
                                (
                                select  t3.object_type 
                                from    t3 
                                where   t3.object_id = t2.object_id
                                )       t3_type
                        from
                                t2
                        order by
                                t2.object_id
                )       v1
                on
                        v1.object_id = t1.object_id
                and     v1.object_type = 'TABLE'
                )       v2
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

The initial t2 query becomes an inline view called v1, and that becomes the second table in a join with t1. I’ve got the table and view in this order because initially the OP had an outer (left) join preserving t1 and I thought that that might be significant, but it turned out that it wasn’t.

Having joined t1 and v1 I’ve selected a small number of columns from the t1 and t2 tables and ignored the column that was generated by the inline scalar subquery. (This may seem a little stupid – but the same problem appears when the inline view is replaced with a stored view, which is a more realistic possibility.) Here’s the resulting execution plan (taken from 11.2.0.4 in this case):


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   476 | 31416 |    45  (12)| 00:00:01 |
|*  1 |  HASH JOIN           |      |   476 | 31416 |    45  (12)| 00:00:01 |
|   2 |   VIEW               |      |   476 | 15708 |    23  (14)| 00:00:01 |
|   3 |    SORT ORDER BY     |      |   476 | 41888 |    23  (14)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| T2   |   476 | 41888 |    22  (10)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | T1   | 10000 |   322K|    21   (5)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   4 - filter("T2"."OBJECT_TYPE"='TABLE')

I was a little surprised by this plan as I had expected the optimizer to eliminate the in-line “order by” in view v1 – but even when I changed the code to traditional Oracle join syntax the redundant and wasteful sort order by at operaton 3 still took place. (You might note that the data will be reported in an order dictated by the order of the data arriving from the t1 tablescan thanks to the mechanics of a hash join, so the sort is a total waste of effort.)

The plus point, of course, is that the optimizer had been smart enough to eliminate the scalar subquery referencing t3. The value returned from t3 is not needed anywhere in the course of the execution, so it simply disappears.

Now we change from a simple select to a Create as Select (CTAS) which I’ve run, with rowsource execution stats enabled, using Oracle 19.3 for this output:

set serveroutput off
set linesize 156
set trimspool on
set pagesize 60

alter session set statistics_level = all;

create table t4 as
        select  
                v2.*
        from    (
                select
                        t1.object_id,
                        t1.object_name  t1_name,
                        v1.object_name  t2_name,
                        t1.object_type  t1_type,
                        v1.object_type  t2_type
                from
                        t1
                join (
                        select 
                                t2.*,
                                (
                                select  t3.object_type 
                                from    t3 
                                where   t3.object_id = t2.object_id
                                )       t3_type
                        from
                                t2
                        order by 
                                t2.object_id
                )       v1
                on
                        v1.object_id = t1.object_id
                and     v1.object_type = 'TABLE'
                )       v2
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter session set statistics_level = typical;

And here’s the run-time execution plan – showing the critical error and statistics to prove that it really happened:

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Writes |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | CREATE TABLE STATEMENT           |       |      1 |        |      0 |00:00:00.01 |     471 |      3 |       |       |          |
|   1 |  LOAD AS SELECT                  | T4    |      1 |        |      0 |00:00:00.01 |     471 |      3 |  1042K|  1042K| 1042K (0)|
|   2 |   OPTIMIZER STATISTICS GATHERING |       |      1 |    435 |    294 |00:00:00.01 |     414 |      0 |   256K|   256K|  640K (0)|
|*  3 |    HASH JOIN                     |       |      1 |    435 |    294 |00:00:00.01 |     414 |      0 |  1265K|  1265K| 1375K (0)|
|   4 |     VIEW                         |       |      1 |    435 |    294 |00:00:00.01 |     234 |      0 |       |       |          |
|   5 |      TABLE ACCESS BY INDEX ROWID | T3    |    294 |      1 |     50 |00:00:00.01 |      54 |      0 |       |       |          |
|*  6 |       INDEX UNIQUE SCAN          | T3_PK |    294 |      1 |     50 |00:00:00.01 |       4 |      0 |       |       |          |
|   7 |      SORT ORDER BY               |       |      1 |    435 |    294 |00:00:00.01 |     234 |      0 | 80896 | 80896 |71680  (0)|
|*  8 |       TABLE ACCESS FULL          | T2    |      1 |    435 |    294 |00:00:00.01 |     180 |      0 |       |       |          |
|   9 |     TABLE ACCESS FULL            | T1    |      1 |  10000 |  10000 |00:00:00.01 |     180 |      0 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   6 - access("T3"."OBJECT_ID"=:B1)
   8 - filter("T2"."OBJECT_TYPE"='TABLE')

You’ll notice that the VIEW at operation 4 reports the inline scalar subquery as operations 5 and 6, and the Starts column show that the scalar subquery executes 294 times – which is the number of rows returned by the scan of table t2. Although my first thought was that this was an artefact of the transformation from ANSI to Oracle syntax it turned out that when I modified the two statements to use traditional Oracle syntax the same difference appeared. Finally I re-ran the CTAS after removing the order by clause in the in-line view and the redundant subquery disappeared from the execution plan.

Tiny Geek bit

It’s not immediately obvious why there should be such a difference between the select and the CTAS in this case, but the 10053 trace files do give a couple of tiny clues the CTAS trace file includes the lines:

ORE: bypassed - Top query block of a DML.
TE: Bypassed: Top query block of a DML.
SQT:    SQT bypassed: in a transaction.

The first two suggest that we should expect some cases where DML statement optimise differently from simple queries. The last one is a further indication that differences may appear. (SQT – might this be subquery transformation, it doesn’t appear in the list of abbreviations in the trace file).

Unfortunately the SELECT trace file also included the line:


SQT:     SQT bypassed: Disabled by parameter.

So “SQT” – whatever that is – being in or out of a transaction may not have anything to do with the difference.

Summary

There are cases where optimising a select statement is not sufficient as a strategy for optimising a CTAS statement. In this case it looks as if an inline view which was non-mergable (thanks to a redundant order by clause) produced the unexpected side-effect that a completely redundant scalar subquery in the select list of the inline view was executed during the CTAS even though it was transformed out of existence for the simple select.

There are some unexpected performance threats in “cut-and-paste” coding and in re-using stored views if you haven’t checked carefully what they do and how they’re supposed to be used.

Update (Feb 2020)

I’ve just run the test on 19.8 (available here on LiveSQL) and the redundant scalarsubquery is still in the plan.

 

 

January 17, 2019

Hint Reports

Filed under: dbms_xplan,Execution plans,Hints,Oracle — Jonathan Lewis @ 9:59 am GMT Jan 17,2019

Nigel Bayliss has posted a note about a frequently requested feature that has now appeared in Oracle 19c – a mechanism to help people understand what has happened to their hints.  It’s very easy to use, it’s just another format option to the “display_xxx()” calls in dbms_xplan; so I thought I’d run up a little demonstration (using an example I first generated 18 years and 11 versions ago) to make three points: first, to show the sort of report you get, second to show you that the report may tell you what has happened, but that doesn’t necessarily tell you why it has happened, and third to remind you that you should have stopped using the /*+ ordered */ hint 18 years ago.

I’ve run the following code on livesql:


rem
rem     Script:         c_ignorehint.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2001
rem


drop table ignore_1;
drop table ignore_2;

create table ignore_1
nologging
as
select
        rownum          id,
        rownum          val,
        rpad('x',500)   padding
from    all_objects
where   rownum <= 3000
;

create table ignore_2
nologging
as
select
        rownum          id,
        rownum          val,
        rpad('x',500)   padding
from    all_objects
where   rownum <= 500
;

alter table ignore_2
add constraint ig2_pk primary key (id);


explain plan for
update
        (
                select
                        /*+
                                ordered
                                use_nl(i2)
                                index(i2,ig2_pk)
                        */
                        i1.val  val1,
                        i2.val  val2
                from
                        ignore_1        i1,
                        ignore_2        i2
                where
                        i2.id = i1.id
                and     i1.val <= 10
        )
set     val1 = val2
;

select * from table(dbms_xplan.display(null,null,'hint_report'));

explain plan for
update
        (
                select
                        /*+
                                use_nl(i2)
                                index(i2,ig2_pk)
                        */
                        i1.val  val1,
                        i2.val  val2
                from
                        ignore_1        i1,
                        ignore_2        i2
                where
                        i2.id = i1.id
                and     i1.val <= 10
        )
set     val1 = val2
;

select * from table(dbms_xplan.display(null,null,'hint_report'));

As you can see I’ve simply added the format option “hint_report” to the call to dbms_xplan.display(). Before showing you the output I’ll just say a few words about the plans we might expect from the two versions of the update statement.

Given the /*+ ordered */ hint in the first statement we might expect Oracle to do a full tablescan of ignore_1 then do a nested loop into ignore_2 (obeying the use_nl() hint) using the (hinted) ig2_pk index. In the second version of the statement, and in the absence of the ordered hint, it’s possible that the optimizer will still use the same path but, in principle, it might find some other path.

So what do we get ? In order here are the two execution plans:


Plan hash value: 3679612214
 
--------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                      |          |    10 |   160 |   111   (0)| 00:00:01 |
|   1 |  UPDATE                               | IGNORE_1 |       |       |            |          |
|*  2 |   HASH JOIN                           |          |    10 |   160 |   111   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| IGNORE_2 |   500 |  4000 |    37   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN                   | IG2_PK   |   500 |       |     1   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS STORAGE FULL          | IGNORE_1 |    10 |    80 |    74   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("I2"."ID"="I1"."ID")
   5 - storage("I1"."VAL"<=10)
       filter("I1"."VAL"<=10)
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$DA9F4B51
           -  ordered
 
   3 -  SEL$DA9F4B51 / I2@SEL$1
         U -  use_nl(i2)
           -  index(i2,ig2_pk)




Plan hash value: 1232653668
 
------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT              |          |    10 |   160 |    76   (0)| 00:00:01 |
|   1 |  UPDATE                       | IGNORE_1 |       |       |            |          |
|   2 |   NESTED LOOPS                |          |    10 |   160 |    76   (0)| 00:00:01 |
|   3 |    NESTED LOOPS               |          |    10 |   160 |    76   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL | IGNORE_1 |    10 |    80 |    74   (0)| 00:00:01 |
|*  5 |     INDEX UNIQUE SCAN         | IG2_PK   |     1 |       |     0   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID| IGNORE_2 |     1 |     8 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - storage("I1"."VAL"<=10)
       filter("I1"."VAL"<=10)
   5 - access("I2"."ID"="I1"."ID")
 
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   5 -  SEL$DA9F4B51 / I2@SEL$1
           -  index(i2,ig2_pk)
           -  use_nl(i2)

As you can see, the “Hint Report” shows us how many hints have been seen in the SQL text, then the body of the report shows us which query block, operation and table (where relevant) each hint has been associated with, and whether it has been used or not.

The second query has followed exactly the plan I predicted for the first query and the report has shown us that Oracle noted, and used, the use_nl() and index() hints to access table ignore2, deciding for itself to visit the tables in the order ignore_1 -> ignore_2, and doing a full tablescan on ignore_1.

The first query reports three hints, but flags the use_nl() hint as unused. (There is (at least) one other flag that could appear against a hint – “E” for error (probably syntax error), so we can assume that this hint is not being ignored because there’s something wrong with it.) Strangely the report tells us that the optimizer has used the ordered hint but we can see from the plan that the tables appear to be in the opposite order to the order we specified in the from clause, and the chosen order has forced the optimizer into using an index full scan on ig2_pk because it had to obey our index() hint.  Bottom line – the optimizer has managed to find a more costly plan by “using but apparently ignoring” a hint that described the cheaper plan that we would have got if we hadn’t used the hint.

Explanation

Query transformation can really mess things up and you shouldn’t be using the ordered hint.

I’ve explained many times over the years that the optimizer evaluates the cost of an update statement by calculating the cost of selecting the rowids of the rows to be updated. In this case, which uses an updatable join view, the steps taken to follow this mechanism this are slightly more complex.  Here are two small but critical extracts from the 10053 trace file (taken from an 18c instance):


CVM:   Merging SPJ view SEL$1 (#0) into UPD$1 (#0)
Registered qb: SEL$DA9F4B51 0x9c9966e8 (VIEW MERGE UPD$1; SEL$1; UPD$1)

...

SQE: Trying SQ elimination.
Query after View Removal
******* UNPARSED QUERY IS *******
SELECT
        /*+ ORDERED INDEX ("I2" "IG2_PK") USE_NL ("I2") */
        0
FROM    "TEST_USER"."IGNORE_2" "I2",
        "TEST_USER"."IGNORE_1" "I1"
WHERE   "I2"."ID"="I1"."ID"
AND     "I1"."VAL"<=10


The optimizer has merged the UPDATE query block with the SELECT query block to produce a select statement that will produce the necessary plan (I had thought that i1.rowid would appear in the select list, but the ‘0’ will do for costing purposes). Notice that the hints have been preserved as the update and select were merged but, unfortunately, the merge mechanism has reversed the order of the tables in the from clause. So the optimizer has messed up our select statement, then obeyed the original ordered hint!

Bottom line – the hint report is likely to be very helpful in most cases but you will still have to think about what it is telling you, and you may still have to look at the occasional 10053 to understand why the report is showing you puzzling results. You should also stop using a hint that was replaced by a far superior hint more than 18 years ago – the ordered hint in my example should have been changed to /*+ leading(i1 i2) */ in Oracle 9i.

Footnote: There are three variants on the hint report format: ‘hint_report’, ‘hint_report_used’, ‘hint_report_unused’: the first reports the fate of all the hints you included in the SQL, the last two report only the used or unused (respectively) hints. Unused hints may also show the error that made a hint unusable. (See also Franck Pachot’s blog note on the hint report)

 

November 26, 2018

Shrink Space

Filed under: dbms_xplan,Execution plans,Oracle,Performance,subqueries — Jonathan Lewis @ 4:37 pm GMT Nov 26,2018

I have never been keen on the option to “shrink space” for a table because of the negative impact it can have on performance.

I don’t seem to have written about it in the blog but I think there’s something in one of my books pointing out that the command moves data from the “end” of the table (high extent ids) to the “start” of the table (low extent ids) by scanning the table backwards to find data that can be moved and scanning forwards to find space to put it. This strategy can have the effect of increasing the scattering of the data that you’re interested in querying if most of your queries are about “recent” data, and you have a pattern of slowing deleting aging data. (You may end up doing a range scan through a couple of hundred table blocks for data at the start of the table that was once packed into a few blocks near the end of the table.)

In a discussion with a member of the audience at the recent DOAG conference (we were talking about execution plans for queries that included filter subqueries) I suddenly thought of another reason why (for an unlucky person) the shrink space command could be a disaster – here’s a little fragment of code and output to demonstrate the point.


rem
rem     Script:         shrink_scalar_subq.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2018
rem     Purpose:
rem
rem     Versions tested
rem             12.2.0.1
rem

select
        /*+ gather_plan_statistics pre-shrink */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from
                emp outer
        where
                outer.sal > (
                        select  /*+ no_unnest */
                                avg(inner.sal)
                        from
                                emp inner
                        where
                                inner.dept_no = outer.dept_no
                )
        )
;

alter table emp enable row movement;
alter table emp shrink space compact;

select
        /*+ gather_plan_statistics post-shrink  */
        count(*)
from    (
        select  /*+ no_merge */
                outer.*
        from emp outer
        where outer.sal >
                (
                        select /*+ no_unnest */ avg(inner.sal)
                        from emp inner
                        where inner.dept_no = outer.dept_no
                )
        )
;

The two queries are the same and the execution plans are the same (the shrink command doesn’t change the object statistics, after all), but the execution time jumped from 0.05 seconds to 9.43 seconds – and the difference in timing wasn’t about delayed block cleanout or other exotic side effects.


  COUNT(*)
----------
      9498

Elapsed: 00:00:00.05


  COUNT(*)
----------
      9498

Elapsed: 00:00:09.43

The query is engineered to have a problem, of course, and enabling rowsource execution statistics exaggerates the anomaly – but the threat is genuine. You may have seen my posting (now 12 years old) about the effects of scalar subquery caching – this is another example of the wrong item of data appearing in the wrong place making us lose the caching benefit. The emp table I’ve used here is (nearly) the same emp table I used in the 2006 posting, but the difference between this case and the previous case is that I updated a carefully selected row to an unlucky value in 2006, but here in 2018 the side effects of a call to shrink space moved a row from the end of the table (where it was doing no harm) to the start of the table (where it had a disastrous impact).

Here are the two execution plans – before and after the shrink space – showing the rowsource execution stats. Note particularly the number of times the filter subquery ran – jumping from 7 to 3172 – the impact this has on the buffer gets, and the change in time recorded:

----------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |      1 |00:00:00.03 |    1880 |
|   1 |  SORT AGGREGATE       |      |      1 |      1 |      1 |00:00:00.03 |    1880 |
|   2 |   VIEW                |      |      1 |    136 |   9498 |00:00:00.03 |    1880 |
|*  3 |    FILTER             |      |      1 |        |   9498 |00:00:00.03 |    1880 |
|   4 |     TABLE ACCESS FULL | EMP  |      1 |  19001 |  19001 |00:00:00.01 |     235 |
|   5 |     SORT AGGREGATE    |      |      7 |      1 |      7 |00:00:00.02 |    1645 |
|*  6 |      TABLE ACCESS FULL| EMP  |      7 |   2714 |  19001 |00:00:00.02 |    1645 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("OUTER"."SAL">)
   6 - filter("INNER"."DEPT_NO"=:B1)


----------------------------------------------------------------------------------------
| Id  | Operation             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |      1 |        |      1 |00:00:09.42 |     745K|
|   1 |  SORT AGGREGATE       |      |      1 |      1 |      1 |00:00:09.42 |     745K|
|   2 |   VIEW                |      |      1 |    136 |   9498 |00:00:11.71 |     745K|
|*  3 |    FILTER             |      |      1 |        |   9498 |00:00:11.70 |     745K|
|   4 |     TABLE ACCESS FULL | EMP  |      1 |  19001 |  19001 |00:00:00.01 |     235 |
|   5 |     SORT AGGREGATE    |      |   3172 |      1 |   3172 |00:00:09.40 |     745K|
|*  6 |      TABLE ACCESS FULL| EMP  |   3172 |   2714 |     10M|00:00:04.33 |     745K|
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("OUTER"."SAL">)
   6 - filter("INNER"."DEPT_NO"=:B1)


Footnote:

For completeness, here’s the code to generate the emp table. It’s sitting in a tablespace using system managed extents and automatic segment space management.


create table emp(
        dept_no         not null,
        sal,
        emp_no          not null,
        padding,
        constraint e_pk primary key(emp_no)
)
as
with generator as (
        select  null
        from    dual
        connect by
                level <= 1e4 -- > comment to avoid wordpress format issue
)
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',60)
from
        generator       v1,
        generator       v2
where
        rownum <= 2e4 -- > comment to avoid wordpress format issue
;


insert into emp values(432, 20001, 20001, rpad('x',60));
delete /*+ full(emp) */ from emp where emp_no <= 1000;      -- > comment to avoid wordpress format issue
commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'EMP',
                method_opt       => 'for all columns size 1'
        );
end;
/



 

May 11, 2016

dbms_xplan

Filed under: dbms_xplan,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 12:22 pm BST May 11,2016

My favourite format options for dbms_xplan.display_cursor().

This is another of those posts where I tell you about something that I’ve frequently mentioned but never documented explicitly as a good (or, at least, convenient) idea. It also another example of how easy it is to tell half the story most of the time when someone asks a “simple” question.

You’re probably familiar with the idea of “tuning by cardinality feedback” – comparing the predicted data volumes with the actual data volumes from an execution plan – and I wrote a short note about how to make that comparison last week; and you’re probably familiar with making a call to dbms_xplan.display_cursor() after enabling the capture of rowsource execution statistics (in one of three ways) for the execution of the query, and the format parameter usually suggested for the call is ‘allstats last’ to get the execution stats for the most recent execution of the query. I actually like to see the Cost column of the execution plan as well, so I usually add that to the format, so (with all three strategies shown for an SQL*Plus environment):

set linesize 180
set trimspool on
set pagesize 60
set serveroutput off

alter session set "_rowsource_execution_statistics"=true;
alter session set statistics_level=all;

select /*+ gather_plan_statistics */ * from user_tablespaces;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

So what do we often forget to mention:

  • For SQL*Plus it is important to ensure that serveroutput is off
  • The /*+ gather_plan_statistics */ option uses sampling, so may be a bit inaccurate
  • The two accurate strategies may add a significant, sometimes catastrophic, amount of CPU overhead
  • This isn’t appropriate if the query runs parallel

For a parallel query the “last” execution of a query is typically carried out by the query co-ordinator, so the rowsource execution stats of many (or all) of the parallel execution slaves are likely to disappear from the output. If you’re testing with parallel queries you need to add some “tag” text to the query to make it unique and omit the ‘last’ option from the format string.

Now, a common suggestion is that you need to add the ‘all’ format option instead – but this doesn’t mean “all executions” it means (though doesn’t actually deliver) all the data that’s available about the plan. So here’s an execution plans produced after running a parallel query and using ‘allstats all’ as the format option (t1 is a copy of all_objects, and this demo is running on 12.1.0.2).

SQL_ID  51u5j42rvnnfg, child number 1
-------------------------------------
select  /*+   parallel(2)  */  object_type,  sum(object_id) from t1
group by object_type order by object_type

Plan hash value: 2919148568

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |       |   113 (100)|          |        |      |            |     30 |00:00:00.04 |       5 |      0 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |       |            |          |        |      |            |     30 |00:00:00.04 |       5 |      0 |       |       |          |
|   2 |   PX SEND QC (ORDER)     | :TQ10001 |      0 |     30 |   420 |   113   (9)| 00:00:01 |  Q1,01 | P->S | QC (ORDER) |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   3 |    SORT GROUP BY         |          |      2 |     30 |   420 |   113   (9)| 00:00:01 |  Q1,01 | PCWP |            |     30 |00:00:00.01 |       0 |      0 |  2048 |  2048 |     2/0/0|
|   4 |     PX RECEIVE           |          |      2 |     30 |   420 |   113   (9)| 00:00:01 |  Q1,01 | PCWP |            |     50 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |      PX SEND RANGE       | :TQ10000 |      0 |     30 |   420 |   113   (9)| 00:00:01 |  Q1,00 | P->P | RANGE      |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   6 |       HASH GROUP BY      |          |      2 |     30 |   420 |   113   (9)| 00:00:01 |  Q1,00 | PCWP |            |     50 |00:00:00.05 |    1492 |   1440 |  1048K|  1048K|     2/0/0|
|   7 |        PX BLOCK ITERATOR |          |      2 |  85330 |  1166K|   105   (2)| 00:00:01 |  Q1,00 | PCWC |            |  85330 |00:00:00.03 |    1492 |   1440 |       |       |          |
|*  8 |         TABLE ACCESS FULL| T1       |     26 |  85330 |  1166K|   105   (2)| 00:00:01 |  Q1,00 | PCWP |            |  85330 |00:00:00.01 |    1492 |   1440 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   8 - SEL$1 / T1@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   8 - access(:Z>=:Z AND :Z<=:Z)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "OBJECT_TYPE"[VARCHAR2,23], SUM()[22]
   2 - (#keys=0) "OBJECT_TYPE"[VARCHAR2,23], SUM()[22]
   3 - (#keys=1; rowset=200) "OBJECT_TYPE"[VARCHAR2,23], SUM()[22]
   4 - (rowset=200) "OBJECT_TYPE"[VARCHAR2,23], SYS_OP_MSR()[25]
   5 - (#keys=1) "OBJECT_TYPE"[VARCHAR2,23], SYS_OP_MSR()[25]
   6 - (rowset=200) "OBJECT_TYPE"[VARCHAR2,23], SYS_OP_MSR()[25]
   7 - (rowset=200) "OBJECT_ID"[NUMBER,22], "OBJECT_TYPE"[VARCHAR2,23]
   8 - (rowset=200) "OBJECT_ID"[NUMBER,22], "OBJECT_TYPE"[VARCHAR2,23]

Note
-----
   - Degree of Parallelism is 2 because of hint


48 rows selected.

You’ll notice we’ve reported the “alias” and “projection” information – those are two of the format options that you can use with a + or – to include or exclude if you want. We’ve also got E-Bytes and E-time columns in the body of the plan. In other words (at least in my opinion) we’ve got extra information that makes the output longer and wider and therefore harder to read.

The format string I tend to use for parallel query is ‘allstats parallel cost’ – which (typically) gives something like the following:

SQL_ID  51u5j42rvnnfg, child number 1
-------------------------------------
select  /*+   parallel(2)  */  object_type,  sum(object_id) from t1
group by object_type order by object_type

Plan hash value: 2919148568

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows | Cost (%CPU)|    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |   113 (100)|        |      |            |     30 |00:00:00.04 |       5 |      0 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |            |        |      |            |     30 |00:00:00.04 |       5 |      0 |       |       |          |
|   2 |   PX SEND QC (ORDER)     | :TQ10001 |      0 |     30 |   113   (9)|  Q1,01 | P->S | QC (ORDER) |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   3 |    SORT GROUP BY         |          |      2 |     30 |   113   (9)|  Q1,01 | PCWP |            |     30 |00:00:00.01 |       0 |      0 |  2048 |  2048 |     2/0/0|
|   4 |     PX RECEIVE           |          |      2 |     30 |   113   (9)|  Q1,01 | PCWP |            |     50 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |      PX SEND RANGE       | :TQ10000 |      0 |     30 |   113   (9)|  Q1,00 | P->P | RANGE      |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   6 |       HASH GROUP BY      |          |      2 |     30 |   113   (9)|  Q1,00 | PCWP |            |     50 |00:00:00.05 |    1492 |   1440 |  1048K|  1048K|     2/0/0|
|   7 |        PX BLOCK ITERATOR |          |      2 |  85330 |   105   (2)|  Q1,00 | PCWC |            |  85330 |00:00:00.03 |    1492 |   1440 |       |       |          |
|*  8 |         TABLE ACCESS FULL| T1       |     26 |  85330 |   105   (2)|  Q1,00 | PCWP |            |  85330 |00:00:00.01 |    1492 |   1440 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   8 - access(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 2 because of hint


30 rows selected.

Of course you may prefer ‘allstats all’ – and sometimes I do actually want to see the alias or projection information – but I think there’s so much information available on the execution plan output that anything that makes it a little shorter, cleaner and tidier is a good thing.

You might have noticed, by the way, that the Buffers, Reads, and A-Time columns have still managed to lose information on the way up from operation 6; information that should have been summing up the plan has simply disappeared.  Make sure you do a sanity check for disappearing numbers when you’re looking at more complex plans.

Update (Dec 2019)

Things we forget to mention (and did in the above):

  • When comparing actuals with estimates (A-rows vs. E-rows) the basic guideline is “A-rows = E-rows * Starts”
  • This is not always true – partitioning and parallel query need extra thought.

 

 

November 26, 2014

Lunchtime quiz

Filed under: CBO,dbms_xplan,Oracle — Jonathan Lewis @ 12:41 pm GMT Nov 26,2014

There was a question on OTN a few days ago asking the following question:

Here’s a query that ran okay on 11g, but crashed with Oracle error “ORA-01843: not a valid month” after upgrade to 12c; why ?

The generically correct answer, of course, is that the OP had been lucky (or unlucky, depending on your point of view) on 11g – and I’ll explain that answer in another blog posting.

That isn’t the point of this posting, though. This posting is a test of observation and deduction. One of the respondants in the thread had conveniently supplied a little bit of SQL that I copied and fiddled about with to demonstrate a point regarding CPU costing, but as I did so I thought I’d show you the following and ask a simple question.’


drop table T;

Create Table T
As
with
periods as (
                  Select 'January' period, 1 cal  From Dual
        union all Select 'February' period , 2 cal From Dual
        union all Select 'March' period , 3 cal From Dual
        union all Select 'April' period , 4 cal From Dual
        union all Select 'May'  period, 5 cal From Dual
        union all Select 'June' period, 6 cal From Dual
        union all Select 'July' period, 7 cal From Dual
        union all Select 'August' period, 8 cal From Dual
        union all Select 'September' period, 9 cal  From Dual
        union all Select 'October' period, 10 cal From Dual
        union all Select 'November' period, 11 cal From Dual
        Union All Select 'December' Period, 12 Cal From Dual
        Union All Select '13 Series' Period, Null Cal  From Dual
)
Select  Period,Cal
from periods;

prompt  ==================================
prompt  When we invoke below SQL it works.
prompt  ==================================

set autotrace on explain

select *
from    (
        select
                Period,
                Cal,
                to_date(Period || ', ' || 2014,'Month, YYYY') col1 ,
                to_date('November, 2014','Month, YYYY') col2
        From  T
        Where  Cal > 0
        )
;

prompt  ================================================
prompt  But when we add comparison operations , it fails
prompt  ================================================

select *
from    (
        select
                Period,
                Cal,
                to_date(Period || ', ' || 2014,'Month, YYYY')   col1,
                to_date('November, 2014','Month, YYYY')         col2
        From  T
        Where  Cal > 0
        )
where
        col1 >= col2
;

set autotrace off



All I’ve done is create a table then run and generate the execution plans for two queries – with a comment that if you try to run one query it will succeed but if you try to run the other it will fail (and raise ORA-01843). As far as the original supplier was concerned, both queries succeeded in 11g and the failure of the second one appeared only in 12c. In fact, for reasons that I won’t discuss here, it is POSSIBLE for the failure to appear in 11g as well, though not necessarily with this exact data set.

Here’s the COMPLETE output I got from running the code above on an 11.2.0.4 instance:



Table dropped.


Table created.

==================================
When we invoke below SQL it works.
==================================

PERIOD           CAL COL1      COL2
--------- ---------- --------- ---------
January            1 01-JAN-14 01-NOV-14
February           2 01-FEB-14 01-NOV-14
March              3 01-MAR-14 01-NOV-14
April              4 01-APR-14 01-NOV-14
May                5 01-MAY-14 01-NOV-14
June               6 01-JUN-14 01-NOV-14
July               7 01-JUL-14 01-NOV-14
August             8 01-AUG-14 01-NOV-14
September          9 01-SEP-14 01-NOV-14
October           10 01-OCT-14 01-NOV-14
November          11 01-NOV-14 01-NOV-14
December          12 01-DEC-14 01-NOV-14

12 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    12 |   228 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |    12 |   228 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CAL">0)

Note
-----
   - dynamic sampling used for this statement (level=2)

================================================
But when we add comparison operations , it fails
================================================

PERIOD           CAL COL1      COL2
--------- ---------- --------- ---------
November          11 01-NOV-14 01-NOV-14
December          12 01-DEC-14 01-NOV-14

2 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    19 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    19 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CAL">0 AND TO_DATE("PERIOD"||', '||'2014','Month,
              YYYY')>=TO_DATE(' 2014-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


So this is the question. What’s the anomaly in this output ?

Bonus question: What’s the explanation for the anomaly ?

Answers:

If I had asked why the query might, or might not, crash – the answer would be about the order of predicate evaluation, and simply collecting stats (or not) might have made a difference. Ever since “system stats”  and “CPU costing” appeared the optimizer has been able to change the order in which it applies filter predicates to a table (there’s a pdf of an article of mine from Oracle magazine in the 9i / 10g timeline linked at this URL) .  In this case, applying the “cal > 0” predicate first luckily eliminates the rows that would fail the second predicate. Since the effect is driven by the optimizer’s stats this type of failure could occur ANY TIME you have a predicate that requires coercion between types to take place – which is one reason why you see the injunctions to use the correct data types; and why, if you need coercion to work around incorrect data types you have to consider writing your own functions to trap and resolve the necessary errors raised by Oracle’s implicit conversion mechanisms.

For a quick sketch of the optimizer strategy, the arithmetic is roughly:  predicate A costs c1 and predicate B costs c2; if I apply predicate A to every row I have to apply predicate B to only N surviving rows; if I apply predicate B to every row I have to apply predicate A to M surviving rows; which is smaller: (input_rows * c1 + N * c2) or (input_rows * c2 + M * c1).

The answer to the question I actually asked is this, though: I stressed the fact that this was the COMPLETE output because, as Narenda highlighted in comment 7 below –  the first query shows a note about dynamic sampling and the second query does not. This is a little surprising; we don’t have stats on the table, and the two queries are different so we have to optimizer both of them.  In 12c, of course, it’s possible that the optimizer may have done something clever with statistics feedback (formerly cardinality feedback) and created an SQL directive – but even then we should have seen a note about that.

For the bonus question: given the second output doesn’t report dynamic sampling we should be curious why not – did the optimizer simply decide not to try, did it try then decide not to use the results for some reason, or is there some other reason.  The obvious next step is to look at the 10053 (optimizer) trace – where you find that the optimizer DID do dynamic sampling or rather, it tried to do dynamic sampling but the query generated to take the sample failed with Oracle error ORA-01843, as suggested by Chinar Aliyev in comment 9  and expanded by Mohamed Houri in comment 11.

The irony of the sampling problem (hinted by Chinar Aliyev in comment 10) is that you could be in a position where you have a large table and oracle picks a small sample which happens to miss any of the problem rows and then return a sample that persuades the optimizer to pick an execution plan that is bound to find a problem row; alternatively the SQL used to generate the sample might apply the predicate in an order that manages to eliminate the problem rows, while the final plan derived after sampling persuades the optimizer to use the predicate in the order B, A.

 

September 4, 2014

Group By Bug

Filed under: 12c,Bugs,dbms_xplan,Execution plans,Oracle — Jonathan Lewis @ 5:11 pm BST Sep 4,2014

This just in from OTN Database Forum – a surprising little bug with “group by elimination” exclusive to 12.1.0.2.

rem
rem     Script:         12c_group_by_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sept 2014
rem

alter session set nls_date_format='dd-Mon-yyyy hh24:mi:ss';

select
       /* optimizer_features_enable('12.1.0.1')*/
       trunc (ts,'DD') ts1, sum(fieldb) fieldb
from (
  select
        ts, max(fieldb) fieldb
  from (
  select trunc(sysdate) - 1/24 ts, 1 fieldb from dual
  union all
  select trunc(sysdate) - 2/24 ts, 2 fieldb from dual
  union all
  select trunc(sysdate) - 3/24 ts, 3 fieldb from dual
  union all
  select trunc(sysdate) - 4/24 ts, 4 fieldb from dual
  union all
  select trunc(sysdate) - 5/24 ts, 5 fieldb from dual
  )
  group by ts
)
group by trunc (ts,'DD')
/

You might expect to get one row as the answer – but this is the result I got, with the execution plan pulled from memory:


TS1                      FIELDB
-------------------- ----------
03-Sep-2014 00:00:00          1
03-Sep-2014 00:00:00          5
03-Sep-2014 00:00:00          4
03-Sep-2014 00:00:00          2
03-Sep-2014 00:00:00          3

-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    11 (100)|          |
|   1 |  HASH GROUP BY   |      |     5 |    60 |    11  (10)| 00:00:01 |
|   2 |   VIEW           |      |     5 |    60 |    10   (0)| 00:00:01 |
|   3 |    UNION-ALL     |      |       |       |            |          |
|   4 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   5 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   6 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |     FAST DUAL    |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

You’ll notice that I’ve got an “optimizer_features_enable()” comment in the code: if I change it into a hint I get the following (correct) result and plan:


TS1                      FIELDB
-------------------- ----------
03-Sep-2014 00:00:00         15

-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    12 (100)|          |
|   1 |  HASH GROUP BY   |      |     5 |    60 |    12  (17)| 00:00:01 |
|   2 |   VIEW           |      |     5 |    60 |    11  (10)| 00:00:01 |
|   3 |    HASH GROUP BY |      |     5 |    60 |    11  (10)| 00:00:01 |
|   4 |     VIEW         |      |     5 |    60 |    10   (0)| 00:00:01 |
|   5 |      UNION-ALL   |      |       |       |            |          |
|   6 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Somehow 12.1.0.2 has managed to get confused by the combination of “group by ts” and “group by trunc(ts,’DD’)” and has performed “group-by elimination” when it shouldn’t have. If you use the ‘outline’ option for dbms_xplan.display_cursor() you’ll find that the bad result reports the hint elim_groupby(@sel$1), which leads to an alternative solution to hinting the optimizer_features level. Start the code like this:


select
       /*+ qb_name(main) no_elim_groupby(@main) */
       trunc (ts,'DD') ts1, sum(fieldb) fieldb
from  ...

The (no_)elim_groupby is a hint that appeared in v$sql_hint only in the 12.1.0.2.

October 16, 2013

Hash Clusters – 3

Filed under: 12c,Bugs,dbms_xplan,Oracle,Upgrades — Jonathan Lewis @ 1:03 pm BST Oct 16,2013

This note is a quick summary of a costing oddity that came to light after a twitter conversation with Christian Antognini yesterday. First a little test script to get things going:

(more…)

Next Page »

Website Powered by WordPress.com.