Oracle Scratchpad

February 24, 2020

Fake Baselines – 2

Filed under: Bugs,dbms_xplan,Execution plans,Hints,Ignoring Hints,Oracle — Jonathan Lewis @ 3:38 pm GMT Feb 24,2020

Many years ago (2011) I wrote a note describing how you could attach the Outline Information from one query to the SQL_ID of another query using the official Oracle mechanism of calling dbms_spm.load_plans_from_cursor_cache(). Shortly after publishing that note I drafted a follow-up note with an example demonstrating that even when the alternative outline was technically relevant the optimizer might still fail to use the SQL Plan Baseline. Unfortunately I didn’t quite finish the draft – until today.

The example I started with nearly 10 years ago behaved correctly against 11.1.0.7, but failed to reproduce the plan when I tested it against 11.2.0.3, and it still fails against 19.3.0.0. Here’s the test data and the query we’re going to attempt to manipulate:


rem
rem     Script:         fake_sql_baseline_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2010
rem

create table emp1 (
        dept_no         number /* not null */,
        sal             number,
        emp_no          number,
        padding         varchar2(200),
        constraint e1_pk primary key(emp_no)
)
;

create table emp2 (
        dept_no         number /* not null */,
        sal             number,
        emp_no          number,
        padding         varchar2(200),
        constraint e2_pk primary key(emp_no)
)
;

insert into emp1
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',200)
from
        all_objects
where
        rownum <= 20000 -- > comment to avoid wordpress format issue
;

insert into emp2
select
        mod(rownum,6),
        rownum,
        rownum,
        rpad('x',200)
from
        all_objects
where
        rownum <= 20000 -- > comment to avoid wordpress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname                 => user,
                tabname                 => 'EMP1',
                cascade                 => true,
                method_opt              =>'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname                 => user,
                tabname                 => 'EMP2',
                cascade                 => true,
                method_opt              =>'for all columns size 1'
        );
end;
/

select
        /*+ target_query */
        count(*)
from
        emp1
where
        emp1.dept_no not in (
                select  dept_no
                from    emp2
        )
;

select * from table(dbms_xplan.display_cursor(null, null, 'outline'));

I haven’t included the code I run on my testbed to delete all existing SQL Plan Baselines before running this test, I’ll post that at the end of the article.

The query is very simple and will, of course, return no rows since emp1 and emp2 are identical and we’re looking for departments in emp1 that don’t appear in emp2. The “obvious” plan for the optimizer is to unnest the subquery into a distinct (i.e. aggregate) inline view then apply an anti-join. It’s possible that the optimizer will also decide to do complex view merging and postpone the aggregation. Here’s the execution plan from 19.3:


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |   168 (100)|          |
|   1 |  SORT AGGREGATE     |      |     1 |     6 |            |          |
|*  2 |   HASH JOIN ANTI NA |      |  3333 | 19998 |   168   (5)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMP1"."DEPT_NO"="DEPT_NO")

Hint Report (identified by operation id / Query Block Name / Object Alias):

Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  target_query

As expected the subquery unnested, we have the anti-join (in this case, since dept_no can be null, it’s a “Null-Aware” antijoin); and the optimizer has, indeed, decided to do the join before the aggregation.

Assume, now, that for reasons known only to me a merge (anti-)join would be more effective than a hash join. To get the optimizer to do this I’m going to capture the query and connect it to a plan that uses a merge join. There are several minor variations on how we could do this, but I’m going to follow the steps I took in 2011 – but cut out a couple of the steps where I loaded redundant baselines into the SMB (SQLPlan Management Base). As a starting point I’ll just record the sql_id and plan_hash_value for the query (and the child_number just in case I want to use dbms_xplan.display_cursor() to report the in-memory execution plan):

column  sql_id                  new_value       m_sql_id_1
column  plan_hash_value         new_value       m_plan_hash_value_1
column  child_number            new_value       m_child_number_1

select
        sql_id, plan_hash_value, child_number
from
        v$sql
where
        sql_text like '%target_query%'
and     sql_text not like '%v$sql%'
and     rownum = 1
;

Now I’ll hack the query to produce a plan that does the merge join. An easy first step is to look at the current outline and take advantage of the hints there. You’ll notice I included the ‘outline’ format in my call to dbms_xplan.display_cursor() above, even though I didn’t show you that part of the output – here it is now:

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "EMP1"@"SEL$1")
      FULL(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "EMP1"@"SEL$1" "EMP2"@"SEL$2")
      USE_HASH(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      END_OUTLINE_DATA
  */

So I’m going to take the useful-looking hints, get rid of the use_hash() hint and, for good measure, turn it into a no_use_hash() hint. Here’s the resulting query, with its execution plan:

select
        /*+
                unnest(@sel$2)
                leading(@sel$5da710d3 emp1@sel$1 emp2@sel$2)
                no_use_hash(@sel$5da710d3 emp2@sel$2)
                full(@sel$5da710d3 emp2@sel$2)
                full(@sel$5da710d3 emp1@sel$1)
                alternate_query
        */
        count(*)
from
        emp1
where
        emp1.dept_no not in (
                select  dept_no
                from    emp2
        )
;

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |   178 (100)|          |
|   1 |  SORT AGGREGATE      |      |     1 |     6 |            |          |
|   2 |   MERGE JOIN ANTI NA |      |  3333 | 19998 |   178  (11)| 00:00:01 |
|   3 |    SORT JOIN         |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|*  5 |    SORT UNIQUE       |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("EMP1"."DEPT_NO"="DEPT_NO")
       filter("EMP1"."DEPT_NO"="DEPT_NO")

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "EMP1"@"SEL$1")
      FULL(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "EMP1"@"SEL$1" "EMP2"@"SEL$2")
      USE_MERGE(@"SEL$5DA710D3" "EMP2"@"SEL$2")
      END_OUTLINE_DATA
  */

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  alternate_query

Note that I’ve included the text “alternative_query” at the end of the hint list as something to use when I’m searaching v$sql. Note also, that the “no_use_hash()” hint has disappeared and been replaced by “use_merge()” hint.

The plan tells us that the optimizer is happy to use a “merge join anti NA”, so we can load this plan’s outline into the SMB by combining the sql_id and plan_hash_value for this query with (for older versions of Oracle, though you can now use the sql_id in recent versions) the text of the previous query so that we can store the old text with the new plan.


column  sql_id                  new_value       m_sql_id_2
column  plan_hash_value         new_value       m_plan_hash_value_2
column  child_number            new_value       m_child_number_2

select
        sql_id, plan_hash_value, child_number
from
        v$sql
where
        sql_text like '%alternate_query%'
and     sql_text not like '%v$sql%'
and     rownum = 1
;

declare
        m_clob  clob;
begin
        select
                sql_fulltext
        into
                m_clob
        from
                v$sql
        where
                sql_id = '&m_sql_id_1'
        and     child_number = &m_child_number_1
        ;

        dbms_output.put_line(m_clob);

        dbms_output.put_line(
                'Number of plans loaded: ' ||
                dbms_spm.load_plans_from_cursor_cache(
                        sql_id                  => '&m_sql_id_2',
                        plan_hash_value         => &m_plan_hash_value_2,
                        sql_text                => m_clob,
                        fixed                   => 'YES',
                        enabled                 => 'YES'
                )
        );

end;
/

At this point we have one SQL Plan Baseline in the SMB, and it says the old query should execute usng the new plan. So let’s give it a go:

set serveroutput off
alter system flush shared_pool;

alter session set events '10053 trace name context forever';

select
        /*+ target_query */
        count(*)
from
        emp1
where
        emp1.dept_no not in (
                select  dept_no
                from    emp2
        )
/

alter session set events '10053 trace name context off';

select * from table(dbms_xplan.display_cursor(null, null, 'alias outline'));

I’ve enabled the 10053 (optimizer) trace so that I can report a critical few lines from it later on. Here’s the execution plan, omitting the outline but including the alias information.

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |   168 (100)|          |
|   1 |  SORT AGGREGATE     |      |     1 |     6 |            |          |
|*  2 |   HASH JOIN ANTI NA |      |  3333 | 19998 |   168   (5)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$5DA710D3
   3 - SEL$5DA710D3 / EMP1@SEL$1
   4 - SEL$5DA710D3 / EMP2@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMP1"."DEPT_NO"="DEPT_NO")

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  target_query

Note
-----
   - Failed to use SQL plan baseline for this statement

We haven’t used the SQL Plan Baseline – and in 19.3 we even have a note that the optimizer knew there was at least one baseline available that it failed to use! So what went wrong?

I have two diagnostics – first is the content of the baseline itself (warning – the SQL below will report ALL currently saved SQL Plan Baselines); I’ve just made sure that I have only one to report:

set linesize 90

select
        pln.*
from
        (select sql_handle, plan_name
         from   dba_sql_plan_baselines spb
         order by
                sql_handle, plan_name
        ) spb,
        table(dbms_xplan.display_sql_plan_baseline(spb.sql_handle, spb.plan_name)) pln
;


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
SQL handle: SQL_ce3099e9e3bdaf2f
SQL text: select         /*+ target_query */         count(*) from         emp1
          where         emp1.dept_no not in (                 select  dept_no
                        from    emp2         )
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SQL_PLAN_cwc4tx7jvvbtg02bb0c12         Plan id: 45812754
Enabled: YES     Fixed: YES     Accepted: YES     Origin: MANUAL-LOAD-FROM-CURSOR-CACHE
Plan rows: From dictionary
--------------------------------------------------------------------------------

Plan hash value: 1517539632

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |       |       |   178 (100)|          |
|   1 |  SORT AGGREGATE      |      |     1 |     6 |            |          |
|   2 |   MERGE JOIN ANTI NA |      |  3333 | 19998 |   178  (11)| 00:00:01 |
|   3 |    SORT JOIN         |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMP1 | 20000 | 60000 |    83   (4)| 00:00:01 |
|*  5 |    SORT UNIQUE       |      | 20000 | 60000 |    89  (11)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| EMP2 | 20000 | 60000 |    83   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("EMP1"."DEPT_NO"="DEPT_NO")
       filter("EMP1"."DEPT_NO"="DEPT_NO")

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (E - Syntax error (1))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  alternate_query

We have an SQL Plan baseline that is accepted, enabled, and fixed; and it’s supposed to produce a “merge join anti NA”, and it clearly “belongs” to our query. So it should have been used.

Then we have the 10053 trace file, in which we find the following:


SPM: planId in plan baseline = 45812754, planId of reproduced plan = 1410137244
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : TEST_USER
  plan_baseline signature  : 14857544400522555183
  plan_baseline plan_id    : 45812754
  plan_baseline hintset    :
    hint num  1 len 27 text: IGNORE_OPTIM_EMBEDDED_HINTS
    hint num  2 len 35 text: OPTIMIZER_FEATURES_ENABLE('19.1.0')
    hint num  3 len 20 text: DB_VERSION('19.1.0')
    hint num  4 len  8 text: ALL_ROWS
    hint num  5 len 29 text: OUTLINE_LEAF(@"SEL$5DA710D3")
    hint num  6 len 16 text: UNNEST(@"SEL$2")
    hint num  7 len 17 text: OUTLINE(@"SEL$1")
    hint num  8 len 17 text: OUTLINE(@"SEL$2")
    hint num  9 len 36 text: FULL(@"SEL$5DA710D3" "EMP1"@"SEL$1")
    hint num 10 len 36 text: FULL(@"SEL$5DA710D3" "EMP2"@"SEL$2")
    hint num 11 len 54 text: LEADING(@"SEL$5DA710D3" "EMP1"@"SEL$1" "EMP2"@"SEL$2")
    hint num 12 len 41 text: USE_MERGE(@"SEL$5DA710D3" "EMP2"@"SEL$2")

During optimization the optimizer has found that SQL Plan Baseline. We can see that the hints in the baseline are exactly the hints from the plan that we wanted – but the optimizer says it can’t reproduce the plan we wanted. In fact if you try adding exactly these hints to the query itself you’ll still find that the merge join won’t appear and Oracle will use a hash join.

Conclusion

This is just a simple example of how the optimizer may be able to produce a plan if hinted in one way, but the outline consists of a different set of hints that won’t reproduce the plan they describe. My no_use_hash() has turned into a use_merge() but that hint fails to reproduce the merge join in circumstances that makes me think there’s a bug in the optimizer.

If you happen to be unlucky you may find that the plan you really need to see can’t be forced through a SQL Plan Baseline. In this example it may be necessary to use the SQL Patch mechanism to include the no_use_hash() hint in a set of hints that I associate with the query.

 

1 Comment »

  1. Thanks for the article Jonathan. Very timely.
    Earlier this week I was trying SQL Baselines for the first time and thought I was doing something wrong since the FIXED baseline was not being used. A 10053 trace verified that.
    My understanding was the purpose of a baseline was to coerce the optimizer to choose among approved plans. But the parse phase still seems like it has the option of acting like the baseline is not there.
    Very confusing and frustrating.

    Comment by Andrew Markiewicz — February 28, 2020 @ 4:10 pm GMT Feb 28,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.