Subquery Factoring (8)

May 24, 2012

Subquery Factoring (8)

Filed under: Execution plans,Hints,Oracle,Subquery Factoring,Tuning — Jonathan Lewis @ 6:37 pm BST May 24,2012

I have a growing collection of postings where I’ve described anomalies or limitations in subquery factoring (the “with subquery”, or Common Table Expression (CTE) to give it the ANSI name). Here’s another example of Oracle’s code not behaving consistently. You may recognise the basic query from yesterday’s example of logical tuning – so I won’t reprint the code to generate the data. The examples in this note were created on 11.2.0.2 – we start with a simple query and its execution plan:

rem
rem     Script:         subq_factor_3.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2012
rem 

select
        *
from
        t2
where
        not exists (
                select  null
                from    t3
                where   n3a = n2a
                and     n3b = n2b
                and     id3 = id2
        )
and     not exists (
                select  null
                from    t3
                where   n3a = n2a
                and     n3b = n2b
                and     id3 = id2 + 1000
        )
;

-------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |    50 |  7200 |    16 |
|   1 |  NESTED LOOPS ANTI  |       |    50 |  7200 |    16 |
|   2 |   NESTED LOOPS ANTI |       |  4999 |   644K|    16 |
|   3 |    TABLE ACCESS FULL| T2    |  5000 |   585K|    16 |
|*  4 |    INDEX UNIQUE SCAN| T3_PK |     1 |    12 |       |
|*  5 |   INDEX UNIQUE SCAN | T3_PK |  5000 | 60000 |       |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2"+1000)
   5 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2")

Now what happens if I take this piece of SQL (which is really the core of a much more complex query), put it into a CTE, and reference it later?

In one of the queries in the client’s application I wanted to materialize a piece of code like this because the result set was used multiple times in the body of the query. You’ll notice that I got two anti-joins when my test case ran as a “naked” SQL statement, but look what happens when I try to use the CTE mechanism in the simplest possible way:

with    cte as (
select
        /*+ materialize */
        *
from
        t2
where
        not exists (
                select
                        null
                from    t3
                where   n3a = n2a
                and     n3b = n2b
                and     id3 = id2
        )
and     not exists (
                select
                        null
                from    t3
                where   n3a = n2a
                and     n3b = n2b
                and     id3 = id2 + 1000
        )
)
select
        *
from
        cte
;

------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Rows  | Bytes | Cost  |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |  5000 |   468K|  5029 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |       |       |       |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D6641_40101879 |       |       |       |
|*  3 |    FILTER                  |                             |       |       |       |
|   4 |     TABLE ACCESS FULL      | T2                          |  5000 |   585K|    15 |
|*  5 |     INDEX UNIQUE SCAN      | T3_PK                       |     1 |    12 |     1 |
|*  6 |     INDEX UNIQUE SCAN      | T3_PK                       |     1 |    12 |     1 |
|   7 |   VIEW                     |                             |  5000 |   468K|    14 |
|   8 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6641_40101879 |  5000 |   585K|    14 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter( NOT EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID3"=:B1 AND "N3B"=:B2
              AND "N3A"=:B3) AND  NOT EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID3"=:B4+1000 AND
              "N3B"=:B5 AND "N3A"=:B6))
   5 - access("N3A"=:B1 AND "N3B"=:B2 AND "ID3"=:B3)
   6 - access("N3A"=:B1 AND "N3B"=:B2 AND "ID3"=:B3+1000)

When I try to materialize the view Oracle reverts to a pair of filter subqueries.

If I take away the /*+ materialize */ hint the CTE goes inline and the query optimizes with anti-joins – just as it did in the original form – so the behaviour isn’t as inconsistent as some of the earlier cases I’ve documented.

I can get the behaviour I want by adding /*+ unnest */ hints to the two subqueries so the problem isn’t a show-stopper, but it’s just a little irritating to have to do this.

Here’s what I found in the 10053 trace file with the attempt to materialize (without adding the extra unnest hints):

SU:   Checking validity of unnesting subquery SEL$2 (#0)
SU:     SU bypassed: invalidated.
SU:   Validity checks failed.
SU:   Checking validity of unnesting subquery SEL$3 (#0)
SU:     SU bypassed: invalidated.
SU:   Validity checks failed.

That’s just a little strange, considering that the equivalent section of the trace for the version where the CTE goes inline reads as follows:

SU:   Checking validity of unnesting subquery SEL$2 (#3)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$2 (#3)SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.
SU:   Checking validity of unnesting subquery SEL$3 (#2)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$3 (#2)SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.

Thank goodness we can still hint – and then generate SQL Baselines.

Update 7th Aug 2013

In 12c the execution plan for both versions of the code produce the same plan (using the anti-join method) without the need to add /*+ unnest */ hints.

Comments (10)

10 Comments »

Reblogged this on lava kafle kathmandu nepal.

Comment by lkafle — May 25, 2012 @ 3:57 am BST May 25,2012 | Reply
[…] Jonathan Lewis blogs about subquery factoring. […]

Pingback by Log Buffer #273, A Carnival of the Vanities for DBAs | The Pythian Blog — May 25, 2012 @ 9:31 am BST May 25,2012 | Reply

Jonathan sir,

Is it possible to ‘push’ the effects of materialize hint in a query plan using SQL profile?

I am trying to attach materialize hint to a query using SQL profile, but the plan generated through the SQL profile doesn’t match with the one generated through direct hint. Version is 10.2.0.5 EE on RHEL.


create or replace function f_test
return number
as 
begin
  for r in (select min(snap_id) snap_id from dba_hist_snapshot)
	loop
		  return r.snap_id;
	end loop;
end;
/

explain plan for
with snp as (select f_test() snap_id from dual)
select count(*) 
from dba_hist_snapshot a, snp b
where a.snap_id= b.snap_id;

Plan hash value: 1583531653

select * from table(dbms_xplan.display);

-------------------------------------------------------------------------------------
| Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |               |     1 |     7 |   113   (8)| 00:00:01 |
|   1 |  SORT AGGREGATE     |               |     1 |     7 |            |          |
|   2 |   NESTED LOOPS      |               |     4 |    28 |   113   (8)| 00:00:01 |
|   3 |    FAST DUAL        |               |     1 |       |     2   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| WRM$_SNAPSHOT |     4 |    28 |   111   (9)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("SNAP_ID"="F_TEST"() AND "STATUS"=0)

This is not efficient, because the function will execute once per very row in the wrm$_snapshot table. An easy way to prevent this is to add a materialize hint to the CTE.


explain plan for
with snp as (select /*+ materialize */ f_test() snap_id from dual)
select count(*) 
from dba_hist_snapshot a, snp b
where a.snap_id= b.snap_id;

select * from table(dbms_xplan.display);

----------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Rows  | Bytes | Cost (%CPU)| Time  |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |     1 |    20 |   108   (2)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |       |       |            |       |
|   2 |   LOAD AS SELECT           |                             |       |       |            |       |
|   3 |    FAST DUAL               |                             |     1 |       |     2   (0)| 00:00:01 |
|   4 |   SORT AGGREGATE           |                             |     1 |    20 |            |       |
|*  5 |    HASH JOIN               |                             |     4 |    80 |   106   (2)| 00:00:01 |
|   6 |     VIEW                   |                             |     1 |    13 |     2   (0)| 00:00:01 |
|   7 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D7979_73A6CAFD |     1 |    13 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL      | WRM$_SNAPSHOT               | 16314 |   111K|   103   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("SNAP_ID"="B"."SNAP_ID")
   8 - filter("STATUS"=0)

With materialize hint, the function will be executed only once. The problem is – I don’t have access to the source code. So the only option for me is to use SQL profile.

I took the outline data resulting from the query with materialize hint, created a SQL profile. However, the plan generated by SQL profile looks different from the one generated with materialize hint. The code to generate SQL profile is too big, so giving below a ‘shortcut’ by using outline data as a hint.


select * from table(dbms_xplan.display(null, null, 'advanced'));

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$D67CB2D2" "T1"@"SEL$D67CB2D2")
      USE_HASH(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
      LEADING(@"SEL$335DD26A" "B"@"SEL$2" "WRM$_SNAPSHOT"@"SEL$3")
      FULL(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
      NO_ACCESS(@"SEL$335DD26A" "B"@"SEL$2")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      MERGE(@"SEL$3")
      OUTLINE_LEAF(@"SEL$335DD26A")
      OUTLINE_LEAF(@"SEL$D67CB2D2")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      OPT_PARAM('_gby_hash_aggregation_enabled' 'false')
      OPT_PARAM('optimizer_dynamic_sampling' 4)
      OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

explain plan for
with snp as (select f_test() snap_id from dual)
select
  /*+ FULL(@"SEL$D67CB2D2" "T1"@"SEL$D67CB2D2")
      USE_HASH(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
      LEADING(@"SEL$335DD26A" "B"@"SEL$2" "WRM$_SNAPSHOT"@"SEL$3")
      FULL(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
      NO_ACCESS(@"SEL$335DD26A" "B"@"SEL$2")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      MERGE(@"SEL$3")
      OUTLINE_LEAF(@"SEL$335DD26A")
      OUTLINE_LEAF(@"SEL$D67CB2D2")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      OPT_PARAM('_gby_hash_aggregation_enabled' 'false')
      OPT_PARAM('optimizer_dynamic_sampling' 4)
      OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
      IGNORE_OPTIM_EMBEDDED_HINTS  */
count(*) 
from dba_hist_snapshot a, snp b
where a.snap_id= b.snap_id;	

select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------------
| Id  | Operation            | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                   |     1 |    26 |   106   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE      |                   |     1 |    26 |            |          |
|*  2 |   HASH JOIN          |                   |     4 |   104 |   106   (2)| 00:00:01 |
|   3 |    VIEW              |                   |     1 |    13 |     2   (0)| 00:00:01 |
|   4 |     FAST DUAL        |                   |     1 |       |     2   (0)| 00:00:01 |
|   5 |    VIEW              | DBA_HIST_SNAPSHOT | 16314 |   207K|   103   (1)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL| WRM$_SNAPSHOT     | 16314 |   111K|   103   (1)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."SNAP_ID"="B"."SNAP_ID")
   6 - filter("STATUS"=0)

The plan generated by SQL profile is not much different from above:

-------------------------------
SQL_ID  8vncgw5ygpx1b, child number 1
-------------------------------------
with snp as (select f_test() snap_id from dual) select count(*) from
dba_hist_snapshot a, snp b where a.snap_id= b.snap_id

Plan hash value: 1587178679

-------------------------------------------------------------------------------------
| Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |               |       |       |   106 (100)|          |
|   1 |  SORT AGGREGATE     |               |     1 |    20 |            |          |
|*  2 |   HASH JOIN         |               |     4 |    80 |   106   (2)| 00:00:01 |
|   3 |    VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
|   4 |     FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS FULL| WRM$_SNAPSHOT | 16314 |   111K|   103   (1)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SNAP_ID"="B"."SNAP_ID")
   5 - filter("STATUS"=0)

Note
-----
   - SQL profile "Test_Prof" used for this statement

In the latter two plan, it is using a HASH JOIN instead of NL, so I believe it will execute the function only once. But why does it not create a temp table, as it does with materialize hint? Is there any way to force this?

Many thanks.

Comment by CJ — May 28, 2012 @ 3:31 pm BST May 28,2012 | Reply

I see that the ‘sourcecode’ tags haven’t been applied. A preview functionality here would be really helpful.

Comment by CJ — May 28, 2012 @ 3:33 pm BST May 28,2012 | Reply
- And maybe a facility to delete comments.. because my earlier comment is now redundant.
  
  Comment by CJ — May 28, 2012 @ 4:39 pm BST May 28,2012 | Reply
- CJ,
  
  I’d like a preview option as well – but it’s not available.
  
  The sourcecode tags (which you correctly put in square brackets) have an open and close form. To close you need a leading /, as in “/sourcecode” (but in square brackets, not quote marks).
  
  If you look in view v$sql_hints you will find that the inline and materialize hints do not have an “outline version” which (I believe) means you can’t inject them through a stored outline, profile, or SQL Baseline. In your case it would be sufficient to inject a no_merge() hint into the subquery factoring clause (CTE). Since the query block in the clause is sel$1 you need a minimum of /*+ no_merge(@sel$1) */ in your outline/profile to get the plan:
```
-------------------------------------------------------------------------------------
| Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |               |     1 |    19 |     5  (20)| 00:00:01 |
|   1 |  SORT AGGREGATE     |               |     1 |    19 |            |          |
|*  2 |   HASH JOIN         |               |     1 |    19 |     5  (20)| 00:00:01 |
|   3 |    VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
|   4 |     FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS FULL| WRM$_SNAPSHOT |    54 |   324 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
```
  Technically an alternative hint would be /*+ outline_leaf(@sel$1) */, but the outline_leaf() hint is not documented, so should not be used.
  
  Comment by Jonathan Lewis — May 28, 2012 @ 4:57 pm BST May 28,2012 | Reply

Looks like I have found my answer. The NO_MERGE hint needs to be added not to the CTE, but to every query wherein the CTE referred.


explain plan for
with snp as (select f_test() snap_id from dual)
select /*+ no_merge(b) */ count(*)
from dba_hist_snapshot a, snp b
where (a.snap_id= b.snap_id) 
union all
select /*+ no_merge(b) */ count(*)
from dba_hist_snapshot a, snp b
where (a.snap_id &lt; b.snap_id) ;

--------------------------------------------------------------------------------------
| Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |               |     2 |    40 |   211  (51)| 00:00:01 |
|   1 |  UNION-ALL           |               |       |       |            |          |
|   2 |   SORT AGGREGATE     |               |     1 |    20 |            |          |
|*  3 |    HASH JOIN         |               |     4 |    80 |   106   (2)| 00:00:01 |
|   4 |     VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
|   5 |      FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL| WRM$_SNAPSHOT | 16314 |   111K|   103   (1)| 00:00:01 |
|   7 |   SORT AGGREGATE     |               |     1 |    20 |            |          |
|   8 |    NESTED LOOPS      |               |   816 | 16320 |   105   (1)| 00:00:01 |
|   9 |     VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
|  10 |      FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
|* 11 |     TABLE ACCESS FULL| WRM$_SNAPSHOT |   816 |  5712 |   103   (1)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(&quot;SNAP_ID&quot;=&quot;B&quot;.&quot;SNAP_ID&quot;)
   6 - filter(&quot;STATUS&quot;=0)
  11 - filter(&quot;STATUS&quot;=0 AND &quot;SNAP_ID&quot;&lt;&quot;B&quot;.&quot;SNAP_ID&quot;)

This is contrary to my earlier perception that NO_MERGE hint had to be added to the CTE and not anywhere else. Which one is correct?

Many thanks.

Comment by CJ — May 30, 2012 @ 3:37 pm BST May 30,2012 | Reply

Jonathan sir,

A rather curious behaviour with NO_MERGE and UNION. NO_MERGE works with the query appearing last in UNION ALL, but not the query appearing before.



explain plan for
with snp as (select /*+ no_merge */ f_test() snap_id from dual)
select count(*)
from dba_hist_snapshot a, snp b
where (a.snap_id= b.snap_id) 
union all
select count(*)
from dba_hist_snapshot a, snp b
where (a.snap_id < b.snap_id) 
;


--------------------------------------------------------------------------------------
| Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |               |     2 |    27 |   218  (53)| 00:00:01 |
|   1 |  UNION-ALL           |               |       |       |            |          |
|   2 |   SORT AGGREGATE     |               |     1 |     7 |            |          |
|   3 |    NESTED LOOPS      |               |     4 |    28 |   113   (8)| 00:00:01 |
|   4 |     FAST DUAL        |               |     1 |       |     2   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| WRM$_SNAPSHOT |     4 |    28 |   111   (9)| 00:00:01 |
|   6 |   SORT AGGREGATE     |               |     1 |    20 |            |          |
|   7 |    NESTED LOOPS      |               |   816 | 16320 |   105   (1)| 00:00:01 |
|   8 |     VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
|   9 |      FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
|* 10 |     TABLE ACCESS FULL| WRM$_SNAPSHOT |   816 |  5712 |   103   (1)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("SNAP_ID"="F_TEST"() AND "STATUS"=0)
  10 - filter("STATUS"=0 AND "SNAP_ID"<"B"."SNAP_ID")

From the filter conditions you can see that the function will execute once for every row in the first part of UNION, whereas it will execute only once for the second part. I traced using DBMS_PROFILER to confirm this. What can be the reason? (BTW, this is why I was trying to push in the MATERIALIZE hint).

Many thanks.

Comment by CJ — May 30, 2012 @ 3:08 pm BST May 30,2012 | Reply

[…] Disappearance of semi-join transformation – fixed […]

Pingback by 12c subquery factoring | Oracle Scratchpad — August 7, 2013 @ 6:15 pm BST Aug 7,2013 | Reply
[…] Plan variations for materialized CTE involving “not exists” subqueries […]

Pingback by CTE Catalogue | Oracle Scratchpad — June 10, 2020 @ 6:46 pm BST Jun 10,2020 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome. Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Scratchpad

May 24, 2012