Oracle Scratchpad

May 24, 2012

Subquery Factoring

Filed under: Execution plans,Hints,Oracle,Subquery Factoring,Tuning — Jonathan Lewis @ 6:37 pm BST May 24,2012

I have a small collection of postings where I’ve described anomalies or limitations in subquery factoring (the “with subquery”, or Common Table Expression (CTE) to give it the official ANSI name). Here’s another example of Oracle’s code not behaving consistently. You may recognise the basic query from yesterday’s example of logical tuning – so I won’t reprint the code to generate the data sets. This examples in this note were created on 11.2.0.2 – we start with a simple query and its execution plan:

select
	*
from
	t2
where
	not exists (
		select	null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2
	)
and	not exists (
		select	null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2 + 1000
	)
;

-------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |    50 |  7200 |    16 |
|   1 |  NESTED LOOPS ANTI  |       |    50 |  7200 |    16 |
|   2 |   NESTED LOOPS ANTI |       |  4999 |   644K|    16 |
|   3 |    TABLE ACCESS FULL| T2    |  5000 |   585K|    16 |
|*  4 |    INDEX UNIQUE SCAN| T3_PK |     1 |    12 |       |
|*  5 |   INDEX UNIQUE SCAN | T3_PK |  5000 | 60000 |       |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2"+1000)
   5 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2")

Now what happens if I take this piece of SQL (which is really the core of a much more complex query), put it into a CTE, and reference it later ? In one of the queries in the client’s application I wanted to materialize a piece of code like this because the result set was used multiple times in the body of the query. You’ll notice that I got two anti-joins when my test case ran as a “naked” SQL statement, but look what happens when I try to use the CTE mechanism in the simplest possible way:

with	cte as (
select
	/*+ materialize */
	*
from
	t2
where
	not exists (
		select
			null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2
	)
and	not exists (
		select
			null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2 + 1000
	)
)
select
	*
from
	cte
;

------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Rows  | Bytes | Cost  |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |  5000 |   468K|  5029 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |       |       |       |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D6641_40101879 |       |       |       |
|*  3 |    FILTER                  |                             |       |       |       |
|   4 |     TABLE ACCESS FULL      | T2                          |  5000 |   585K|    15 |
|*  5 |     INDEX UNIQUE SCAN      | T3_PK                       |     1 |    12 |     1 |
|*  6 |     INDEX UNIQUE SCAN      | T3_PK                       |     1 |    12 |     1 |
|   7 |   VIEW                     |                             |  5000 |   468K|    14 |
|   8 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6641_40101879 |  5000 |   585K|    14 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter( NOT EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID3"=:B1 AND "N3B"=:B2
              AND "N3A"=:B3) AND  NOT EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID3"=:B4+1000 AND
              "N3B"=:B5 AND "N3A"=:B6))
   5 - access("N3A"=:B1 AND "N3B"=:B2 AND "ID3"=:B3)
   6 - access("N3A"=:B1 AND "N3B"=:B2 AND "ID3"=:B3+1000)

When I try to materialize the view Oracle reverts to a pair of filter subqueries. If I take away the /*+ materialize */ hint the CTE goes inline and the query optimizes with anti-joins – just as it did in the original form – so the behaviour isn’t as inconsistent as some of the earlier cases I’ve documented. I can get the behaviour I want by adding /*+ unnest */ hints to the two subqueries so the problem isn’t a show-stopper, but it’s just a little irritating to have to do this.

Here’s what I found in the 10053 trace file with the attempt to materialize:

SU:   Checking validity of unnesting subquery SEL$2 (#0)
SU:     SU bypassed: invalidated.
SU:   Validity checks failed.
SU:   Checking validity of unnesting subquery SEL$3 (#0)
SU:     SU bypassed: invalidated.
SU:   Validity checks failed.

That’s just a little strange, considering that the equivalent section of the trace for the version where the CTE goes inline reads as folllows:

SU:   Checking validity of unnesting subquery SEL$2 (#3)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$2 (#3)SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.
SU:   Checking validity of unnesting subquery SEL$3 (#2)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$3 (#2)SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.

Thank goodness we can still hint – and then generate SQL Baselines.

[Further reading on "subquery factoring"]

Update 7th Aug 2013

In 12c the execution plan for both versions of the code is the same (using the anti-join method).

9 Comments »

  1. Reblogged this on lava kafle kathmandu nepal.

    Comment by lkafle — May 25, 2012 @ 3:57 am BST May 25,2012 | Reply

  2. [...] Jonathan Lewis blogs about subquery factoring. [...]

    Pingback by Log Buffer #273, A Carnival of the Vanities for DBAs | The Pythian Blog — May 25, 2012 @ 9:31 am BST May 25,2012 | Reply

  3. Jonathan sir,

    Is it possible to ‘push’ the effects of materialize hint in a query plan using SQL profile?

    I am trying to attach materialize hint to a query using SQL profile, but the plan generated through the SQL profile doesn’t match with the one generated through direct hint. Version is 10.2.0.5 EE on RHEL.

    
    create or replace function f_test
    return number
    as 
    begin
      for r in (select min(snap_id) snap_id from dba_hist_snapshot)
    	loop
    		  return r.snap_id;
    	end loop;
    end;
    /
    
    explain plan for
    with snp as (select f_test() snap_id from dual)
    select count(*) 
    from dba_hist_snapshot a, snp b
    where a.snap_id= b.snap_id;
    
    Plan hash value: 1583531653
    
    select * from table(dbms_xplan.display);
    
    -------------------------------------------------------------------------------------
    | Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT    |               |     1 |     7 |   113   (8)| 00:00:01 |
    |   1 |  SORT AGGREGATE     |               |     1 |     7 |            |          |
    |   2 |   NESTED LOOPS      |               |     4 |    28 |   113   (8)| 00:00:01 |
    |   3 |    FAST DUAL        |               |     1 |       |     2   (0)| 00:00:01 |
    |*  4 |    TABLE ACCESS FULL| WRM$_SNAPSHOT |     4 |    28 |   111   (9)| 00:00:01 |
    -------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       4 - filter("SNAP_ID"="F_TEST"() AND "STATUS"=0)
    
    

    This is not efficient, because the function will execute once per very row in the wrm$_snapshot table. An easy way to prevent this is to add a materialize hint to the CTE.

    
    explain plan for
    with snp as (select /*+ materialize */ f_test() snap_id from dual)
    select count(*) 
    from dba_hist_snapshot a, snp b
    where a.snap_id= b.snap_id;
    
    select * from table(dbms_xplan.display);
    
    ----------------------------------------------------------------------------------------------------------
    | Id  | Operation                  | Name                        | Rows  | Bytes | Cost (%CPU)| Time  |
    ----------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT           |                             |     1 |    20 |   108   (2)| 00:00:01 |
    |   1 |  TEMP TABLE TRANSFORMATION |                             |       |       |            |       |
    |   2 |   LOAD AS SELECT           |                             |       |       |            |       |
    |   3 |    FAST DUAL               |                             |     1 |       |     2   (0)| 00:00:01 |
    |   4 |   SORT AGGREGATE           |                             |     1 |    20 |            |       |
    |*  5 |    HASH JOIN               |                             |     4 |    80 |   106   (2)| 00:00:01 |
    |   6 |     VIEW                   |                             |     1 |    13 |     2   (0)| 00:00:01 |
    |   7 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D7979_73A6CAFD |     1 |    13 |     2   (0)| 00:00:01 |
    |*  8 |     TABLE ACCESS FULL      | WRM$_SNAPSHOT               | 16314 |   111K|   103   (1)| 00:00:01 |
    ----------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       5 - access("SNAP_ID"="B"."SNAP_ID")
       8 - filter("STATUS"=0)
    
    

    With materialize hint, the function will be executed only once. The problem is – I don’t have access to the source code. So the only option for me is to use SQL profile.

    I took the outline data resulting from the query with materialize hint, created a SQL profile. However, the plan generated by SQL profile looks different from the one generated with materialize hint. The code to generate SQL profile is too big, so giving below a ‘shortcut’ by using outline data as a hint.

    
    select * from table(dbms_xplan.display(null, null, 'advanced'));
    
    Outline Data
    -------------
    
      /*+
          BEGIN_OUTLINE_DATA
          FULL(@"SEL$D67CB2D2" "T1"@"SEL$D67CB2D2")
          USE_HASH(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
          LEADING(@"SEL$335DD26A" "B"@"SEL$2" "WRM$_SNAPSHOT"@"SEL$3")
          FULL(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
          NO_ACCESS(@"SEL$335DD26A" "B"@"SEL$2")
          OUTLINE(@"SEL$3")
          OUTLINE(@"SEL$2")
          OUTLINE(@"SEL$1")
          MERGE(@"SEL$3")
          OUTLINE_LEAF(@"SEL$335DD26A")
          OUTLINE_LEAF(@"SEL$D67CB2D2")
          OUTLINE_LEAF(@"SEL$1")
          ALL_ROWS
          OPT_PARAM('_gby_hash_aggregation_enabled' 'false')
          OPT_PARAM('optimizer_dynamic_sampling' 4)
          OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
          IGNORE_OPTIM_EMBEDDED_HINTS
          END_OUTLINE_DATA
      */
    
    explain plan for
    with snp as (select f_test() snap_id from dual)
    select
      /*+ FULL(@"SEL$D67CB2D2" "T1"@"SEL$D67CB2D2")
          USE_HASH(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
          LEADING(@"SEL$335DD26A" "B"@"SEL$2" "WRM$_SNAPSHOT"@"SEL$3")
          FULL(@"SEL$335DD26A" "WRM$_SNAPSHOT"@"SEL$3")
          NO_ACCESS(@"SEL$335DD26A" "B"@"SEL$2")
          OUTLINE(@"SEL$3")
          OUTLINE(@"SEL$2")
          OUTLINE(@"SEL$1")
          MERGE(@"SEL$3")
          OUTLINE_LEAF(@"SEL$335DD26A")
          OUTLINE_LEAF(@"SEL$D67CB2D2")
          OUTLINE_LEAF(@"SEL$1")
          ALL_ROWS
          OPT_PARAM('_gby_hash_aggregation_enabled' 'false')
          OPT_PARAM('optimizer_dynamic_sampling' 4)
          OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
          IGNORE_OPTIM_EMBEDDED_HINTS  */
    count(*) 
    from dba_hist_snapshot a, snp b
    where a.snap_id= b.snap_id;	
    
    select * from table(dbms_xplan.display);
    
    ------------------------------------------------------------------------------------------
    | Id  | Operation            | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |                   |     1 |    26 |   106   (2)| 00:00:01 |
    |   1 |  SORT AGGREGATE      |                   |     1 |    26 |            |          |
    |*  2 |   HASH JOIN          |                   |     4 |   104 |   106   (2)| 00:00:01 |
    |   3 |    VIEW              |                   |     1 |    13 |     2   (0)| 00:00:01 |
    |   4 |     FAST DUAL        |                   |     1 |       |     2   (0)| 00:00:01 |
    |   5 |    VIEW              | DBA_HIST_SNAPSHOT | 16314 |   207K|   103   (1)| 00:00:01 |
    |*  6 |     TABLE ACCESS FULL| WRM$_SNAPSHOT     | 16314 |   111K|   103   (1)| 00:00:01 |
    ------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("A"."SNAP_ID"="B"."SNAP_ID")
       6 - filter("STATUS"=0)
    
    

    The plan generated by SQL profile is not much different from above:

    -------------------------------
    SQL_ID  8vncgw5ygpx1b, child number 1
    -------------------------------------
    with snp as (select f_test() snap_id from dual) select count(*) from
    dba_hist_snapshot a, snp b where a.snap_id= b.snap_id
    
    Plan hash value: 1587178679
    
    -------------------------------------------------------------------------------------
    | Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT    |               |       |       |   106 (100)|          |
    |   1 |  SORT AGGREGATE     |               |     1 |    20 |            |          |
    |*  2 |   HASH JOIN         |               |     4 |    80 |   106   (2)| 00:00:01 |
    |   3 |    VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
    |   4 |     FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
    |*  5 |    TABLE ACCESS FULL| WRM$_SNAPSHOT | 16314 |   111K|   103   (1)| 00:00:01 |
    -------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("SNAP_ID"="B"."SNAP_ID")
       5 - filter("STATUS"=0)
    
    Note
    -----
       - SQL profile "Test_Prof" used for this statement
    

    In the latter two plan, it is using a HASH JOIN instead of NL, so I believe it will execute the function only once. But why does it not create a temp table, as it does with materialize hint? Is there any way to force this?

    Many thanks.

    Comment by CJ — May 28, 2012 @ 3:31 pm BST May 28,2012 | Reply

  4. I see that the ‘sourcecode’ tags haven’t been applied. A preview functionality here would be really helpful.

    Comment by CJ — May 28, 2012 @ 3:33 pm BST May 28,2012 | Reply

    • And maybe a facility to delete comments.. because my earlier comment is now redundant.

      Comment by CJ — May 28, 2012 @ 4:39 pm BST May 28,2012 | Reply

    • CJ,

      I’d like a preview option as well – but it’s not available.

      The sourcecode tags (which you correctly put in square brackets) have an open and close form. To close you need a leading /, as in “/sourcecode” (but in square brackets, not quote marks).

      If you look in view v$sql_hints you will find that the inline and materialize hints do not have an “outline version” which (I believe) means you can’t inject them through a stored outline, profile, or SQL Baseline. In your case it would be sufficient to inject a no_merge() hint into the subquery factoring clause (CTE). Since the query block in the clause is sel$1 you need a minimum of /*+ no_merge(@sel$1) */ in your outline/profile to get the plan:

      -------------------------------------------------------------------------------------
      | Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
      -------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT    |               |     1 |    19 |     5  (20)| 00:00:01 |
      |   1 |  SORT AGGREGATE     |               |     1 |    19 |            |          |
      |*  2 |   HASH JOIN         |               |     1 |    19 |     5  (20)| 00:00:01 |
      |   3 |    VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
      |   4 |     FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
      |*  5 |    TABLE ACCESS FULL| WRM$_SNAPSHOT |    54 |   324 |     2   (0)| 00:00:01 |
      -------------------------------------------------------------------------------------
      
      

      Technically an alternative hint would be /*+ outline_leaf(@sel$1) */, but the outline_leaf() hint is not documented, so should not be used.

      Comment by Jonathan Lewis — May 28, 2012 @ 4:57 pm BST May 28,2012 | Reply

  5. Looks like I have found my answer. The NO_MERGE hint needs to be added not to the CTE, but to every query wherein the CTE referred.

    
    explain plan for
    with snp as (select f_test() snap_id from dual)
    select /*+ no_merge(b) */ count(*)
    from dba_hist_snapshot a, snp b
    where (a.snap_id= b.snap_id) 
    union all
    select /*+ no_merge(b) */ count(*)
    from dba_hist_snapshot a, snp b
    where (a.snap_id < b.snap_id) ;
    
    --------------------------------------------------------------------------------------
    | Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |               |     2 |    40 |   211  (51)| 00:00:01 |
    |   1 |  UNION-ALL           |               |       |       |            |          |
    |   2 |   SORT AGGREGATE     |               |     1 |    20 |            |          |
    |*  3 |    HASH JOIN         |               |     4 |    80 |   106   (2)| 00:00:01 |
    |   4 |     VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
    |   5 |      FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
    |*  6 |     TABLE ACCESS FULL| WRM$_SNAPSHOT | 16314 |   111K|   103   (1)| 00:00:01 |
    |   7 |   SORT AGGREGATE     |               |     1 |    20 |            |          |
    |   8 |    NESTED LOOPS      |               |   816 | 16320 |   105   (1)| 00:00:01 |
    |   9 |     VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
    |  10 |      FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
    |* 11 |     TABLE ACCESS FULL| WRM$_SNAPSHOT |   816 |  5712 |   103   (1)| 00:00:01 |
    --------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("SNAP_ID"="B"."SNAP_ID")
       6 - filter("STATUS"=0)
      11 - filter("STATUS"=0 AND "SNAP_ID"<"B"."SNAP_ID")
    
    

    This is contrary to my earlier perception that NO_MERGE hint had to be added to the CTE and not anywhere else. Which one is correct?

    Many thanks.

    Comment by CJ — May 30, 2012 @ 3:37 pm BST May 30,2012 | Reply

  6. Jonathan sir,

    A rather curious behaviour with NO_MERGE and UNION. NO_MERGE works with the query appearing last in UNION ALL, but not the query appearing before.

    
    
    explain plan for
    with snp as (select /*+ no_merge */ f_test() snap_id from dual)
    select count(*)
    from dba_hist_snapshot a, snp b
    where (a.snap_id= b.snap_id) 
    union all
    select count(*)
    from dba_hist_snapshot a, snp b
    where (a.snap_id < b.snap_id) 
    ;
    
    
    --------------------------------------------------------------------------------------
    | Id  | Operation            | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |               |     2 |    27 |   218  (53)| 00:00:01 |
    |   1 |  UNION-ALL           |               |       |       |            |          |
    |   2 |   SORT AGGREGATE     |               |     1 |     7 |            |          |
    |   3 |    NESTED LOOPS      |               |     4 |    28 |   113   (8)| 00:00:01 |
    |   4 |     FAST DUAL        |               |     1 |       |     2   (0)| 00:00:01 |
    |*  5 |     TABLE ACCESS FULL| WRM$_SNAPSHOT |     4 |    28 |   111   (9)| 00:00:01 |
    |   6 |   SORT AGGREGATE     |               |     1 |    20 |            |          |
    |   7 |    NESTED LOOPS      |               |   816 | 16320 |   105   (1)| 00:00:01 |
    |   8 |     VIEW             |               |     1 |    13 |     2   (0)| 00:00:01 |
    |   9 |      FAST DUAL       |               |     1 |       |     2   (0)| 00:00:01 |
    |* 10 |     TABLE ACCESS FULL| WRM$_SNAPSHOT |   816 |  5712 |   103   (1)| 00:00:01 |
    --------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       5 - filter("SNAP_ID"="F_TEST"() AND "STATUS"=0)
      10 - filter("STATUS"=0 AND "SNAP_ID"<"B"."SNAP_ID")
    
    

    From the filter conditions you can see that the function will execute once for every row in the first part of UNION, whereas it will execute only once for the second part. I traced using DBMS_PROFILER to confirm this. What can be the reason? (BTW, this is why I was trying to push in the MATERIALIZE hint).

    Many thanks.

    Comment by CJ — May 30, 2012 @ 3:08 pm BST May 30,2012 | Reply

  7. […] Disappearance of semi-join transformation – fixed […]

    Pingback by 12c subquery factoring | Oracle Scratchpad — August 7, 2013 @ 6:15 pm BST Aug 7,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,266 other followers