Subquery Factoring (3)

June 29, 2010

Subquery Factoring (3)

Filed under: CBO,Execution plans,Oracle,Performance,Subquery Factoring,Troubleshooting — Jonathan Lewis @ 6:28 pm BST Jun 29,2010

From time to time I’ve warned people that subquery factoring should be used with a little care if all you’re trying to do is make a query more readable by extracting parts of the SQL into “factored subqueries” (or Common Table Expressions – CTEs – if you want to use the ANSI term for them). In principle, for example, the following two queries should produce the same execution plan:

select
        t1.*
from    t1,
        (
         select
                distinct n30
         from   t2
         where
                n30 < 5  -- > comment to avoid wordpress format issue
        ) subq
where
        t1.n30 = subq.n30
;

with subq as (
        select
                distinct n30
        from   t2
        where
                n30 < 5  -- > comment to avoid wordpress format issue
)
select
        t1.*
from
        t1,
        subq
where
        t1.n30 = subq.n30
;

All I’ve done in the second query is changed the inline view to a common table expression at the top of the statement in the hope of making the query text look simpler. When I do this, I expect Oracle to copy the CTE back in line and then optimize, coming up with exactly the same plan in both cases. Unfortunately this is what I find:

For the query with the inline view, I see that the optimizer has applied complex view merging, joining early and aggregating late:

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   161 | 14007 |   302   (2)| 00:00:04 |
|   1 |  VIEW                |      |   161 | 14007 |   302   (2)| 00:00:04 |
|   2 |   HASH UNIQUE        |      |   161 | 21252 |   302   (2)| 00:00:04 |
|*  3 |    HASH JOIN         |      |   161 | 21252 |   301   (1)| 00:00:04 |
|*  4 |     TABLE ACCESS FULL| T2   |   150 |  1800 |   150   (1)| 00:00:02 |
|*  5 |     TABLE ACCESS FULL| T1   |   150 | 18000 |   150   (1)| 00:00:02 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."N30"="N30")
   4 - filter("N30"<5)
   5 - filter("T1"."N30"<5)

For the query with the CTE, the optimizer has not used complex view merging, it’s aggregated early and joined late.

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   150 | 20250 |   302   (2)| 00:00:04 |
|*  1 |  HASH JOIN           |      |   150 | 20250 |   302   (2)| 00:00:04 |
|   2 |   VIEW               |      |     5 |    75 |   151   (2)| 00:00:02 |
|   3 |    HASH UNIQUE       |      |     5 |    40 |   151   (2)| 00:00:02 |
|*  4 |     TABLE ACCESS FULL| T2   |   150 |  1200 |   150   (1)| 00:00:02 |
|   5 |   TABLE ACCESS FULL  | T1   | 30000 |  3515K|   150   (1)| 00:00:02 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."N30"="SUBQ"."N30")
   4 - filter("N30"<5)

This particular test case behaves the same way in 10.2.0.3 and 11.1.0.6. Since I’m running with CPU costing (system statistics), it’s possible that you won’t be able to reproduce the same result without some fiddling with data sizes or database parameters, but here’s the code to generate the two tables:


rem
rem     Script:         subq_difference.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2010
rem
 
create table t1
as
with generator as (
        select  --+ materialize
                rownum  id
        from    all_objects
        where   rownum <= 3000  -- > comment to avoid wordpress format issue
)
select
        rownum                  id,
        mod(rownum,1000)        n30,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 30000
;

create table t2
as
with generator as (
        select  --+ materialize
                rownum  id
        from    all_objects
        where   rownum <= 3000  -- > comment to avoid wordpress format issue
)
select
        rownum                  id,
        mod(rownum,1000)        n30,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 30000  -- > comment to avoid wordpress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                estimate_percent => null,
                block_sample     => true,
                method_opt       => 'for all columns size 1',
                cascade          => true
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                estimate_percent => null,
                block_sample     => true,
                method_opt       => 'for all columns size 1',
                cascade          => true
        );
end;
/

I also set the system statistics with the following PL/SQL:

begin
	dbms_stats.set_system_stats('MBRC',8);
	dbms_stats.set_system_stats('MREADTIM',26);
	dbms_stats.set_system_stats('SREADTIM',12);
	dbms_stats.set_system_stats('CPUSPEED',800);
end;

You might want to fiddle around with newer releases and see if this result is reproducible – it’s always possible that it’s fixed in 10.2.0.4 and 11.1.0.7, but there are other bugs relating to CTEs and the CBO choosing different optimization strategies because of changes in the path through the code when it sees a CTE.

Update 7th Aug 2013

Fixed by 11.2.0.3 – view merging took place in both cases on the system I ran it.

On 12c I got the same plan for both versions of the query – but on that system it was the non-merged view that appeared in both cases.

Comments (21)

21 Comments »

Interesting. Had been wondering if the issues would arise. Seems inevitable. Tried on 10.2.0.1 on windows including system stats settings and couldn’t reproduce. Nice to have a n easy to set up test case. Will try tweaking it a bit.
Of course Common Table Expressions are great for cleaning up code as well as replacing repeating sub-queries, but I’m also curious about using CTEs to encourage/force early execution filter evaluations. Encouraging filters to execute early with inline views was a typical approach I used to try and help the optimizer. I’m thinking the same approach would make sense with CTEs but I have yet to experiment much with the approach.

Comment by Kyle Hailey — June 29, 2010 @ 10:01 pm BST Jun 29,2010 | Reply
- I also use “inline view/subquery factoring” for early filtering; in addition I use the hint no_merge. Am I correct? Are there any other strategies?
  
  Comment by Roberto — June 30, 2010 @ 7:32 am BST Jun 30,2010 | Reply
  - Roberto,
    
    The question is a little too open-ended for me to give a definite reply – there are always exceptions (and classes of exception) that matter. However, as a general principle using inline views with /*+ no_merge */ or CTEs with /*+ materialize */ are robust strategies for controlling the way that the optimizer progresses through a complex query and I don’t think there are any other good strategies for applying this type of control. (You can do massive amounts of detailed hinting, of course – but that’s not a good strategy).
    
    Comment by Jonathan Lewis — June 30, 2010 @ 9:17 am BST Jun 30,2010 | Reply
    - /*+ materialize */ implies PIOs on temporary tablespaces. Really I often use /*+ no_merge */ with CTEs (not inline views), for early filtering with complex joins, and this improves performances and stabilizes execution plans.
      
      Comment by Roberto — June 30, 2010 @ 9:37 am BST Jun 30,2010
- Kyle
  
  Which way round did it misbehave. I’d guess that neither query did complex view merging (one because of the cost, the other because it wasn’t considered). For added detail: I was using LMTs with uniform 1MB extents and freelist management, and my db_file_mulitblock_read_count was not set.
  
  Additional thought – did your plans show “hash unique” or “sort unique” ?
  
  Comment by Jonathan Lewis — June 30, 2010 @ 9:05 am BST Jun 30,2010 | Reply
  - Reproduced the case as is on 10.2.0.1
    (Again, i was going too fast to check all my facts correctly the first time around)
    Best
    Kyle
    here is a side by side comparison with Extended Row Source Stats:
    
    Comment by Kyle Hailey — June 30, 2010 @ 11:14 pm BST Jun 30,2010 | Reply
I have observed that sometimes CTE’s cause increased use of temp space. Although I love this approach since it makes query readable.

Comment by Jigar Doshi — June 30, 2010 @ 6:16 am BST Jun 30,2010 | Reply
- Jigar Doshi,
  
  One reason for increased use of temp space is that CTEs that materialize WILL do direct writes to write their content to the temporary tablespace, and then do “db file scattered read”s to read them back on demand.
  
  Comment by Jonathan Lewis — June 30, 2010 @ 9:10 am BST Jun 30,2010 | Reply
  - I don’t have Oracle database at the moment to test, but I’m pretty sure that wait events on reading of the materialized subquery will be “direct path read temp” in 10g and above.
    
    Comment by Timur Akhmadeev — June 30, 2010 @ 3:31 pm BST Jun 30,2010 | Reply

I was able to reproduce the same execution plans for the queries against Oracle 10gR2 (10.2.0.4) running on Windows XP.

I presonally like using CTEs as they make the query more readable.

Here are the results:

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for 32-bit Windows: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL>
SQL> set autotrace traceonly exp
SQL>
SQL> @c:\q1

Execution Plan
----------------------------------------------------------
Plan hash value: 2507266765

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   161 | 14007 |   302   (2)| 00:00:04 |
|   1 |  VIEW                |      |   161 | 14007 |   302   (2)| 00:00:04 |
|   2 |   HASH UNIQUE        |      |   161 | 21252 |   302   (2)| 00:00:04 |
|*  3 |    HASH JOIN         |      |   161 | 21252 |   301   (1)| 00:00:04 |
|*  4 |     TABLE ACCESS FULL| T2   |   150 |  1800 |   150   (1)| 00:00:02 |
|*  5 |     TABLE ACCESS FULL| T1   |   150 | 18000 |   150   (1)| 00:00:02 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."N30"="N30")
   4 - filter("N30"<5)
   5 - filter("T1"."N30"
SQL>
SQL>
SQL> @c:\q2

Execution Plan
----------------------------------------------------------
Plan hash value: 3120856933

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   150 | 20250 |   303   (2)| 00:00:04 |
|*  1 |  HASH JOIN           |      |   150 | 20250 |   303   (2)| 00:00:04 |
|   2 |   VIEW               |      |     5 |    75 |   151   (2)| 00:00:02 |
|   3 |    HASH UNIQUE       |      |     5 |    40 |   151   (2)| 00:00:02 |
|*  4 |     TABLE ACCESS FULL| T2   |   150 |  1200 |   150   (1)| 00:00:02 |
|   5 |   TABLE ACCESS FULL  | T1   | 30000 |  3515K|   150   (1)| 00:00:02 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."N30"="SUBQ"."N30")
   4 - filter("N30"
SQL> set autotrace off
SQL>

Comment by Asif Momen — June 30, 2010 @ 7:26 am BST Jun 30,2010 | Reply

Jonathan, what did Oracle support say when you spoke to them about this?

BTW, can you please tell me where you obtained the MATERIALIZE hint from? It is not in any of the manuals (at least I can’t find it; I’ve looked in 10.2 and 11.2), and as such it is not an officially recognised hint, so it shouldn’t be used in production code !

Comment by Martin Rose — June 30, 2010 @ 8:02 am BST Jun 30,2010 | Reply
- Martin,
  
  I don’t raise SRs with Oracle for every anomaly I find. If I find something that’s important to the client then I let them pursue it, and try to create a demonstration case to support their SR; but sometimes the client will prefer to use an alternative mechanism rather than pursue an SR.
  
  In the case of the CVM failing, there is at least one open bug on Metalink at present about the way in which the CTE heuristics cause some CBQT mechanism to be bypassed – unfortunately I’m keep getting “search timeout” errors when I try to find it.
  
  I can’t remember why I know about the materialize and inline hints. I never take Metalink as a flat-out justification for applying a given strategy or method, but I think if you search metalink for references to the materialize hint you will find that there are examples of Oracle support advising end-users to use it.
  
  Update: Doing a quick search for the materialize hint, I came across bug 7597354 – “Wrong results from WITH clause which is put INLINE [ID 7597354.8]” Fixed in 10.2.0.5 and 11.2.0.1. One of the supplied workarounds was to use the materialize hint (although it was given as “materialized” in the bug note), and a linked bug suggested the use of the inline hint as a workaround to that problem,
  
  Comment by Jonathan Lewis — June 30, 2010 @ 9:36 am BST Jun 30,2010 | Reply
A little side-note: according to bug note 5502215 there is a new parameter in 10.2.0.5 and 11.1.0.6 which changes the way that the optimizer handles CTEs (subquery factoring). The description is as follows:

The choice of which method to use is based on a set of heuristics. This fix allows a preference to be specified via the new “_with_subquery” parameter which can be set to:

INLINE – always inline the subquery
MATERIALIZE – always try to materialize the subquery (if materialize is allowed)
not set or OPTIMIZER – use existing heuristics.

Comment by Jonathan Lewis — June 30, 2010 @ 2:26 pm BST Jun 30,2010 | Reply
Personally I hate subquery factoring in the production code. It makes SQL looks like not SQL. What I mean is having WITH clause makes your “I want that data” request presented as “First of all I want this thing, after this I want that one, and finally I would like to build the results from several intermediate result sets” – you see, it’s more like procedural logic. And I don’t like procedural logic :)
Another thing to consider is there are many bugs associated with subquery factoring which places it on the same level as ANSI joins – i.e. of course you can use it, but don’t be surprised with unexpected consequences.
Just my 2c.

Comment by Timur Akhmadeev — June 30, 2010 @ 3:43 pm BST Jun 30,2010 | Reply
- > it’s more like procedural logic
  
  I couldn’t disagree more.
  I find it an invaluable tool for thinking clearly – and getting others to think clearly – about the sets of data that you wish to join together.
  
  Comment by Dominic Brooks — June 30, 2010 @ 8:20 pm BST Jun 30,2010 | Reply
I just tested at my home laptop, Windows 7 and Oracle 11.2.0.1. I get the same results as Jonathan by using his scripts. Oracle is at the default settings, out-of-the-box.

So I think not much has changed.

Comment by htendam — June 30, 2010 @ 7:25 pm BST Jun 30,2010 | Reply
I use subquery factoring to improve query performance when using the CONNECT BY clause. If I create a factored subquery containing only the set of columns I need from a table before applying the CONNECT BY, it gets much faster. It seems that the CONNECT BY clause is faster when run on a smaller set of data. This might have changed in Oracle 11G as shown in this post: http://raptorreports.blogspot.com/2010/03/hierarchical-queries-in-oracle-10g-11g.html

Comment by Donat Callens — July 14, 2010 @ 7:31 am BST Jul 14,2010 | Reply
[…] — Jonathan Lewis @ 6:38 pm UTC Sep 13,2010 I’ve written before about the effects of subquery factoring (common table expressions – or CTEs) on the optimizer, and the way that the optimizer can […]

Pingback by Subquery Factoring (4) « Oracle Scratchpad — September 13, 2010 @ 6:40 pm BST Sep 13,2010 | Reply
[…] the last thre or four years I’ve made several commentsabout how subquery factoring could result in changes in execution plans even if the “factored […]

Pingback by Subquery Factoring « Oracle Scratchpad — February 1, 2012 @ 5:53 pm GMT Feb 1,2012 | Reply
[…] Consistency in view merging – fixed (in 11.2.0.3) […]

Pingback by 12c subquery factoring | Oracle Scratchpad — August 7, 2013 @ 6:17 pm BST Aug 7,2013 | Reply
[…] Rewriting to use subquery factoring can cause changes to execution plans […]

Pingback by CTE Catalogue | Oracle Scratchpad — June 10, 2020 @ 6:46 pm BST Jun 10,2020 | Reply

RSS feed for comments on this post. TrackBack URI

Oracle Scratchpad

June 29, 2010