Oracle Scratchpad

February 14, 2012

Subquery Factoring

Filed under: Bugs,CBO,Execution plans,Oracle,Subquery Factoring,Tuning — Jonathan Lewis @ 5:59 pm BST Feb 14,2012

Here’s an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. It came to me from Jared Still (a fellow member of Oak Table Network) shortly after I’d made some comments about the patch. It’s an example based on the scott/tiger schema – which I’ve extracted from the script $ORACLE_HOME/rdbms/admin/utlsampl.sql (though the relevant scripts may be demobld.sql or scott.sql, depending on version).

As usual I’ve used an 8KB block size, LMT with uniform 1MB extents, and no ASSM to hold the data. I won’t reproduce the code to generate the schema, just the versions of the query with, then without, subquery factoring:

with e as (
	select
		/*+ gather_plan_statistics */
		deptno
	from	emp
	order by
		deptno
)
select
	deptno, count(*)
from
	e
group by deptno
;

select   deptno, count(*)
from (
	select
		/*+ gather_plan_statistics */
		deptno
	from	emp
	order by
		deptno
) e
group by deptno
;

-- reference script: with_subq_anomaly.sql

I have to say that I’m not entirely sure what Oracle should do with these queries – but I do think it should end up with the same result in both cases, but here are the two sets of results and their execution plans (in the same order that I listed the queries  above):

    DEPTNO   COUNT(*)
---------- ----------
        10          3
        20          5
        30          6

3 rows selected.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  fk4xrmz1wpcb9, child number 0
-------------------------------------
with e as (  select   /*+ gather_plan_statistics */   deptno  from emp
order by   deptno ) select  deptno, count(*) from  e group by deptno

Plan hash value: 1536940522

-------------------------------------
| Id  | Operation            | Name |
-------------------------------------
|   0 | SELECT STATEMENT     |      |
|   1 |  SORT GROUP BY NOSORT|      |
|   2 |   VIEW               |      |
|   3 |    SORT ORDER BY     |      |
|   4 |     TABLE ACCESS FULL| EMP  |
-------------------------------------

    DEPTNO   COUNT(*)
---------- ----------
        30          6
        20          5
        10          3

3 rows selected.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  17m8h7wgmxvr0, child number 0
-------------------------------------
select   deptno, count(*) from (  select   /*+ gather_plan_statistics
*/   deptno  from emp  order by   deptno ) e group by deptno

Plan hash value: 4067220884

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|   1 |  HASH GROUP BY     |      |
|   2 |   TABLE ACCESS FULL| EMP  |
-----------------------------------

As you can see, with the inline view Oracle has managed to eliminate the order by clause; but when we use subquery factoring Oracle inlines the subquery as a non-mergeable view and preserves the order by, then recognises that it can achieve the aggregation through a “sort group by” on data that doesn’t need any more sorting.

Personally I think that the original authors of this SQL have got lucky – so far. Neither version of the statement, as written, can guarantee any order to the data because the final operator is not an order by. It is simply a limitation in the current optimizer code that the order by clause hasn’t been eliminated from the factored subquery. At the moment, then, this limitation means that this third party application is getting the expected results by accident – and the next upgrade might make the accident stop happening.

There is a fairly simple workaround to this type of problem – if you write the query with the inline view but add the hint /*+ no_eliminate_oby */ to the inline view, you can capture the baseline for the view and attach it to the version of the query with the inline view but without the hint. (See Fake Baselines for one possible method of doing this capture.)

[Further reading on "subquery factoring"]

Update 7th Aug 2013

This difference between the inline approach and the subquery factoring that chooses to move the subquery inline is still present in 12c (12.1.0.1). This suggests that you may find other cases in 12c where rewriting a query with factored subqueries to reduce the apparent complexity may still produce a change of execution plan.

9 Comments »

  1. what is the advantage of putting the order by in the “inner” query?

    i would always have written this query as

    select   deptno, count(*)
    from (
        select
            /*+ gather_plan_statistics */
            deptno
        from    emp
    ) e
    group by deptno
    order by deptno
    ;
    
    

    Comment by David Pyke Le Brun — February 14, 2012 @ 10:14 pm BST Feb 14,2012 | Reply

    • David,

      “what is the advantage of putting the order by in the “inner” query?”

      Don’t ask me, I didn’t write the application: “… this third party application is … “. The sample above is just demonstrating the issue, perhaps if we saw an example of the real application code we might some rationale for doing things this way.

      Comment by Jonathan Lewis — February 15, 2012 @ 12:19 am BST Feb 15,2012 | Reply

      • What i meant was, without the order by on the outside of the query, I would never expect to have deterministic row ordering.

        in any case… Is the same true if you use a full view with an order by clause.. ?

        ie

        create or replace view e
        as
          select
                /*+ gather_plan_statistics */
                deptno
          from    emp
          order by deptno
        ;
        
        select   deptno, count(*)
        from  e
        group by deptno
        ;
        
        

        Comment by David Pyke Le Brun — February 15, 2012 @ 2:07 am BST Feb 15,2012 | Reply

        • David,

          You’re absolutely right. The application code is wrong if it’s expecting an ordered result without supplying an “order by” clause – and that’s true under any circumstances. It just happens that sometimes we put in an “order by” clause and the execution plan doesn’t end with a “sort order by” operation because the optimizer has determined that previous operations will guarantee that the data will be appearing in the right order as a guaranteed side effect of the earlier operatins.

          The only point I was making with the note was that there are still cases even with the latest patching where the code with the inline view doesn’t produce the same execution plan as the code with a CTE that has been inlined.

          Comment by Jonathan Lewis — February 15, 2012 @ 2:24 am BST Feb 15,2012

  2. I think the “advantage” is demonstrating select deptno from emp order by deptno has different results in different queries.

    Comment by Barfo Rama — February 14, 2012 @ 11:12 pm BST Feb 14,2012 | Reply

  3. The first query behaves just like the second one if you do:

    alter table emp parallel 4;

    Comment by A reader — February 15, 2012 @ 2:16 pm BST Feb 15,2012 | Reply

    • A reader,
      Not on a couple of versions I tested. I still see one query doing the sort order by (although parallel) and the other not, viz:

      ---------------------------------------------------------------------------------------------------
      | Id  | Operation                    | Name     | Rows  | Bytes | Cost  |    TQ  |IN-OUT| PQ Distri
      ---------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT             |          |     3 |     9 |     6 |        |      |
      |   1 |  PX COORDINATOR              |          |       |       |       |        |      |
      |   2 |   PX SEND QC (RANDOM)        | :TQ10002 |     3 |     9 |     6 |  Q1,02 | P->S | QC (RAND)
      |   3 |    HASH GROUP BY             |          |     3 |     9 |     6 |  Q1,02 | PCWP |
      |   4 |     PX RECEIVE               |          |     3 |     9 |     6 |  Q1,02 | PCWP |
      |   5 |      PX SEND HASH            | :TQ10001 |     3 |     9 |     6 |  Q1,01 | P->P | HASH
      |   6 |       HASH GROUP BY          |          |     3 |     9 |     6 |  Q1,01 | PCWP |
      |   7 |        VIEW                  |          |    14 |    42 |     6 |  Q1,01 | PCWP |
      |   8 |         SORT ORDER BY        |          |    14 |    42 |     6 |  Q1,01 | PCWP |
      |   9 |          PX RECEIVE          |          |    14 |    42 |     2 |  Q1,01 | PCWP |
      |  10 |           PX SEND RANGE      | :TQ10000 |    14 |    42 |     2 |  Q1,00 | P->P | RANGE
      |  11 |            PX BLOCK ITERATOR |          |    14 |    42 |     2 |  Q1,00 | PCWC |
      |  12 |             TABLE ACCESS FULL| EMP      |    14 |    42 |     2 |  Q1,00 | PCWP |
      ---------------------------------------------------------------------------------------------------
      
      --------------------------------------------------------------------------------------------------
      | Id  | Operation                | Name     | Rows  | Bytes | Cost  |    TQ  |IN-OUT| PQ Distrib |
      --------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT         |          |     3 |     9 |    27 |        |      |            |
      |   1 |  PX COORDINATOR          |          |       |       |       |        |      |            |
      |   2 |   PX SEND QC (RANDOM)    | :TQ10001 |     3 |     9 |    27 |  Q1,01 | P->S | QC (RAND)  |
      |   3 |    HASH GROUP BY         |          |     3 |     9 |    27 |  Q1,01 | PCWP |            |
      |   4 |     PX RECEIVE           |          |     3 |     9 |    27 |  Q1,01 | PCWP |            |
      |   5 |      PX SEND HASH        | :TQ10000 |     3 |     9 |    27 |  Q1,00 | P->P | HASH       |
      |   6 |       HASH GROUP BY      |          |     3 |     9 |    27 |  Q1,00 | PCWP |            |
      |   7 |        PX BLOCK ITERATOR |          |    14 |    42 |     2 |  Q1,00 | PCWC |            |
      |   8 |         TABLE ACCESS FULL| EMP      |    14 |    42 |     2 |  Q1,00 | PCWP |            |
      --------------------------------------------------------------------------------------------------
      
      

      Comment by Jonathan Lewis — March 12, 2012 @ 1:21 pm BST Mar 12,2012 | Reply

  4. [...] Jonathan Lewis shares an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. [...]

    Pingback by Log Buffer #259, A Carnival of the Vanities for DBAs — March 4, 2013 @ 3:37 pm BST Mar 4,2013 | Reply

  5. […] Order by elimination disappeared – still broken […]

    Pingback by 12c subquery factoring | Oracle Scratchpad — August 7, 2013 @ 6:15 pm BST Aug 7,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,013 other followers