Here’s an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. It came to me from Jared Still (a fellow member of Oak Table Network) shortly after I’d made some comments about the patch. It’s an example based on the scott/tiger schema – which I’ve extracted from the script $ORACLE_HOME/rdbms/admin/utlsampl.sql (though the relevant scripts may be demobld.sql or scott.sql, depending on version).
As usual I’ve used an 8KB block size, LMT with uniform 1MB extents, and no ASSM to hold the data. I won’t reproduce the code to generate the schema, just the versions of the query with, then without, subquery factoring:
with e as ( select /*+ gather_plan_statistics */ deptno from emp order by deptno ) select deptno, count(*) from e group by deptno ; select deptno, count(*) from ( select /*+ gather_plan_statistics */ deptno from emp order by deptno ) e group by deptno ; -- reference script: with_subq_anomaly.sql
I have to say that I’m not entirely sure what Oracle should do with these queries – but I do think it should end up with the same result in both cases, but here are the two sets of results and their execution plans (in the same order that I listed the queries above):
DEPTNO COUNT(*)
---------- ----------
10 3
20 5
30 6
3 rows selected.
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID fk4xrmz1wpcb9, child number 0
-------------------------------------
with e as ( select /*+ gather_plan_statistics */ deptno from emp
order by deptno ) select deptno, count(*) from e group by deptno
Plan hash value: 1536940522
-------------------------------------
| Id | Operation | Name |
-------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT GROUP BY NOSORT| |
| 2 | VIEW | |
| 3 | SORT ORDER BY | |
| 4 | TABLE ACCESS FULL| EMP |
-------------------------------------
DEPTNO COUNT(*)
---------- ----------
30 6
20 5
10 3
3 rows selected.
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 17m8h7wgmxvr0, child number 0
-------------------------------------
select deptno, count(*) from ( select /*+ gather_plan_statistics
*/ deptno from emp order by deptno ) e group by deptno
Plan hash value: 4067220884
-----------------------------------
| Id | Operation | Name |
-----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH GROUP BY | |
| 2 | TABLE ACCESS FULL| EMP |
-----------------------------------
As you can see, with the inline view Oracle has managed to eliminate the order by clause; but when we use subquery factoring Oracle inlines the subquery as a non-mergeable view and preserves the order by, then recognises that it can achieve the aggregation through a “sort group by” on data that doesn’t need any more sorting.
Personally I think that the original authors of this SQL have got lucky – so far. Neither version of the statement, as written, can guarantee any order to the data because the final operator is not an order by. It is simply a limitation in the current optimizer code that the order by clause hasn’t been eliminated from the factored subquery. At the moment, then, this limitation means that this third party application is getting the expected results by accident – and the next upgrade might make the accident stop happening.
There is a fairly simple workaround to this type of problem – if you write the query with the inline view but add the hint /*+ no_eliminate_oby */ to the inline view, you can capture the baseline for the view and attach it to the version of the query with the inline view but without the hint. (See Fake Baselines for one possible method of doing this capture.)

what is the advantage of putting the order by in the “inner” query?
i would always have written this query as
select deptno, count(*) from ( select /*+ gather_plan_statistics */ deptno from emp ) e group by deptno order by deptno ;Comment by David Pyke Le Brun — February 14, 2012 @ 10:14 pm UTC Feb 14,2012 |
David,
Comment by Jonathan Lewis — February 15, 2012 @ 12:19 am UTC Feb 15,2012 |
What i meant was, without the order by on the outside of the query, I would never expect to have deterministic row ordering.
in any case… Is the same true if you use a full view with an order by clause.. ?
ie
create or replace view e as select /*+ gather_plan_statistics */ deptno from emp order by deptno ; select deptno, count(*) from e group by deptno ;Comment by David Pyke Le Brun — February 15, 2012 @ 2:07 am UTC Feb 15,2012 |
David,
You’re absolutely right. The application code is wrong if it’s expecting an ordered result without supplying an “order by” clause – and that’s true under any circumstances. It just happens that sometimes we put in an “order by” clause and the execution plan doesn’t end with a “sort order by” operation because the optimizer has determined that previous operations will guarantee that the data will be appearing in the right order as a guaranteed side effect of the earlier operatins.
The only point I was making with the note was that there are still cases even with the latest patching where the code with the inline view doesn’t produce the same execution plan as the code with a CTE that has been inlined.
Comment by Jonathan Lewis — February 15, 2012 @ 2:24 am UTC Feb 15,2012
I think the “advantage” is demonstrating select deptno from emp order by deptno has different results in different queries.
Comment by Barfo Rama — February 14, 2012 @ 11:12 pm UTC Feb 14,2012 |
The first query behaves just like the second one if you do:
alter table emp parallel 4;
Comment by A reader — February 15, 2012 @ 2:16 pm UTC Feb 15,2012 |
A reader,
Not on a couple of versions I tested. I still see one query doing the sort order by (although parallel) and the other not, viz:
Comment by Jonathan Lewis — March 12, 2012 @ 1:21 pm UTC Mar 12,2012 |
[...] Jonathan Lewis shares an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. [...]
Pingback by Log Buffer #259, A Carnival of the Vanities for DBAs — March 4, 2013 @ 3:37 pm UTC Mar 4,2013 |