Here’s an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. It came to me from Jared Still (a fellow member of Oak Table Network) shortly after I’d made some comments about the patch. It’s an example based on the scott/tiger schema – which I’ve extracted from the script $ORACLE_HOME/rdbms/admin/utlsampl.sql (though the relevant scripts may be demobld.sql or scott.sql, depending on version).
As usual I’ve used an 8KB block size, LMT with uniform 1MB extents, and no ASSM to hold the data. I won’t reproduce the code to generate the schema, just the versions of the query with, then without, subquery factoring:
rem rem Script: with_subq_anomaly.sql rem Author: Jonathan Lewis rem Dated: Mar 2012 rem prompt ======================== prompt Subquery written in line prompt ======================== select deptno, count(*) from ( select /*+ gather_plan_statistics */ deptno from emp order by deptno ) e group by deptno ; prompt ============================ prompt Effect of subquery factoring prompt ============================ with e as ( select /*+ gather_plan_statistics */ deptno from emp order by deptno ) select deptno, count(*) from e group by deptno ;
I have to say that I’m not entirely sure what Oracle should do with these queries – but I do think it should end up with the same result in both cases, but here are the two sets of results and their execution plans (in the same order that I listed the queries above):
======================== Subquery written in line ======================== DEPTNO COUNT(*) ---------- ---------- 30 6 20 5 10 3 3 rows selected. PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID 17m8h7wgmxvr0, child number 0 ------------------------------------- select deptno, count(*) from ( select /*+ gather_plan_statistics */ deptno from emp order by deptno ) e group by deptno Plan hash value: 4067220884 ----------------------------------- | Id | Operation | Name | ----------------------------------- | 0 | SELECT STATEMENT | | | 1 | HASH GROUP BY | | | 2 | TABLE ACCESS FULL| EMP | ----------------------------------- ============================ Effect of subquery factoring ============================ DEPTNO COUNT(*) ---------- ---------- 10 3 20 5 30 6 3 rows selected. PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID fk4xrmz1wpcb9, child number 0 ------------------------------------- with e as ( select /*+ gather_plan_statistics */ deptno from emp order by deptno ) select deptno, count(*) from e group by deptno Plan hash value: 1536940522 ------------------------------------- | Id | Operation | Name | ------------------------------------- | 0 | SELECT STATEMENT | | | 1 | SORT GROUP BY NOSORT| | | 2 | VIEW | | | 3 | SORT ORDER BY | | | 4 | TABLE ACCESS FULL| EMP | -------------------------------------
As you can see, with the inline view Oracle has managed to eliminate the order by clause; but when we use subquery factoring Oracle inlines the subquery as a non-mergeable view and preserves the order by, then recognises that it can achieve the aggregation through a “sort group by” on data that doesn’t need any more sorting.
Personally I think that the original authors of this SQL have got lucky – so far. Neither version of the statement, as written, can guarantee any order to the data because the final operator is not an order by. It is simply a limitation in the current optimizer code that the order by clause hasn’t been eliminated from the factored subquery. At the moment, then, this limitation means that this third party application is getting the expected results by accident – and the next upgrade might make the accident stop happening.
There is a fairly simple workaround to this type of problem – if you write the query with the inline view but add the hint /*+ no_eliminate_oby */ to the inline view, you can capture the baseline for the view and attach it to the version of the query with the inline view but without the hint. (See Fake Baselines for one possible method of doing this capture.)
Update 7th Aug 2013
This difference between the inline approach and the subquery factoring that chooses to move the subquery inline is still present in 12c (12.1.0.2). This suggests that you may find other cases in 12c where rewriting a query with factored subqueries to reduce the apparent complexity may still produce a change of execution plan.
Update June 2020
I must have re-run the test a few years ago with 12.2.0.1 as the script header currently includes the comment: “fixed in 12.2, both queries report hash group by“. (In this context “fixed” means “identical mechanism”, but anyone previously seeing an ordered result set will now be getting an unordered result and call this broken.)
what is the advantage of putting the order by in the “inner” query?
i would always have written this query as
Comment by David Pyke Le Brun — February 14, 2012 @ 10:14 pm GMT Feb 14,2012 |
David,
Comment by Jonathan Lewis — February 15, 2012 @ 12:19 am GMT Feb 15,2012 |
What i meant was, without the order by on the outside of the query, I would never expect to have deterministic row ordering.
in any case… Is the same true if you use a full view with an order by clause.. ?
ie
Comment by David Pyke Le Brun — February 15, 2012 @ 2:07 am GMT Feb 15,2012 |
David,
You’re absolutely right. The application code is wrong if it’s expecting an ordered result without supplying an “order by” clause – and that’s true under any circumstances. It just happens that sometimes we put in an “order by” clause and the execution plan doesn’t end with a “sort order by” operation because the optimizer has determined that previous operations will guarantee that the data will be appearing in the right order as a guaranteed side effect of the earlier operatins.
The only point I was making with the note was that there are still cases even with the latest patching where the code with the inline view doesn’t produce the same execution plan as the code with a CTE that has been inlined.
Comment by Jonathan Lewis — February 15, 2012 @ 2:24 am GMT Feb 15,2012
I think the “advantage” is demonstrating select deptno from emp order by deptno has different results in different queries.
Comment by Barfo Rama — February 14, 2012 @ 11:12 pm GMT Feb 14,2012 |
The first query behaves just like the second one if you do:
alter table emp parallel 4;
Comment by A reader — February 15, 2012 @ 2:16 pm GMT Feb 15,2012 |
A reader,
Not on a couple of versions I tested. I still see one query doing the sort order by (although parallel) and the other not, viz:
Comment by Jonathan Lewis — March 12, 2012 @ 1:21 pm GMT Mar 12,2012 |
[…] Jonathan Lewis shares an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. […]
Pingback by Log Buffer #259, A Carnival of the Vanities for DBAs — March 4, 2013 @ 3:37 pm GMT Mar 4,2013 |
[…] Order by elimination disappeared – still broken […]
Pingback by 12c subquery factoring | Oracle Scratchpad — August 7, 2013 @ 6:15 pm BST Aug 7,2013 |
[…] Another plan inconsistency from subquery factoring – addressed in 12.2 […]
Pingback by CTE Catalogue | Oracle Scratchpad — June 10, 2020 @ 6:46 pm BST Jun 10,2020 |