Update Oct 2021 – to test on 21.3.0.0 and to modify the model to avoid the need for hints.
Imagine I have a simple SQL statement with a “where” clause that looks like this:
t2.id1(+) = t1.id1 and t2.id2(+) = t1.id2
Would you expect it to run several times faster (25 minutes instead of a few hours) when the only change you made was to swap the order of the join predicates to:
t2.id2(+) = t1.id2 and t2.id1(+) = t1.id1
You may recall that a couple of years ago I wrote about some bugs in the optimizer and pointed you to a blog article by Alberto Dell’Era that demonstrated an anomaly in cardinality calculations that made this type of thing possible. But here’s an example which has nothing to do with cardinality errors. We start with a suitable dataset – initially running on 11.1.0.6 (though the anomaly is still present in 19c)
rem rem Script: merge_order_anomaly.sql rem Author: Jonathan Lewis rem Dated: Nov 2010 rem rem Last tested rem 19.3.0.0 -- still sub-optimal rem create table t1 as with generator as ( select rownum id from dual connect by rownum <= 1e4 -- > comment to avoid wordpress format issue ) select trunc(dbms_random.value(1,1000)) id1, trunc(dbms_random.value(1,1000)) id2, lpad(rownum,10,'0') small_vc, rpad('x',1000) padding from generator v1 ; create table t2 as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 7 -- > comment to avoid wordpress format issue ) select t1.id1, t1.id2, v1.id, lpad(rownum,10,'0') small_vc, rpad('x',70) padding from t1 t1, generator v1 ; create index t1_i1 on t1(id1, id2); create index t2_i1 on t2(id1, id2); begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T2', method_opt => 'for all columns size 1' ); end; .
This data set models a problem – stripped to the bare essentials – that I came across at a client site some time ago. We have a “parent/child” relationship between the tables (although I haven’t declared referential integrity in this example) with roughly seven child rows per parent row. The parent rows are quite long, the child rows are quite short. Some parents may not have children (although in this data set they all do).
We now run a report that generates data for a number-crunching tool that extracts all the data from the tables, using an outer join so that parent rows don’t get lost. For various reasons the tool wanted the data sorted in a certain order so there’s also an order by clause in the query. I’m going to show you a model of the original query – first unhinted, then hinted to use a merge join:
prompt =========================================================================== prompt Join predicate columns in opposite order to order by columns - default path prompt =========================================================================== select t1.padding, t2.padding from t1, t2 where t2.id1(+) = t1.id1 and t2.id2(+) = t1.id2 order by t1.id2, t1.id1 ; --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 70373 | 73M| | 15064 (1)| 00:00:01 | | 1 | SORT ORDER BY | | 70373 | 73M| 78M| 15064 (1)| 00:00:01 | |* 2 | HASH JOIN RIGHT OUTER| | 70373 | 73M| 6224K| 726 (4)| 00:00:01 | | 3 | TABLE ACCESS FULL | T2 | 70000 | 5400K| | 133 (7)| 00:00:01 | | 4 | TABLE ACCESS FULL | T1 | 10000 | 9853K| | 189 (3)| 00:00:01 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T2"."ID1"(+)="T1"."ID1" AND "T2"."ID2"(+)="T1"."ID2") prompt ===================================================================== prompt Join predicate columns in opposite order to order by columns - hinted prompt ===================================================================== select /*+ use_merge(t2) */ t1.padding, t2.padding from t1, t2 where t2.id1(+) = t1.id1 and t2.id2(+) = t1.id2 order by t1.id2, t1.id1 ; ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 70373 | 73M| | 17721 (2)| 00:00:01 | | 1 | SORT ORDER BY | | 70373 | 73M| 78M| 17721 (2)| 00:00:01 | | 2 | MERGE JOIN OUTER | | 70373 | 73M| | 3383 (2)| 00:00:01 | | 3 | SORT JOIN | | 10000 | 9853K| 19M| 2082 (2)| 00:00:01 | | 4 | TABLE ACCESS FULL| T1 | 10000 | 9853K| | 189 (3)| 00:00:01 | |* 5 | SORT JOIN | | 70000 | 5400K| 12M| 1301 (3)| 00:00:01 | | 6 | TABLE ACCESS FULL| T2 | 70000 | 5400K| | 133 (7)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T2"."ID1"(+)="T1"."ID1" AND "T2"."ID2"(+)="T1"."ID2") filter("T2"."ID2"(+)="T1"."ID2" AND "T2"."ID1"(+)="T1"."ID1")
The hash join has a much lower cost than the merge join so it’s not surprising that the optimizer used it, but there’s something a little odd about the plan the optimizer has created for the merge join. Although our where clause references the join columns in the order id1, id2 our order by clause is on id2, id1 and the optimizer hasn’t taken advantage of the fact that it could do the sort join at operations 3 and 5 in the order (id2, id1) and avoid the final sort order by at operation 1.
So let’s rewrite the query to make the predicate ordering in the where clause match the ordering of the columns in the order by clause and see what happens to the plan:
select t1.padding, t2.padding from t1, t2 where t2.id2(+) = t1.id2 and t2.id1(+) = t1.id1 order by t1.id2, t1.id1 ; ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 70373 | 73M| | 3383 (2)| 00:00:01 | | 1 | MERGE JOIN OUTER | | 70373 | 73M| | 3383 (2)| 00:00:01 | | 2 | SORT JOIN | | 10000 | 9853K| 19M| 2082 (2)| 00:00:01 | | 3 | TABLE ACCESS FULL| T1 | 10000 | 9853K| | 189 (3)| 00:00:01 | |* 4 | SORT JOIN | | 70000 | 5400K| 12M| 1301 (3)| 00:00:01 | | 5 | TABLE ACCESS FULL| T2 | 70000 | 5400K| | 133 (7)| 00:00:01 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("T2"."ID2"(+)="T1"."ID2" AND "T2"."ID1"(+)="T1"."ID1") filter("T2"."ID1"(+)="T1"."ID1" AND "T2"."ID2"(+)="T1"."ID2")
The plan no longer has a final sort order by operation – and the cost of the plan is much lower as a consequence. You’ll also notice that the Predicate Information section (always check the Predicate Information section) is a little different – the order of predicate evaluation has been reversed.
The client’s tables were much bigger, and when we check plans and costs with the same three variations in the code/hinting the three plan costs were:
hash join with sort: 5,455K Merge join"badly ordered": 6,208K Merge join "well ordered": 1,338K
This was a case where the cost was a fairly realistic indication of run time, and avoiding an operation that would have sorted roughly 35GB of generated data was a very good move.
tl;dr
With multi-column joins the order of predicates in the where clause can, contrary to all expectations, make a big difference to the way Oracle calculates the cost of a merge join, resulting in a plan that introduces a redundant sort operation and may be doing a hash join when a merge join would be far more efficient.
Update March 2018
The anomaly is still present in 12.1.0.2, 12.2.0.1 and 18.3.0.0
Update May 2019
A quick re-run of the test shows that the anomaly is still present in 19.2
Update Oct 2021
After testing 21.3.0.0 with the original test code (where the costs had changed, but not enough to result in a change of execution plan) I decided to tweak the model to see if I could get a change of plan without having to hint the code. This turned out to be very simple – all I had to do was take the original code and create indexes on the (id1, id2) combination so that Oracle would use the resulting pseudo-“column group” stats in its calculations of join cardinality.
I don’t have an 11g database at hand, but the behaviour reproduces on 10.2.0.3 as well. And it doesn’t need to be an outer join – an inner join behaves just the same.
Comment by Flado — December 16, 2010 @ 8:19 am GMT Dec 16,2010 |
Flado,
Thanks for that; and thanks for demonstrating one of the benefits of test cases – you can reproduce them with variations and different versions of Oracle to test their boundaries.
Anyone for 11.2 ?
Comment by Jonathan Lewis — December 16, 2010 @ 7:05 pm GMT Dec 16,2010 |
Reproduces on 11.2.0.2.
Comment by Vyacheslav Rasskazov — December 16, 2010 @ 11:14 pm GMT Dec 16,2010 |
Vyacheslav,
Thanks for running the test and passing on the result.
Comment by Jonathan Lewis — December 17, 2010 @ 8:52 am GMT Dec 17,2010
Jonathan,
Some cases i had seen it to make a difference is during joins involving composite partitioned tables.
http://srivenukadiyala.wordpress.com/2010/08/09/plan_hash_value-limitations-pq_distribute-skew-in-data-distribution-across-parallel-slaves/
regards
srivenu
Comment by ksrivenu — December 20, 2010 @ 4:53 pm GMT Dec 20,2010 |
Srivenu,
Interesting, and worrying, case – thanks for posting it.
I’ll have to take a closer look at the implications. Your example is about composite partitioned tables and parallel execution, but I wonder if it would generalise to more cases of multi-column hash joins.
Comment by Jonathan Lewis — December 21, 2010 @ 8:09 am GMT Dec 21,2010 |
Reproduces in 10.2.0.5.
Comment by Pavel Ermakov — December 28, 2010 @ 10:12 am GMT Dec 28,2010 |
Pavel,
Thanks for checking.
Comment by Jonathan Lewis — December 28, 2010 @ 3:22 pm GMT Dec 28,2010 |
[…] Lewis was surprised by a join. If he was surprised, I’m sure everyone else will be too. Did you know that the order of join […]
Pingback by Log Buffer #210, A Carnival of the Vanities for DBAs — February 15, 2013 @ 1:20 pm GMT Feb 15,2013 |
Your passion for optimizer behaviour greatly appreciated
You are putting in very simplified and understanding example
Thank you Sir
Comment by Amit — May 24, 2017 @ 5:22 am BST May 24,2017 |
[…] Source: Join Surprise […]
Pingback by Join Surprise – Steve Bamber — July 13, 2018 @ 10:41 pm BST Jul 13,2018 |