Oracle Scratchpad

February 11, 2010

Ignoring Hints – 2

Filed under: CBO,Execution plans,Hints,Ignoring Hints — Jonathan Lewis @ 7:28 pm UTC Feb 11,2010

Here’s a little puzzle that someone sent to me a couple of days ago – it’s a case where the optimizer seems to be ignoring a hint.

create table t1
as
select
	rownum id,
	object_name name
from
	all_objects
where
	rownum <= 1000
;

create table t2
as
select
	mod(rownum,20)+1 id
from
	all_objects
where
	rownum <= 1000
;

alter table t1 add constraint t1_pk primary key(id);
create index ind_t2 on t2(id);

-- gather stats at this point

set autotrace traceonly explain

select
	/*+ ordered use_hash(t2) */ *
from
	t1, t2
where
	t1.id = t2.id
and	t1.id = 1
;

select
	/*+ ordered use_hash(t2) */ *
from
	t1, t2
where
	t1.id = t2.id
and	t2.id = 1
;

set autotrace off

Notice that the only difference between these two queries is the filter predicate: t1.id = 1 in the first statement and t2.id = 1 in the second. Given the fact that the join predicate is then: t1.id = t2.id, these two queries are logically identical. So why does one of them “ignore” the use_hash() hint and do a nested loop ? Remember: the optimizer does not ignore hints (with the usual caveats about legality, spelling, syntax and bugs).

Here are the two execution plans:

------------------------------------------------+-----------------------------------+
| Id  | Operation                     | Name    | Rows  | Bytes | Cost  | Time      |
------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT              |         |       |       |     3 |           |
| 1   |  NESTED LOOPS                 |         |    50 |  1050 |     3 |  00:00:01 |
| 2   |   TABLE ACCESS BY INDEX ROWID | T1      |     1 |    18 |     2 |  00:00:01 |
| 3   |    INDEX UNIQUE SCAN          | T1_PK   |     1 |       |     1 |  00:00:01 |
| 4   |   INDEX RANGE SCAN            | IND_T2  |    50 |   150 |     1 |  00:00:01 |
------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
3 - access("T1"."ID"=1)
4 - access("T2"."ID"=1)

------------------------------------------------+-----------------------------------+
| Id  | Operation                     | Name    | Rows  | Bytes | Cost  | Time      |
------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT              |         |       |       |     4 |           |
| 1   |  HASH JOIN                    |         |    50 |  1050 |     4 |  00:00:01 |
| 2   |   TABLE ACCESS BY INDEX ROWID | T1      |     1 |    18 |     2 |  00:00:01 |
| 3   |    INDEX UNIQUE SCAN          | T1_PK   |     1 |       |     1 |  00:00:01 |
| 4   |   INDEX RANGE SCAN            | IND_T2  |    50 |   150 |     1 |  00:00:01 |
------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
1 - access("T1"."ID"="T2"."ID")
3 - access("T1"."ID"=1)
4 - access("T2"."ID"=1)

Have you spotted the critical clue ? It’s in the predicate sections.

Notice how the first plan has two predicates and the second plan has three – and the “missing” predicate is the join predicate. You’re looking at an example of transitive closure:  if t1.id1 = 1 and t2.id1 = t1.id1 then t2.id1 = 1 So Oracle has taken the predicate from one table and copied to the other table – but in just one of the cases it has also eliminated the join. Because there is no longer a join predicate it’s not possible for the optimizer to do a hash join (in earlier versions of Oracle you might have seen a cartesian merge join at this point) so the use_hash() hint has to be ignored – it’s out of context.

The question, though, is why are the two cases not symmetrical – the queries are logically equivalent (and to the human eye very obviously so) so they should have been transformed in the same way. My best guess is that the optimzer checks the significance of the existing predicate as it generates the  new predicate – and if it sees that it’s guaranteed to supply a single row it discards the join condition. But when it’s working the other way round (i.e. starting with the non-unique predicate on t2) it doesn’t check the significance of the predicates that it has generated.

It’s an odd little detail, but I can’t think of  a case where it will have any noticeable impact on performance – though someone may prove me wrong on that.

Footnote:

I’ve been reminded by a couple of the comments that I should have mentioned that I ran this example on 10.2.0.3. Others have now run it on other versions of Oracle. If you go back to 9i you’ll see that the join predicate disappears in both cases and 9i will do a cartesian merge join).

[Further reading on "ignoring hints"]

13 Comments »

  1. Oracle version 10?

    Another point for having the right version of the explain-plan-table.

    Comment by lascoltodelvenerdi — February 12, 2010 @ 10:11 am UTC Feb 12,2010 | Reply

  2. I tested this against 10.2.0.4 and found working.

    Comment by Asif Momen — February 12, 2010 @ 11:15 am UTC Feb 12,2010 | Reply

  3. Hi Nice findings,

    Why can’t you ask ASKTom regarding this, as he will be able to answer this. This is an interesting find as it can change the execution plan in some kind of cases where we chose which is the first filtering condition

    Thanks,
    Sarath

    Comment by Sarath — February 12, 2010 @ 1:31 pm UTC Feb 12,2010 | Reply

  4. I agree with your results for Oracle Versions 10.2.0.3 and 11.1.0.6.

    Comment by Tony Sleight — February 12, 2010 @ 1:46 pm UTC Feb 12,2010 | Reply

  5. I ran the test on Oracle Database 11.2.0.1 and received the same results as those posted above. I also generated a 10053 trace in the script, and the resulting trace file possibly shows what is happening.

    For the first query:

    ...
    **************************
    Predicate Move-Around (PM)
    **************************
    PM:     PM bypassed: Outer query contains no views.
    PM:     PM bypassed: Outer query contains no views.
    query block SEL$1 (#0) unchanged
    FPD: Considering simple filter push in query block SEL$1 (#0)
    "T1"."ID"="T2"."ID" AND "T1"."ID"=1
    try to generate transitive predicate from check constraints for query block SEL$1 (#0)
    finally: "T2"."ID"=1 AND "T1"."ID"=1
    ...
    

    For the second query:

    ...
    **************************
    Predicate Move-Around (PM)
    **************************
    PM:     PM bypassed: Outer query contains no views.
    PM:     PM bypassed: Outer query contains no views.
    query block SEL$1 (#0) unchanged
    FPD: Considering simple filter push in query block SEL$1 (#0)
    "T1"."ID"="T2"."ID" AND "T2"."ID"=1
    try to generate transitive predicate from check constraints for query block SEL$1 (#0)
    finally: "T1"."ID"="T2"."ID" AND "T2"."ID"=1 AND "T1"."ID"=1
     
    FPD:   transitive predicates are generated in query block SEL$1 (#0)
    "T1"."ID"="T2"."ID" AND "T2"."ID"=1 AND "T1"."ID"=1
    ...
    

    From the above, I would suggest that the optimizer checks for the presence of a unique key constraint on the predicates already in the WHERE clause to see if the join may be removed before the optimizer generates the “T2″.”ID”=1 predicate, although more testing might be needed.

    What is possibly interesting is that in both cases, the 10053 trace shows that both hints are actually used.

    For the first query:

    Dumping Hints
    =============
      atom_hint=(@=0x2b32e92e60f8 err=0 resol=1 used=1 token=807 org=1 lvl=2 txt=ORDERED )
      atom_hint=(@=0x2b32e92e5dc8 err=0 resol=1 used=1 token=922 org=1 lvl=3 txt=USE_HASH ("T2") )
    ====================== END SQL Statement Dump ======================
    

    For the second query:

    Dumping Hints
    =============
      atom_hint=(@=0x2b32e92e60f8 err=0 resol=1 used=1 token=807 org=1 lvl=2 txt=ORDERED )
      atom_hint=(@=0x2b32e92e5dc8 err=0 resol=1 used=1 token=922 org=1 lvl=3 txt=USE_HASH ("T2") )
    ====================== END SQL Statement Dump ======================
    

    Comment by Charles Hooper — February 12, 2010 @ 11:25 pm UTC Feb 12,2010 | Reply

  6. Interesting test case, incidentally I’ve done some research on the subject recently too.

    This transitivity knowledge “retention” is controlled by _optimizer_transitivity_retain parameter, which changed to true in 10.2.0.1.

    Even in past Oracle versions, when you see MERGE JOIN CARTESIAN, the join is really just a NESTED LOOP join as cartesiam merge join is internally implemented as nested loop join without filtering.

    I guess the human-readable execution plan text generation functions fail to recognize it sometimes…

    Comment by Tanel Poder — February 13, 2010 @ 10:07 am UTC Feb 13,2010 | Reply

    • Tanel,

      “the join is really just a NESTED LOOP join as cartesiam merge join is internally implemented as nested loop join without filtering”

      I’ve got a draft note pointing out that all three join mechanisms are basically nested loop joins – except the merge and hash joins have different strategies for moving data from the buffer cache to local memory (or temp) before starting the loop.

      Even so, there is an important difference between the (Cartesian) Merge and the Nested Loop – the second data set is always copied into a local buffer by the “buffer sort” operation before the “nested loop” bit begins.

      It seems that Oracle from 10g2 recognises that this copy is a waste of resources when there is only one row in the first data set – and so switches to a “non-copied” nested loop.

      There’s a link to a couple of articles on the buffer sort in the article I linked to above about Cartesian Merge joins.

      Comment by Jonathan Lewis — February 14, 2010 @ 6:28 pm UTC Feb 14,2010 | Reply

      • Cool!

        Where I was getting at is that the same rowsource function (qerjotFetch) is used for both driving a nested loop join and a cartesian merge join, while hash join has its of set of functions (starting with qerhj). Of course what happens in the child rowsources under the join branch in the exec plan tree also makes a difference (like the buffering you mentioned etc).

        Comment by Tanel Poder — February 15, 2010 @ 8:03 am UTC Feb 15,2010 | Reply

  7. Thank you, Jonathan & Tanel, both. You have, in 7 sentences, crushed my Oracle knowledge ego like an 18-wheel, fully-laden, interstate juggernaut driving over a chicken egg, with horn blaring and CB radio gobbledygook carried in your “We Brake for Nobody” wake.

    Comment by Nigel — February 15, 2010 @ 2:13 pm UTC Feb 15,2010 | Reply

  8. [...] Jonathan Lewis-Ignoring Hints-2 [...]

    Pingback by Blogroll Report 05/02/2009 – 12/02/2010 « Coskan’s Approach to Oracle — March 10, 2010 @ 2:28 am UTC Mar 10,2010 | Reply

  9. [...] a 10053 trace might help solve part of the mystery.  An interesting comment by Tanel Poder in this blog article suggests that a MERGE JOIN CARTESIAN is similar to a NESTED LOOP operation, just without filtering, [...]

    Pingback by Improving Performance by Using a Cartesian Join 2 « Charles Hooper's Oracle Notes — May 18, 2010 @ 12:11 pm UTC May 18,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 1,394 other followers