Oracle Scratchpad

May 14, 2010

Quiz Night

Filed under: Oracle,Performance — Jonathan Lewis @ 6:55 pm BST May 14,2010

I have two queries designed to return the same result set in the same order. In outline they look like this (look carefully at the from clauses):

select  ...
from    tableA  t1,
        tableB  t2
where
        t1.filter = ...
and     t2.join = t1.join
and     t2.filter = ...
order by ...


select  ...
from    tableA  t1,
        tableB  t2,
        tableA  t3
where
        t1.filter = ...
and     t2.join = t1.join
and     t2.filter = ...
and     ...  -- to be continued
order by ...

How did I manage to take the first query and make it more efficient by turning it from a two-table join to a three-table join ?

11 Comments »

  1. How many conditions are in the “to be continued” part? If there are more than 1 than I think it’s easy to make such test case. If there is just one than I currently don’t know how to do that.

    Comment by Timur Akhmadeev — May 14, 2010 @ 7:40 pm BST May 14,2010 | Reply

    • Timur,

      The principle of turning a two-table into a more efficient three-table is the important bit. I hadn’t applied a great deal of thought to whether I would do it with just one more predicate in the framework about – but I think I can construct an example with just one more.

      Comment by Jonathan Lewis — May 14, 2010 @ 8:47 pm BST May 14,2010 | Reply

  2. TableA is joined twice here – so maybe this will lead to the use of a HASH JOIN instead of any other join implementation.

    Depending on the intermediate result set size this may be more efficient.

    Another option would be that by joining TableA again, an index might be used which will allow to spare additional sort operations.

    Any of those guesses in the right playfield?

    cheers,
    Jim.

    Comment by Jim — May 15, 2010 @ 7:43 am BST May 15,2010 | Reply

    • I’d need something more concrete to go on, but the comment “Depending on the intermediate result set size this may be more efficient.” highlights an important principle.

      Comment by Jonathan Lewis — May 15, 2010 @ 10:47 am BST May 15,2010 | Reply

  3. Interesting if adding an extra ‘version’ of tablea into the FROM clause doesn’t change the result. tablea must be joined on a primary or unique key (otherwise you’d have, at least, the potential for duplicates in the result set). If tablea had a unique key on ‘colx’ then joining tablea to itself on ‘colx’ would be an implicit ‘colx is not null’ condition. If 99% of tablea entries had a null value for ‘colx’ and an index existed on that column (which it would do for a unique key), then the optimizer could use the index to single out that remaining 1%, which might provide a better optimizer path.

    So I can see how an extra join may give a more efficient path. But if that was the situation then I’d expect you to have simply added a ‘colx is not null’ predicate.

    Comment by Gary — May 16, 2010 @ 11:21 pm BST May 16,2010 | Reply

  4. My *new improved* thought.
    I think the old plan involved a full table scan of tablea hash joined to a full scan from tableb.

    I think the new plan involves an index fast full scan on tablea (columns in the t1.join = t2.join) hash joined to a scan from tableb. The results from that join then use a straight ROWID join to get additional data from tablea (eg stuff required for the SELECT but not for the join/where predicates).

    The index being full scan would (in this case) be a lot smaller than the table, so the hash join would be more efficient.

    Not sure whether that plan could get chosen by the optimizer without the explicit join.

    Comment by Gary — May 16, 2010 @ 11:45 pm BST May 16,2010 | Reply

    • Gary,

      Well done – you’ve put your finger on the critical point.
      I haven’t seen Oracle do it yet (but it might be possible in 11.2) but it’s perfectly reasonable to write SQL that visits a table using an index-only access path, and then comes back later by rowid to pick up the table data. I published an example of this type of approach about a year ago, but I’ll post a much simpler (and less contentious) example as soon as I have a few minutes.

      Comment by Jonathan Lewis — May 17, 2010 @ 7:49 am BST May 17,2010 | Reply

  5. Something like this?

    select t3. ... -- what used to come from t1 now comes from t3
    from    tableA  t1,
            tableB  t2,
            tableA  t3
    where
            t1.filter = ...
    and     t2.join = t1.join
    and     t2.filter = ...
    and     t3.rowid = t1.rowid
    order by t3. ... -- ditto
    

    Or am I late to the party?

    Cheers!
    Flado

    Comment by Flado — May 17, 2010 @ 1:41 pm BST May 17,2010 | Reply

    • Flado,

      Your additional predicate is an exact match for the one that will be appearing on the blog item that I’ll be publishing later on today.

      Comment by Jonathan Lewis — May 18, 2010 @ 7:23 am BST May 18,2010 | Reply

      • Yay!

        But the predicate alone doesn’t do it: one must also SELECT and ORDER BY t3 fields instead of t1 fields.

        Saluti!
        Flado

        Comment by Flado — May 18, 2010 @ 7:38 am BST May 18,2010 | Reply

  6. [...] Execution plans,Performance,Tuning — Jonathan Lewis @ 7:06 pm UTC May 18,2010 In the latest Quiz Night, I asked how you could make a query more efficient by changing a two table join into a three table [...]

    Pingback by double trouble « Oracle Scratchpad — May 18, 2010 @ 7:07 pm BST May 18,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,877 other followers