I have two queries designed to return the same result set in the same order. In outline they look like this (look carefully at the from clauses):
select ...
from tableA t1,
tableB t2
where
t1.filter = ...
and t2.join = t1.join
and t2.filter = ...
order by ...
select ...
from tableA t1,
tableB t2,
tableA t3
where
t1.filter = ...
and t2.join = t1.join
and t2.filter = ...
and ... -- to be continued
order by ...
How did I manage to take the first query and make it more efficient by turning it from a two-table join to a three-table join ?

How many conditions are in the “to be continued” part? If there are more than 1 than I think it’s easy to make such test case. If there is just one than I currently don’t know how to do that.
Comment by Timur Akhmadeev — May 14, 2010 @ 7:40 pm UTC May 14,2010 |
Timur,
The principle of turning a two-table into a more efficient three-table is the important bit. I hadn’t applied a great deal of thought to whether I would do it with just one more predicate in the framework about – but I think I can construct an example with just one more.
Comment by Jonathan Lewis — May 14, 2010 @ 8:47 pm UTC May 14,2010 |
TableA is joined twice here – so maybe this will lead to the use of a HASH JOIN instead of any other join implementation.
Depending on the intermediate result set size this may be more efficient.
Another option would be that by joining TableA again, an index might be used which will allow to spare additional sort operations.
Any of those guesses in the right playfield?
cheers,
Jim.
Comment by Jim — May 15, 2010 @ 7:43 am UTC May 15,2010 |
I’d need something more concrete to go on, but the comment “Depending on the intermediate result set size this may be more efficient.” highlights an important principle.
Comment by Jonathan Lewis — May 15, 2010 @ 10:47 am UTC May 15,2010 |
Interesting if adding an extra ‘version’ of tablea into the FROM clause doesn’t change the result. tablea must be joined on a primary or unique key (otherwise you’d have, at least, the potential for duplicates in the result set). If tablea had a unique key on ‘colx’ then joining tablea to itself on ‘colx’ would be an implicit ‘colx is not null’ condition. If 99% of tablea entries had a null value for ‘colx’ and an index existed on that column (which it would do for a unique key), then the optimizer could use the index to single out that remaining 1%, which might provide a better optimizer path.
So I can see how an extra join may give a more efficient path. But if that was the situation then I’d expect you to have simply added a ‘colx is not null’ predicate.
Comment by Gary — May 16, 2010 @ 11:21 pm UTC May 16,2010 |
My *new improved* thought.
I think the old plan involved a full table scan of tablea hash joined to a full scan from tableb.
I think the new plan involves an index fast full scan on tablea (columns in the t1.join = t2.join) hash joined to a scan from tableb. The results from that join then use a straight ROWID join to get additional data from tablea (eg stuff required for the SELECT but not for the join/where predicates).
The index being full scan would (in this case) be a lot smaller than the table, so the hash join would be more efficient.
Not sure whether that plan could get chosen by the optimizer without the explicit join.
Comment by Gary — May 16, 2010 @ 11:45 pm UTC May 16,2010 |
Gary,
Well done – you’ve put your finger on the critical point.
I haven’t seen Oracle do it yet (but it might be possible in 11.2) but it’s perfectly reasonable to write SQL that visits a table using an index-only access path, and then comes back later by rowid to pick up the table data. I published an example of this type of approach about a year ago, but I’ll post a much simpler (and less contentious) example as soon as I have a few minutes.
Comment by Jonathan Lewis — May 17, 2010 @ 7:49 am UTC May 17,2010 |
Something like this?
select t3. ... -- what used to come from t1 now comes from t3 from tableA t1, tableB t2, tableA t3 where t1.filter = ... and t2.join = t1.join and t2.filter = ... and t3.rowid = t1.rowid order by t3. ... -- dittoOr am I late to the party?
Cheers!
Flado
Comment by Flado — May 17, 2010 @ 1:41 pm UTC May 17,2010 |
Flado,
Your additional predicate is an exact match for the one that will be appearing on the blog item that I’ll be publishing later on today.
Comment by Jonathan Lewis — May 18, 2010 @ 7:23 am UTC May 18,2010 |
Yay!
But the predicate alone doesn’t do it: one must also SELECT and ORDER BY t3 fields instead of t1 fields.
Saluti!
Flado
Comment by Flado — May 18, 2010 @ 7:38 am UTC May 18,2010 |
[...] Execution plans,Performance,Tuning — Jonathan Lewis @ 7:06 pm UTC May 18,2010 In the latest Quiz Night, I asked how you could make a query more efficient by changing a two table join into a three table [...]
Pingback by double trouble « Oracle Scratchpad — May 18, 2010 @ 7:07 pm UTC May 18,2010 |