In the latest Quiz Night, I asked how you could make a query more efficient by changing a two table join into a three table join – with the clue that my third table was a repeat of the first table. Gary Myers, in comment 4, provided the type of answer I was looking for. Sometimes it is more efficient to get a small amount of data from a table on a first pass then go back and get the rest of the data on a second pass – especially if the first pass is an ‘index only’ operation.
I’ve created a little demonstration that gives you some idea of the approach:
rem rem Script: double_join_optimisation.sql rem Author: Jonathan Lewis rem Dated: May 2010 rem create table t1 as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 --> comment to avoid wordpress format issue ) select rownum id, mod(rownum,100) mod1, trunc(dbms_random.value(0,10000)) random1, lpad(rownum,10,'0') small_vc, rpad('x',60) padding from generator v1, generator v2 where rownum <= 100000 --> comment to avoid wordpress format issue ; create table t2 as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 --> comment to avoid wordpress format issue ) select rownum id, mod(rownum,100) mod2, trunc(dbms_random.value(0,10000)) random2, lpad(rownum,10,'0') small_vc, rpad('x',60) padding from generator v1, generator v2 where rownum <= 100000 --> comment to avoid wordpress format issue ; create index t1_i1 on t1(mod1, random1); create index t2_i1 on t2(mod2, random2);
This creates two tables of 100,000 (fairly short) rows. Note the mod columns which return 1,000 rows per value, and the random columns which return approximately 10 rows per value. When I give Oracle the following query, it overestimates the final result set and chooses what I know to be a relatively resource-intensive execution plan:
select t1.padding, t2.padding from t1, t2 where t1.mod1 = 50 and t2.random2 = t1.random1 and t2.mod2 = 50 ; ----------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ----------------------------------------------------------- | 0 | SELECT STATEMENT | | 1045 | 138K| 388 | |* 1 | HASH JOIN | | 1045 | 138K| 388 | |* 2 | TABLE ACCESS FULL| T1 | 1000 | 68000 | 193 | |* 3 | TABLE ACCESS FULL| T2 | 1000 | 68000 | 193 | ----------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T2"."RANDOM2"="T1"."RANDOM1") 2 - filter("T1"."MOD1"=50) 3 - filter("T2"."MOD2"=50)
This plan is dictated largely by the fact that I have to collect quite a lot of data from both tables then eliminate a large fraction of the data I have collected. This pattern is the driver for what I am about to do: I know that I want a small volume of data eventually but if I have to go to the table at every step of the plan then I will have to do a lot of redundant work and carry a lot of redundant data at some point. Remember – it’s often the case that “visiting the table” is the expensive part of any query.
select /*+ leading(t1 t2 t3) use_nl(t3) rowid(t3) */ t3.padding, t2.padding from t1, t2, t1 t3 where t1.mod1 = 50 and t2.random2 = t1.random1 and t2.mod2 = 50 and t3.rowid = t1.rowid ; --------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1045 | 163K| 1244 | | 1 | NESTED LOOPS | | 1045 | 163K| 1244 | |* 2 | HASH JOIN | | 1045 | 90915 | 199 | |* 3 | INDEX RANGE SCAN | T1_I1 | 1000 | 19000 | 4 | |* 4 | TABLE ACCESS FULL | T2 | 1000 | 68000 | 193 | | 5 | TABLE ACCESS BY USER ROWID| T1 | 1 | 73 | 1 | --------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$1 3 - SEL$1 / T1@SEL$1 4 - SEL$1 / T2@SEL$1 5 - SEL$1 / T3@SEL$1 Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T2"."RANDOM2"="T1"."RANDOM1") 3 - access("T1"."MOD1"=50) 4 - filter("T2"."MOD2"=50)
I don’t think the optimizer can generate a plan like this at present – but I may be wrong. I’ve reduced my workload by taking advantage of an existing index on table t1 to do a range scan that picks up only the columns that I need to join to t2. In this case the t2 access path was still a full tablescan – but even so I have reduced the workload against t1, and by the time I revisit it by rowid I will only be visiting the (relatively) small number of rows I really need.
(Left as an exercise to the reader: I could have written the query as a four part join, visiting both table segments by rowid for just those rows that I really needed; have a go, and check that you’ve got it right. Don’t forget that any references to the “non-index” columns that appear in the query have to be changed to reference the second occurrence of the table – note how I’ve changed t1.padding in my original query to t3.padding in the rewrite.)
Footnote:
If you think this type of path is a little odd take a look at the typical stucture of a nested loop join that appears under “nlj_batching” in 11g (this isnt the same t1 and t2 as above, by the way):
select /*+ ordered use_nl(t1) index(t1(n1)) */ t2.n1, t1.n2 from t2,t1 where t2.n2 = 45 and t1.n1 = t2.n1; ------------------------------------------------------ | Id | Operation | Name | Rows | ------------------------------------------------------ | 0 | SELECT STATEMENT | | 225 | | 1 | NESTED LOOPS | | | | 2 | NESTED LOOPS | | 225 | |* 3 | TABLE ACCESS FULL | T2 | 15 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 5 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | ------------------------------------------------------
Notice how Oracle can present a single join as two nested loops – one into the index and a second into the table. This is why I think there may be options within the optimizer to do my little trick automatically – if not now, then soon.
Update June 2012
I’ve just had an exchange of email with an Oak Table member who has pointed me to US patent 8103658 (dated November 2009) – which looks like a remarkably good description of this technique. So maybe the method will become an automatic option for the optimizer some time in the next couple of years.
Jonathan,
thank you for this interesting note.
a. question
how do you define “more efficient” ?
b. note
I tried to reproduce your results on
11.2.0.1.0 – 64bit
Red Hat Enterprise Linux AS release 5.3
(system statistics on)
and I get
Comment by Sokrates — May 18, 2010 @ 7:33 pm BST May 18,2010 |
Sokrates,
(a): “more efficient” – generally I would interpret this as ‘gets the same result by doing less work’
(b): interesting – I wonder how clever 10g might get with the same type of strategy. It looks like 11.2 has manage to invoke “Join elimination”. I don’t have a copy of 11.2 at hand currently, but I’d check the 10053 trace for references to join elimination. If this is the case then you might want to check the effect of adding the NO_ELIMINATE_JOIN to the query (you may have to add a qb_name hint to the main query block (or make explicit reference to sel$1) so that you have a named query block in the no_eliminate_join hint.
Comment by Jonathan Lewis — May 18, 2010 @ 11:34 pm BST May 18,2010 |
Bingo !
10053 trace shows:
JE: Considering Join Elimination on query block SEL$1 (#0)
*************************
Join Elimination (JE)
*************************
SQL:******* UNPARSED QUERY IS *******
SELECT /*+ LEADING ("T1" "T2" "T3") ROWID ("T3") USE_NL ("T3") */ "T3"."PADDING" "PADDING","T2"."PADDING" "PADDING" FROM "SOKRATES"."T1" "T1","SOKRATES"."T2" "T2","SOKRATES"."T1" "T3" WHERE "T1"."MOD1"=50 AND "T2"."RANDOM2"="T1"."RANDOM1" AND "T2"."MOD2"=50 AND "T3".ROWID="T1".ROWID
JE: cfro: T1 objn:673336 col#:1001 dfro:T1 dcol#:1001
JE: cfro: T1 objn:673336 col#:1001 dfro:T1 dcol#:1001
Query block (0x2b6a8b3c6638) before join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT /*+ LEADING ("T1" "T2" "T3") ROWID ("T3") USE_NL ("T3") */ "T3"."PADDING" "PADDING","T2"."PADDING" "PADDING" FROM "SOKRATES"."T1" "T1","SOKRATES"."T2" "T2","SOKRATES"."T1" "T3" WHERE "T1"."MOD1"=50 AND "T2"."RANDOM2"="T1"."RANDOM1" AND "T2"."MOD2"=50 AND "T1".ROWID="T3".ROWID
JE: eliminate table: T1 (T3)
JE: Replaced column: T3.PADDING with column: T1.PADDING
Registered qb: SEL$E9892BDE 0x8b3c6638 (JOIN REMOVED FROM QUERY BLOCK SEL$1; SEL$1; "T3"@"SEL$1")
---------------------
QUERY BLOCK SIGNATURE
---------------------
signature (): qb_name=SEL$E9892BDE nbfros=2 flg=0
fro(0): flg=0 objn=673336 hint_alias="T1"@"SEL$1"
fro(1): flg=0 objn=673337 hint_alias="T2"@"SEL$1"
however, I don’t succeed in using NO_ELIMINATE_JOIN in order to let the Optimizer not eliminate the join.
Comment by Sokrates — May 19, 2010 @ 9:19 am BST May 19,2010 |
Sokrates,
Did you try no_eliminate_join(t3) ? I failed to point out that you have to reference the alias of the table whose join should not be eliminated.
Comment by Jonathan Lewis — May 19, 2010 @ 9:52 am BST May 19,2010
you’re using hints anyway, so wouldn’t the following do the trick:
Comment by Michael V — May 19, 2010 @ 8:57 am BST May 19,2010 |
Michael,
If the query uses the two indexes in the way you suggest it has to visit all the rows that match mod1 = 50 before discarding about 90% of them after probing the t2 index. The intention is to avoid visiting the table for that 90%.
Don’t worry about trying to optimise this specific statement, by the way, it’s too trivial and too small to worry about – I’ve just used it to demonstrate the mechanism. The suggested exercise of turning it into a 4-table join is also not a realistic one, since a simple index hint to t2 will (in this case) allow Oracle to visit nothing more than the rows that are really needed and we don’t need to do anything clever to limit the number of block visits to t2.
Comment by Jonathan Lewis — May 19, 2010 @ 9:34 am BST May 19,2010 |
NO_ELIMINATE_JOIN worked in this form:
Comment by Timur Akhmadeev — May 19, 2010 @ 10:17 am BST May 19,2010 |
Timur,
Thanks.
An important trap in a case like this is to following human thinking rather than machine “thinking”. My first thought was “Oracle is eliminating t3”, but as your suggestion shows – the problem has symmetry, if you stop Oracle from eliminating t3 it can rewrite the query to eliminate t1.
Comment by Jonathan Lewis — May 23, 2010 @ 1:16 pm BST May 23,2010 |
you wrote
“Note the mod columns which return 1,000 rows per value, and the random columns which return approximately 100 rows per value.”
correct is:
the random columns which return approximately 10 rows per value.
select avg(co) from (select random1, count(*) co from t1 group by random1);
AVG(CO)
----------
10
and there are 10.000 distinct random values.
I assume it was a typo, when you replace “trunc(dbms_random.value(0,10000))” by “trunc(dbms_random.value(0,1000))”, your example still works.
However, I still don’t understand your statement “[Oracle] overestimates the final result set”
How do you see it overestimates the final result set ?
Comment by Sokrates — May 20, 2010 @ 9:42 am BST May 20,2010 |
Sokrates,
Thanks for that – now corrected.
It was a typo in the original script, which I cloned for the second table. The rest of the posting is consistent with the script.
I saw it overestimating the result because I ran the query and got about 96 rows when the optimizer was predicting 1,000. (If you change the script, though, the actual result happens to come quite close to the prediction.)
Comment by Jonathan Lewis — May 23, 2010 @ 1:19 pm BST May 23,2010 |
This SQL is close to 4-table join, using “WITH” clause and materialize hint. I tested in a 11.2 database. In my testing, it did about 300 consistent gets, about 1400 less than 3-table join one. It only processed the rows that were in the final result set.
Comment by Bo — May 21, 2010 @ 12:03 am BST May 21,2010 |
Bo,
Thanks for the example.
I really like the way you’ve used subquery factoring to introduce clarity to what you’re doing here – and your solution is valid. (The hint use_nl(v3) is presumably a typo for use_nl(t2) but on my machine the nested loop to t2 still appeared with the logical I/O you reported.)
A couple of general points, though. I would avoid using /*+ materialize */ unless really necessary since this will write and read to the temporary tablespace – the direct I/O may have a higher price than the saving you generate. Secondly, as a matter of personal taste – I would not have introduced the third subquery: the first two subqueries add clarity to the solution but, to me, the third one doesn’t add value. For my reference set of scripts, I’ve taken your idea and written a query that ends up joining v1, v2, t1, t2.
Comment by Jonathan Lewis — May 23, 2010 @ 1:27 pm BST May 23,2010 |
Hi Jonathan,
reading your interesting post I thought what would happen if I include an order as follows
I'm expecting an improvement in the clustering factor of two indexes and that the data in the table are distributed in adjacent blocks.
I'm expecting that Oracle then uses the two indexes so efficient and that I should not recur to the use hints (A forcing more than a suggestion, which I do not like do unless you have a need), or use
this his valuable suggestions.
I'd like to have your confirmation on this consideration
thanks
Comment by Donatello Settembrino — May 21, 2010 @ 8:38 am BST May 21,2010 |
Donatello,
You are correct. If we have a system where the data we want is well placed, then we don’t have to play games to create “damage-limiting” access paths. But in the general case we usually find that if we physically arrange the data to suit one set of queries we have other queries that want it arranged differently.
Comment by Jonathan Lewis — May 23, 2010 @ 1:31 pm BST May 23,2010 |
Jonathan,
I missed the quiz night !.
Previously for these sql’s i used to use a subquery instead of adding the 3rd table.
This was the original query used by developers
After applying the predicates, i get 45093 rows from CUSTOMER table and 4214808 rows from ADDRESS table. After joining them i get 1018.
I created the following indexes
composite index on STATE, CITY & CMS_ACCT_NBR on ADDRESS table.
composite index on F_NAME, CMS_ACCT_NBR on CUSTOMER table.
To use hash join after a range scan on these indexes i used this subquery
I missed the idea, i could have used the rowid’s
thanks & regards
srivenu
Comment by srivenu — May 28, 2010 @ 1:04 pm BST May 28,2010 |
Srivenu,
I missed the idea, i could have used the rowid’s
The use of rowids is just a little bonus – you got the important bit, which is recognising that you can use an index as if it were a just another table.
Comment by Jonathan Lewis — May 29, 2010 @ 10:10 am BST May 29,2010 |
[…] Manual Optimisation Filed under: Execution plans,Hints,Indexing,Tuning — Jonathan Lewis @ 6:00 pm UTC Oct 8,2010 Here’s an example of “creative SQL” that I wrote in response to a question on OTN about combining data from two indexes to optimise access to a table. It demonstrates the principle that you can treat an index as a special case of a table – allowing you to make a query go faster by referencing the same table more times. […]
Pingback by Manual Optimisation « Oracle Scratchpad — October 8, 2010 @ 6:04 pm BST Oct 8,2010 |
[…] another example of referencing a table twice (or three times) in the query because multiple references allow you to define a better execution […]
Pingback by Index Join – 2 « Oracle Scratchpad — November 26, 2010 @ 6:39 pm GMT Nov 26,2010 |
[…] while ago I published a note explaining how it was possible to find queries which ran faster if you manually de-coupled the index and table accesses. Here’s a further example that came up in discussion on a client site recently. The query […]
Pingback by Star Transformation « Oracle Scratchpad — April 22, 2011 @ 6:16 pm BST Apr 22,2011 |
In the situation that you’ve described it this post it make sence to put the sorting before join by rowid.
I’ve changed distribution of the data (marked by — *) so that advantage will be particularly noticeable.
As I run the script on the 11.2.0.1 I will put “_optimizer_join_elimination_enabled” into the your variant of the query.
I have also provided the query with enhancement for both tables – the second query. I’ve copied the outline data into the hint.
And the third query is the same query but with ordering by rowid before the joining.
The results for the “consistent gets” are 2136 for first, 1365 for second, 34 for third.
Of course, the distribution is too specific, in the real situation the advantage may not be so large.
The idea is that Oracle have to jump between the few blocks because of after the HJ the rowids will be misordered.
Test script:
Results:
Comment by Valentin Nikotin — January 31, 2012 @ 7:21 pm GMT Jan 31,2012 |
[…] my old “two-step” approach to visiting tables and indexes. Get the rowids you really need, and visit the table later. In this case, though, I’ve sorted […]
Pingback by Compression Units – 5 « Oracle Scratchpad — August 19, 2012 @ 6:03 pm BST Aug 19,2012 |
[…] written several examples of how we can optimise SQL manually by rewriting it to introduce extra copies of some of the tables (typically in a fashion analogous to the optimizer’s mechanism for star […]
Pingback by Table Duplication | Oracle Scratchpad — January 3, 2015 @ 11:54 am GMT Jan 3,2015 |
[…] of course, is just applying two principles that I’ve discussed before: (a) be selective about using a table twice to reduce the workload, (b) aggregate early if you can reduce the scale of the […]
Pingback by Cartesian join | Oracle Scratchpad — April 15, 2015 @ 6:41 pm BST Apr 15,2015 |