In the latest Quiz Night, I asked how you could make a query more efficient by changing a two table join into a three table join – with the clue that my third table was a repeat of the first table. Gary Myers, in comment 4, provided the type of answer I was looking for. Sometimes it is more efficient to get a small amount of data from a table on a first pass then go back and get the rest of the data on a second pass – especially if the first pass is an ‘index only’ operation.
I’ve created a little demonstration that gives you some idea of the approach:
create table t1
as
with generator as (
select --+ materialize
rownum id
from dual
connect by
rownum <= 10000
)
select
rownum id,
mod(rownum,100) mod1,
trunc(dbms_random.value(0,10000)) random1,
lpad(rownum,10,'0') small_vc,
rpad('x',60) padding
from
generator v1,
generator v2
where
rownum <= 100000
;
create table t2
as
with generator as (
select --+ materialize
rownum id
from dual
connect by
rownum <= 10000
)
select
rownum id,
mod(rownum,100) mod2,
trunc(dbms_random.value(0,10000)) random2,
lpad(rownum,10,'0') small_vc,
rpad('x',60) padding
from
generator v1,
generator v2
where
rownum <= 100000
;
create index t1_i1 on t1(mod1, random1);
create index t2_i1 on t2(mod2, random2);
This creates two tables of 100,000 (fairly short) rows. Note the mod columns which return 1,000 rows per value, and the random columns which return approximately 10 rows per value. When I give Oracle the following query, it overestimates the final result set and chooses what I know to be a relatively resource-intensive execution plan:
select
t1.padding,
t2.padding
from
t1, t2
where
t1.mod1 = 50
and t2.random2 = t1.random1
and t2.mod2 = 50
;
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1045 | 138K| 388 |
|* 1 | HASH JOIN | | 1045 | 138K| 388 |
|* 2 | TABLE ACCESS FULL| T1 | 1000 | 68000 | 193 |
|* 3 | TABLE ACCESS FULL| T2 | 1000 | 68000 | 193 |
-----------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T2"."RANDOM2"="T1"."RANDOM1")
2 - filter("T1"."MOD1"=50)
3 - filter("T2"."MOD2"=50)
This plan is dictated largely by the fact that I have to collect quite a lot of data from both tables then eliminate a large fraction of the data I have collected. This pattern is the driver for what I am about to do: I know that I want a small volume of data eventually but if I have to go to the table at every step of the plan then I will have to do a lot of redundant work and carry a lot of redundant data at some point. Remember – it’s often the case that “visiting the table” is the expensive part of any query.
select
/*+
leading(t1 t2 t3)
use_nl(t3)
rowid(t3)
*/
t3.padding,
t2.padding
from
t1,
t2,
t1 t3
where
t1.mod1 = 50
and t2.random2 = t1.random1
and t2.mod2 = 50
and t3.rowid = t1.rowid
;
---------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1045 | 163K| 1244 |
| 1 | NESTED LOOPS | | 1045 | 163K| 1244 |
|* 2 | HASH JOIN | | 1045 | 90915 | 199 |
|* 3 | INDEX RANGE SCAN | T1_I1 | 1000 | 19000 | 4 |
|* 4 | TABLE ACCESS FULL | T2 | 1000 | 68000 | 193 |
| 5 | TABLE ACCESS BY USER ROWID| T1 | 1 | 73 | 1 |
---------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
3 - SEL$1 / T1@SEL$1
4 - SEL$1 / T2@SEL$1
5 - SEL$1 / T3@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T2"."RANDOM2"="T1"."RANDOM1")
3 - access("T1"."MOD1"=50)
4 - filter("T2"."MOD2"=50)
I don’t think the optimizer can generate a plan like this at present – but I may be wrong. I’ve reduced my workload by taking advantage of an existing index on table t1 to do a range scan that picks up only the columns that I need to join to t2. In this case the t2 access path was still a full tablescan – but even so I have reduced the workload against t1, and by the time I revisit it by rowid I will only be visiting the (relatively) small number of rows I really need.
(Left as an exercise to the reader: I could have written the query as a four part join, visiting both table segments by rowid for just those rows that I really needed; have a go, and check that you’ve got it right. Don’t forget that any references to the “non-index” columns that appear in the query have to be changed to reference the second occurrence of the table – note how I’ve changed t1.padding in my original query to t3.padding in the rewrite.)
Footnote:
If you think this type of path is a little odd take a look at the typical stucture of a nested loop join that appears under “nlj_batching” in 11g (this isnt the same t1 and t2 as above, by the way):
select
/*+ ordered use_nl(t1) index(t1(n1)) */
t2.n1, t1.n2
from t2,t1
where
t2.n2 = 45
and t1.n1 = t2.n1;
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 225 |
| 1 | NESTED LOOPS | | |
| 2 | NESTED LOOPS | | 225 |
|* 3 | TABLE ACCESS FULL | T2 | 15 |
|* 4 | INDEX RANGE SCAN | T1_I1 | 15 |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 15 |
------------------------------------------------------
Notice how Oracle can present a single join as two nested loops – one into the index and a second into the table. This is why I think there may be options within the optimizer to do my little trick automatically – if not now, then soon.
Update June 2012
I’ve just had an exchange of email with an Oak Table member who has pointed me to US patent 8103658 (dated November 2009) – which looks like a remarkably good description of this technique. So maybe the method will become an automatic option for the optimizer some time in the next couple of years.

Jonathan,
thank you for this interesting note.
a. question
how do you define “more efficient” ?
b. note
I tried to reproduce your results on
11.2.0.1.0 – 64bit
Red Hat Enterprise Linux AS release 5.3
(system statistics on)
and I get
11.2.0.1.0 > set autotr traceonly 11.2.0.1.0 > select /*+ leading(t1 t2 t3) use_nl(t3) rowid(t3) */ t3.padding, t2.padding from t1, t2, t1 t3 where t1.mod1 = 50 and t2.random2 = t1.random1 and t2.mod2 = 50 and t3.rowid = t1.ro 2 wid ; 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 98 rows selected. Elapsed: 00:00:00.04 Execution Plan ---------------------------------------------------------- Plan hash value: 1838229974 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1045 | 151K| 259 (1)| 01:51:14 | |* 1 | HASH JOIN | | 1045 | 151K| 259 (1)| 01:51:14 | |* 2 | TABLE ACCESS FULL| T1 | 1000 | 80000 | 129 (0)| 00:55:31 | |* 3 | TABLE ACCESS FULL| T2 | 1000 | 68000 | 129 (0)| 00:55:31 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T2"."RANDOM2"="T1"."RANDOM1") 2 - filter("T1"."MOD1"=50) 3 - filter("T2"."MOD2"=50) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2485 consistent gets 0 physical reads 0 redo size 2464 bytes sent via SQL*Net to client 590 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 98 rows processedComment by Sokrates — May 18, 2010 @ 7:33 pm UTC May 18,2010 |
Sokrates,
(a): “more efficient” – generally I would interpret this as ‘gets the same result by doing less work’
(b): interesting – I wonder how clever 10g might get with the same type of strategy. It looks like 11.2 has manage to invoke “Join elimination”. I don’t have a copy of 11.2 at hand currently, but I’d check the 10053 trace for references to join elimination. If this is the case then you might want to check the effect of adding the NO_ELIMINATE_JOIN to the query (you may have to add a qb_name hint to the main query block (or make explicit reference to sel$1) so that you have a named query block in the no_eliminate_join hint.
Comment by Jonathan Lewis — May 18, 2010 @ 11:34 pm UTC May 18,2010 |
Bingo !
10053 trace shows:
JE: Considering Join Elimination on query block SEL$1 (#0)
*************************
Join Elimination (JE)
*************************
SQL:******* UNPARSED QUERY IS *******
SELECT /*+ LEADING ("T1" "T2" "T3") ROWID ("T3") USE_NL ("T3") */ "T3"."PADDING" "PADDING","T2"."PADDING" "PADDING" FROM "SOKRATES"."T1" "T1","SOKRATES"."T2" "T2","SOKRATES"."T1" "T3" WHERE "T1"."MOD1"=50 AND "T2"."RANDOM2"="T1"."RANDOM1" AND "T2"."MOD2"=50 AND "T3".ROWID="T1".ROWID
JE: cfro: T1 objn:673336 col#:1001 dfro:T1 dcol#:1001
JE: cfro: T1 objn:673336 col#:1001 dfro:T1 dcol#:1001
Query block (0x2b6a8b3c6638) before join elimination:
SQL:******* UNPARSED QUERY IS *******
SELECT /*+ LEADING ("T1" "T2" "T3") ROWID ("T3") USE_NL ("T3") */ "T3"."PADDING" "PADDING","T2"."PADDING" "PADDING" FROM "SOKRATES"."T1" "T1","SOKRATES"."T2" "T2","SOKRATES"."T1" "T3" WHERE "T1"."MOD1"=50 AND "T2"."RANDOM2"="T1"."RANDOM1" AND "T2"."MOD2"=50 AND "T1".ROWID="T3".ROWID
JE: eliminate table: T1 (T3)
JE: Replaced column: T3.PADDING with column: T1.PADDING
Registered qb: SEL$E9892BDE 0x8b3c6638 (JOIN REMOVED FROM QUERY BLOCK SEL$1; SEL$1; "T3"@"SEL$1")
---------------------
QUERY BLOCK SIGNATURE
---------------------
signature (): qb_name=SEL$E9892BDE nbfros=2 flg=0
fro(0): flg=0 objn=673336 hint_alias="T1"@"SEL$1"
fro(1): flg=0 objn=673337 hint_alias="T2"@"SEL$1"
however, I don’t succeed in using NO_ELIMINATE_JOIN in order to let the Optimizer not eliminate the join.
Comment by Sokrates — May 19, 2010 @ 9:19 am UTC May 19,2010 |
Sokrates,
Did you try no_eliminate_join(t3) ? I failed to point out that you have to reference the alias of the table whose join should not be eliminated.
Comment by Jonathan Lewis — May 19, 2010 @ 9:52 am UTC May 19,2010
you’re using hints anyway, so wouldn’t the following do the trick:
Comment by Michael V — May 19, 2010 @ 8:57 am UTC May 19,2010 |
Michael,
If the query uses the two indexes in the way you suggest it has to visit all the rows that match mod1 = 50 before discarding about 90% of them after probing the t2 index. The intention is to avoid visiting the table for that 90%.
Don’t worry about trying to optimise this specific statement, by the way, it’s too trivial and too small to worry about – I’ve just used it to demonstrate the mechanism. The suggested exercise of turning it into a 4-table join is also not a realistic one, since a simple index hint to t2 will (in this case) allow Oracle to visit nothing more than the rows that are really needed and we don’t need to do anything clever to limit the number of block visits to t2.
Comment by Jonathan Lewis — May 19, 2010 @ 9:34 am UTC May 19,2010 |
NO_ELIMINATE_JOIN worked in this form:
Comment by Timur Akhmadeev — May 19, 2010 @ 10:17 am UTC May 19,2010 |
Timur,
Thanks.
An important trap in a case like this is to following human thinking rather than machine “thinking”. My first thought was “Oracle is eliminating t3″, but as your suggestion shows – the problem has symmetry, if you stop Oracle from eliminating t3 it can rewrite the query to eliminate t1.
Comment by Jonathan Lewis — May 23, 2010 @ 1:16 pm UTC May 23,2010 |
you wrote
“Note the mod columns which return 1,000 rows per value, and the random columns which return approximately 100 rows per value.”
correct is:
the random columns which return approximately 10 rows per value.
select avg(co) from (select random1, count(*) co from t1 group by random1);
AVG(CO)
----------
10
and there are 10.000 distinct random values.
I assume it was a typo, when you replace “trunc(dbms_random.value(0,10000))” by “trunc(dbms_random.value(0,1000))”, your example still works.
However, I still don’t understand your statement “[Oracle] overestimates the final result set”
How do you see it overestimates the final result set ?
Comment by Sokrates — May 20, 2010 @ 9:42 am UTC May 20,2010 |
Sokrates,
Thanks for that – now corrected.
It was a typo in the original script, which I cloned for the second table. The rest of the posting is consistent with the script.
I saw it overestimating the result because I ran the query and got about 96 rows when the optimizer was predicting 1,000. (If you change the script, though, the actual result happens to come quite close to the prediction.)
Comment by Jonathan Lewis — May 23, 2010 @ 1:19 pm UTC May 23,2010 |
This SQL is close to 4-table join, using “WITH” clause and materialize hint. I tested in a 11.2 database. In my testing, it did about 300 consistent gets, about 1400 less than 3-table join one. It only processed the rows that were in the final result set.
Comment by Bo — May 21, 2010 @ 12:03 am UTC May 21,2010 |
Bo,
Thanks for the example.
I really like the way you’ve used subquery factoring to introduce clarity to what you’re doing here – and your solution is valid. (The hint use_nl(v3) is presumably a typo for use_nl(t2) but on my machine the nested loop to t2 still appeared with the logical I/O you reported.)
A couple of general points, though. I would avoid using /*+ materialize */ unless really necessary since this will write and read to the temporary tablespace – the direct I/O may have a higher price than the saving you generate. Secondly, as a matter of personal taste – I would not have introduced the third subquery: the first two subqueries add clarity to the solution but, to me, the third one doesn’t add value. For my reference set of scripts, I’ve taken your idea and written a query that ends up joining v1, v2, t1, t2.
Comment by Jonathan Lewis — May 23, 2010 @ 1:27 pm UTC May 23,2010 |
Hi Jonathan,
reading your interesting post I thought what would happen if I include an order as follows
create table t1 as SELECT * FROM ( with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 ) select rownum id, mod(rownum,100) mod1, trunc(dbms_random.value(0,10000)) random1, lpad(rownum,10,'0') small_vc, rpad('x',60) padding from generator v1, generator v2 where rownum <= 100000) ORDER BY MOD1, RANDOM1 ; create table t2 as SELECT * FROM ( with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 ) select rownum id, mod(rownum,100) mod2, trunc(dbms_random.value(0,10000)) random2, lpad(rownum,10,'0') small_vc, rpad('x',60) padding from generator v1, generator v2 where rownum <= 100000 ) ORDER BY MOD2, RANDOM2 ;I'm expecting an improvement in the clustering factor of two indexes and that the data in the table are distributed in adjacent blocks.
I'm expecting that Oracle then uses the two indexes so efficient and that I should not recur to the use hints (A forcing more than a suggestion, which I do not like do unless you have a need), or use
this his valuable suggestions.
I'd like to have your confirmation on this consideration
thanks
Comment by Donatello Settembrino — May 21, 2010 @ 8:38 am UTC May 21,2010 |
Donatello,
You are correct. If we have a system where the data we want is well placed, then we don’t have to play games to create “damage-limiting” access paths. But in the general case we usually find that if we physically arrange the data to suit one set of queries we have other queries that want it arranged differently.
Comment by Jonathan Lewis — May 23, 2010 @ 1:31 pm UTC May 23,2010 |
Jonathan,
I missed the quiz night !.
Previously for these sql’s i used to use a subquery instead of adding the 3rd table.
This was the original query used by developers
After applying the predicates, i get 45093 rows from CUSTOMER table and 4214808 rows from ADDRESS table. After joining them i get 1018.
I created the following indexes
composite index on STATE, CITY & CMS_ACCT_NBR on ADDRESS table.
composite index on F_NAME, CMS_ACCT_NBR on CUSTOMER table.
To use hash join after a range scan on these indexes i used this subquery
SELECT a.address1, a.address2, a.address3, b.cust_day_phone, b.DOB FROM ( select a1.cms_acct_nbr from ADDRESS a1, CUSTOMER b1 WHERE a1.STATE = 'AP' AND a1.CITY = 'HYDERABAD' AND b1.F_NAME = 'VEERA' AND a1.cms_acct_nbr = b1.cms_acct_nbr ) iv, ADDRESS a, CUSTOMER b, CUST_CONTACT_REL c WHERE iv.cms_acct_nbr = a.cms_acct_nbr AND a.cms_acct_nbr = b.cms_acct_nbr AND b.cms_acct_nbr = c.cms_acct_nbr AND c.address_type = 'B'I missed the idea, i could have used the rowid’s
thanks & regards
srivenu
Comment by srivenu — May 28, 2010 @ 1:04 pm UTC May 28,2010 |
Srivenu,
I missed the idea, i could have used the rowid’s
The use of rowids is just a little bonus – you got the important bit, which is recognising that you can use an index as if it were a just another table.
Comment by Jonathan Lewis — May 29, 2010 @ 10:10 am UTC May 29,2010 |
[...] Manual Optimisation Filed under: Execution plans,Hints,Indexing,Tuning — Jonathan Lewis @ 6:00 pm UTC Oct 8,2010 Here’s an example of “creative SQL” that I wrote in response to a question on OTN about combining data from two indexes to optimise access to a table. It demonstrates the principle that you can treat an index as a special case of a table – allowing you to make a query go faster by referencing the same table more times. [...]
Pingback by Manual Optimisation « Oracle Scratchpad — October 8, 2010 @ 6:04 pm UTC Oct 8,2010 |
[...] another example of referencing a table twice (or three times) in the query because multiple references allow you to define a better execution [...]
Pingback by Index Join – 2 « Oracle Scratchpad — November 26, 2010 @ 6:39 pm UTC Nov 26,2010 |
[...] while ago I published a note explaining how it was possible to find queries which ran faster if you manually de-coupled the index and table accesses. Here’s a further example that came up in discussion on a client site recently. The query [...]
Pingback by Star Transformation « Oracle Scratchpad — April 22, 2011 @ 6:16 pm UTC Apr 22,2011 |
In the situation that you’ve described it this post it make sence to put the sorting before join by rowid.
I’ve changed distribution of the data (marked by — *) so that advantage will be particularly noticeable.
As I run the script on the 11.2.0.1 I will put “_optimizer_join_elimination_enabled” into the your variant of the query.
I have also provided the query with enhancement for both tables – the second query. I’ve copied the outline data into the hint.
And the third query is the same query but with ordering by rowid before the joining.
The results for the “consistent gets” are 2136 for first, 1365 for second, 34 for third.
Of course, the distribution is too specific, in the real situation the advantage may not be so large.
The idea is that Oracle have to jump between the few blocks because of after the HJ the rowids will be misordered.
Test script:
drop table t1 purge; drop table t2 purge; create table t1 as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 ) select rownum id, trunc(rownum / 1000) mod1, -- * trunc(dbms_random.value(0,1000)) random1, -- * lpad(rownum,10,'0') small_vc, rpad('x',60) padding from generator v1, generator v2 where rownum <= 100000 ; create table t2 as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 ) select rownum id, trunc(rownum / 1000) mod2, -- * trunc(dbms_random.value(0,1000)) random2, -- * lpad(rownum,10,'0') small_vc, rpad('x',60) padding from generator v1, generator v2 where rownum <= 100000 ; create index t1_i1 on t1(mod1, random1); create index t2_i1 on t2(mod2, random2); begin dbms_stats.gather_table_stats(user, 't1', estimate_percent => 100, method_opt => 'for all columns size 1'); dbms_stats.gather_table_stats(user, 't2', estimate_percent => 100, method_opt => 'for all columns size 1'); end; / alter session set "_optimizer_cost_model" = io; set autot on -- first select /*+ opt_param('_optimizer_join_elimination_enabled' 'false') leading(t1 t2 t3) use_nl(t3) rowid(t3) */ count(t3.padding) cnt1, count(t2.padding) cnt2 from t1 , t2 , t1 t3 where t1.mod1 = 50 and t2.random2 = t1.random1 and t2.mod2 = 50 and t3.rowid = t1.rowid; -- second select /*+ OUTLINE_LEAF(@"SEL$3") OUTLINE_LEAF(@"SEL$2") OUTLINE_LEAF(@"SEL$1") NO_ACCESS(@"SEL$1" "TT"@"SEL$1") LEADING(@"SEL$1" "TT"@"SEL$1" "T4"@"SEL$1") USE_NL(@"SEL$1" "T4"@"SEL$1") NO_ACCESS(@"SEL$2" "T"@"SEL$2") LEADING(@"SEL$2" "T"@"SEL$2" "T3"@"SEL$2") USE_NL(@"SEL$2" "T3"@"SEL$2") LEADING(@"SEL$3" "T1"@"SEL$3" "T2"@"SEL$3") */ count(tt.padding) cnt1, count(t4.padding) cnt2 from ( select t3.padding, t.rid2 from ( select t1.rowid rid1, t2.rowid rid2 from t1, t2 where t1.mod1 = 50 and t2.random2 = t1.random1 and t2.mod2 = 50 ) t , t1 t3 where t3.rowid = t.rid1 ) tt , t2 t4 where tt.rid2 = t4.rowid; -- third select /*+ OPT_PARAM('_optimizer_order_by_elimination_enabled' 'false') OUTLINE_LEAF(@"SEL$3") OUTLINE_LEAF(@"SEL$2") OUTLINE_LEAF(@"SEL$1") NO_ACCESS(@"SEL$1" "TT"@"SEL$1") ROWID(@"SEL$1" "T4"@"SEL$1") LEADING(@"SEL$1" "TT"@"SEL$1" "T4"@"SEL$1") USE_NL(@"SEL$1" "T4"@"SEL$1") NO_ACCESS(@"SEL$2" "T"@"SEL$2") ROWID(@"SEL$2" "T3"@"SEL$2") LEADING(@"SEL$2" "T"@"SEL$2" "T3"@"SEL$2") USE_NL(@"SEL$2" "T3"@"SEL$2") INDEX(@"SEL$3" "T1"@"SEL$3" ("T1"."MOD1" "T1"."RANDOM1")) INDEX(@"SEL$3" "T2"@"SEL$3" ("T2"."MOD2" "T2"."RANDOM2")) LEADING(@"SEL$3" "T1"@"SEL$3" "T2"@"SEL$3") USE_HASH(@"SEL$3" "T2"@"SEL$3") */ count(tt.padding) cnt1, count(t4.padding) cnt2 from ( select t3.padding, t.rid2 from ( select t1.rowid rid1, t2.rowid rid2 from t1, t2 where t1.mod1 = 50 and t2.random2 = t1.random1 and t2.mod2 = 50 order by rid1 ) t , t1 t3 where t3.rowid = t.rid1 order by rid2 ) tt , t2 t4 where tt.rid2 = t4.rowid;Results:
SQL*Plus: Release 11.2.0.1.0 Production on Tue Jan 31 21:08:42 2012 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> drop table t1 purge; Table dropped. SQL> drop table t2 purge; Table dropped. SQL> SQL> create table t1 2 as 3 with generator as ( 4 select --+ materialize 5 rownum id 6 from dual 7 connect by 8 rownum <= 10000 9 ) 10 select 11 rownum id, 12 trunc(rownum / 1000) mod1, -- * 13 trunc(dbms_random.value(0,1000)) random1, -- * 14 lpad(rownum,10,'0') small_vc, 15 rpad('x',60) padding 16 from 17 generator v1, 18 generator v2 19 where 20 rownum <= 100000 21 ; Table created. SQL> SQL> create table t2 2 as 3 with generator as ( 4 select --+ materialize 5 rownum id 6 from dual 7 connect by 8 rownum <= 10000 9 ) 10 select 11 rownum id, 12 trunc(rownum / 1000) mod2, -- * 13 trunc(dbms_random.value(0,1000)) random2, -- * 14 lpad(rownum,10,'0') small_vc, 15 rpad('x',60) padding 16 from 17 generator v1, 18 generator v2 19 where 20 rownum <= 100000 21 ; Table created. SQL> SQL> create index t1_i1 on t1(mod1, random1); Index created. SQL> create index t2_i1 on t2(mod2, random2); Index created. SQL> SQL> begin 2 dbms_stats.gather_table_stats(user, 't1', estimate_percent => 100, method_opt => 'for all columns size 1'); 3 dbms_stats.gather_table_stats(user, 't2', estimate_percent => 100, method_opt => 'for all columns size 1'); 4 end; 5 / PL/SQL procedure successfully completed. SQL> SQL> alter session set "_optimizer_cost_model" = io; Session altered. SQL> SQL> set autot on SQL> SQL> -- first SQL> SQL> select /*+ 2 opt_param('_optimizer_join_elimination_enabled' 'false') 3 leading(t1 t2 t3) 4 use_nl(t3) 5 rowid(t3) 6 */ 7 count(t3.padding) cnt1, count(t2.padding) cnt2 8 from t1 9 , t2 10 , t1 t3 11 where t1.mod1 = 50 12 and t2.random2 = t1.random1 13 and t2.mod2 = 50 14 and t3.rowid = t1.rowid; CNT1 CNT2 ---------- ---------- 974 974 Execution Plan ---------------------------------------------------------- Plan hash value: 3542293915 ---------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 160 | 1753 | | 1 | SORT AGGREGATE | | 1 | 160 | | | 2 | NESTED LOOPS | | 1554 | 242K| 1753 | |* 3 | HASH JOIN | | 1554 | 132K| 199 | |* 4 | INDEX RANGE SCAN | T1_I1 | 990 | 18810 | 4 | |* 5 | TABLE ACCESS FULL | T2 | 990 | 67320 | 193 | | 6 | TABLE ACCESS BY USER ROWID| T1 | 1 | 73 | 1 | ---------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("T2"."RANDOM2"="T1"."RANDOM1") 4 - access("T1"."MOD1"=50) 5 - filter("T2"."MOD2"=50) Note ----- - cpu costing is off (consider enabling it) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2136 consistent gets 0 physical reads 0 redo size 481 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL> -- second SQL> SQL> select /*+ 2 OUTLINE_LEAF(@"SEL$3") 3 OUTLINE_LEAF(@"SEL$2") 4 OUTLINE_LEAF(@"SEL$1") 5 NO_ACCESS(@"SEL$1" "TT"@"SEL$1") 6 LEADING(@"SEL$1" "TT"@"SEL$1" "T4"@"SEL$1") 7 USE_NL(@"SEL$1" "T4"@"SEL$1") 8 NO_ACCESS(@"SEL$2" "T"@"SEL$2") 9 LEADING(@"SEL$2" "T"@"SEL$2" "T3"@"SEL$2") 10 USE_NL(@"SEL$2" "T3"@"SEL$2") 11 LEADING(@"SEL$3" "T1"@"SEL$3" "T2"@"SEL$3") 12 */ 13 count(tt.padding) cnt1, count(t4.padding) cnt2 14 from 15 ( 16 select 17 t3.padding, 18 t.rid2 19 from 20 ( 21 select 22 t1.rowid rid1, 23 t2.rowid rid2 24 from t1, t2 25 where t1.mod1 = 50 26 and t2.random2 = t1.random1 27 and t2.mod2 = 50 28 ) t 29 , t1 t3 30 where t3.rowid = t.rid1 31 ) tt 32 , t2 t4 33 where tt.rid2 = t4.rowid; CNT1 CNT2 ---------- ---------- 974 974 Execution Plan ---------------------------------------------------------- Plan hash value: 1174882150 ------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 117 | 3118 | | 1 | SORT AGGREGATE | | 1 | 117 | | | 2 | NESTED LOOPS | | 1554 | 177K| 3118 | | 3 | VIEW | | 1554 | 68376 | 1564 | | 4 | NESTED LOOPS | | 1554 | 147K| 1564 | | 5 | VIEW | | 1554 | 37296 | 10 | |* 6 | HASH JOIN | | 1554 | 59052 | 10 | |* 7 | INDEX RANGE SCAN | T1_I1 | 990 | 18810 | 4 | |* 8 | INDEX RANGE SCAN | T2_I1 | 990 | 18810 | 4 | | 9 | TABLE ACCESS BY USER ROWID| T1 | 1 | 73 | 1 | | 10 | TABLE ACCESS BY USER ROWID | T2 | 1 | 73 | 1 | ------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 6 - access("T2"."RANDOM2"="T1"."RANDOM1") 7 - access("T1"."MOD1"=50) 8 - access("T2"."MOD2"=50) Note ----- - cpu costing is off (consider enabling it) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 1365 consistent gets 0 physical reads 0 redo size 481 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL> -- third SQL> SQL> select /*+ 2 OPT_PARAM('_optimizer_order_by_elimination_enabled' 'false') 3 OUTLINE_LEAF(@"SEL$3") 4 OUTLINE_LEAF(@"SEL$2") 5 OUTLINE_LEAF(@"SEL$1") 6 NO_ACCESS(@"SEL$1" "TT"@"SEL$1") 7 ROWID(@"SEL$1" "T4"@"SEL$1") 8 LEADING(@"SEL$1" "TT"@"SEL$1" "T4"@"SEL$1") 9 USE_NL(@"SEL$1" "T4"@"SEL$1") 10 NO_ACCESS(@"SEL$2" "T"@"SEL$2") 11 ROWID(@"SEL$2" "T3"@"SEL$2") 12 LEADING(@"SEL$2" "T"@"SEL$2" "T3"@"SEL$2") 13 USE_NL(@"SEL$2" "T3"@"SEL$2") 14 INDEX(@"SEL$3" "T1"@"SEL$3" ("T1"."MOD1" "T1"."RANDOM1")) 15 INDEX(@"SEL$3" "T2"@"SEL$3" ("T2"."MOD2" "T2"."RANDOM2")) 16 LEADING(@"SEL$3" "T1"@"SEL$3" "T2"@"SEL$3") 17 USE_HASH(@"SEL$3" "T2"@"SEL$3") 18 */ 19 count(tt.padding) cnt1, count(t4.padding) cnt2 20 from 21 ( 22 select 23 t3.padding, 24 t.rid2 25 from (select 26 t1.rowid rid1, 27 t2.rowid rid2 28 from t1, t2 29 where t1.mod1 = 50 30 and t2.random2 = t1.random1 31 and t2.mod2 = 50 32 order by rid1) t 33 , t1 t3 34 where t3.rowid = t.rid1 35 order by rid2 36 ) tt 37 , t2 t4 38 where tt.rid2 = t4.rowid; CNT1 CNT2 ---------- ---------- 974 974 Execution Plan ---------------------------------------------------------- Plan hash value: 2600095256 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 117 | 3210 | | 1 | SORT AGGREGATE | | 1 | 117 | | | 2 | NESTED LOOPS | | 1554 | 177K| 3210 | | 3 | VIEW | | 1554 | 68376 | 1656 | | 4 | SORT ORDER BY | | 1554 | 147K| 1656 | | 5 | NESTED LOOPS | | 1554 | 147K| 1603 | | 6 | VIEW | | 1554 | 37296 | 49 | | 7 | SORT ORDER BY | | 1554 | 59052 | 49 | |* 8 | HASH JOIN | | 1554 | 59052 | 10 | |* 9 | INDEX RANGE SCAN | T1_I1 | 990 | 18810 | 4 | |* 10 | INDEX RANGE SCAN | T2_I1 | 990 | 18810 | 4 | | 11 | TABLE ACCESS BY USER ROWID| T1 | 1 | 73 | 1 | | 12 | TABLE ACCESS BY USER ROWID | T2 | 1 | 73 | 1 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("T2"."RANDOM2"="T1"."RANDOM1") 9 - access("T1"."MOD1"=50) 10 - access("T2"."MOD2"=50) Note ----- - cpu costing is off (consider enabling it) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 34 consistent gets 0 physical reads 0 redo size 481 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed SQL>Comment by Valentin Nikotin — January 31, 2012 @ 7:21 pm UTC Jan 31,2012 |
[...] my old “two-step” approach to visiting tables and indexes. Get the rowids you really need, and visit the table later. In this case, though, I’ve sorted [...]
Pingback by Compression Units – 5 « Oracle Scratchpad — August 19, 2012 @ 6:03 pm UTC Aug 19,2012 |