Oracle Scratchpad

November 22, 2010

Index Join

Filed under: CBO,Execution plans,Index Joins,Indexing — Jonathan Lewis @ 6:40 pm BST Nov 22,2010

One of the less well known access paths available to the optimizer is the “index join” also known as the “index hash join” path. It’s an access path that can be used when the optimizer decides that it doesn’t need to visit a table to supply the select list because there are indexes on the table that, between them, hold all the required columns. A simple example might look something like the following:


create table indjoin
as
select
	rownum	id,
	rownum	val1,
	rownum	val2,
	rpad('x',500) padding
from
	all_objects
where
	rownum <= 3000
;

create unique index ij_v1 on indjoin(id, val1);
create unique index ij_v2 on indjoin(id, val2);

-- collect stats on the table and indexes

select
	ij.id
from
	indjoin		ij
where
	ij.val1 between 100 and 200
and	ij.val2 between 50 and 150
;

Note that the columns in the where clause appear in (some) indexes, and the column(s) in the select list exist in (at least) some indexes. Under these circumstances the optimizer can produce the following plan (the test script was one I wrote for 8i – but this plan comes from an 11.1 instance):


---------------------------------------------------------------------------
| Id  | Operation              | Name             | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |     3 |    36 |    24 |
|   1 |  VIEW                  | index$_join$_001 |     3 |    36 |    24 |
|*  2 |   HASH JOIN            |                  |       |       |       |
|*  3 |    INDEX FAST FULL SCAN| IJ_V1            |     3 |    36 |    11 |
|*  4 |    INDEX FAST FULL SCAN| IJ_V2            |     3 |    36 |    11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(ROWID=ROWID)
   3 - filter("VAL1"<=200 AND "VAL1">=100)
   4 - filter("VAL2"<=150 AND "VAL2">=50)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - "ID"[NUMBER,22]
   2 - (#keys=1) "ID"[NUMBER,22]
   3 - ROWID[ROWID,10], "ID"[NUMBER,22]
   4 - ROWID[ROWID,10]

We do a fast full scan of the two indexes extracting the rowid and id from index ij_v1 and just the rowid from index ij_v2. We can then get the result we want by doing a hash join between these two result sets on the rowid values because any time the two rowsources have a rowid in common, it’s a rowid for a row where val1 is between 100 and 200, and val2 is between 50 and 150 and the first rowsource is carrying the id - which is the thing we need to report.

There are a couple of little observations that we can make about this example.

    First, although I’ve only used two indexes in this example Oracle is not limited to just two indexes. The number of indexes that could be used is effectively unlimited.
    Second, the index_join path is strictly limited to cases where the optimizer can see that every column in the query can be found in indexes on the table.
    Third, although my example uses index fast full scans that’s not a necessary feature of the plan. Just like any other hash join, Oracle could use an index range (or full) scan to get some of the data.
    Finally, there are clearly a couple of bugs in the code.

Bugs:

If you check the rows/bytes columns in the plan you’ll see that the predicted number of rows selected is the same for both indexes (lines 3 and 4) – but we extract the rowid and the id from the first index (projection detail for line 3), so the total data volume expected from line 3 is slightly larger than the total data volume from line 4 where we extract only the rowid; theoretically, therefore, the optimizer has used the tables (indexes) in the wrong order – the one supplying the smaller volume of data should have been used as the first (build) rowsource.

More significantly, though, a quick check of the code that generates the data tells you that each index will supply 101 rows to the hash join – and you can even show that for other query execution plans the optimizer will calculate this cardinality (nearly) correctly. In the case of the index join the optimizer seems to have lost the correct individual cardinalities and has decided to use the size of the final result set as the cardinality of the two driving index scans.

There’s more, of course – one of the strangest things about the index join is that if your select list includes the table’s rowid, the optimizer doesn’t consider that to be a column in the index. So even though the predicate section of the plan shows the rowids being projected in the hash join, Oracle won’t use an index join for a query returning the rowid !

Footnote: The reason I’ve written this brief introduction to the index join is because an interesting question came up at the first E2SN virtual conference.

“If you hint an index hash join, is there any way of telling Oracle the order in which it should use the indexes?”

The answer is no – but there are ways of creating code that will do what you want, and that will be the topic of my next blog.

Updated Feb 2012

I’m wrong about controlling the order of the index hash join – it suddenly occurred to me that the full specification for the hint includes a list of the indexes you want to join. Having run a few experiments using the full specification for a two-index join, I think I’ve convinced myself that the order you list the indexes is taken as the join order, with no swap.  Unfortunately there are various bugs that hide this fact. Furthermore,  I haven’t yet investigated more complex examples  (viz: examples which hash more indexes) to see whether Oracle also allows for swapping join inputs when there are more than two indexes.

[Further reading on Index Joins]

9 Comments »

  1. Jonathon,

    Do we have histograms on these columns, Also i would like to see the result with oracle 11g column statistics.

    Regards
    Amir Riaz

    Comment by Amir Riaz — November 23, 2010 @ 6:17 am BST Nov 23,2010 | Reply

  2. Amir,

    There are no histograms on the columns. It’s a question worth asking because it might (in theory) make some sort of difference. In fact all three columns are uniformly distributed (rownum) with no gaps in the range, so Oracle’s analysis of the data in its default “gather stats” calls shouldn’t produce histograms.

    Another point about stats, though, is that I’ve disabled CPU costing (system statistics) to make the results as consistent as possible across different versions of Oracle – and you might wonder whether CPU costing would be enough to switch the order of the join.

    Comment by Jonathan Lewis — November 23, 2010 @ 8:02 am BST Nov 23,2010 | Reply

  3. Thanks Jonathan for picking this up :).
    Can’t wait for second part .
    BTW It was my question .
    Regards.
    Greg

    Comment by Grzegorz Goryszewski — November 24, 2010 @ 8:11 am BST Nov 24,2010 | Reply

    • Greg,

      It might appear this evening while I’m waiting in Heathrow airport for a flight to Geneva, or it might appear on Friday evening while I’m waiting in Geneva airport for a flight home – but it will be here soon. (And then there are two more articles I could write on the topic.)

      As I said at the time – it was a good question.

      Comment by Jonathan Lewis — November 24, 2010 @ 10:22 am BST Nov 24,2010 | Reply

  4. Hi Jonathon,

    Nice article just a quick question why the rows column of the execution plan show 3 row in the final step (i.e. id=0)
    it should be 50 right (or at least close to 50 if there is some arithmetics error)?

    OR am i missing something?

    Many Thanks

    Comment by Henish — November 24, 2010 @ 2:58 pm BST Nov 24,2010 | Reply

  5. Henish,

    This is “just” the usual arithmetic problem that the optimizer has with dependent columns. You can see that the two columns are perfectly correlated and that the query is asking for a given overlap. The optimizer sees only that we have two columns that are each asking for 1/30th of the data so, ignoring some of the fiddly bits it’s arithmetic is “total_rows * 1/30 * 1/30″ (which isn’t 3, but it’s a lot closer than the 50 you’re expecting.

    Comment by Jonathan Lewis — November 24, 2010 @ 6:48 pm BST Nov 24,2010 | Reply

  6. Jonathan,

    I’m thinking about the sentence “…total data volume expected from line 3 is slightly larger than the total data volume from line 4 where we extract only the rowid…”

    If we look at the Projection Information you are right, but I think that in this scenario Oracle is saying something like “fast full scan of the index costs me 36 bytes, then of this 36 I take only the ROWID” that is: in a fast full scan Oracle do something like a full table scan so it reads all the index’s column(s) even if they are not needed and then selects only what really need (the Projection Information).
    So in this case Bytes (column from the plan) and the Projection Information can be a bit “unrelated”.

    What you think of my idea?

    Bye,
    Antonio

    Comment by lascoltodelvenerdi — November 25, 2010 @ 10:24 am BST Nov 25,2010 | Reply

    • Antonio,

      I think the bytes column is supposed to be the bytes produced in the rowsource that will be passed up to the parent, so it should include the bytes for the rowid.

      Comment by Jonathan Lewis — November 27, 2010 @ 7:07 pm BST Nov 27,2010 | Reply

  7. [...] very lucid, reproducible example about Index Joins has been put on display by Jonathan Lewis. It’s a compact introduction about Index Joins, which are also known as Index Hash [...]

    Pingback by Log Buffer #207, A Carnival of the Vanities for DBAs | The Pythian Blog — November 26, 2010 @ 3:40 pm BST Nov 26,2010 | Reply


RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,514 other followers