Browsing a little history recently I came across a note I’d written about the new-style index hint. In that note I claimed that:
… the index has to start with the columns (product_group, id) in that order – with preference given to an exact match, otherwise using the lowest cost index that starts the right way.
On reading this statement I suddenly realised that I hadn’t actually proved (to myself, even) that if I had the indexes (product_group, id) and (product_group, id, other_col) then a two-column hint forced Oracle to use the two column index in all (legal) circumstances.
So, tonight’s quiz – are there any edge cases, and what easy ways can you think of to prove (or disprove) the claim for the general case.
Footnote: you don’t have to demonstrate the method, just a brief outline of the idea will be sufficient.
Update Jan 19th
I’m sorry it’s taken so long to respond to this. It was a post that I pre-dated at the end of December, and I forgot that it would launch itself.
The most interesting comment, from my perspective, came from Valentin Nikotin – “what if the index you are hinting is in another schema?” Virtually every test I do is based on a single schema, I rarely use two, or more, schemas at the same time. So what if there’s a special case that somehow the hint has a precedence that assumes the indexes will be found by (in effect) querying user_ind_columns rather than all_ind_columns ? I wouldn’t expect this to be the case – but when you’re trying to do pre-emptive trouble-shooting it’s this type of case (i.e. “nobody does that sort of thing”) that you can easily overlook.
The other responents came up with the type of thing that I would call the correct strategy. We need to start with an SQL statement that uses the three-column index by default and uses the two-column index only when it’s hinted. Of course, if we can examine the 10053 trace and see the change, especially the restriction to the two-column index explicitly being labelled with something like “index demanded by user hint”, then we can be pretty confident that the interpretation is correct.
The question then is, how can we construct a suitable demonstration.
In general, if you have a three-column index that starts with the same columns in the same order as the two-column index then the three-column index will be bigger and have a higher clustering_factor so, from a purely arithmetic perspective, it will be less desirable than the two column index except for one special class of queries. Consider a query of the form:
select col3 from t1 where col1 = {constant} and col2 = {constant}
With a suitable three-column index this query can be answered from completely within the index, with only the first two columns Oracle has to visit the table. So if we ensure that there are several rows scattered around the table we can be confident (in the general case) that the optimizer will see that the three-column query has a lower cost than the two-column query.
I don’t need to set up the model for you – Charles Hooper has already done the work.
Footnote 2:
Following Valentin Nikotin’s comment, you might like to consider that any reports you currently run against the user_indexes and user_ind_columns views (and their partition-related equivalents) should be run against the all_indexes and all_ind_columns views with a restriction of the form: “table_owner = user”.
Index with the same name that is owned by the other user.
Comment by Valentin Nikotin — January 13, 2012 @ 9:53 pm GMT Jan 13,2012 |
The Optimizer transforms the hint to /*+ INDEX (“T” “IND”) */ while it performs.
After that it will consider all indexes having the same name.
Comment by Valentin Nikotin — January 15, 2012 @ 3:52 pm GMT Jan 15,2012 |
Valentin,
Although Oracle reports an index name in the final execution plan output, if you check the outline section of the plan – dbms_xplan.display(null,null,’outline’) you will find that it is still thinking of the index in terms of an ordered list of columns.
Comment by Jonathan Lewis — January 26, 2012 @ 9:48 pm GMT Jan 26,2012 |
The another example of the situation when Oracle doesn’t perform correctly new-style hint, while old-style works fine, is a hint for table through a db link. The hint in the query on the remote side will have ??? instead of the index name:
Comment by Valentin Nikotin — January 15, 2012 @ 8:57 pm GMT Jan 15,2012 |
Valentin,
That’s an interesting observation.
The behaviour probably varies with version of Oracle, so it’s worth having a couple of simple test cases so that you can check what’s going on in each upgrade and patch.
Comment by Jonathan Lewis — January 26, 2012 @ 9:42 pm GMT Jan 26,2012 |
Test with a SQL statement that doesn’t reference the columns in the two column index, eg select other_col from products order by other_col;
A CBO (10053) trace on the this SQL without the hint shows evaluations of access paths for a table scan or using the three column index only.
A CBO trace on the SQL with the hint shows “User hint to use this index” for the two column index and only shows access path options using the two column index.
Comment by Ari — January 14, 2012 @ 9:14 am GMT Jan 14,2012 |
If you select product_group, id, other_col, in totality or not (but at most only those columns) and you hint to use the two columns index (product_group, id) for a query involving in its where clause only a predicate on other_col then the CBO might(depending on your data and how it is scattered) do an index FFS on the 3 columns index instead of the hinted 2 columns index
Comment by Mohamed Houri — January 15, 2012 @ 9:17 am GMT Jan 15,2012 |
For what it is worth, I was not successful in disproving what you stated when testing Oracle Database 11.2.0.2 (although it is possible to make it appear as though what is stated is incorrect, simply by using an invalid index hint). The test table and indexes that I started with were defined as follows:
Unhinted, the following query accesses the index on columns C2, C1, and C3 to avoid accessing the table:
Let’s hint the optimizer to use the index on the columns C1 and C2:
In the above, the optimizer obeyed the hint, even though the calculated cost from the unhinted plan increased from 1,823 to 1,012,000.
Let’s reverse the order of the columns in the index hint:
In the above, note that the index on columns C2, C1, and C3 was used, but the cost is now calculated at 5,171 rather than 1,823 as it was in the unhinted plan. The INDEX FAST FULL SCAN operation is now shown as an INDEX FULL SCAN operation.
We have an index on columns C2, C1, and C3, but we also have an index on columns C3, c1, and C2. What happens when we specify the columns C3, C1, and C2 in the index hint in that order?
An index full scan was selected to performed on the IND_T1_C3_C1_C2 index at a cost 5,283, rather than using the IND_T1_C2_C1_C3 index that previously resulted in a cost of 5,171 – so the optimizer will not alter the order of the columns in the index hint to reduce the calculated cost.
If we add a WHERE clause that places a restriction on column C2 to be less than 10, the optimizer could use a couple of different access paths. Let’s specify the columns C1 and C2 in the index hint to see which index is selected:
In the above, a skip scan was selected because the index with columns C1 and C2 was specified in the hint.
Let’s try another example that possibly might be considered a case where the optimizer disobeys the hint or is free to change the order of the columns specified in the index hint (this might be incorrectly considered an edge case):
A quick peek at the above output might suggest that the optimizer could decide to locate an index with columns C1, C2, and C3 in any order – but I do not believe that this is the case. I believe that the optimizer considered the index hint specified in the SQL statement as being invalid (a check of the 10053 trace might confirm).
Let’s create another index and then repeat the above SQL statement.
The index with the columns that matched the order of the columns in the index hint was selected, even though the calculated cost would have been lower if the optimizer were permitted to select any index with the columns listed in the index hint.
What about a case where there is an exact match between an index definition and an index hint, and there is also another index with one additional column which would avoid the table access:
The index that exactly match the index hint was selected.
What if we only specify in the index hint a leading column, when there are two indexes with that leading column, one of which allows the optimizer to avoid the table access:
The optimizer selected the lowest cost access path from the two indexes that matched the hint.
What if we specify a column in the index hint that is not listed in the SELECT or WHERE clauses?
The above shows that the optimizer still obeyed the intention of the hint – it found an index that started with the specified column and selected to perform an INDEX SKIP SCAN even though column C2, specified in the WHERE clause, is the third column in the index definition.
What happens if we try something silly by modifying the WHERE clause in the previous SQL statement to specify that essentially every row from the table will be retrieved:
At least with B*tree indexes and simple SQL statements, there do not appear to be any edge cases, although specifying an invalid index hint might appear to be an edge case at first glance.
Comment by Charles Hooper — January 15, 2012 @ 3:53 pm GMT Jan 15,2012 |
Jonathan,
It might be the case that the original basis for your observation is found in the Oracle Database documentation (found in the 10.2 and 11.2 documentation library – a bit more information is provided in the 11.2 documentation libary (in section 19.2.4 Specifying Complex Index Hints), but the link to create the V view (link to Example 16-3) was removed).
http://docs.oracle.com/cd/B19306_01/server.102/b14211/hintsref.htm#PFGRF50105
The above quote agrees with your statement. However, the index hint example provided in the above quote does not appear to be valid because both an index name and a column list are specified. In light of the apparently invalid index hint found in the 10.2 and 11.2 documentation, a bit more experimentation might be required.
I find that it is interesting that the following index hint found in the documentation works:
Comment by Charles Hooper — January 15, 2012 @ 8:04 pm GMT Jan 15,2012 |
Charles,
Thanks for that. I’ve always found it harder to find things in the manuals since they went electronic, but I’m surprised I hadn’t noticed that before. (Maybe it wasn’t there when I first checked, or maybe I only checked in the 10.1 manual).
To return the favour – the /*+ index(v.e2.e3 emp_job_ix) */ format goes all the way back to Oracle 8, and is the form used to supply hints against tables inside view definitions. If it actually works it means that e2 is a view inside v, and e3 is a view inside e2.
Comment by Jonathan Lewis — January 26, 2012 @ 9:47 pm GMT Jan 26,2012 |
[…] my previous post, I made the comment: In general, if you have a three-column index that starts with the same […]
Pingback by Quiz Night « Oracle Scratchpad — January 19, 2012 @ 8:51 am GMT Jan 19,2012 |
[…] reminded that it was possible to create non-specific index hints that specify table columns when a recent quiz was posted that asked to find specific cases where the behavior is other than expected with the […]
Pingback by Non-Specific Index Hints « Charles Hooper's Oracle Notes — January 24, 2012 @ 8:39 pm GMT Jan 24,2012 |