Browsing a little history recently I came across a note I’d written about the new-style index hint. In that note I claimed that:
… the index has to start with the columns (product_group, id) in that order – with preference given to an exact match, otherwise using the lowest cost index that starts the right way.
On reading this statement I suddenly realised that I hadn’t actually proved (to myself, even) that if I had the indexes (product_group, id) and (product_group, id, other_col) then a two-column hint forced Oracle to use the two column index in all (legal) circumstances.
So, tonight’s quiz – are there any edge cases, and what easy ways can you think of to prove (or disprove) the claim for the general case.
Footnote: you don’t have to demonstrate the method, just a brief outline of the idea will be sufficient.
A new form of index hint appeared in 10g – and it’s becoming more common to see it in production code; instead of naming indexes in index hints, we describe them. Consider the following hint (expressed in two ways, first as it appeared in the outline section of an execution plan, then cosmetically adjusted to look more like the way you would write it in your SQL):
INDEX(@"SEL$1" "PRD"@"SEL$1" ("PRODUCTS"."PRODUCT_GROUP" "PRODUCTS"."ID"))
index(@sel$1 prd@sel$1(product_group id))
As I’ve often pointed out, this blog isn’t AskTom, or the OTN forum, so I don’t expect to have people asking me to solve their problems; neither do I answer email questions about specific problems. Occasionally, though, questions do appear that are worth a little public airing, and one of these came in by email a couple of weeks ago. The question is longer than the answer I sent, my contribution to the exchange doesn’t start until the heading: “My Reply”.
Here’s a quick tutorial in hinting, promped by a question on the OTN database forum.
The OP has a hash semi-join and Oracle appears to be ignoring a hint to use a nested loop:
Yes, finally, really ignoring hints – but it’s a sort of bug, of course.
Thanks to Timur Akhmadeev for telling us about bug 8432870 (you’ll need an account on MOS for the link) in his reply to Charles Hooper’s recent post.
In the upgrade from 9i to 10g there was a change in the “hint parser”. If you put a valid SQL keyword inside the hint delimiters (the note says /*+ */ but doesn’t mention the –+ alternative for specifying a hint, thought it’s probably still true there) when the keyword is not a valid hint – for example the word NOLOGGING which I have seen people use as if it were a hint – then Oracle will ignore all the hints.
Earlier versions of Oracle simply noticed that you had embedded something that wasn’t a valid hint, but that didn’t stop the parser from reading the rest of the hints correctly.
If the invalid hint is not a valid SQL keyword then there are no nasty side effects.
This might explain why I ran into an odd problem a little while ago when I added a comment to my hint list and found that the hints stopped working. I can’t remember the exact details any more but I think my comment was something along the lines of: “Do not … because …”, and this broke the hints until I changed it to “Don’t … because …”.
[Further reading on “ignoring hints”]
I have four simple (non-partitioned, non-clustered, not views, not object type – really I’m not trying to be cunning or devious here) heap tables, and write a query that joins them:
leading(t1 t2 t3 t4)
use_hash(t2) use_hash(t3) use_hash(t4)
t2.id2 = t1.id1
and t3.id3 = t2.id2
and t4.id4 = t3.id3
I’m not keen on ANSI standard SQL – even though it is, technically, the strategic option and even though you have to use it for full outer joins and partitioned outer joins.
One reason for disliking it is that it “separates join predicates from filter predicates” – a reason often given in praise of the syntax which, to my mind, claims a spurious distinction and introduces a mechanism that makes it harder to keep mental track of what’s going to happen as you walk through the join order. (I have to admit that I was temporarily fooled into thinking it was quite a nice idea – in an abstract sort of way.)
Here’s an extract from an execution plan I found on a client site recently. I’ve collapsed lines 5 to 42 into a single line representing the rowsource produced by a fairly messy execution plan, leaving just the last three stages of execution on view. Each of three operations joins the same collection variable (using the table() operator) to the row source – once through a hash join, then twice more (joining to two other columns) through nested loop outer joins:
The resulting estimates of row counts and costs are quite entertaining and, fortunately, not very accurate:
In an earlier article introducing the index join I raised a question that came up at the first ES2N virtual conference:
“If you hint an index hash join, is there any way of telling Oracle the order in which it should use the indexes?”
Consider the following example:
Here’s an example of “creative SQL” that I wrote in response to a question on OTN about combining data from two indexes to optimise access to a table. It demonstrates the principle that you can treat an index as a special case of a table – allowing you to make a query go faster by referencing the same table more times.
Unfortunately you shouldn’t use this particular example in a production system because it relies on the data appearing in the right order without having an “order by” clause. This type of thing makes me really keen to have a hint that says something like: /*+ qb_name(my_driver) assume_ordered(@my_driver) */ so that you could tell the optimizer that it can assume that the rowset from a given query block will appear in the order of the final “order by” clause as it does, for example, with sorted hash clusters.
I’ve previously published a couple of notes (hereand here) about the use of the driving_site() hint with distributed queries. The first note pointed out that the hint was deliberately ignored if you write a local CTAS or INSERT that did a distributed query. I’ve just found another case where the hint is ignored – this time in a simple SELECT statement.
Try running an ordinary distributed query from the SYS account, and then try using the driving_site() hint to make it run at the remote site. When I tried this a few days ago I ended up wasting half an hour translating some SQL from ANSI to Oracle dialect because I thought that the ANSI was making Oracle transform the query in a way that lost the hint – then I discovered that both versions of the code worked correctly if I logged in as a different user.
I was running my queries between two databases using 22.214.171.124 – I won’t guarantee you get the same results on other versions, but it looks like SYS doesn’t honour the driving_site() hint. I can’t think of a robust argument why this should be the case, but if I were forced to do some vague hand-waving I’d probably mumble something about potential security loopholes.
Footnote: I should, of course, have mentioned that there are all sorts of things that behave in unexpected ways if you are logged on as SYS, and that you shouldn’t be logged on as SYS – especially in a production system.
[Further reading on “ignoring hints”]
Here’s a simple piece of code demonstrating an irritating problem. I’ve created a table, a function-based index, collected stats (without histograms), and then run a query that should use that index – but doesn’t.
Everybody knows you shouldn’t be using the Rule-based optimizer (RBO) any more – everyone, that is, except some of the folks at Oracle Corp.
I had a conversation a few weeks ago with someone who was having a problem with their standby database on 10.2 because a query against v$archive_gap was taking a very long time to complete. Now that’s an easy to address (in principle) – collect stats on the underlying X$ objects using the call dbms_stats.gather_fixed_objects_stats() and the magic of cost-based optimisation takes over and solves everything.
If you read the manual pages about “dynamic sampling” it’s easy to get just a little lost in the detail; so this is a brief overview of the variations in the strategies used.
Here’s a little puzzle that someone sent to me a couple of days ago – it’s a case where the optimizer seems to be ignoring a hint.