Update 24th Nov 2010: See comment below for restriction.
There are various reasons why Oracle is sometimes unable to unnest a subquery. Here’s one example of a fairly common problem:
rem rem Script: exists_with_or_2.sql rem Dated: June 2006 rem Author: J P Lewis rem select count(small_vc) from t1 where t1.mod_15 > 10 and ( t1.modded = 0 or exists ( select null from t2 where t2.id = t1.id ) ) ;
Because of the “or” condition, the optimiser has to run the subquery as a filter subquery – in this case starting with an indexed access path to find all rows where column mod_15 > 10.
Having identified all rows where mod_15 > 10, Oracle will check whether modded is zero, if it is the row will be reported, if it isn’t then Oracle will (in principle – but check my comments about filter subqueries) run the subquery as the second check.
If this does more work than you think appropriate, or introduces the random fluctutations in performance I described in my note on filter subqueries, then you could always restructure the query manually – like this:
select count(small_vc) from ( select small_vc from t1 where t1.mod_15 > 10 and t1.modded = 0 union all select small_vc from t1 where t1.mod_15 > 10 and t1.modded != 0 and exists ( select null from t2 where t2.id = t1.id ) ) ;
In effect, this is a manual concatenation – the sort of thing the optimizer will sometimes do when it can generate a result by combining the results from two simpler, non-overlapping, queries.
In one of my queries I find all the data where modded is zero, and in the other I find all the data where modded is not zero but the subquery test succeeds. Of course, I would only do this if I thought that the two queries could both operate efficiently – which is probably not true in this specific case – but, because I’ve changed the “or” into an “and”, the optimizer does at least have the option to consider unnesting.
At first sight this looks like a useful strategy, but there is a problem still to be addressed – the ever-threatening null. If I have a row where mod_15 > 10 and modded is null then it will get reported by the original query – but not by my re-write.
It won’t appear in the top half of my union all because it won’t pass the test modded = 0, and it won’t get into the bottom half because it won’t pass the test modded != 0.
Unless column modded has been declared to be not null, the predicate I need for the bottom half of the union all should be a variant of: “nvl(t1.modded,-1) != 0”
Of course, it’s a bit of a pain having to re-engineer the original predicate into something with an appropriate nvl() function included like this. And there’s a way to avoid it. As an alternative (at least in 10g) you can take advantage of the lnnvl() function – a somewhat unusual function which takes a predicate as it’s single input parameter and returns a “proper” boolean (true/false) as its result. All we need to write is lnnvl(t1.modded = 0) as the extra predicate, and the job is done.
The function lnnvl(predicate) returns true if its parameter returns false or null. So it’s a clean, simple, way of making sure we have exactly the right code in place when we do our manual concatenation.
Footnote: the lnnvl() function has existed since at least 7.3, but was not documented until 10g.