Update 24th Nov 2010: See comment 14 below for restriction, and an indication of how the use of lnnvl() can become a little counter-intuitive in slightly more complex cases.
There are various reasons why Oracle is sometimes unable to unnest a subquery. Here’s one example of a fairly common problem:
rem rem Script: exists_with_or_2.sql rem Dated: June 2006 rem Author: J P Lewis rem select count(small_vc) from t1 where t1.mod_15 > 10 and ( t1.modded = 0 or exists ( select null from t2 where t2.id = t1.id ) ) ;
Because of the “or” condition, the optimiser has to run the subquery as a filter subquery – in this case starting with an indexed access path to find all rows where column mod_15 > 10.
Having identified all rows where mod_15 > 10, Oracle will check whether modded is zero, if it is the row will be reported, if it isn’t then Oracle will (in principle – but check my comments about filter subqueries) run the subquery as the second check.
If this does more work than you think appropriate, or introduces the random fluctutations in performance I described in my note on filter subqueries, then you could always restructure the query manually – like this:
select count(small_vc) from ( select small_vc from t1 where t1.mod_15 > 10 and t1.modded = 0 union all select small_vc from t1 where t1.mod_15 > 10 and t1.modded != 0 and exists ( select null from t2 where t2.id = t1.id ) ) ;
In effect, this is a manual concatenation – the sort of thing the optimizer will sometimes do when it can generate a result by combining the results from two simpler, non-overlapping, queries.
In one of my queries I find all the data where modded is zero, and in the other I find all the data where modded is not zero but the subquery test succeeds. Of course, I would only do this if I thought that the two queries could both operate efficiently – which is probably not true in this specific case – but, because I’ve changed the “or” into an “and”, the optimizer does at least have the option to consider unnesting.
However …
At first sight this looks like a useful strategy but there is a problem still to be addressed – the ever-threatening null. If I have a row where mod_15 > 10 and modded is null then it will get reported by the original query – but not by my re-write.
It won’t appear in the top half of my union all because it won’t pass the test modded = 0, and it won’t get into the bottom half because it won’t pass the test modded != 0.
Unless column modded has been declared to be not null, the predicate I need for the bottom half of the union all should be a variant of: “nvl(t1.modded,-1) != 0”
Of course, it’s a bit of a pain having to re-engineer the original predicate into something with an appropriate nvl() function included like this. And there’s a way to avoid it. As an alternative (at least in 10g) you can take advantage of the lnnvl() function – a somewhat unusual function which takes a predicate as its single input parameter and returns a “proper” boolean (true/false) as its result. All we need to write is lnnvl(t1.modded = 0) as the extra predicate and the job is done.
The function lnnvl(predicate) returns true if its parameter returns false or null. So it’s a clean, simple, way of making sure we have exactly the right code in place when we do our manual concatenation.
Footnote:
The lnnvl() function has existed since at least 7.3, but was not documented until 10g.
Update (Aug 2020)
Oracle introduced “cost-based OR-expansion” in 12.2.0.1 which produces exactly the effect we need to see in cases like this. However you may need to hint it with the or_expand() hint in both 12.2 and 18c to make the transformation appear. See this blog note for further details and an example but be a little careful about bugs in the earlier version
.
Be careful with lnnvl(), at least on pre-10g systems:
Connected to:
Oracle8i Release 8.1.7.0.0 – Production
JServer Release 8.1.7.0.0 – Production
SQL> select 1 from dual where not lnnvl(1=1);
select 1 from dual where not lnnvl(1=1)
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Unfortunately I don’t have access to a 10g system to test this there.
Comment by Milo van der Leij — February 27, 2007 @ 6:04 pm GMT Feb 27,2007 |
It works fine without the “not” on 9.2.0.6 and 10.2.0.2
With the “not” in 10.2.0.2.0 we get:
SQL> select 1 from dual where not lnnvl(1=1);
select 1 from dual where not lnnvl(1=1)
*
ERROR at line 1:
ORA-03002: operator not implemented
Fun little bug.
Comment by scott — February 27, 2007 @ 6:14 pm GMT Feb 27,2007 |
hi Jonathan, I was wondering if you can shed light on something of a pesky problem that I’m having, with a big Ora Apps scenario: we’re trying to pull data for any and all records in a query that have records that have changed in **any** table after a particular date such that:
select * from tableA A, tableB B, tableC C, TableD D
where a.col1=b.col1 and b1.col2=c.col2 and c.col3=d.col3 and
(A.last_update_date > trunc(sysdate) OR
B.last_update_date > trunc(sysdate) OR
C.last_update_date > trunc(sysdate) OR
D.last_update_date > trunc(sysdate))
so, basically, because of the “OR” clauses, there’s full table scans on just about every table — rightfully so, I believe;
When these tables are into the millions of records, this is an unreasonable/untunable query — it seems.
One cannot use “AND” because, let’s say, when retrieving data for a child table, a parent’s record might be outside of the date range (“previous dates”) sought, so hence the thought of using “OR’s”
any thoughts/suggestions? — this is just a small subset/example, however the real example has about 8 tables with the “OR’s” … which to me, the problem seems untunable… or the approach, at least…
thanks much for any feedback into this “OR” pesky issue ;-)
Cos
Comment by cos — March 2, 2007 @ 6:13 pm GMT Mar 2,2007 |
Hi Cos,
Is the process in your question a datawarehouse loading process (a source view query to be precise) ?
We have exactly the same situation where in we are trying to capture the changed data in a number of tables of Oracle APPS.
Jonathan, would be very helpful if you could shed some light on Cos’s question.
Thanks,
Mahad.
Comment by Mahad — March 3, 2007 @ 6:20 am GMT Mar 3,2007 |
Mahad,
that’s precisely it. ;-)
Comment by Cos — March 3, 2007 @ 3:55 pm GMT Mar 3,2007 |
Cos, based on the comments you and Mahad have made, I suspect the problem is not the tablescans, but the fact that you are effectively doing a Cartesian join across all these tables, and only discarding data at the very last step.
I’d guess that with the 8-table requirement, you are probably doing 7 consecutive hash joins – and it’s the joins which are probably far more labour-intensive than the tablescans themselves.
You could test the effect of an 8-way union all of 8 joins, putting the last_update_date predicate on just one of the tables in each subquery. This might give you 8 relatively small result sets that could then be sorted for uniqueness
Bear in mind that an extract process that tries to depend on a predicate like “last_update_date > timestamp” could “lose” updates because of read-consistency. There’s a few comments about the problem in Oracle Insights – Tales of the Oak Table”.
Comment by Jonathan Lewis — March 4, 2007 @ 10:03 pm GMT Mar 4,2007 |
hi Jonathan,
thanks much for the reply.
Unfortunately, the table scans was **also** the problem. Yes, I tried the multi-way UNION ALL as a workaround and yes, that would have been faster but had not entirely completed the test as I feared doing the unions N-times would have been significantly more expensive ;-( If I get the selection (pre-discarding) step fast, I think we’ll be fine as there won’t be that many records involved then. Thanks for the book pointer, I have it signed by none other than JL ;-) but I guess I have not read that entirely either
Back to the drawing board ;-)
Comment by Cos — March 5, 2007 @ 1:10 pm GMT Mar 5,2007 |
Cos, part of the assumption of the union all approach is that you have have to do something inefficient to get the driving rows from one table, but the joins to the other seven will then be along the PK/FK indexes that represent the referential integrity. If those joins are still going to be inefficient then there isn’t a lot you can do, other than some infrastructure changes.
Comment by Jonathan Lewis — March 5, 2007 @ 1:14 pm GMT Mar 5,2007 |
you’re correct Jonathan, yes, the purpose of the UNION ALL, “AND”-ing the date column one table at a time worked fine and was quite fast after “then-walking-along PK’s” … I was just concerned about the amount of code that was to be written as well (complexity, management etc), but I guess, in these instances, you can’t have your cake and eat it all as well… sometimes, you have to write a lot more code to achieve performance. Like some famous Oracle guru says :-) — LOL — sometimes you have gently nudge, you have to “coach” Oracle into “understanding” better what you’re trying to do … worst part is, many times, developers, moreso end users, don’t have a clue of what they’re trying to accomplish to begin with!!! … :-) ok… enough with the ranting.
Comment by cos — March 5, 2007 @ 3:08 pm GMT Mar 5,2007 |
Interesting. Looks like “lnnvl” can be nested. I ran this on Oracle 9.2.0.7.0:
select ‘Line 1’ as is_true
from dual
where lnnvl(1=null)
union
select ‘Line 2’ as is_true
from dual
where lnnvl(1=0)
union
select ‘Line 3’ as is_true
from dual
where lnnvl(1=1)
union
select ‘Line 4’ as is_true
from dual
where lnnvl(lnnvl(1=null))
union
select ‘Line 5’ as is_true
from dual
where lnnvl(lnnvl(1=0))
union
select ‘Line 6’ as is_true
from dual
where lnnvl(lnnvl(1=1))
Comment by Robert Shepard — March 5, 2007 @ 9:55 pm GMT Mar 5,2007 |
Looks like the bug with “NOT LNNVL()” is Bug 4211527. Originally filed with 10.1.0.3 over two years ago. No workaround posted. What a pain. Back to the “OR” statements for me.
Comment by Don Seiler — April 4, 2007 @ 7:33 pm BST Apr 4,2007 |
Don,
Thanks for letting us know about this bug.
Comment by Jonathan Lewis — April 8, 2007 @ 3:55 pm BST Apr 8,2007 |
I would say, a workaround for the “NOT LNNVL()” Bug 4211527
is using “lnnvl(lnnvl())” instead of “not lnnvl()”
did some tests and it looks great for me.
for example
select null from dual
where lnnvl(lnnvl(exists(select null from dual where lnnvl(null=null))));
do I miss something ?
why don’t they propose it as workaround ?
Comment by Sokrates — June 28, 2007 @ 2:22 pm BST Jun 28,2007 |
Update:
I suggested the use of lnnvl() in a thread on OTN recently – and the user found that they got Oracle error “ORA-013207: incorrect use of the [LNNVL] operator”. I hadn’t realised that there were any limitations on the predicates you could apply the function to, and their predicate was:
You could work around this by expanding the lnnvl() predicate to:
but this is a little counter-intuitive and possibly difficult to work out – and my original argument for using the lnnvl() operator was that it made the code easier to understand. So in this case I’d go for:
Comment by Jonathan Lewis — November 24, 2010 @ 12:30 pm GMT Nov 24,2010 |
Hi Jonathan
I have been investigating a performance issue regarding the use of OR. Changing the query to use UNION ALL improved performance(minutes to seconds) by changing the plans to use indexes instead of full scans which the OR version caused. Here’s the bit I’ve found strange, while trying to do various testing I wrap a create table as select around the slow query(with the “OR”) and the CBO decides to pick the fast explain plan i.e the one with the indexes. I’ve assumed some form of dynamic sampling maybe in play but can’t see anything in the 10046 trace that obviously suggests this or anything on the web to suggest DDL will change an explain plan. Have you seen this behavior before?
Comment by Gary — October 18, 2011 @ 8:15 pm BST Oct 18,2011 |
Gary,
The only thing I can think of is an example I saw recently in 11.2 (I think) where the cardinality of a select statement was 1 when the query used a UNION ALL in an inline view joined to some other table, but a better (ca. 80M) cardinality when the UNION ALL was changed to a UNION. There isn’t any information in your example that would make it reasonable to suggest that your example is linked – but I suppose it’s a possibility, and it’s possible that CTAS results in a limited range of execution paths being examined that bypass the code that hits the cardinality error.
Comment by Jonathan Lewis — November 13, 2011 @ 2:37 pm GMT Nov 13,2011 |
[…] when considering rewriting a SQL OR condition to a UNION/UNION ALL, as pointed out to me by Jonathan Lewis referencing this post on this recent OTN forum […]
Pingback by Manual Concatenation and LNVL « OraStory — July 23, 2012 @ 12:34 pm BST Jul 23,2012 |
[…] shoot the above web service performance issue by transforming a disjunctive subquery into an UNION ALL SQL statement so that I gave the CBO an opportunity to choose an optimal […]
Pingback by Disjunctive subquery | Mohamed Houri’s Oracle Notes — May 12, 2014 @ 2:29 pm BST May 12,2014 |
[…] by a pingback on this post, followed in very short order by a related question (with a most gratifying result) on […]
Pingback by Subquery with OR | Oracle Scratchpad — May 15, 2014 @ 6:23 pm BST May 15,2014 |
[…] is to do something inefficient – a problem I described some time ago in a note on “OR with subquery”. We recognise, of course, that in general we should either use the front end to choose one of two […]
Pingback by Quiz Night | Oracle Scratchpad — May 16, 2015 @ 12:14 pm BST May 16,2015 |
[…] you are a regular reader of Jonathan Lewis blog you will have probably came across this article in which the author explains why an “OR subquery” pre-empts the optimizer from unnesting the […]
Pingback by SUBQ INTO VIEW FOR COMPLEX UNNEST | Mohamed Houri’s Oracle Notes — June 6, 2015 @ 9:42 am BST Jun 6,2015 |
Hi,
I’ve rerun the test listed on http://nenadnoveljic.com/blog/disjunctive-subquery-optimization/ on Oracle 18.4.0.0.0
and still no improvement. (even with the unnest)
Comment by Anonymous — November 26, 2018 @ 2:23 pm GMT Nov 26,2018 |
Anonymous,
A late reply on this one – the query needs an OR_EXPAND hint for the new (12.2) cost-based or-expansion to take place. (it would be automatic, unhinted, in 19c).
Regards
Jonathan Lewis
Comment by Jonathan Lewis — August 26, 2020 @ 10:53 pm BST Aug 26,2020 |
In my test case mentioned by Anonymous abive, ORE isn’t done even in 19c. The reason is the following heuristic: the columns of the driving table referenced in WHERE must be indexed. (“ORE: Bypassed for disjunct chain: No Index or Partition driver found.”). The expansion is done after creating indexes on t_large.a and t_large.id.
Notice, that the FILTER cost is wrong (591 instead of ~3 million). If Oracle decides to remove the index heuristic and make this transformation entirely cost based, it should fix the FILTER costing problem first.
Best regards,
Nenad
Comment by Nenad Noveljic — August 27, 2020 @ 11:18 am BST Aug 27,2020 |
Nenad,
Thanks for the follow-up.
I have to admit I didn’t look closely at the case I just added my comment about the hint because 18c still needs the hint for some queries that clearly ought to expand automatically.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — August 27, 2020 @ 11:42 am BST Aug 27,2020
[…] thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a […]
Pingback by Parse Time | Oracle Scratchpad — April 19, 2019 @ 5:10 pm BST Apr 19,2019 |
[…] by preceding parts of the union all; this can mean lots of new predicates appearing that use the lnnvl() function. (Of course there are variants between these two extremes, but the general principle is […]
Pingback by ANSI Plans | Oracle Scratchpad — February 17, 2020 @ 9:22 am GMT Feb 17,2020 |
[…] you check the Predicate Information you will see that operation 8 has introduced two lnnvl() predicates. Since the optimizer has lost sight of the fact that :choice is the same variable in […]
Pingback by Danger – Hints | Oracle Scratchpad — February 17, 2020 @ 9:25 am GMT Feb 17,2020 |
[…] written a couple of notes in the past about the problems of optimising queries with predicates of the form “or […]
Pingback by Subquery with OR | Oracle Scratchpad — August 19, 2020 @ 2:03 pm BST Aug 19,2020 |