Oracle Scratchpad

February 26, 2007

Subquery with OR

Filed under: CBO, Execution plans, Tuning — Jonathan Lewis @ 10:35 pm UTC Feb 26,2007

There are various reasons why Oracle is sometimes unable to unnest a subquery. Here’s one example of a fairly common problem:


select
	count(small_vc)
from
	t1
where
	t1.mod_15 > 10
and	(    t1.modded = 0
	  or exists (
		select	null
		from	t2
		where	t2.id = t1.id
		)
	)
;

Because of the “or” condition, the optimiser has to run the subquery as a filter subquery – in this case starting with an indexed access path to find all rows where  column mod_15 > 10.

Having identified all rows where mod_15 > 10, Oracle will check whether modded is zero, if it is the row will be reported, if it isn’t then Oracle will (in principle – but check my comments about  filter subqueries) run the subquery as the second check.

If this does more work than you think appropriate, or introduces the random fluctutations in performance I described in my note on filter subqueries, then you could always restructure the query manually – like this:


select
	count(small_vc)
from
	(
	select
		small_vc
	from
		t1
	where
		t1.mod_15 > 10
	and	t1.modded = 0
	union all
	select
		small_vc
	from
		t1
	where
		t1.mod_15 > 10
	and	t1.modded != 0
	and	exists (
			select	null
			from	t2
			where	t2.id = t1.id
		)
	)
;

In effect, this is a manual concatenation – the sort of thing the optimizer will sometimes do when it can generate a result by combining the results from two simpler, non-overlapping, queries.

In one of my queries I find all the data where modded is zero, and in the other I find all the data where modded is not zero but the subquery test succeeds. Of course, I would only do this if I thought that the two queries could both operate efficiently – which is probably not true in this specific case – but, because I’ve changed the “or” into an “and”, the optimizer does at least have the option to consider unnesting.

At first sight this looks like a useful strategy, but there is a problem still to be addressed – the ever-threatening null. If I have a row where mod_15 > 10 and modded is null then it will get reported by the original query - but not by my re-write.

It won’t appear in the top half of my union all because it won’t pass the test modded = 0, and it won’t get into the bottom half because it won’t pass the test modded != 0.

Unless column modded  has been declared to be not null, the predicate I need for the bottom half of the union all should be a variant of: “nvl(t1.modded,-1) != 0″

Of course, it’s a bit of a pain having to re-engineer the original predicate into something with an appropriate nvl() function included like this. And there’s a way to avoid it. As an alternative (at least in 10g) you can take advantage of the lnnvl() function – a somewhat unusual function which takes a predicate as it’s single input parameter and returns a “proper” boolean (true/false) as its result. All we need to write is lnnvl(t1.modded = 0) as the extra predicate, and the job is done.

The function lnnvl(predicate) returns true if its parameter returns false or null. So it’s a clean, simple, way of making sure we have exactly the right code in place when we do our manual concatenation.

Footnote: the lnnvl() function has existed since at least 7.3, but was not documented until 10g.

20 Comments »

  1. Jonathan,
    I was interested by the innvl function but can’t quite get the syntax right (no documentation, even on metalink). Based on lnnvl(t1.modded = 0) in your example, I thought a test like

    select ‘x’
    from dual
    where innvl(1= 0)

    would work, but I get the error

    1 select ‘x’
    2 from dual
    3* where innvl(1= 0)
    SQL> /
    where innvl(1= 0)
    *
    ERROR at line 3:
    ORA-00907: missing right parenthesis

    On 10.2.0.3 and 9.2.0.8.

    It’s probably something glaringly obvious….

    Comment by Ady — February 27, 2007 @ 1:40 pm UTC Feb 27,2007 | Reply

  2. Ady, the function is Lnnvl, not innvl.

    Comment by Yas — February 27, 2007 @ 2:16 pm UTC Feb 27,2007 | Reply

  3. Hey Ady,

    it’s LNNVL, not INNVL

    Comment by Franco — February 27, 2007 @ 2:17 pm UTC Feb 27,2007 | Reply

  4. Ok Yas, you won ;-)

    Comment by Franco — February 27, 2007 @ 2:18 pm UTC Feb 27,2007 | Reply

  5. Thanks Yas & Franco. I knew it would be something obvious :”>

    Comment by Ady — February 27, 2007 @ 3:16 pm UTC Feb 27,2007 | Reply

  6. Be careful with lnnvl(), at least on pre-10g systems:

    Connected to:
    Oracle8i Release 8.1.7.0.0 – Production
    JServer Release 8.1.7.0.0 – Production

    SQL> select 1 from dual where not lnnvl(1=1);
    select 1 from dual where not lnnvl(1=1)
    *
    ERROR at line 1:
    ORA-03113: end-of-file on communication channel

    Unfortunately I don’t have access to a 10g system to test this there.

    Comment by Milo van der Leij — February 27, 2007 @ 6:04 pm UTC Feb 27,2007 | Reply

  7. It works fine without the “not” on 9.2.0.6 and 10.2.0.2

    With the “not” in 10.2.0.2.0 we get:
    SQL> select 1 from dual where not lnnvl(1=1);
    select 1 from dual where not lnnvl(1=1)
    *
    ERROR at line 1:
    ORA-03002: operator not implemented

    Fun little bug.

    Comment by scott — February 27, 2007 @ 6:14 pm UTC Feb 27,2007 | Reply

  8. “If I have a row where mod_15 > 0 and modded is null then it will get reported by the original query – but not by my re-write.”

    Original query is:

    select
    count(small_vc)
    from
    t1
    where
    t1.mod_15 > 10
    and ( t1.modded = 0
    or exists (
    select null
    from t2
    where t2.id = t1.modded
    )
    )
    ;

    If modded is null
    then ( t1.modded = 0
    or exists (
    select null
    from t2
    where t2.id = t1.modded
    )
    )

    will be false so the original query will not return the rows when t1.modded is null

    where am I going wrong ?

    Comment by amit poddar — February 27, 2007 @ 8:30 pm UTC Feb 27,2007 | Reply

  9. Amit, you aren’t going wrong, I made three mistakes in the article.

    First, the text should have had the same mod_15 > 10 as the SQL; secondly, the nvl() in the text should have read nvl(modded,-1) != 0; and finally, the correlating predicate in the subquery should have been t2.id = t1.id.

    Sorry about causing confusion and wasting your time. I’ve now corrected the errors.

    Comment by Jonathan Lewis — February 28, 2007 @ 6:02 am UTC Feb 28,2007 | Reply

  10. hi Jonathan, I was wondering if you can shed light on something of a pesky problem that I’m having, with a big Ora Apps scenario: we’re trying to pull data for any and all records in a query that have records that have changed in **any** table after a particular date such that:

    select * from tableA A, tableB B, tableC C, TableD D
    where a.col1=b.col1 and b1.col2=c.col2 and c.col3=d.col3 and
    (A.last_update_date > trunc(sysdate) OR
    B.last_update_date > trunc(sysdate) OR
    C.last_update_date > trunc(sysdate) OR
    D.last_update_date > trunc(sysdate))

    so, basically, because of the “OR” clauses, there’s full table scans on just about every table — rightfully so, I believe;

    When these tables are into the millions of records, this is an unreasonable/untunable query — it seems.

    One cannot use “AND” because, let’s say, when retrieving data for a child table, a parent’s record might be outside of the date range (“previous dates”) sought, so hence the thought of using “OR’s”

    any thoughts/suggestions? — this is just a small subset/example, however the real example has about 8 tables with the “OR’s” … which to me, the problem seems untunable… or the approach, at least…

    thanks much for any feedback into this “OR” pesky issue ;-)
    Cos

    Comment by cos — March 2, 2007 @ 6:13 pm UTC Mar 2,2007 | Reply

  11. Hi Cos,

    Is the process in your question a datawarehouse loading process (a source view query to be precise) ?

    We have exactly the same situation where in we are trying to capture the changed data in a number of tables of Oracle APPS.

    Jonathan, would be very helpful if you could shed some light on Cos’s question.

    Thanks,
    Mahad.

    Comment by Mahad — March 3, 2007 @ 6:20 am UTC Mar 3,2007 | Reply

  12. Mahad,
    that’s precisely it. ;-)

    Comment by Cos — March 3, 2007 @ 3:55 pm UTC Mar 3,2007 | Reply

  13. Cos, based on the comments you and Mahad have made, I suspect the problem is not the tablescans, but the fact that you are effectively doing a Cartesian join across all these tables, and only discarding data at the very last step.

    I’d guess that with the 8-table requirement, you are probably doing 7 consecutive hash joins – and it’s the joins which are probably far more labour-intensive than the tablescans themselves.

    You could test the effect of an 8-way union all of 8 joins, putting the last_update_date predicate on just one of the tables in each subquery. This might give you 8 relatively small result sets that could then be sorted for uniqueness

    Bear in mind that an extract process that tries to depend on a predicate like “last_update_date > timestamp” could “lose” updates because of read-consistency. There’s a few comments about the problem in Oracle Insights – Tales of the Oak Table”.

    Comment by Jonathan Lewis — March 4, 2007 @ 10:03 pm UTC Mar 4,2007 | Reply

  14. hi Jonathan,
    thanks much for the reply.
    Unfortunately, the table scans was **also** the problem. Yes, I tried the multi-way UNION ALL as a workaround and yes, that would have been faster but had not entirely completed the test as I feared doing the unions N-times would have been significantly more expensive ;-( If I get the selection (pre-discarding) step fast, I think we’ll be fine as there won’t be that many records involved then. Thanks for the book pointer, I have it signed by none other than JL ;-) but I guess I have not read that entirely either

    Back to the drawing board ;-)

    Comment by Cos — March 5, 2007 @ 1:10 pm UTC Mar 5,2007 | Reply

  15. Cos, part of the assumption of the union all approach is that you have have to do something inefficient to get the driving rows from one table, but the joins to the other seven will then be along the PK/FK indexes that represent the referential integrity. If those joins are still going to be inefficient then there isn’t a lot you can do, other than some infrastructure changes.

    Comment by Jonathan Lewis — March 5, 2007 @ 1:14 pm UTC Mar 5,2007 | Reply

  16. you’re correct Jonathan, yes, the purpose of the UNION ALL, “AND”-ing the date column one table at a time worked fine and was quite fast after “then-walking-along PK’s” … I was just concerned about the amount of code that was to be written as well (complexity, management etc), but I guess, in these instances, you can’t have your cake and eat it all as well… sometimes, you have to write a lot more code to achieve performance. Like some famous Oracle guru says :-) — LOL — sometimes you have gently nudge, you have to “coach” Oracle into “understanding” better what you’re trying to do … worst part is, many times, developers, moreso end users, don’t have a clue of what they’re trying to accomplish to begin with!!! … :-) ok… enough with the ranting.

    Comment by cos — March 5, 2007 @ 3:08 pm UTC Mar 5,2007 | Reply

  17. Interesting. Looks like “lnnvl” can be nested. I ran this on Oracle 9.2.0.7.0:

    select ‘Line 1′ as is_true
    from dual
    where lnnvl(1=null)
    union
    select ‘Line 2′ as is_true
    from dual
    where lnnvl(1=0)
    union
    select ‘Line 3′ as is_true
    from dual
    where lnnvl(1=1)
    union
    select ‘Line 4′ as is_true
    from dual
    where lnnvl(lnnvl(1=null))
    union
    select ‘Line 5′ as is_true
    from dual
    where lnnvl(lnnvl(1=0))
    union
    select ‘Line 6′ as is_true
    from dual
    where lnnvl(lnnvl(1=1))

    Comment by Robert Shepard — March 5, 2007 @ 9:55 pm UTC Mar 5,2007 | Reply

  18. Looks like the bug with “NOT LNNVL()” is Bug 4211527. Originally filed with 10.1.0.3 over two years ago. No workaround posted. What a pain. Back to the “OR” statements for me.

    Comment by Don Seiler — April 4, 2007 @ 7:33 pm UTC Apr 4,2007 | Reply

  19. Don,
    Thanks for letting us know about this bug.

    Comment by Jonathan Lewis — April 8, 2007 @ 3:55 pm UTC Apr 8,2007 | Reply

  20. I would say, a workaround for the “NOT LNNVL()” Bug 4211527
    is using “lnnvl(lnnvl())” instead of “not lnnvl()”

    did some tests and it looks great for me.
    for example

    select null from dual
    where lnnvl(lnnvl(exists(select null from dual where lnnvl(null=null))));

    do I miss something ?
    why don’t they propose it as workaround ?

    Comment by Sokrates — June 28, 2007 @ 2:22 pm UTC Jun 28,2007 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.