Oracle Scratchpad

January 9, 2007

Conditional SQL

Filed under: CBO,Conditional SQL,Execution plans — Jonathan Lewis @ 7:57 pm BST Jan 9,2007

Here’s a simplified example of the type of SQL that I often used to see coming from screen generators:

select 	*
from 	t1
where	t1.n1 = nvl(:n1,t1.n1)
;

This code is basically trying to say: “if the user supplies an input value for some on-screen field then return data for that value, otherwise return all the data unconditionally.”

Here’s its execution plan (from a 10.2.0.1 instance):


Execution Plan
----------------------------------------------------------
Plan hash value: 3794124282      

-----------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |  1001 |   121K|     6 |
|   1 |  CONCATENATION                |       |       |       |       |
|*  2 |   FILTER                      |       |       |       |       |
|*  3 |    TABLE ACCESS FULL          | T1    |  1000 |   121K|     4 |
|*  4 |   FILTER                      |       |       |       |       |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1    |     1 |   124 |     2 |
|*  6 |     INDEX UNIQUE SCAN         | T1_PK |     1 |       |     1 |
-----------------------------------------------------------------------      

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:N1 IS NULL)
   3 - filter("T1"."N1" IS NOT NULL)
   4 - filter(:N1 IS NOT NULL)
   6 - access("T1"."N1"=:N1)      

Note how you can see two access paths embedded in the one plan - one is a full tablescan, the other is an index access path, and each plan has a filter parent, and the results of the two filter operations are simply concatenated in line 1 of the plan.

When we check the predicate information section of the report to see what’s going on in the two filter operations, we see that one filter operates the predicate “:N1 IS NULL”, and the other operates the predicate “:N1 IS NOT NULL”. Clearly only one of those two filters can be true. So what’s the optimizer trying to tell us with this plan?

It’s just another example of a “Non-execution plan”. At run-time, when Oracle gets to the filter at line 2 it decides whether or not to run the sub-plan at line 3; similarly when it gets to line 4 it will decide whether or not to run the sub-plan at line 5.

Effectively the optimizer has generated an ‘if-then-else’ execution plan:  “if :n1 is null then do a tablescan else do an index range scan”. (Historically, the only safe option was always to do the tablescan)

The same trick appears for predicates like: “n1 = decode(:n1, null, n1, :n1)”, although it doesn’t (currently) appear for the (apparently) equivalent “n1 = nvl2(:n1, :n1, n1)”.

There are all sorts of ramifications and interesting little side effects to this feature (which appeared somewhere in the middle of 8i, I think, when the parameter _or_expand_nvl_predicate was set to true), and I’ll talk about a couple of them in a later posting. In the meantime, I’ll leave you with just one thought.

I said at the top of the page that screen painters often generate code of this sort because they want to return either “the requested items” or “all the data”. Did that statement seem to make sense when you read it?  Look very carefully at line 3 of the the predicate information section.

When you do the full tablescan, you check that n1 is not null – in other words you may lose some rows unless n1 has been declared as not null. This isn’t something new, by the way, but in the improved output, with the cunning plan, it’s suddenly very explicit. Maybe your screen painter has been giving you wrong results for years!

[Further reading on Conditional SQL]

25 Comments »

  1. Jonathan, I wish it was just the screen painters. Many people code queries like this without knowing they can return wrong results.

    Comment by Yas — January 10, 2007 @ 9:07 am BST Jan 10,2007 | Reply

  2. Ahhh yes, those pesky null’s…
    Wish Mr. NULL was still here to witness this.

    Comment by Toon Koppelaars — January 10, 2007 @ 9:08 am BST Jan 10,2007 | Reply

  3. @Jonathan

    I don’t get the point when you that we can loose rows…

    Are you suggesting that the correct filter session should be something like:

    Predicate Information (identified by operation id):
    —————————————————
    2 – filter(:N1 IS NULL)
    3 – filter(“T1″.”N1″ IS NULL)
    4 – filter(:N1 IS NOT NULL)
    6 – access(“T1″.”N1″=:N1)

    ???

    Comment by Antonio — January 10, 2007 @ 9:31 am BST Jan 10,2007 | Reply

  4. Antonio,

    you loose the rows with NULL in t1.n1 (in case :n1:= NULL) :

    “where t1.n1 = nvl(:n1,t1.n1)” => “where NULL = NULL” => not selected

    Jan

    Comment by Jan — January 10, 2007 @ 9:51 am BST Jan 10,2007 | Reply

  5. @Jan

    Yep!

    I missed it too! LOL :)

    Comment by Antonio — January 10, 2007 @ 10:03 am BST Jan 10,2007 | Reply

  6. Wish Mr. NULL was still here to witness this…

    Indeed, how E.F. Codd must be spinning, not to mention Chris Date & Fabian Pascal in the present…

    Comment by SeanMacGC — January 10, 2007 @ 11:50 am BST Jan 10,2007 | Reply

  7. Jonathan,

    As I can see there are two filter plans kept ready, and used by the optimizer at run time. Could you please explain the reasoning for the following behaviour:

    Case 1: Without using nvl:

    SQL> explain plan for
    2 select *
    3 from t1
    4 where t1.id = 3
    5 /

    Explained.

    SQL> SQL> select * from table(dbms_xplan.display)
    2 /

    PLAN_TABLE_OUTPUT
    ----------------------------------------------------------------------------------------------------
    Plan hash value: 770639542

    -------------------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    -------------------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 218 | 2 (0)| 00:00:01 |
    | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 218 | 2 (0)| 00:00:01 |
    |* 2 | INDEX UNIQUE SCAN | PK_ID | 1 | | 1 (0)| 00:00:01 |
    -------------------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

    2 - access("T1"."ID"=3)

    14 rows selected.

    Case 2: Using nvl function
    ===================

    SQL> explain plan for
    2 select *
    3 from t1
    4 where t1.id = nvl(3,t1.id)
    5 /

    Explained.

    SQL> SQL> select * from table(dbms_xplan.display)
    2 /

    PLAN_TABLE_OUTPUT
    ----------------------------------------------------------------------------------------------------
    Plan hash value: 3617692013

    --------------------------------------------------------------------------
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    --------------------------------------------------------------------------
    | 0 | SELECT STATEMENT | | 1 | 218 | 77 (6)| 00:00:01 |
    |* 1 | TABLE ACCESS FULL| T1 | 1 | 218 | 77 (6)| 00:00:01 |
    --------------------------------------------------------------------------

    Predicate Information (identified by operation id):
    ---------------------------------------------------

    1 - filter("T1"."ID"=NVL(3,"T1"."ID"))

    13 rows selected.

    In both the cases, I am using a literal text of 3 instead of bind parameter, so in case 2, when it knows that the value is not null, why won’t the optimizer be inclined to use the index. Is it the case that the right hand side is a function (uses nvl) ? If so, how does the optimizer manage to use the index correctly in your original example. I think I need to read the chapter on 10053 trace file, not yet reached there..

    FYI, table/index created using:

    create table t1
    as
    select rownum as id ,
    rpad('X', 200) as large_vc,
    rpad('Y', 10) as small_vc,
    trunc(dbms_random.value(1,50)) as another_id
    from dual
    connect by level 'scott', tabname => 't1', cascade => true)
    /

    Thanks
    Vinay

    p.s – Since this is not a forum, not sure if its right to ask queries here, so feel free to delete this comment if it doesn’t belong here.

    Comment by Vinay Pai — January 10, 2007 @ 1:50 pm BST Jan 10,2007 | Reply

  8. I don’t usually use NVL for this – I’m a little more explicit:
    SELECT *
    FROM T1
    WHERE (:N1 IS NULL OR N1=:N1)

    However, I’m guessing that it produces the same filtered execution plan. Is my syntax better? Is there a better way?

    Comment by John Flack — January 10, 2007 @ 2:07 pm BST Jan 10,2007 | Reply

  9. Vinay, this is where my standard answer comes in: I don’t know, I didn’t write the spec or the code. At a guess, maybe the programmer responsible for this trick didn’t think that anyone would be daft enough to do what you’ve just done ;-)

    John, funnily enough, your code doesn’t benefit from this path. It seems to have been engineered very precisely for the nvl() or its decode() equivalent.

    Comment by Jonathan Lewis — January 10, 2007 @ 7:25 pm BST Jan 10,2007 | Reply

  10. Hello Mr.Lewis,
    I read this post two days before, went to back to tune one of my statement, involving cartesian join.I did take suggestion from one your review on the free list forum on allowing dynamic sampling on Global temporary table.

    I have a query which is a four table joine( I didn’t post here because it is very big), of which two are Global Temporary tabe. I collected statistice for all the permanent objects thru dbms_stat and found temporary tables are dynamically sampled.

    It took about 571 seconds to complete , with lot of LIO.Looking at the plan below , I was shocked. (Contd after the explain plan)
    call count cpu elapsed disk query current rows
    ——- —— ——– ———- ———- ———- ———- ———-
    Parse 1 0.00 0.00 0 0 0 0
    Execute 1 585.46 571.70 0 47010070 2567 45573
    Fetch 0 0.00 0.00 0 0 0 0
    ——- —— ——– ———- ———- ———- ———- ———-
    total 2 585.46 571.70 0 47010070 2567 45573

    Misses in library cache during parse: 1
    Misses in library cache during execute: 1
    Optimizer mode: ALL_ROWS
    Parsing user id: 1659 (recursive depth: 1)


    Rows    Row Source Operation
    ——-     —————————————————
    45573   CONCATENATION (cr=47009355 time=571078341 us)
    44646     NESTED LOOPS (cr=20139778 time=243938885 us)
    6705797     HASH JOIN (cr=5265 time=27045119 us)
    12733         MERGE JOIN CARTESIAN (cr=14 time=102745 us)
    119             TABLE ACCESS FULL VP_REBATE_CONTROL_DTL (cr=7 time=574 us)
    12733           BUFFER SORT (cr=7 time=39840 us)
    107               TABLE ACCESS FULL GTT_REB_GRP_HDLE_RATE (cr=7 time=362 us)
    62671         TABLE ACCESS FULL GTT_WORKING_RC_DTL_LINE (cr=5251 time=542544 us)
    44646       TABLE ACCESS BY INDEX ROWID VP_RC_HDR (cr=20134513 time=181737957 us)
    6705797       INDEX UNIQUE SCAN VP_RC_HDR_PKEY (cr=13411596 time=87037255 us)
    927       NESTED LOOPS (cr=26869577 time=326887266 us)
    8947488     HASH JOIN (cr=5265 time=35934122 us)
    7854          MERGE JOIN CARTESIAN (cr=14 time=63688 us)
    66              TABLE ACCESS FULL GTT_REB_GRP_HDLE_RATE (cr=7 time=460 us)
    7854            BUFFER SORT (cr=7 time=24797 us)
    119               TABLE ACCESS FULL VP_REBATE_CONTROL_DTL (cr=7 time=391 us)
    135568        TABLE ACCESS FULL GTT_WORKING_RC_DTL_LINE (cr=5251 time=1160783 us)
    927         TABLE ACCESS BY INDEX ROWID VP_RC_HDR (cr=26864312 time=242131290 us)
    8947488       INDEX UNIQUE SCAN VP_RC_HDR_PKEY (cr=17894978 time=115944009 us)

    Now I GAVE A HINT /*+ ordered */ TO THE REQUIRED JOIN ORDER ,it took about 5 seconds. Here is the plan . I am wondering how the optimizer avoided this plan (I have read one of your article on optimizer with four table joins and permutations and combinations,but I seriously don’t remember anythingnow).

    call count cpu elapsed disk query current rows
    ——- —— ——– ———- ———- ———- ———- ———-
    Parse 1 0.00 0.00 0 0 0 0
    Execute 1 5.95 5.84 0 8184 2568 45573
    Fetch 0 0.00 0.00 0 0 0 0
    ——- —— ——– ———- ———- ———- ———- ———-
    total 2 5.95 5.84 0 8184 2568 45573

    Misses in library cache during parse: 1
    Misses in library cache during execute: 1
    Optimizer mode: ALL_ROWS
    Parsing user id: 1659 (recursive depth: 1)

    Rows     Row Source Operation
    ——-      —————————————————
    45573    HASH JOIN (cr=7667 time=5349910 us)
    173        TABLE ACCESS FULL GTT_REB_GRP_HDLE_RATE (cr=7 time=593 us)
    85152      HASH JOIN (cr=7660 time=4299599 us)
    119          TABLE ACCESS FULL VP_REBATE_CONTROL_DTL (cr=7 time=522 us)
    85152        HASH JOIN (cr=7653 time=3121384 us)
    3095           TABLE ACCESS BY INDEX ROWID VP_RC_HDR (cr=2402 time=291100 us)
    15310            INDEX SKIP SCAN VP_RC_HDR_NDX3 (cr=168 time=115784 us)
    195946         TABLE ACCESS FULL GTT_WORKING_RC_DTL_LINE (cr=5251 time=878201 us)

    ********************************************************************************

    Thank you very much.
    CT

    Comment by CT — January 12, 2007 @ 5:20 pm BST Jan 12,2007 | Reply

  11. CT, I have tidied up the plans so that other people will be able to read them. For suggestions on working around the problems of putting code into the comments I have a written a page linked to by “How to Comment”.

    For details on how to cross-check that actual run-time statistics with the predicted path, see: dbms_xplan in 10g

    For one possible reason for the appearance of cartesian merge joins see: Cartesian Merge Join

    Either execution plan could be perfectly reasonable – the fact that you’re are shocked by one of them shows that you know the data better than the optimizer does. It has done something that appeared (to it) to be sensible because it has (probably) worked out some bad estimates of cardinality somewhere down the line. Check the predicted cardinalities, and check what the actual predicates were, and you may understand why the plan was so bad.

    In passing, your example is a very nice demonstration of why not all tablescans are bad :-)

    Comment by Jonathan Lewis — January 12, 2007 @ 5:55 pm BST Jan 12,2007 | Reply

  12. Hello Lewis
    Thank you very much for the respoanse, I will follow through dbms_xplan.
    I wanted to post this cartesian merge join but by mistake I posted here.I apologise for that.

    Next time I will make sure ,I will post the code.
    Thank you
    CT

    Comment by CT — January 12, 2007 @ 6:28 pm BST Jan 12,2007 | Reply

  13. Hi,

    I am intrested in knowing the solution on this.

    I am thinking of using something like this to get around this issue.

    select *
    from t1
    where nvl(t1.n1,’$’) = nvl(:n1,nvl(t1.n1,’$’))
    ;

    PS: n1 is not indexed and is not date datatype column.

    Appreciate your feedback on this.

    Thanks
    KB

    Comment by Kavya — January 13, 2007 @ 8:41 pm BST Jan 13,2007 | Reply

  14. KB – given that you don’t have an index on your n1 column, take a look at note 8 above from John Flack.

    Comment by Jonathan Lewis — January 13, 2007 @ 10:03 pm BST Jan 13,2007 | Reply

  15. [...] plans, CBO — Jonathan Lewis @ 8:48 am UTC Feb 14,2007 Some time ago I wrote a note on conditional SQL, commenting on code with predicates [...]

    Pingback by Conditional SQL (2) « Oracle Scratchpad — February 14, 2007 @ 8:48 am BST Feb 14,2007 | Reply

  16. Jonathan,

    Just wanted to mention that while reading this article I remembered that I saw a similar piece couple of years ago back in the 8i/9i days, which was the reason to use NVL quite extensively in our code:

    http://www.oracle.com/technology/oramag/oracle/03-jul/o43asktom.html

    (scroll down to the “How to Write this Query”)

    Thanks,
    Boris Dali.

    Comment by Boris Dali — March 12, 2007 @ 7:58 pm BST Mar 12,2007 | Reply

  17. … there is a variation of this case though that we couldn’t figure out back than and this article reminded me of that old problem. It was on a project with a highly normalized database model where the main theme was to eliminate redundancy. Everywhere. It probably went a bit too far, as in many tables that would normally have a status column, it wasn’t. Status became a derived column instead. For many queries one of the business rules in effect would be that “… and only the ACTIVE items are to be displayed”. The definition of active was “any items with the end_date as null or set in the future”. An example of item could be a new regulation with the expiration date not known at the time of the insert. As a result, about half of the queries on that system looked like:

    where … and ( end_date is null or end_date > sysdate)

    Since these clause had to be repeated many times, developers quickly figured out a typing shortcut:

    where … and nvl(end_date, sysdate+1/1400)>sysdate

    The problem was that at the beginning most of the items were open and consequently index access path based on the above predicate wouldn’t make much sense in many cases as it would need almost every single row from a given table. As the system grew however, many items became “expired” and the above predicate became fairly selective (no archiving of any sort was used for that system).

    Is there a good generic enough solution for this problem? Something along the lines of conditional plan you demonstrated in this article perhaps? Function based indexes just came around back than (beginning of 8i) and we haven’t tried them out. Would they work in this case?

    Comment by Boris Dali — March 15, 2007 @ 8:10 pm BST Mar 15,2007 | Reply

  18. Boris, I don’t think there’s a generic solution that’s clean and tidy. There’s a general principle, perhaps, to move you in the right direction. You can always add a ‘virtual column’ to any suitable existing indexes – or create an index on just a virtual column – to deal with the null dates, for example:


    create index t1_fbi on t1(nvl(end_date,to_date('31-dec-4000','dd-mon-yyyy')))

    The drawback to this ‘year 4000′ hack, though, is that it distorts the optimizer’s view of your data, so you will need to generate a histogram on the ‘virtual column’ so that the optimizer knows that the year 4000 data is a very odd outlier. There is a method_opt of “for all hidden columns” that you can use with the dbms_stats procedures to do this.

    Comment by Jonathan Lewis — March 15, 2007 @ 9:30 pm BST Mar 15,2007 | Reply

  19. [...] type filter” is a reference to the type of filter I’ve mentioned in a note on Conditional SQL; but this isn’t relevant in this [...]

    Pingback by Odd Filter « Oracle Scratchpad — May 19, 2009 @ 6:40 pm BST May 19,2009 | Reply

  20. [...] Oracle choose to execute different parts of an execution plan – giving another reason for the “conditional plan” strategy I’ve described in the [...]

    Pingback by Connect By « Oracle Scratchpad — June 30, 2009 @ 9:22 am BST Jun 30,2009 | Reply

  21. [...] (the VIEW operator), and this calls the FILTER operation. The filter operation is an example of a "conditional" filter – i.e. if the test in line 5 is true then line 6 is called – and in this case the test is [...]

    Pingback by ANSI Outer « Oracle Scratchpad — January 31, 2011 @ 7:04 pm BST Jan 31,2011 | Reply

  22. […] I suppose it’s not really surprising given the way the optimizer can deal with the over-used predicate “(columnX = nvl(:bind, columnX)” […]

    Pingback by NVL() | Oracle Scratchpad — August 13, 2013 @ 7:14 am BST Aug 13,2013 | Reply

  23. […] In this case (small table, no indexes, requesting most of the table) the only critical detail is in predicate information. In the case of the OP the plan would probably have shown a concatenation operation with two child filter operations to select one of two different run-time paths. […]

    Pingback by Null Quiz | Oracle Scratchpad — December 13, 2013 @ 9:58 am BST Dec 13,2013 | Reply

  24. I’ve been working on a problematic conditional query that looks like this:

    SELECT .. FROM A, B, C WHERE (A.X=1 and A.Y = C.Y) OR (A.X=2 and A.Y = B.Y and B.Z = C.Z)

    As you can see, depending on the value of the column A.X, either A joins directly with C, or it joins indirectly via B.

    I’ve been trying to rewrite this using ANSI inner joins, but I don’t see how this can be done. Of course it can be done if the query is rewritten using UNION,
    but the query plan then requires two passes over the tables, as well as a sort.

    Comment by Jay Turner — February 20, 2014 @ 1:55 pm BST Feb 20,2014 | Reply

    • Jay,

      You’re the second person this week who confused this blog with a forum – I suggest you repeat your question on the OTN database forum.

      Your requirement may require you to drive through A twice. As you’ve written it you’re going to get a Cartesian merge join with B for the first part of the query, and even if you manage to rewrite the query as a join between A and C with a cunning existence or IN subquery against B I think the optimizer will (at best) have to rewrite it as a two-pass query (viz: your UNION) where the first part joins A with C and the second part joins A, B and C.

      Comment by Jonathan Lewis — February 20, 2014 @ 7:58 pm BST Feb 20,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,453 other followers