Oracle Scratchpad

June 2, 2015

Predicate Order

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 7:10 pm BST Jun 2,2015

A recent OTN post demonstrated a very important point about looking at execution plans – especially when you don’t use the right data types. The question was:

We’ve this query which throws invalid number

SELECT * FROM table A
WHERE A.corporate_id IN (59375,54387) AND TRUNC(created_dt) BETWEEN '19-DEC-14' AND '25-DEC-14';

However it works fine if we use not in instead of in

SELECT * FROM table A  
WHERE A.corporate_id  NOT IN (59375,54387) AND TRUNC(created_dt) BETWEEN '19-DEC-14' AND '25-DEC-14';

Please assist.

A follow-up post told us that corporate_id was a varchar2() type – so the root cause of the ORA-01722: invalid number error is simply that you shouldn’t be mixing data types. Either the corporate_id should have been defined as numeric or the in-list should have been a list of varchar2() values. (And, of course, the character strings that look like dates should have been converted explicitly to date data types using either the to_date() function with a 4-digit year or the date ‘yyyy-mm-dd’ syntax. Another little point to remember is that for date-only usage “created_dt >=  19th Dec 2014 and created_dt < 26th Dec 2014” (note – strictly less than following date) would have given the optimizer a chance to get a slightly better cardinality estimate)

The answer to the slightly more specific problem – why does changing NOT IN to IN allow the query to run rather than crashing – is (probably) one that I first addressed in an article in Oracle Magazine just over eleven years ago. With CPU costing enabled (optional in 9i, enabled by default in 10g) Oracle can change the order in which it applies filter predicates to a table. It’s also a question that can easily be answered by my commonest response to many of the optimizer questions that appear on OTN – “look at the execution plan”.

In this example it’s a fairly safe bet that there’s a reasonably small volume of data (according to the optimizer’s estimate) where to_number(corporate_id) is one of the required values, and a much larger volume of data where it is not; and some intermediate volume of data where the created_dt falls in the required date range. With CPU costing enabled the optimizer would then do some arithmetic to calculate the most cost-effective order of applying the filter predicates based on things like: the number of CPU cycles it takes to walk along a row to find a particular column. the number of CPU cycles it takes to convert a character column to a number and compare it with a number; the number of CPU cycles it takes to truncate a date column and compare it with a string, the number of rows that would pass the numeric test and then require the first-date test to be applied, compared with the number of rows that would survive the first-date test and then require either the second date test or the numeric test to take place.

Here’s some code to demonstrate the point. It may require your system stats to be adjusted to a particular set of values to ensure that it is repeatable (though there’s probably some flexibility in the range)  and that’s why I’ve called dbms_stats.set_system_stats() in the first few lines:

rem
rem     Script:         predicate_order_3.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2015
rem
rem     Last tested 
rem             11.2.0.4
rem

drop table t1 purge;

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',1000);
exception
        when others then null;
end;

create table t1 (
        v1      varchar2(10),
        d1      date
)
;

insert into t1 values(1,'01-Jan-2015');

insert into t1 values('x','02-Jan-2015');

insert into t1 values(3,'03-Jan-2015');
insert into t1 values(4,'04-Jan-2015');
insert into t1 values(5,'05-Jan-2015');
insert into t1 values(6,'06-Jan-2015');
insert into t1 values(7,'07-Jan-2015');
insert into t1 values(8,'08-Jan-2015');
insert into t1 values(9,'09-Jan-2015');
insert into t1 values(10,'10-Jan-2015');

execute dbms_stats.gather_table_stats(user,'t1');

First we create a table, load some data, and gather stats. You’ll notice that I’ve got a varchar2(10) column into which I’ve inserted numbers for all rows except one where it holds the value ‘x’. Now we just run some code to check the execution plans for a couple of queries.


explain plan for
select
        *
from    t1
where   v1 in (4,6)
and     d1 between '03-Jan-2015' and '09-Jan-2015'
;

select * from table(dbms_xplan.display);

explain plan for
select
        *
from    t1
where   v1 not in (4,6)
and     d1 between '03-Jan-2015' and '&1-Jan-2015'
;

select * from table(dbms_xplan.display);

As with the original question I’ve take a query with an IN operator and changed it to NOT IN. The in-list is numeric even though the relevant column is varchar2(10). The first query crashes with ORA-01722: invalid number, the second one runs and returns the correct result. You’ll notice, of course, that the “bad” value for v1 is not in the set of rows where d1 is between 3rd and 9th Jan 2015. You’ll also notice that in my code I’ve used &1 for the end day in the query with the NOT IN clause so that I can re-run the query a few times to show the effects of changing the date range. Here are the execution plans – first with the IN clause:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     2 |    20 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    20 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter((TO_NUMBER("V1")=4 OR TO_NUMBER("V1")=6) AND
              "D1" .ge. TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "D1" .le. TO_DATE(' 2015-01-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

NOTE: I’ve change the “less/greater than or equal to” symbols to .le. and .ge. to avoid a WordPress format issue.

The optimizer arithmetic predicts 2 rows returned using a full tablescan – it doesn’t know the query is going to crash. Take note of the Predicate Information section, though. The first predicate says Oracle will attempt to convert v1 to a number and compare it with 4 and then (if the first test fails) with 6. The query will crash as soon as it hits a row with a non-numeric value for v1. In outline, the optimizer has decided that the numeric conversion and test is very cheap (on CPU) and only a few rows will survive to take the more expensive date comparison; whereas either of the (expensive) date comparisons would leave a lot of rows that would still have to be checked with the numeric test. It makes sense to do the numeric comparison first.

Here’s the plan for the query with the NOT IN clause when I set the date range to be 3rd Jan to 7th Jan.


Execution plan for NOT IN:  7th Jan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     5 |    50 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     5 |    50 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1" .le. TO_DATE(' 2015-01-07 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D1" .ge. TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND TO_NUMBER("V1") != 4 AND TO_NUMBER("V1") != 6)

NOTE: I’ve replaced Oracle’s choice of symbols for “not equal” with “!=” to avoid a WordPress format issue.

The plan is still a full tablescan – there are no indexes available – and the estimated number of rows has gone up to 5. The important thing, though, is the Predicate Information section again. In this case the optimizer has decided that the first thing it will apply is the (relatively expensive) predicate “d1 >= 3rd Jan” before worrying about the NOT IN numeric predicate. The optimizer has worked out that almost all the data will survive the NOT IN predicate, so it’s not efficient to apply it before using other predicates that eliminate more data.

By a stroke of luck my simple example happened to be a very good example. Here’s what happened when I set the end date to 8th Jan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     6 |    60 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     6 |    60 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1" .ge.TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D1" .le. TO_DATE(' 2015-01-08 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND TO_NUMBER("V1") != 4 AND TO_NUMBER("V1") != 6)

The estimated rows has gone up to 6 – but the interesting thing, as before, is the Predicate Information section: in the previous example Oracle did the tests in the order “date upper bound”, “date lower bound”, “numeric”; in this test it has done “date lower bound”, “date upper bound”, “numeric”.

And this is what I got when I ran the test with 9th Jan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     7 |    70 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     7 |    70 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1" .ge. TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND TO_NUMBER("V1") != 4 AND TO_NUMBER("V1") != 6 AND
              "D1" .le. TO_DATE(' 2015-01-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Again the estimated rows has gone up by one, but the ever-interesting Predicate Information section now shows the evaluation order as: “date lower bound”, “numeric”, “date upper bound”.

There are only 6 possible orders for the predicate evaluation for the query with the NOT IN clause, and we’ve seen three of them. Three will fail, three will succeed – and I got all three of the successful orders. It wouldn’t take much fiddling around with the data (careful choice of duplicate values, variation in ranges of low and high values, and so on) and I could find a data set where small changes in the requested date range would allow me to reproduce all six variations. In fact when I changed my nls_date_format to “dd-mon-yy” and used a 2 digit year for testing I got two of the three possible failing predicate evaluation orders – “numeric”, “date lower bound”, “date higher bound” and “date higher bound”, “numeric”, “date lower bound” without changing the data set. (To be able to get all six orders with a single data set I’d probably need a data set where the “bad” v1 value corresponded to a d1 value somewhere near the middle of the d1 range.)

Bottom Line

  • use the correct data types
  • make sure your date literals are dates with a 4-digit year
  • use explicit conversion formats for dates, don’t rely on the default setting
  • check the Predicate Information section to see if the optimizer did any implicit conversions with your predicates and what order it used them in.

If you don’t follow these rules you may find that a query will work perfectly for months, then crash because you finally got unlucky with the optimizer arithmetic.

 

7 Comments »

  1. Hi Jonathan,
    Is there a way to force (or change) optimizer’s evaluation order ? I mean ,for example, force optimizer to use numeric evaluation first (or last) .

    Ugurcan

    Comment by Murphy — June 2, 2015 @ 8:17 pm BST Jun 2,2015 | Reply

    • Like a hint + ordered_predicates deprecated in 10g

      Comment by Murphy — June 2, 2015 @ 9:40 pm BST Jun 2,2015 | Reply

    • Murphy/Ugurcan

      I think you’ve answered your own question – the only option I know of is the /*+ ordered_predicates */ hint. I was very disappointed when that hint was deprecated, but possibly that happened because it’s behaviour was hard to predict for a typical user – after all, what would the hint mean if Oracle did some predicate transformation that introduced or eliminated some predicates, what would it mean if Oracle unnested a subquery, what would it mean if a rewrite turned what you thought was going to be an access predicate into a filter predicate ?

      Technically there is another “unpredictable” option. If you disable CPU costing (hint /*+ no_cpu_costing */, or set hidden parameter _optimizer_cost_model to ‘io’) the optimizer evaluates (filter) predicates from the top down (whatever that turns into after query transformation).

      I wouldn’t want to see either approach used in production, but I’d iike to see some method of taking control of predicate order for the special cases where the optimizer is going to get it wrong.

      Comment by Jonathan Lewis — June 3, 2015 @ 9:17 am BST Jun 3,2015 | Reply

  2. If there is an enhancement I’d like to see done to Oracle RDBMS, it’s that we get a flag to CLEARLY identify implicit data-type conversions in predicates via a specific warning code.
    Or alternatively, a flag to disable ALL implicit conversions.
    999 times out of 1000, the implicit conversion is the direct cause of performance problems, if not plain wrong results. It’s a badly needed enhancement.
    I first requested it over 20 years ago, only to be told “not possible, it’d break lots of code out there”.
    Hardly the case with a flag, but who am I to split hairs with the reasoning of the “geniuses” at Oracle?

    Comment by Noons — June 3, 2015 @ 2:41 am BST Jun 3,2015 | Reply

    • Noons,

      Funnily enough there’s been a (not even hidden) parameter you can set to get that effect since 9.2 (if not earlier) – but only for nchar/char conversion:

      nls_nchar_conv_excp: NLS raise an exception instead of allowing implicit conversion

      Comment by Jonathan Lewis — June 3, 2015 @ 9:25 am BST Jun 3,2015 | Reply

      • Yeah, I used that in 9.2 when I was working with 3 Korean alphabet URLs back in 2006.
        But what’s really needed is one that stops implicit conversion for ALL data types.
        That alone would resolve so many “performance problems” with Peoplesoft and JDE…
        And I’m willing to bet whatever the old Financials is called nowadays would also benefit from it.

        Comment by Noons — June 4, 2015 @ 12:51 am BST Jun 4,2015 | Reply

  3. […] 最佳答案 关于谓词顺序 – 请看 https://jonathanlewis.wordpress.com/2015/06/02/predicate-order-2/ […]

    Pingback by Oracle 12c内联视图评估 - 算法网 — July 28, 2019 @ 3:47 am BST Jul 28,2019 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.