Oracle Scratchpad

June 2, 2015

Predicate Order

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 7:10 pm BST Jun 2,2015

A recent OTN post demonstrated a very important point about looking at execution plans – especially when you don’t use the right data types. The question was:

We’ve this query which throws invalid number

SELECT * FROM table A
WHERE A.corporate_id IN (59375,54387) AND TRUNC(created_dt) BETWEEN '19-DEC-14' AND '25-DEC-14';

However it works fine if we use not in instead of in

SELECT * FROM table A  
WHERE A.corporate_id  NOT IN (59375,54387) AND TRUNC(created_dt) BETWEEN '19-DEC-14' AND '25-DEC-14';

Please assist.

A follow-up post told us that corporate_id was a varchar() type – so the root cause of the ORA-01722: invalid number error is simply that you shouldn’t be mixing data types. Either the corporate_id should have been defined as numeric or the in-list should have been a list of varchar2() values. (And, of course, the character strings that look like dates should have been converted explicitly to date data types using either the to_date() function with a 4-digit year or the date ‘yyyy-mm-dd’ syntax; and using “created_dt >=  19th Dec and created_dt < 26th Dec” would have given the optimizer a chance to get a better cardinality estimate)

The answer to the slightly more specific problem – why does changing NOT IN to IN allow the query to run rather than crashing – is (probably) one that I first addressed in an article in Oracle Magazine just over eleven years ago: with CPU costing enabled Oracle can change the order in which it applies filter predicates to a table. It’s also a question that can easily be answered by my commonest response to many of the optimizer questions that appear on OTN – look at the execution plan.

In this example it’s a fairly safe bet that there’s a reasonable small volume of data (according to the optimizer’s estimate) where to_number(corporate_id) is one of the required values, and a much larger volume of data where it is not; with some intermediate volume of data where the created_dt falls in the required date range. With CPU costing enabled (optional in 9i, enabled by default in 10g) the optimizer would then do some arithmetic to calculate the most cost-effective order of applying the filter predicates based on things like: the number of CPU cycles it takes to walk along a row to find a particular column. the number of CPU cycles it takes to convert a character column to a number and compare it with a number; the number of CPU cycles it takes truncate a date column and compare it with a string, the number of rows that would pass the numeric test hence requiring the first-applied date test, compared with the number of rows that would survive the first-applied date test hence requiring either the second date test or the numeric test to take place.

Here’s some code to demonstrate the point. It may require the system stats to be set to a particular values to ensure that it is probably repeatable, but there’s probably some flexibility in the range, which is why I’ve called dbms_stats.set_system_stats() in the first few lines:

drop table t1 purge;

create table t1 (
        v1      varchar2(10),
        d1      date
)
;

insert into t1 values(1,'01-Jan-2015');

insert into t1 values('x','02-Jan-2015');

insert into t1 values(3,'03-Jan-2015');
insert into t1 values(4,'04-Jan-2015');
insert into t1 values(5,'05-Jan-2015');
insert into t1 values(6,'06-Jan-2015');
insert into t1 values(7,'07-Jan-2015');
insert into t1 values(8,'08-Jan-2015');
insert into t1 values(9,'09-Jan-2015');
insert into t1 values(10,'10-Jan-2015');

execute dbms_stats.gather_table_stats(user,'t1');

First we create a table, load some data, and gather stats. You’ll notice that I’ve got a varchar2(10) column into which I’ve inserted numbers for all rows except one where it holds the value ‘x’. Now we just run some code to check the execution plans for a couple of queries.


explain plan for
select
        *
from    t1
where   v1 in (4,6)
and     d1 between '03-Jan-2015' and '09-Jan-2015'
;

select * from table(dbms_xplan.display);

explain plan for
select
        *
from    t1
where   v1 not in (4,6)
and     d1 between '03-Jan-2015' and '&1-Jan-2015'
;

select * from table(dbms_xplan.display);

As with the original question I’ve take a query with an IN operator and changed it to NOT IN. The in-list is numeric even though the relevant column is varchar2(10). The first query crashes with ORA-01722: invalid number, the second one runs and returns the correct result. You’ll notice, of course, that the “bad” value for v1 is not in the set of rows
where d1 is between 3rd and 9th Jan 2015. You’ll also notice that in my code I’ve used &1 for the end day in the query with the NOT IN clause so that I can re-run the query a few times to show the effects of changing the date range. Here are the execution plans – first with the IN clause:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     2 |    20 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    20 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter((TO_NUMBER("V1")=4 OR TO_NUMBER("V1")=6) AND
              "D1">=TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "D1"<=TO_DATE(' 2015-01-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The optimizer arithmetic predicts 2 rows returned using a full tablescan – it doesn’t know the query is going to crash. Notice the predicate information, though. The first predicate says Oracle will attempt to convert v1 to a number and compare it with 4 and then (if the first test fails) with 6. The query will crash as soon as it hits a row with a non-numeric value for v1. In outline, the optimizer has decided that the numeric conversion and test is very cheap (on CPU) and only a few rows will survive to take the more expensive date comparison; wherease either of the (expensive) date comparisons would leave a lot of rows that would still have to be checked with the numeric test. It makes sense to do the numeric comparison first.

Here’s the plan for the query with the NOT IN clause when I set the date range to be 3rd Jan to 7th Jan.


Execution plan for NOT IN:  7th Jan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     5 |    50 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     5 |    50 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1"<=TO_DATE(' 2015-01-07 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D1">=TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND TO_NUMBER("V1")<>4 AND TO_NUMBER("V1")<>6)

The plan is still a full tablescan – there are no indexes available – and the estimated number of rows has gone up to 5. The important thing, though, is the predicate section. In this case the optimizer has decided that the first thing it will apply is the (relatively expensive) predicate “d1 >= 3rd Jan” before worrying about the “NOT IN” numeric predicate. The optimizer has worked out that almost all the data will survive the NOT IN predicate, so it’s not efficient to apply it before using other predicates that eliminate more data.

By a stroke of luck my simple example happened to be a very good example. Here’s what happened when I set the end date to 8th Jan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     6 |    60 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     6 |    60 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1">=TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "D1"<=TO_DATE(' 2015-01-08 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND TO_NUMBER("V1")<>4 AND TO_NUMBER("V1")<>6)

The estimated rows has gone up to 6 – but the interesting thing, as before, is the predicate section: in the previous example Oracle did the tests in the order “upper bound”, “lower bound”, “numeric”; in this test it has done “lower bound”, “upper bound”, “numeric”.

And this is what I got when I ran the test with 9th Jan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     7 |    70 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     7 |    70 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1">=TO_DATE(' 2015-01-03 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND TO_NUMBER("V1")<>4 AND TO_NUMBER("V1")<>6 AND
              "D1"<=TO_DATE(' 2015-01-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Again the estimated rows has gone up by one, but the ever-interesting predicate section now shows the evaluation order as: “lower bound”, “numeric”, “upper bound”.

There are only 6 possible orders for the predicate evaluation for the query with the NOT IN clause, and we’ve seen three of them. Three will fail, three will succeed – and I got all three of the successful orders. It wouldn’t take much fiddling around with the data (careful choice of duplicate values, variation in ranges of low and high values, and so on) and I could find a data set where small changes in the requested date range would allow me to reproduce all six variations. In fact when I changed my nls_date_format to “dd-mon-yy” and used a 2 digit year for testing I got two of the three possible failing predicate evaluation orders – “numeric”, “lower bound”, “higher bound” and “higher bound”, “numeric”, “lower bound” without changing the data set. (To be able to get all six orders with a single data set I’d probably need a data set where the “bad” v1 value corresponded to a d1 value somewhere mear the middle of the d1 range.)

The bottom line – use the correct data types; make sure your date literals are dates with a 4-digit year; check the predicate section to see if the optimizer did any implicit conversions with your predicates and what order it used them in. If you don’t do this you may find that a query can work perfectly for months, then crash because you finally got unlucky with the arithmetic.

6 Comments »

  1. Hi Jonathan,
    Is there a way to force (or change) optimizer’s evaluation order ? I mean ,for example, force optimizer to use numeric evaluation first (or last) .

    Ugurcan

    Comment by Murphy — June 2, 2015 @ 8:17 pm BST Jun 2,2015 | Reply

    • Like a hint + ordered_predicates deprecated in 10g

      Comment by Murphy — June 2, 2015 @ 9:40 pm BST Jun 2,2015 | Reply

    • Murphy/Ugurcan

      I think you’ve answered your own question – the only option I know of is the /*+ ordered_predicates */ hint. I was very disappointed when that hint was deprecated, but possibly that happened because it’s behaviour was hard to predict for a typical user – after all, what would the hint mean if Oracle did some predicate transformation that introduced or eliminated some predicates, what would it mean if Oracle unnested a subquery, what would it mean if a rewrite turned what you thought was going to be an access predicate into a filter predicate ?

      Technically there is another “unpredictable” option. If you disable CPU costing (hint /*+ no_cpu_costing */, or set hidden parameter _optimizer_cost_model to ‘io’) the optimizer evaluates (filter) predicates from the top down (whatever that turns into after query transformation).

      I wouldn’t want to see either approach used in production, but I’d iike to see some method of taking control of predicate order for the special cases where the optimizer is going to get it wrong.

      Comment by Jonathan Lewis — June 3, 2015 @ 9:17 am BST Jun 3,2015 | Reply

  2. If there is an enhancement I’d like to see done to Oracle RDBMS, it’s that we get a flag to CLEARLY identify implicit data-type conversions in predicates via a specific warning code.
    Or alternatively, a flag to disable ALL implicit conversions.
    999 times out of 1000, the implicit conversion is the direct cause of performance problems, if not plain wrong results. It’s a badly needed enhancement.
    I first requested it over 20 years ago, only to be told “not possible, it’d break lots of code out there”.
    Hardly the case with a flag, but who am I to split hairs with the reasoning of the “geniuses” at Oracle?

    Comment by Noons — June 3, 2015 @ 2:41 am BST Jun 3,2015 | Reply

    • Noons,

      Funnily enough there’s been a (not even hidden) parameter you can set to get that effect since 9.2 (if not earlier) – but only for nchar/char conversion:

      nls_nchar_conv_excp: NLS raise an exception instead of allowing implicit conversion

      Comment by Jonathan Lewis — June 3, 2015 @ 9:25 am BST Jun 3,2015 | Reply

      • Yeah, I used that in 9.2 when I was working with 3 Korean alphabet URLs back in 2006.
        But what’s really needed is one that stops implicit conversion for ALL data types.
        That alone would resolve so many “performance problems” with Peoplesoft and JDE…
        And I’m willing to bet whatever the old Financials is called nowadays would also benefit from it.

        Comment by Noons — June 4, 2015 @ 12:51 am BST Jun 4,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.