Oracle Scratchpad

Conditional SQL – 5

Here’s a note that has been sitting around for more than 3 years (the draft date is Jan 2015), waiting for me to finish it off; and in that time we’ve got a new version of Oracle that changes the solution to the problem it presented. (I also managed to write “Conditional SQL –  6” in the intervening period !)

This posting started with a question on the OTN (now ODC) database forum about an execution plan used by 11.2.0.3.  Here’s a model to represent the data and the query:


rem
rem     Script:         null_plan_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2015
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

create table catentry as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment here to avoid format issue
)
select
        rownum  catentry_id,
        case
                when mod(rownum-1,100) > 0 then mod(rownum-1,100)
        end     member_id,
        case
                when trunc((rownum-1)/100) > 0 then trunc((rownum-1)/100)
        end     partnumber,
        rpad('x',100)   padding
from
        generator,
        generator
where
        rownum <= 100 * 100 -- > comment here to avoid format issue
;

execute dbms_stats.gather_table_stats(user,'catentry');

create unique index cat_i0 on catentry(member_id, partnumber) compress 1;
--  create        index cat_i1 on catentry(member_id, partnumber, 0) compress 1;
--  create        index cat_i2 on catentry(partnumber, member_id, 0) compress 1;

variable b1 number
variable b2 number
variable b3 number
variable b4 number

begin
        :b1 := 22;
        :b2 := 1;
        :b3 := 44;
        :b4 := 1;
end;
/

select
        catentry_id
from
        catentry
where
        (   partnumber= :b1
         or (0 = :b2 and partnumber is null)
        )
and     (    member_id= :b3
         or (0 = :b4 and member_id is null)
        )
;

select * from table(dbms_xplan.display_cursor);

==============================================================================

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |       |       |    23 (100)|          |
|*  1 |  TABLE ACCESS FULL| CATENTRY |     1 |    10 |    23   (5)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter((("PARTNUMBER"=:B1 OR ("PARTNUMBER" IS NULL AND 0=:B2))
              AND ("MEMBER_ID"=:B3 OR ("MEMBER_ID" IS NULL AND 0=:B4))))


The question this example raised was, effectively: “Why didn’t Oracle use bind peeking to work out that the best plan for this query – with these input values – was an index range scan?”

The basic answer to this question is this: “No matter how clever Oracle can be with bind peeking and executions plans it has to produce an execution plan that will give the right answer whatever the supplied values might be.”

The OP was hoping that the optimizer would see :b2 and :b4 were arriving with the value 1, infer that “0 = 1” is always false, and reduce the query predicate to “partnumber =22 and member_id = 44” to produce the following plan:



----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | CAT_I0   |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PARTNUMBER"=22 AND "MEMBER_ID"=44)


But this plan could produce the wrong results if the next execution of the query supplied zeros for :b2 or :b4, so Oracle has to do something more generic. (Bear in mind that adaptive cursor sharing keeps reusing the same execution plan until it detects that the performance for some input values is bad; it doesn’t pre-emptively create new plans based on the incoming values – though in principle it might be possible for the Oracle developers to introduce code that can recognise special cases for predicates of the form “constant1 = constant2”).

If you review the SQL statement you can see that it’s clearly trying to allow the user to find data about member_ids and partnumbers where both, neither, or either value is allowed to be null: a couple of “if – then – else” conditions that should have been handled in the client code have been embedded in the code. As we have seen several times before if you can’t change the client code then you have to hope that Oracle will use some clever transformation to handle the query in sections.

We can infer from various details of the posting that the member_id and partnumber columns were both allowed to be null, so if we want to make sure that Oracle always uses an indexed access path to acquire data for this query we need to have an index which starts with those two columns and then has at least one column which is guaranteed to be non-null so, for example, we could simply drop the current index and replace it with one that has a fixed zero on the end:


create index cat_i1 on catentry(member_id, partnumber, 0) compress 1;

With my particular data set, query, and version of Oracle this didn’t make any difference to the plan. But then I thought about the data definition and realised (and checked) that the index had a terrible clustering_factor, so I dropped it and created it with the first two columns in the opposite order:


create index cat_i2 on catentry(partnumber, member_id, 0) compress 1;

Side note:
You’ll notice that I’ve replaced the original unique index with a non-unique index. This was necessary because there were many rows where both partnumber and member_id were null, so if I want to maintain the logic of the previous unique index I’ll need to add a unique constraint on (member_id, partnumber). It’s possible, of course, that in similar circumstances I might want both indexes – one for the uniqueness and to access the data using only one of the columns, the other to access the data using only the other column.

With this index in place, and unhinted, the plan I got from 11.2.0.4 changed to use concatenation with an impressive four-way split:



------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |       |       |    12 (100)|          |
|   1 |  CONCATENATION                |          |       |       |            |          |
|*  2 |   FILTER                      |          |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|*  5 |   FILTER                      |          |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|*  8 |   FILTER                      |          |       |       |            |          |
|*  9 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS BY INDEX ROWID | CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((0=:B2 AND 0=:B4))
   4 - access("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL)
       filter("MEMBER_ID" IS NULL)
   5 - filter(0=:B2)
   6 - filter((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)))
   7 - access("PARTNUMBER" IS NULL AND "MEMBER_ID"=:B3)
       filter("MEMBER_ID"=:B3)
   8 - filter(0=:B4)
   9 - filter((LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2)))
  10 - access("PARTNUMBER"=:B1 AND "MEMBER_ID" IS NULL)
  11 - filter(((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)) AND
              (LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2))))
  12 - access("PARTNUMBER"=:B1 AND "MEMBER_ID"=:B3)



To execute this plan the run-time engine works as follows:

The critical feature of this plan, of course, is that we got it because we have given Oracle an efficient option to find the rows where both member_id and partnumber are null – and that allows the rest of the concatenation options to take place.

Hints and Upgrades

Interestingly, after the clue that 11g would happily use concatenation with a “good enough” index I went back to the example where I’d just added a zero to the existing index and checked to see what would happen if I added a /*+ use_concat */ hint (without any of the qualifying parameters that the hint can now use) and got the same concatenated plan. The fact that the path appeared wasn’t the interesting bit – see if you can spot what is the interesting bit:


------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |       |       |    12 (100)|          |
|   1 |  CONCATENATION                |          |       |       |            |          |
|*  2 |   FILTER                      |          |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|*  5 |   FILTER                      |          |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|*  8 |   FILTER                      |          |       |       |            |          |
|*  9 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS BY INDEX ROWID | CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------


Check the cost, and compare it with the cost of the full tablescan. The hinted path has a lower cost than the default path. I think this may be another case of an “unknowable” range scan being ignored in favour of a known alternative.

Finally, we get to today – when I tested the code against 12.1.0.2 and 12.2.0.1. Nothing exciting happened in 12.1.0.2 – the plans were just like the 11g plans, but here’s the plan I got in 12.2 with the “bad” indexing (original column order with added zero column – index cat_i1) without any hints in the SQL:



----------------------------------------------------------------------------------------------------------
| Id  | Operation			       | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		       |		 |	 |	 |     9 (100)| 	 |
|   1 |  VIEW				       | VW_ORE_37A7142B |     4 |    52 |     9   (0)| 00:00:01 |
|   2 |   UNION-ALL			       |		 |	 |	 |	      | 	 |
|   3 |    TABLE ACCESS BY INDEX ROWID	       | CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  5 |    FILTER			       |		 |	 |	 |	      | 	 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  8 |    FILTER			       |		 |	 |	 |	      | 	 |
|   9 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|* 11 |    FILTER			       |		 |	 |	 |	      | 	 |
|  12 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     3   (0)| 00:00:01 |
|* 13 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      ...
      OR_EXPAND(@"SEL$1" (1) (2) (3) (4))
      ...
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1)
   5 - filter(0=:B4)
   7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1)
       filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3)))
   8 - filter(0=:B2)
  10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL)
       filter(LNNVL("PARTNUMBER"=:B1))
  11 - filter((0=:B4 AND 0=:B2))
  13 - access("MEMBER_ID" IS NULL AND "PARTNUMBER" IS NULL)
       filter(("PARTNUMBER" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND LNNVL("MEMBER_ID"=:B3)))


The full tablescan didn’t appear – but it wasn’t eliminated by concatenation but by the “new” 12.2  variant: “OR EXPANSION”. In this case the net effect is remarkably similar – we still have filter operations comparing :b2 and :b4 with zero, and we still have a scattering of lnnvl() function calls being used to discard rows we’ve already accessed, but the pattern is slightly different and we have a union all rather than concatenation.

This change prompted me to go back to testing with just the original index (member_id, partnumber – index cat_i0) … which took me back to the full tablescan until I added the hint /*+ or_expand */ to the query to get the following plan:



----------------------------------------------------------------------------------------------------------
| Id  | Operation			       | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		       |		 |	 |	 |    29 (100)| 	 |
|   1 |  VIEW				       | VW_ORE_37A7142B |     4 |    52 |    29   (4)| 00:00:01 |
|   2 |   UNION-ALL			       |		 |	 |	 |	      | 	 |
|   3 |    TABLE ACCESS BY INDEX ROWID	       | CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  5 |    FILTER			       |		 |	 |	 |	      | 	 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  8 |    FILTER			       |		 |	 |	 |	      | 	 |
|   9 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|* 11 |    FILTER			       |		 |	 |	 |	      | 	 |
|* 12 |     TABLE ACCESS FULL		       | CATENTRY	 |     1 |    10 |    23   (5)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1)
   5 - filter(0=:B4)
   7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1)
       filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3)))
   8 - filter(0=:B2)
  10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL)
       filter(LNNVL("PARTNUMBER"=:B1))
  11 - filter((0=:B4 AND 0=:B2))
  12 - filter(("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND
	      LNNVL("MEMBER_ID"=:B3)))


The plan shows “or expansion”, and highlights the most significant difference between concatenation and expansion – concatenation requires indexed access paths in all branches, or-expansion doesn’t.

At first sight this plan with its full tablescan at operation 12 might seem like a bad idea; but check operation 11, the guarding filter, the tablescan occurs only if both :b2 and :b4 are null. Perhaps that special condition is never supposed to appear, perhaps it’s going to do a lot of work whether or not you can use an index. The fact that you can now handle the original problem without adding or altering existing indexes – provided you can inject this or_expand hint – may be of significant benefit. (Of course, being able to recreate the original index with the columns in the reverse order would even avoid the necessity of worrying about the hint.)

tl;dr

Applications that push “if-then-else” decisions into the SQL and down to the optimizer are a pain in the backside; the performance problems they produce can sometimes be bypassed by the addition of extra indexes that might give you plans (possibly hinted) that report the use of the concatentation operation. In 12cR2 the optimizer has an improved strategy for damage limitation “Cost-based Or Expansion” that can produce very similar effects without the addition of extra indexes. These plans will report union all operations, referencing views with names like: VW_ORE_xxxxxxxxx.

When you next upgrade you may find a few cases where you can get rid of indexes that were only created to work around defective application coding strategies. You may also want to hunt down any code where you’ve added use_concat hints and see if they can be removed, or if they should be replaced by or_expand. since the former hint will disable the latter feature.