Oracle Scratchpad

March 2, 2018

Conditional SQL – 5

Filed under: Conditional SQL,Execution plans,Hints,Indexing,Oracle — Jonathan Lewis @ 12:49 pm BST Mar 2,2018

Here’s a note that has been sitting around for more than 3 years (the draft date is Jan 2015), waiting for me to finish it off; and in that time we’ve got a new version of Oracle that changes the solution to the problem it presented. (I also managed to write “Conditional SQL –  6” in the intervening period !)

This posting started with a question on the OTN (now ODC) database forum about an execution plan used by 11.2.0.3.  Here’s a model to represent the data and the query:

rem
rem     Script:         null_plan_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2015
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

create table catentry as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment here to avoid format issue
)
select
        rownum  catentry_id,
        case
                when mod(rownum-1,100) > 0 then mod(rownum-1,100)
        end     member_id,
        case
                when trunc((rownum-1)/100) > 0 then trunc((rownum-1)/100)
        end     partnumber,
        rpad('x',100)   padding
from
        generator,
        generator
where
        rownum <= 100 * 100 -- > comment here to avoid format issue
;

execute dbms_stats.gather_table_stats(user,'catentry');

create unique index cat_i0 on catentry(member_id, partnumber) compress 1;
--  create        index cat_i1 on catentry(member_id, partnumber, 0) compress 1;
--  create        index cat_i2 on catentry(partnumber, member_id, 0) compress 1;

variable b1 number
variable b2 number
variable b3 number
variable b4 number

begin
        :b1 := 22;
        :b2 := 1;
        :b3 := 44;
        :b4 := 1;
end;
/

select
        catentry_id
from
        catentry
where
        (   partnumber= :b1
         or (0 = :b2 and partnumber is null)
        )
and     (    member_id= :b3
         or (0 = :b4 and member_id is null)
        )
;

select * from table(dbms_xplan.display_cursor);

==============================================================================

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |       |       |    23 (100)|          |
|*  1 |  TABLE ACCESS FULL| CATENTRY |     1 |    10 |    23   (5)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter((("PARTNUMBER"=:B1 OR ("PARTNUMBER" IS NULL AND 0=:B2))
              AND ("MEMBER_ID"=:B3 OR ("MEMBER_ID" IS NULL AND 0=:B4))))

The question this example raised was, effectively: “Why didn’t Oracle use bind peeking to work out that the best plan for this query – with these input values – was an index range scan?”

The basic answer to this question is this: “No matter how clever Oracle can be with bind peeking and executions plans it has to produce an execution plan that will give the right answer whatever the supplied values might be.”

The OP was hoping that the optimizer would see :b2 and :b4 were arriving with the value 1, infer that “0 = 1” is always false, and reduce the query predicate to “partnumber =22 and member_id = 44” to produce the following plan:


----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | CAT_I0   |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PARTNUMBER"=22 AND "MEMBER_ID"=44)

But this plan could produce the wrong results if the next execution of the query supplied zeros for :b2 or :b4, so Oracle has to do something more generic. (Bear in mind that adaptive cursor sharing keeps reusing the same execution plan until it detects that the performance for some input values is bad; it doesn’t pre-emptively create new plans based on the incoming values – though in principle it might be possible for the Oracle developers to introduce code that can recognise special cases for predicates of the form “constant1 = constant2”).

If you review the SQL statement you can see that it’s clearly trying to allow the user to find data about member_ids and partnumbers where both, neither, or either value is allowed to be null: a couple of “if – then – else” conditions that should have been handled in the client code have been embedded in the code. As we have seen several times before if you can’t change the client code then you have to hope that Oracle will use some clever transformation to handle the query in sections.

We can infer from various details of the posting that the member_id and partnumber columns were both allowed to be null, so if we want to make sure that Oracle always uses an indexed access path to acquire data for this query we need to have an index which starts with those two columns and then has at least one column which is guaranteed to be non-null so, for example, we could simply drop the current index and replace it with one that has a fixed zero on the end:

create index cat_i1 on catentry(member_id, partnumber, 0) compress 1;

With my particular data set, query, and version of Oracle this didn’t make any difference to the plan. But then I thought about the data definition and realised (and checked) that the index had a terrible clustering_factor, so I dropped it and created it with the first two columns in the opposite order:

create index cat_i2 on catentry(partnumber, member_id, 0) compress 1;

Side note:
You’ll notice that I’ve replaced the original unique index with a non-unique index. This was necessary because there were many rows where both partnumber and member_id were null, so if I want to maintain the logic of the previous unique index I’ll need to add a unique constraint on (member_id, partnumber). It’s possible, of course, that in similar circumstances I might want both indexes – one for the uniqueness and to access the data using only one of the columns, the other to access the data using only the other column.

With this index in place, and unhinted, the plan I got from 11.2.0.4 changed to use concatenation with an impressive four-way split:


------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |       |       |    12 (100)|          |
|   1 |  CONCATENATION                |          |       |       |            |          |
|*  2 |   FILTER                      |          |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|*  5 |   FILTER                      |          |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|*  8 |   FILTER                      |          |       |       |            |          |
|*  9 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS BY INDEX ROWID | CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((0=:B2 AND 0=:B4))
   4 - access("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL)
       filter("MEMBER_ID" IS NULL)
   5 - filter(0=:B2)
   6 - filter((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)))
   7 - access("PARTNUMBER" IS NULL AND "MEMBER_ID"=:B3)
       filter("MEMBER_ID"=:B3)
   8 - filter(0=:B4)
   9 - filter((LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2)))
  10 - access("PARTNUMBER"=:B1 AND "MEMBER_ID" IS NULL)
  11 - filter(((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)) AND
              (LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2))))
  12 - access("PARTNUMBER"=:B1 AND "MEMBER_ID"=:B3)


To execute this plan the run-time engine works as follows:

  • Operation 2: If :b2 and :b4 are both zero we use the index to find the rows where member_id and partnumber are null (the filter “member_id is null” seems to be redundant)
  • Operation 5: if :b2 is zero we use the index to find rows where the partnumber is null and the member_id is the supplied value (and if that’s null the access will immediately return zero rows because of the equality predicate), and discard any rows that have already been returned by operation 2
  • Operation 8: if :b4 is zero we will use the index to find rows where the partnumber is the supplied value (and if the partnumber is null, that access will immediately return zero rows because of the equality predicate) and the member_id is null, and discard any rows that have already been returned by operation 2.
  • Operations 11 and 12 will always run – using the index to find rows that match with equality on both the incoming member_id and partnumber, discarding any rows already returned by the previous operations, and obviously not matching any rows where either column “IS” null.

The critical feature of this plan, of course, is that we got it because we have given Oracle an efficient option to find the rows where both member_id and partnumber are null – and that allows the rest of the concatenation options to take place.

Hints and Upgrades

Interestingly, after the clue that 11g would happily use concatenation with a “good enough” index I went back to the example where I’d just added a zero to the existing index and checked to see what would happen if I added a /*+ use_concat */ hint (without any of the qualifying parameters that the hint can now use) and got the same concatenated plan. The fact that the path appeared wasn’t the interesting bit – see if you can spot what is the interesting bit:

------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |       |       |    12 (100)|          |
|   1 |  CONCATENATION                |          |       |       |            |          |
|*  2 |   FILTER                      |          |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|*  5 |   FILTER                      |          |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|*  8 |   FILTER                      |          |       |       |            |          |
|*  9 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS BY INDEX ROWID | CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Check the cost, and compare it with the cost of the full tablescan. The hinted path has a lower cost than the default path. I think this may be another case of an “unknowable” range scan being ignored in favour of a known alternative.

Finally, we get to today – when I tested the code against 12.1.0.2 and 12.2.0.1. Nothing exciting happened in 12.1.0.2 – the plans were just like the 11g plans, but here’s the plan I got in 12.2 with the “bad” indexing (original column order with added zero column – index cat_i1) without any hints in the SQL:


----------------------------------------------------------------------------------------------------------
| Id  | Operation			       | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		       |		 |	 |	 |     9 (100)| 	 |
|   1 |  VIEW				       | VW_ORE_37A7142B |     4 |    52 |     9   (0)| 00:00:01 |
|   2 |   UNION-ALL			       |		 |	 |	 |	      | 	 |
|   3 |    TABLE ACCESS BY INDEX ROWID	       | CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  5 |    FILTER			       |		 |	 |	 |	      | 	 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  8 |    FILTER			       |		 |	 |	 |	      | 	 |
|   9 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|* 11 |    FILTER			       |		 |	 |	 |	      | 	 |
|  12 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     3   (0)| 00:00:01 |
|* 13 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      ...
      OR_EXPAND(@"SEL$1" (1) (2) (3) (4))
      ...
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1)
   5 - filter(0=:B4)
   7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1)
       filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3)))
   8 - filter(0=:B2)
  10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL)
       filter(LNNVL("PARTNUMBER"=:B1))
  11 - filter((0=:B4 AND 0=:B2))
  13 - access("MEMBER_ID" IS NULL AND "PARTNUMBER" IS NULL)
       filter(("PARTNUMBER" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND LNNVL("MEMBER_ID"=:B3)))

The full tablescan didn’t appear – but it wasn’t eliminated by concatenation but by the “new” 12.2  variant: “OR EXPANSION”. In this case the net effect is remarkably similar – we still have filter operations comparing :b2 and :b4 with zero, and we still have a scattering of lnnvl() function calls being used to discard rows we’ve already accessed, but the pattern is slightly different and we have a union all rather than concatenation.

This change prompted me to go back to testing with just the original index (member_id, partnumber – index cat_i0) … which took me back to the full tablescan until I added the hint /*+ or_expand */ to the query to get the following plan:


----------------------------------------------------------------------------------------------------------
| Id  | Operation			       | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		       |		 |	 |	 |    29 (100)| 	 |
|   1 |  VIEW				       | VW_ORE_37A7142B |     4 |    52 |    29   (4)| 00:00:01 |
|   2 |   UNION-ALL			       |		 |	 |	 |	      | 	 |
|   3 |    TABLE ACCESS BY INDEX ROWID	       | CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  5 |    FILTER			       |		 |	 |	 |	      | 	 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  8 |    FILTER			       |		 |	 |	 |	      | 	 |
|   9 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|* 11 |    FILTER			       |		 |	 |	 |	      | 	 |
|* 12 |     TABLE ACCESS FULL		       | CATENTRY	 |     1 |    10 |    23   (5)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1)
   5 - filter(0=:B4)
   7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1)
       filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3)))
   8 - filter(0=:B2)
  10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL)
       filter(LNNVL("PARTNUMBER"=:B1))
  11 - filter((0=:B4 AND 0=:B2))
  12 - filter(("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND
	      LNNVL("MEMBER_ID"=:B3)))

The plan shows “or expansion”, and highlights the most significant difference between concatenation and expansion – concatenation requires indexed access paths in all branches, or-expansion doesn’t.

At first sight this plan with its full tablescan at operation 12 might seem like a bad idea; but check operation 11, the guarding filter, the tablescan occurs only if both :b2 and :b4 are null. Perhaps that special condition is never supposed to appear, perhaps it’s going to do a lot of work whether or not you can use an index. The fact that you can now handle the original problem without adding or altering existing indexes – provided you can inject this or_expand hint – may be of significant benefit. (Of course, being able to recreate the original index with the columns in the reverse order would even avoid the necessity of worrying about the hint.)

tl;dr

Applications that push “if-then-else” decisions into the SQL and down to the optimizer are a pain in the backside; the performance problems they produce can sometimes be bypassed by the addition of extra indexes that might give you plans (possibly hinted) that report the use of the concatentation operation. In 12cR2 the optimizer has an improved strategy for damage limitation “Cost-based Or Expansion” that can produce very similar effects without the addition of extra indexes. These plans will report union all operations, referencing views with names like: VW_ORE_xxxxxxxxx.

When you next upgrade you may find a few cases where you can get rid of indexes that were only created to work around defective application coding strategies. You may also want to hunt down any code where you’ve added use_concat hints and see if they can be removed, or if they should be replaced by or_expand. since the former hint will disable the latter feature.

 

June 22, 2016

Conditional SQL- 6

Filed under: Conditional SQL,Execution plans,Oracle,Performance — Jonathan Lewis @ 2:16 pm BST Jun 22,2016

An odd little anomaly showed up on the OTN database forum a few days ago where a query involving a table covered by Oracle Label Security (OLS) seemed to wrap itself into a non-mergeable view when written using traditional Oracle SQL, but allowed for view-merging when accessed through ANSI standard SQL. I don’t know why there’s a difference but it did prompt a thought about non-mergeable views and what I’ve previously called “conditional SQL” – namely SQL which holds a predicate that should have been tested in the client code and not passed to the database engine.

The thought was this – could the database engine decide to do a lot of redundant work if you stuck a silly predicate inside a non-mergeable view: the answer turns out to be yes. Here’s a demonstration I’ve run on 11g and 12c:


rem
rem     Script:         conditional_fail.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e5)         n1,
        lpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e3)         n1,
        lpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;
-- gather simple statistics (not needed in 12c) 

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t2_i1 on t2(id);

variable b1 number
variable b2 number

exec :b1 := 1; :b2 := 0

There’s nothing terribly significant about the data, beyond the fact that I’ve got a “small” table and a “large” table that I can use to encourage the optimizer to do a hash join. I’ve also created a couple of bind variables and set them to values that ensure that we can see that b1 is definitely not smaller than b2. So here’s a simple query – with a mergeable inline view in the first instance which is then hinted to make the view non-mergeable.


select
        t1.n1, count(*), sum(v1.n1)
from
        t1,
        (select t2.id, t2.n1 from t2 where :b1 < :b2) v1
where
        t1.n1 = 0
and     v1.id = t1.id
group by
        t1.n1
;

select
        t1.n1, count(*), sum(v1.n1)
from
        t1,
        (select /*+ no_merge */ t2.id, t2.n1 from t2 where :b1 < :b2) v1
where
        t1.n1 = 0
and     v1.id = t1.id
group by
        t1.n1
;

Clearly, for our values of b1 and b2, the query will not return any data. In fact we can go further and say that the presence of the “bind variable predicate” in the inline view either has no effect on the volume of data returned or it eliminates all the data. But the presence of the no_merge hint makes a difference to how much work Oracle does for the “no data” option. Here are the two plans, pulled from the memory of an 11g instance after enabling rowsource execution statistics – first when the view is mergeable:


-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |      1 |        |   508 (100)|      0 |00:00:00.01 |
|   1 |  SORT GROUP BY NOSORT          |       |      1 |      1 |   508   (2)|      0 |00:00:00.01 |
|*  2 |   FILTER                       |       |      1 |        |            |      0 |00:00:00.01 |
|   3 |    NESTED LOOPS                |       |      0 |    100 |   508   (2)|      0 |00:00:00.01 |
|   4 |     NESTED LOOPS               |       |      0 |    100 |   508   (2)|      0 |00:00:00.01 |
|*  5 |      TABLE ACCESS FULL         | T1    |      0 |    100 |   208   (4)|      0 |00:00:00.01 |
|*  6 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |     2   (0)|      0 |00:00:00.01 |
|   7 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |     3   (0)|      0 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1<:B2)
   5 - filter("T1"."N1"=0)
   6 - access("T2"."ID"="T1"."ID")

Notice how, despite t2 being the second table in the join, the bind variable predicate has worked its way to the top of the execution plan and execution has terminated after the run-time engine has determined that 1 is not less than zero.

Compare this with the plan when the view is non-mergeable:

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |      1 |        |  2300 (100)|      0 |00:00:00.01 |    1599 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |         |      1 |      1 |  2300   (4)|      0 |00:00:00.01 |    1599 |       |       |          |
|*  2 |   HASH JOIN            |         |      1 |    100 |  2300   (4)|      0 |00:00:00.01 |    1599 |  2061K|  2061K| 1109K (0)|
|   3 |    JOIN FILTER CREATE  | :BF0000 |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|*  4 |     TABLE ACCESS FULL  | T1      |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |    VIEW                |         |      1 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|*  6 |     FILTER             |         |      1 |        |            |      0 |00:00:00.01 |       0 |       |       |          |
|   7 |      JOIN FILTER USE   | :BF0000 |      0 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2      |      0 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)
   6 - filter(:B1<:B2)
   8 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

Thanks to the no_merge hint the bind variable predicate has not been promoted to the top of the plan, so the run-time engine has produced a plan that requires it to access data from table t1 before visiting table t2. In fact the optimizer has decided to do a hash join gathering all the relevant data from t1 and building an in-memory hash table before deciding that 1 is greater than zero and terminating the query.

Be careful if you write SQL that compares bind variables (or other pseudo-constants such as calls to sys_context) with bind variables (etc.); you may find that you’ve managed to produce code that forces the optimizer to do work that it could have avoided if only it had found a way of doing that comparison at the earliest possible moment.

These plans were from 11g, but 12c can behave the same way although, with my specific data set, I had to add the no_push_pred() hint to the query to demonstrate the effect of hash join appearing.

Footnote 1

An interesting side effect of this plan is that it has also allowed a Bloom filter to appear in a serial hash join – not something you’d normally expect to see, so I changed the predicate to :b1 > :b2 to see if the no_merge hint combined with the Bloom filter was faster than merging without the Bloom filter. Here are the two plans, pulled from memory. Running this on 11g I had to hint the hash join when I removed the no_merge hint:


Non-mergeable view - Bloom filter appears
------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |      1 |        |  2300 (100)|      1 |00:00:00.12 |   17725 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |         |      1 |      1 |  2300   (4)|      1 |00:00:00.12 |   17725 |       |       |          |
|*  2 |   HASH JOIN            |         |      1 |    100 |  2300   (4)|    100 |00:00:00.12 |   17725 |  2061K|  2061K| 1106K (0)|
|   3 |    JOIN FILTER CREATE  | :BF0000 |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|*  4 |     TABLE ACCESS FULL  | T1      |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |    VIEW                |         |      1 |   1000K|  2072   (3)|   1605 |00:00:00.10 |   16126 |       |       |          |
|*  6 |     FILTER             |         |      1 |        |            |   1605 |00:00:00.09 |   16126 |       |       |          |
|   7 |      JOIN FILTER USE   | :BF0000 |      1 |   1000K|  2072   (3)|   1605 |00:00:00.08 |   16126 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2      |      1 |   1000K|  2072   (3)|   1605 |00:00:00.07 |   16126 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)
   6 - filter(:B1<B2)
   8 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

View merging allowed - no Bloom filter
-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |  2300 (100)|      1 |00:00:07.56 |   17725 |       |       |          |
|   1 |  SORT GROUP BY NOSORT|      |      1 |      1 |  2300   (4)|      1 |00:00:07.56 |   17725 |       |       |          |
|*  2 |   FILTER             |      |      1 |        |            |    100 |00:00:07.56 |   17725 |       |       |          |
|*  3 |    HASH JOIN         |      |      1 |    100 |  2300   (4)|    100 |00:00:07.56 |   17725 |  2061K|  2061K| 1446K (0)|
|*  4 |     TABLE ACCESS FULL| T1   |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |     TABLE ACCESS FULL| T2   |      1 |   1000K|  2072   (3)|   1000K|00:00:01.94 |   16126 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1<:B2)
   3 - access("T2"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)

Things to note – the Bloom filter eliminated all but 1,605 rows from the tablescan before passing them to the hash join operation to probe the hash table; the run time of the query without filtering was 7.56 seconds (!) compared to 0.12 seconds with the Bloom filter. Fortunately I decided that this was too good to be true BEFORE I published the results and decided to re-run the tests with statistics_level set back to typical and found that most of the difference was CPU time spent on collecting rowsource execution statistics. The query with the Bloom filter was still faster, but only just – the times were more like 0.09 seconds vs. 0.12 seconds.

Footnote 2

The source of the problem on OTN was that as well as using OLS the query in question included a user-defined function. Since you can write a user-defined function that “spies” on the data content and uses (e.g.) dbms_output to write data to the terminal this poses a security risk; if a predicate calling that function executed before the security predicate had been tested then your function could output data that your query shouldn’t be able to report. To avoid this security loophole Oracle restricts the way it merges views (unless you set optimizer_secure_view_merging to false). For a more detailed explanation and demonstration of the issues, see this item on Christian Antognini’s blog.

I still don’t know why the ANSI form of the query managed to bypass this issue, but the predicate with the user-defined function was applied as a filter at the very last step of the plan, so perhaps there was something about the transformation Oracle took to get from ANSI syntax to its internal syntax (with cascading lateral views) that made it possible for the optimizer to recognize and eliminate the security threat efficiently.

January 2, 2014

Conditional SQL – 4

Filed under: Conditional SQL,Execution plans,Oracle — Jonathan Lewis @ 6:14 pm BST Jan 2,2014

This is one of those posts where the investigation is left as an exercise – it’s not difficult, just something that will take a little time that I don’t have, and just might end up with me chasing half a dozen variations (so I’d rather not get sucked into looking too closely). It comes from an OTN question which ends up reporting this predicate:

WHERE ( LENGTH ( :b7) IS NULL OR
         UPPER (TRIM (CODFSC)) = UPPER (TRIM ( :b8)) or
         UPPER (TRIM (CODUIC)) = UPPER (TRIM ( :b9)))
       AND STATE = 0;

The three bind variables all hold the same value; there is a function-based index on upper(trim(codfsc)), and another on upper(trim(coduic)). The execution plan for this query is a full tablescan, but if you eliminate the first predicate Oracle can do a concatenation of two index range scans. This variation doesn’t surprise me, the optimizer’s ability to introduce concatenation is limited; however, I did wonder whether some small variation in the SQL would allow the optimizer to get just a little more clever.

Would you get concatenation if you changed the first predicate to (:b7 is null); if not, would a similar query that didn’t depend on function-based indexes do concatenation; if not is there any rewrite of this query that could do a tablescan ONLY for the case where :b7 was null ?

Demonstrations of any levels of success can be left in the comments if anyone’s interested. To get a fixed font that preserves space start the code with “sourcecode” and end with “/sourcecode” (removing the quotation marks and replacing them with square brackets).

August 13, 2013

NVL()

Filed under: CBO,Conditional SQL,Execution plans,NULL,Oracle — Jonathan Lewis @ 7:14 am BST Aug 13,2013

Here’s a little detail about how the optimizer can handle the nvl() function that I hadn’t noticed before (and it goes back to at least 8i). This is running on 11.2.0.3, and table t1 is just all_objects where rownum <= 20000:

(more…)

October 31, 2010

Conditional SQL – 3

Filed under: CBO,Conditional SQL,Execution plans,Tuning — Jonathan Lewis @ 5:54 pm BST Oct 31,2010

I’ve spent the last week in Spain – sightseeing, rather than working – with a minimum amount of access to the Internet.

Inevitably I now have to work hard to catch up with my email. As a moment of light relief in an otherwise irritating chore I thought I’d respond to an emailed request for help. (Regular readers of the blog will know that I don’t usually respond to private email requests for solutions, but sometimes someone gets lucky.)

The question was basically this:  why do I get different execution plans for the following two statements:

(more…)

February 14, 2007

Conditional SQL (2)

Filed under: CBO,Conditional SQL,Execution plans,Indexing,Tuning — Jonathan Lewis @ 8:48 am BST Feb 14,2007

Some time ago I wrote a note on conditional SQL, commenting on code with predicates like:

         where	t1.n1 = nvl(:n1,t1.n1)

In that article, I showed how Oracle can transform a query of this type using concatenation so that at run time one of two possible execution paths can be taken. But, as often happens,  a simple example leads to more complex questions.

What happens if you have several similar predicates in your query – does Oracle keep doubling the number of possible execution paths for the query: if so, how far does it go, if not are there any side effects to consider? What happens with more complex queries, including joins.
(more…)

January 9, 2007

Conditional SQL

Filed under: CBO,Conditional SQL,Execution plans — Jonathan Lewis @ 7:57 pm BST Jan 9,2007

Here’s a simplified example of the type of SQL that I often used to see coming from screen generators:


rem
rem     Script:         null_plan.sql
rem     Author:         Jonathan Lewis
rem     Dated:          February 2003
rem

create table t1 (
        n1              number(5),
        n2              number(5),
        v1              varchar2(10),
        v2              varchar2(10),
        v3              varchar2(10),
        v4              varchar2(10),
        v5              varchar2(10),
        padding         varchar2(100),
        constraint t1_pk primary key(n1)
);

insert into t1
select
        rownum,
        rownum,
        rownum,
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        rpad('x',100)
from all_objects
where
        rownum <= 1000 -- > comment to avoid WordPress format issue
;

--  gather stats

select 	*
from 	t1
where	t1.n1 = nvl(:n1,t1.n1)
;

This code is basically trying to say: “if the user supplies an input value for some on-screen field then return data for that value, otherwise return all the data unconditionally.”
(more…)

Powered by WordPress.com.