Oracle Scratchpad

March 2, 2018

Conditional SQL – 5

Filed under: Conditional SQL,Execution plans,Hints,Indexing,Oracle — Jonathan Lewis @ 12:49 pm BST Mar 2,2018

Here’s a note that has been sitting around for more than 3 years (the draft date is Jan 2015), waiting for me to finish it off; and in that time we’ve got a new version of Oracle that changes the solution to the problem it presented. (I also managed to write “Conditional SQL –  6” in the intervening period !)

This posting started with a question on the OTN (now ODC) database forum about an execution plan used by 11.2.0.3.  Here’s a model to represent the data and the query:

rem
rem     Script:         null_plan_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2015
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

create table catentry as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment here to avoid format issue
)
select
        rownum  catentry_id,
        case
                when mod(rownum-1,100) > 0 then mod(rownum-1,100)
        end     member_id,
        case
                when trunc((rownum-1)/100) > 0 then trunc((rownum-1)/100)
        end     partnumber,
        rpad('x',100)   padding
from
        generator,
        generator
where
        rownum <= 100 * 100 -- > comment here to avoid format issue
;

execute dbms_stats.gather_table_stats(user,'catentry');

create unique index cat_i0 on catentry(member_id, partnumber) compress 1;
--  create        index cat_i1 on catentry(member_id, partnumber, 0) compress 1;
--  create        index cat_i2 on catentry(partnumber, member_id, 0) compress 1;

variable b1 number
variable b2 number
variable b3 number
variable b4 number

begin
        :b1 := 22;
        :b2 := 1;
        :b3 := 44;
        :b4 := 1;
end;
/

select
        catentry_id
from
        catentry
where
        (   partnumber= :b1
         or (0 = :b2 and partnumber is null)
        )
and     (    member_id= :b3
         or (0 = :b4 and member_id is null)
        )
;

select * from table(dbms_xplan.display_cursor);

==============================================================================

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |       |       |    23 (100)|          |
|*  1 |  TABLE ACCESS FULL| CATENTRY |     1 |    10 |    23   (5)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter((("PARTNUMBER"=:B1 OR ("PARTNUMBER" IS NULL AND 0=:B2))
              AND ("MEMBER_ID"=:B3 OR ("MEMBER_ID" IS NULL AND 0=:B4))))

The question this example raised was, effectively: “Why didn’t Oracle use bind peeking to work out that the best plan for this query – with these input values – was an index range scan?”

The basic answer to this question is this: “No matter how clever Oracle can be with bind peeking and executions plans it has to produce an execution plan that will give the right answer whatever the supplied values might be.”

The OP was hoping that the optimizer would see :b2 and :b4 were arriving with the value 1, infer that “0 = 1” is always false, and reduce the query predicate to “partnumber =22 and member_id = 44” to produce the following plan:


----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | CAT_I0   |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PARTNUMBER"=22 AND "MEMBER_ID"=44)

But this plan could produce the wrong results if the next execution of the query supplied zeros for :b2 or :b4, so Oracle has to do something more generic. (Bear in mind that adaptive cursor sharing keeps reusing the same execution plan until it detects that the performance for some input values is bad; it doesn’t pre-emptively create new plans based on the incoming values – though in principle it might be possible for the Oracle developers to introduce code that can recognise special cases for predicates of the form “constant1 = constant2”).

If you review the SQL statement you can see that it’s clearly trying to allow the user to find data about member_ids and partnumbers where both, neither, or either value is allowed to be null: a couple of “if – then – else” conditions that should have been handled in the client code have been embedded in the code. As we have seen several times before if you can’t change the client code then you have to hope that Oracle will use some clever transformation to handle the query in sections.

We can infer from various details of the posting that the member_id and partnumber columns were both allowed to be null, so if we want to make sure that Oracle always uses an indexed access path to acquire data for this query we need to have an index which starts with those two columns and then has at least one column which is guaranteed to be non-null so, for example, we could simply drop the current index and replace it with one that has a fixed zero on the end:

create index cat_i1 on catentry(member_id, partnumber, 0) compress 1;

With my particular data set, query, and version of Oracle this didn’t make any difference to the plan. But then I thought about the data definition and realised (and checked) that the index had a terrible clustering_factor, so I dropped it and created it with the first two columns in the opposite order:

create index cat_i2 on catentry(partnumber, member_id, 0) compress 1;

Side note:
You’ll notice that I’ve replaced the original unique index with a non-unique index. This was necessary because there were many rows where both partnumber and member_id were null, so if I want to maintain the logic of the previous unique index I’ll need to add a unique constraint on (member_id, partnumber). It’s possible, of course, that in similar circumstances I might want both indexes – one for the uniqueness and to access the data using only one of the columns, the other to access the data using only the other column.

With this index in place, and unhinted, the plan I got from 11.2.0.4 changed to use concatenation with an impressive four-way split:


------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |       |       |    12 (100)|          |
|   1 |  CONCATENATION                |          |       |       |            |          |
|*  2 |   FILTER                      |          |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|*  5 |   FILTER                      |          |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|*  8 |   FILTER                      |          |       |       |            |          |
|*  9 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS BY INDEX ROWID | CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | CAT_I2   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((0=:B2 AND 0=:B4))
   4 - access("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL)
       filter("MEMBER_ID" IS NULL)
   5 - filter(0=:B2)
   6 - filter((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)))
   7 - access("PARTNUMBER" IS NULL AND "MEMBER_ID"=:B3)
       filter("MEMBER_ID"=:B3)
   8 - filter(0=:B4)
   9 - filter((LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2)))
  10 - access("PARTNUMBER"=:B1 AND "MEMBER_ID" IS NULL)
  11 - filter(((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)) AND
              (LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2))))
  12 - access("PARTNUMBER"=:B1 AND "MEMBER_ID"=:B3)


To execute this plan the run-time engine works as follows:

  • Operation 2: If :b2 and :b4 are both zero we use the index to find the rows where member_id and partnumber are null (the filter “member_id is null” seems to be redundant)
  • Operation 5: if :b2 is zero we use the index to find rows where the partnumber is null and the member_id is the supplied value (and if that’s null the access will immediately return zero rows because of the equality predicate), and discard any rows that have already been returned by operation 2
  • Operation 8: if :b4 is zero we will use the index to find rows where the partnumber is the supplied value (and if the partnumber is null, that access will immediately return zero rows because of the equality predicate) and the member_id is null, and discard any rows that have already been returned by operation 2.
  • Operations 11 and 12 will always run – using the index to find rows that match with equality on both the incoming member_id and partnumber, discarding any rows already returned by the previous operations, and obviously not matching any rows where either column “IS” null.

The critical feature of this plan, of course, is that we got it because we have given Oracle an efficient option to find the rows where both member_id and partnumber are null – and that allows the rest of the concatenation options to take place.

Hints and Upgrades

Interestingly, after the clue that 11g would happily use concatenation with a “good enough” index I went back to the example where I’d just added a zero to the existing index and checked to see what would happen if I added a /*+ use_concat */ hint (without any of the qualifying parameters that the hint can now use) and got the same concatenated plan. The fact that the path appeared wasn’t the interesting bit – see if you can spot what is the interesting bit:

------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |       |       |    12 (100)|          |
|   1 |  CONCATENATION                |          |       |       |            |          |
|*  2 |   FILTER                      |          |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|*  5 |   FILTER                      |          |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|*  8 |   FILTER                      |          |       |       |            |          |
|*  9 |    TABLE ACCESS BY INDEX ROWID| CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
|* 11 |   TABLE ACCESS BY INDEX ROWID | CATENTRY |     1 |    10 |     3   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | CAT_I1   |     1 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Check the cost, and compare it with the cost of the full tablescan. The hinted path has a lower cost than the default path. I think this may be another case of an “unknowable” range scan being ignored in favour of a known alternative.

Finally, we get to today – when I tested the code against 12.1.0.2 and 12.2.0.1. Nothing exciting happened in 12.1.0.2 – the plans were just like the 11g plans, but here’s the plan I got in 12.2 with the “bad” indexing (original column order with added zero column – index cat_i1) without any hints in the SQL:


----------------------------------------------------------------------------------------------------------
| Id  | Operation			       | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		       |		 |	 |	 |     9 (100)| 	 |
|   1 |  VIEW				       | VW_ORE_37A7142B |     4 |    52 |     9   (0)| 00:00:01 |
|   2 |   UNION-ALL			       |		 |	 |	 |	      | 	 |
|   3 |    TABLE ACCESS BY INDEX ROWID	       | CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  5 |    FILTER			       |		 |	 |	 |	      | 	 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  8 |    FILTER			       |		 |	 |	 |	      | 	 |
|   9 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     1   (0)| 00:00:01 |
|* 11 |    FILTER			       |		 |	 |	 |	      | 	 |
|  12 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     3   (0)| 00:00:01 |
|* 13 |      INDEX RANGE SCAN		       | CAT_I1 	 |     1 |	 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      ...
      OR_EXPAND(@"SEL$1" (1) (2) (3) (4))
      ...
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1)
   5 - filter(0=:B4)
   7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1)
       filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3)))
   8 - filter(0=:B2)
  10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL)
       filter(LNNVL("PARTNUMBER"=:B1))
  11 - filter((0=:B4 AND 0=:B2))
  13 - access("MEMBER_ID" IS NULL AND "PARTNUMBER" IS NULL)
       filter(("PARTNUMBER" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND LNNVL("MEMBER_ID"=:B3)))

The full tablescan didn’t appear – but it wasn’t eliminated by concatenation but by the “new” 12.2  variant: “OR EXPANSION”. In this case the net effect is remarkably similar – we still have filter operations comparing :b2 and :b4 with zero, and we still have a scattering of lnnvl() function calls being used to discard rows we’ve already accessed, but the pattern is slightly different and we have a union all rather than concatenation.

This change prompted me to go back to testing with just the original index (member_id, partnumber – index cat_i0) … which took me back to the full tablescan until I added the hint /*+ or_expand */ to the query to get the following plan:


----------------------------------------------------------------------------------------------------------
| Id  | Operation			       | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		       |		 |	 |	 |    29 (100)| 	 |
|   1 |  VIEW				       | VW_ORE_37A7142B |     4 |    52 |    29   (4)| 00:00:01 |
|   2 |   UNION-ALL			       |		 |	 |	 |	      | 	 |
|   3 |    TABLE ACCESS BY INDEX ROWID	       | CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  5 |    FILTER			       |		 |	 |	 |	      | 	 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|*  8 |    FILTER			       |		 |	 |	 |	      | 	 |
|   9 |     TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY	 |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX RANGE SCAN		       | CAT_I0 	 |     1 |	 |     1   (0)| 00:00:01 |
|* 11 |    FILTER			       |		 |	 |	 |	      | 	 |
|* 12 |     TABLE ACCESS FULL		       | CATENTRY	 |     1 |    10 |    23   (5)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1)
   5 - filter(0=:B4)
   7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1)
       filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3)))
   8 - filter(0=:B2)
  10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL)
       filter(LNNVL("PARTNUMBER"=:B1))
  11 - filter((0=:B4 AND 0=:B2))
  12 - filter(("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND
	      LNNVL("MEMBER_ID"=:B3)))

The plan shows “or expansion”, and highlights the most significant difference between concatenation and expansion – concatenation requires indexed access paths in all branches, or-expansion doesn’t.

At first sight this plan with its full tablescan at operation 12 might seem like a bad idea; but check operation 11, the guarding filter, the tablescan occurs only if both :b2 and :b4 are null. Perhaps that special condition is never supposed to appear, perhaps it’s going to do a lot of work whether or not you can use an index. The fact that you can now handle the original problem without adding or altering existing indexes – provided you can inject this or_expand hint – may be of significant benefit. (Of course, being able to recreate the original index with the columns in the reverse order would even avoid the necessity of worrying about the hint.)

tl;dr

Applications that push “if-then-else” decisions into the SQL and down to the optimizer are a pain in the backside; the performance problems they produce can sometimes be bypassed by the addition of extra indexes that might give you plans (possibly hinted) that report the use of the concatentation operation. In 12cR2 the optimizer has an improved strategy for damage limitation “Cost-based Or Expansion” that can produce very similar effects without the addition of extra indexes. These plans will report union all operations, referencing views with names like: VW_ORE_xxxxxxxxx.

When you next upgrade you may find a few cases where you can get rid of indexes that were only created to work around defective application coding strategies. You may also want to hunt down any code where you’ve added use_concat hints and see if they can be removed, or if they should be replaced by or_expand. since the former hint will disable the latter feature.

 

June 22, 2016

Conditional SQL- 6

Filed under: Conditional SQL,Execution plans,Oracle,Performance — Jonathan Lewis @ 2:16 pm BST Jun 22,2016

An odd little anomaly showed up on the OTN database forum a few days ago where a query involving a table covered by Oracle Label Security (OLS) seemed to wrap itself into a non-mergeable view when written using traditional Oracle SQL, but allowed for view-merging when accessed through ANSI standard SQL. I don’t know why there’s a difference but it did prompt a thought about non-mergeable views and what I’ve previously called “conditional SQL” – namely SQL which holds a predicate that should have been tested in the client code and not passed to the database engine.

The thought was this – could the database engine decide to do a lot of redundant work if you stuck a silly predicate inside a non-mergeable view: the answer turns out to be yes. Here’s a demonstration I’ve run on 11g and 12c:


rem
rem     Script:         conditional_fail.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e5)         n1,
        lpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e3)         n1,
        lpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;
-- gather simple statistics (not needed in 12c) 

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t2_i1 on t2(id);

variable b1 number
variable b2 number

exec :b1 := 1; :b2 := 0

There’s nothing terribly significant about the data, beyond the fact that I’ve got a “small” table and a “large” table that I can use to encourage the optimizer to do a hash join. I’ve also created a couple of bind variables and set them to values that ensure that we can see that b1 is definitely not smaller than b2. So here’s a simple query – with a mergeable inline view in the first instance which is then hinted to make the view non-mergeable.


select
        t1.n1, count(*), sum(v1.n1)
from
        t1,
        (select t2.id, t2.n1 from t2 where :b1 < :b2) v1
where
        t1.n1 = 0
and     v1.id = t1.id
group by
        t1.n1
;

select
        t1.n1, count(*), sum(v1.n1)
from
        t1,
        (select /*+ no_merge */ t2.id, t2.n1 from t2 where :b1 < :b2) v1
where
        t1.n1 = 0
and     v1.id = t1.id
group by
        t1.n1
;

Clearly, for our values of b1 and b2, the query will not return any data. In fact we can go further and say that the presence of the “bind variable predicate” in the inline view either has no effect on the volume of data returned or it eliminates all the data. But the presence of the no_merge hint makes a difference to how much work Oracle does for the “no data” option. Here are the two plans, pulled from the memory of an 11g instance after enabling rowsource execution statistics – first when the view is mergeable:


-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |      1 |        |   508 (100)|      0 |00:00:00.01 |
|   1 |  SORT GROUP BY NOSORT          |       |      1 |      1 |   508   (2)|      0 |00:00:00.01 |
|*  2 |   FILTER                       |       |      1 |        |            |      0 |00:00:00.01 |
|   3 |    NESTED LOOPS                |       |      0 |    100 |   508   (2)|      0 |00:00:00.01 |
|   4 |     NESTED LOOPS               |       |      0 |    100 |   508   (2)|      0 |00:00:00.01 |
|*  5 |      TABLE ACCESS FULL         | T1    |      0 |    100 |   208   (4)|      0 |00:00:00.01 |
|*  6 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |     2   (0)|      0 |00:00:00.01 |
|   7 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |     3   (0)|      0 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1<:B2)
   5 - filter("T1"."N1"=0)
   6 - access("T2"."ID"="T1"."ID")

Notice how, despite t2 being the second table in the join, the bind variable predicate has worked its way to the top of the execution plan and execution has terminated after the run-time engine has determined that 1 is not less than zero.

Compare this with the plan when the view is non-mergeable:

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |      1 |        |  2300 (100)|      0 |00:00:00.01 |    1599 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |         |      1 |      1 |  2300   (4)|      0 |00:00:00.01 |    1599 |       |       |          |
|*  2 |   HASH JOIN            |         |      1 |    100 |  2300   (4)|      0 |00:00:00.01 |    1599 |  2061K|  2061K| 1109K (0)|
|   3 |    JOIN FILTER CREATE  | :BF0000 |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|*  4 |     TABLE ACCESS FULL  | T1      |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |    VIEW                |         |      1 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|*  6 |     FILTER             |         |      1 |        |            |      0 |00:00:00.01 |       0 |       |       |          |
|   7 |      JOIN FILTER USE   | :BF0000 |      0 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2      |      0 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)
   6 - filter(:B1<:B2)
   8 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

Thanks to the no_merge hint the bind variable predicate has not been promoted to the top of the plan, so the run-time engine has produced a plan that requires it to access data from table t1 before visiting table t2. In fact the optimizer has decided to do a hash join gathering all the relevant data from t1 and building an in-memory hash table before deciding that 1 is greater than zero and terminating the query.

Be careful if you write SQL that compares bind variables (or other pseudo-constants such as calls to sys_context) with bind variables (etc.); you may find that you’ve managed to produce code that forces the optimizer to do work that it could have avoided if only it had found a way of doing that comparison at the earliest possible moment.

These plans were from 11g, but 12c can behave the same way although, with my specific data set, I had to add the no_push_pred() hint to the query to demonstrate the effect of hash join appearing.

Footnote 1

An interesting side effect of this plan is that it has also allowed a Bloom filter to appear in a serial hash join – not something you’d normally expect to see, so I changed the predicate to :b1 > :b2 to see if the no_merge hint combined with the Bloom filter was faster than merging without the Bloom filter. Here are the two plans, pulled from memory. Running this on 11g I had to hint the hash join when I removed the no_merge hint:


Non-mergeable view - Bloom filter appears
------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |      1 |        |  2300 (100)|      1 |00:00:00.12 |   17725 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |         |      1 |      1 |  2300   (4)|      1 |00:00:00.12 |   17725 |       |       |          |
|*  2 |   HASH JOIN            |         |      1 |    100 |  2300   (4)|    100 |00:00:00.12 |   17725 |  2061K|  2061K| 1106K (0)|
|   3 |    JOIN FILTER CREATE  | :BF0000 |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|*  4 |     TABLE ACCESS FULL  | T1      |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |    VIEW                |         |      1 |   1000K|  2072   (3)|   1605 |00:00:00.10 |   16126 |       |       |          |
|*  6 |     FILTER             |         |      1 |        |            |   1605 |00:00:00.09 |   16126 |       |       |          |
|   7 |      JOIN FILTER USE   | :BF0000 |      1 |   1000K|  2072   (3)|   1605 |00:00:00.08 |   16126 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2      |      1 |   1000K|  2072   (3)|   1605 |00:00:00.07 |   16126 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)
   6 - filter(:B1<B2)
   8 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

View merging allowed - no Bloom filter
-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |  2300 (100)|      1 |00:00:07.56 |   17725 |       |       |          |
|   1 |  SORT GROUP BY NOSORT|      |      1 |      1 |  2300   (4)|      1 |00:00:07.56 |   17725 |       |       |          |
|*  2 |   FILTER             |      |      1 |        |            |    100 |00:00:07.56 |   17725 |       |       |          |
|*  3 |    HASH JOIN         |      |      1 |    100 |  2300   (4)|    100 |00:00:07.56 |   17725 |  2061K|  2061K| 1446K (0)|
|*  4 |     TABLE ACCESS FULL| T1   |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |     TABLE ACCESS FULL| T2   |      1 |   1000K|  2072   (3)|   1000K|00:00:01.94 |   16126 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1<:B2)
   3 - access("T2"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)

Things to note – the Bloom filter eliminated all but 1,605 rows from the tablescan before passing them to the hash join operation to probe the hash table; the run time of the query without filtering was 7.56 seconds (!) compared to 0.12 seconds with the Bloom filter. Fortunately I decided that this was too good to be true BEFORE I published the results and decided to re-run the tests with statistics_level set back to typical and found that most of the difference was CPU time spent on collecting rowsource execution statistics. The query with the Bloom filter was still faster, but only just – the times were more like 0.09 seconds vs. 0.12 seconds.

Footnote 2

The source of the problem on OTN was that as well as using OLS the query in question included a user-defined function. Since you can write a user-defined function that “spies” on the data content and uses (e.g.) dbms_output to write data to the terminal this poses a security risk; if a predicate calling that function executed before the security predicate had been tested then your function could output data that your query shouldn’t be able to report. To avoid this security loophole Oracle restricts the way it merges views (unless you set optimizer_secure_view_merging to false). For a more detailed explanation and demonstration of the issues, see this item on Christian Antognini’s blog.

I still don’t know why the ANSI form of the query managed to bypass this issue, but the predicate with the user-defined function was applied as a filter at the very last step of the plan, so perhaps there was something about the transformation Oracle took to get from ANSI syntax to its internal syntax (with cascading lateral views) that made it possible for the optimizer to recognize and eliminate the security threat efficiently.

January 2, 2014

Conditional SQL – 4

Filed under: Conditional SQL,Execution plans,Oracle — Jonathan Lewis @ 6:14 pm BST Jan 2,2014

This is one of those posts where the investigation is left as an exercise – it’s not difficult, just something that will take a little time that I don’t have, and just might end up with me chasing half a dozen variations (so I’d rather not get sucked into looking too closely). It comes from an OTN question which ends up reporting this predicate:

WHERE ( LENGTH ( :b7) IS NULL OR
         UPPER (TRIM (CODFSC)) = UPPER (TRIM ( :b8)) or
         UPPER (TRIM (CODUIC)) = UPPER (TRIM ( :b9)))
       AND STATE = 0;

The three bind variables all hold the same value; there is a function-based index on upper(trim(codfsc)), and another on upper(trim(coduic)). The execution plan for this query is a full tablescan, but if you eliminate the first predicate Oracle can do a concatenation of two index range scans. This variation doesn’t surprise me, the optimizer’s ability to introduce concatenation is limited; however, I did wonder whether some small variation in the SQL would allow the optimizer to get just a little more clever.

Would you get concatenation if you changed the first predicate to (:b7 is null); if not, would a similar query that didn’t depend on function-based indexes do concatenation; if not is there any rewrite of this query that could do a tablescan ONLY for the case where :b7 was null ?

Demonstrations of any levels of success can be left in the comments if anyone’s interested. To get a fixed font that preserves space start the code with “sourcecode” and end with “/sourcecode” (removing the quotation marks and replacing them with square brackets).

October 31, 2010

Conditional SQL – 3

Filed under: CBO,Conditional SQL,Execution plans,Tuning — Jonathan Lewis @ 5:54 pm BST Oct 31,2010

I’ve spent the last week in Spain – sightseeing, rather than working – with a minimum amount of access to the Internet.

Inevitably I now have to work hard to catch up with my email. As a moment of light relief in an otherwise irritating chore I thought I’d respond to an emailed request for help. (Regular readers of the blog will know that I don’t usually respond to private email requests for solutions, but sometimes someone gets lucky.)

The question was basically this:  why do I get different execution plans for the following two statements:

(more…)

February 14, 2007

Conditional SQL (2)

Filed under: CBO,Conditional SQL,Execution plans,Indexing,Tuning — Jonathan Lewis @ 8:48 am BST Feb 14,2007

Some time ago I wrote a note on conditional SQL, commenting on code with predicates like:

         where	t1.n1 = nvl(:n1,t1.n1)

In that article, I showed how Oracle can transform a query of this type using concatenation so that at run time one of two possible execution paths can be taken. But, as often happens,  a simple example leads to more complex questions.

What happens if you have several similar predicates in your query – does Oracle keep doubling the number of possible execution paths for the query: if so, how far does it go, if not are there any side effects to consider? What happens with more complex queries, including joins.
(more…)

January 9, 2007

Conditional SQL

Filed under: CBO,Conditional SQL,Execution plans — Jonathan Lewis @ 7:57 pm BST Jan 9,2007

Here’s a simplified example of the type of SQL that I often used to see coming from screen generators:


rem
rem     Script:         null_plan.sql
rem     Author:         Jonathan Lewis
rem     Dated:          February 2003
rem

create table t1 (
        n1              number(5),
        n2              number(5),
        v1              varchar2(10),
        v2              varchar2(10),
        v3              varchar2(10),
        v4              varchar2(10),
        v5              varchar2(10),
        padding         varchar2(100),
        constraint t1_pk primary key(n1)
);

insert into t1
select
        rownum,
        rownum,
        rownum,
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        rpad('x',100)
from all_objects
where
        rownum <= 1000 -- > comment to avoid WordPress format issue
;

--  gather stats

select 	*
from 	t1
where	t1.n1 = nvl(:n1,t1.n1)
;

This code is basically trying to say: “if the user supplies an input value for some on-screen field then return data for that value, otherwise return all the data unconditionally.”
(more…)

May 19, 2009

Odd Filter

Filed under: Execution plans,subqueries — Jonathan Lewis @ 6:39 pm BST May 19,2009

A little while ago someone sent me a brief email about an odd section of execution plan that they had seen. To make things a little more challenging they didn’t send the SQL statement, and they only sent me a few lines from the middle of the plan to see if I could explain what was going on. This is what is looked like:
(more…)

February 14, 2018

Join Factorization

Filed under: CBO,Oracle — Jonathan Lewis @ 3:38 pm BST Feb 14,2018

This item is, by a roundabout route, a follow-up to yesterday’s note on a critical difference in cardinality estimates that appeared if you used the coalesce() function in its simplest form as a substitute for the nvl() function. Connor McDonald wrote a followup note about how using the nvl() function in a suitable predicate could lead to Oracle splitting a query into a union all (in version 12.2), which led me to go back to a note I’d written on the same topic about 10 years earlier where the precursor of this feature already existed but used concatenation instead of OR-expansion.

The script I’d used for my earlier article was actually one I’d written in February 2003 and tested fairly regularly since – which brings me to this article, because I finally tested my script against 12.2.0.1 to discover a very cute bit of optimisation.

The business of splitting a query into two parts can be used even when the queries are more complex and include joins; this doesn’t always happen automatically and sometimes has to be hinted (but that may be a costs/statistics thing) for example, from 12.1.0.2, a query and its execution plan:


select
        *
from
        t1, t2
where
        t1.v1 = nvl(:v1,t1.v1)
and     t2.n1 = t1.n1
;

---------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |         |  1001 |   228K|    11   (0)| 00:00:01 |
|   1 |  CONCATENATION                          |         |       |       |            |          |
|*  2 |   FILTER                                |         |       |       |            |          |
|*  3 |    HASH JOIN                            |         |  1000 |   228K|     8   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL                   | T2      |  1000 |   106K|     4   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL                   | T1      |  1000 |   122K|     4   (0)| 00:00:01 |
|*  6 |   FILTER                                |         |       |       |            |          |
|   7 |    NESTED LOOPS                         |         |     1 |   234 |     3   (0)| 00:00:01 |
|   8 |     NESTED LOOPS                        |         |     1 |   234 |     3   (0)| 00:00:01 |
|   9 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1      |     1 |   125 |     2   (0)| 00:00:01 |
|* 10 |       INDEX RANGE SCAN                  | T1_IDX1 |     1 |       |     1   (0)| 00:00:01 |
|* 11 |      INDEX UNIQUE SCAN                  | T2_PK   |     1 |       |     0   (0)| 00:00:01 |
|  12 |     TABLE ACCESS BY INDEX ROWID         | T2      |     1 |   109 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:V1 IS NULL)
   3 - access("T2"."N1"="T1"."N1")
   5 - filter("T1"."V1" IS NOT NULL)
   6 - filter(:V1 IS NOT NULL)
  10 - access("T1"."V1"=:V1)
  11 - access("T2"."N1"="T1"."N1")

You can see in this plan how Oracle has split the query into two queries combined through concatenation with filter operations at lines 2 (:v1 is null) and 6 (:v1 is not null) to allow the runtime engine to execute only the appropriate branch. You’ll also note that each branch can be optimised separately and in this case the two branches get dramatically different paths because of the enormous difference in the estimated volumes of data.

So let’s move up to 12.2.0.1 and see what happens to this query – but first I’m going to execute a cunning “alter session…” command which I’ll say more about later:


------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                 |  1001 |   180K|    11   (0)| 00:00:01 |
|   1 |  VIEW                                    | VW_ORE_F79C84EE |  1001 |   180K|    11   (0)| 00:00:01 |
|   2 |   UNION-ALL                              |                 |       |       |            |          |
|*  3 |    FILTER                                |                 |       |       |            |          |
|   4 |     NESTED LOOPS                         |                 |     1 |   234 |     3   (0)| 00:00:01 |
|   5 |      NESTED LOOPS                        |                 |     1 |   234 |     3   (0)| 00:00:01 |
|   6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1              |     1 |   125 |     2   (0)| 00:00:01 |
|*  7 |        INDEX RANGE SCAN                  | T1_IDX1         |     1 |       |     1   (0)| 00:00:01 |
|*  8 |       INDEX UNIQUE SCAN                  | T2_PK           |     1 |       |     0   (0)| 00:00:01 |
|   9 |      TABLE ACCESS BY INDEX ROWID         | T2              |     1 |   109 |     1   (0)| 00:00:01 |
|* 10 |    FILTER                                |                 |       |       |            |          |
|* 11 |     HASH JOIN                            |                 |  1000 |   228K|     8   (0)| 00:00:01 |
|  12 |      TABLE ACCESS FULL                   | T2              |  1000 |   106K|     4   (0)| 00:00:01 |
|* 13 |      TABLE ACCESS FULL                   | T1              |  1000 |   122K|     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(:V1 IS NOT NULL)
   7 - access("T1"."V1"=:V1)
   8 - access("T2"."N1"="T1"."N1")
  10 - filter(:V1 IS NULL)
  11 - access("T2"."N1"="T1"."N1")
  13 - filter("T1"."V1" IS NOT NULL)

There’s nothing terribly exciting about the change – except for the disappearance of the CONCATENATION operator and the appearance of the VIEW and UNION ALL operators to replace it (plus you’ll see that the two branches appear in the opposite order in the plan). But let’s try again, without doing that “alter session…”:


--------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                    |  1001 |   229K|    10   (0)| 00:00:01 |
|*  1 |  HASH JOIN                              |                    |  1001 |   229K|    10   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL                     | T2                 |  1000 |   106K|     4   (0)| 00:00:01 |
|   3 |   VIEW                                  | VW_JF_SET$A2355C8B |  1001 |   123K|     6   (0)| 00:00:01 |
|   4 |    UNION-ALL                            |                    |       |       |            |          |
|*  5 |     FILTER                              |                    |       |       |            |          |
|*  6 |      TABLE ACCESS FULL                  | T1                 |  1000 |   122K|     4   (0)| 00:00:01 |
|*  7 |     FILTER                              |                    |       |       |            |          |
|   8 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1                 |     1 |   125 |     2   (0)| 00:00:01 |
|*  9 |       INDEX RANGE SCAN                  | T1_IDX1            |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N1"="ITEM_1")
   5 - filter(:V1 IS NULL)
   6 - filter("T1"."V1" IS NOT NULL)
   7 - filter(:V1 IS NOT NULL)
   9 - access("T1"."V1"=:V1)

The plan now shows a view which is a union all involving only table t1 in both its branches. The result set from the view is then used as the probe table of a hash join with t2. You’ll note that the name of the view is now VW_JF_SET$A2355C8B – that’s JF for “Join Factorization”, and the alter session I executed to get the first plan was to disable the feature: ‘alter session set “_optimizer_join_factorization”= false;’.

Join factorization can occur when the optimizer sees a union all view that includes some tables that are common to both (all) branches of the query, and finds that it can move those tables outside the union all view while getting the same end result at a lower cost. In this case it happens to be a nice example of how the optimizer can transform then transform again to get to the lowest cost plan.

It’s worth noting that Join Factorization has been around since 11.2.x.x, and Or Expansion has been around for even longer – but it’s not until 12.2 that nvl() transforms through OR-expansion, which is what allows it to transform onwards through Join Factorization.

You’ll note, by the way, that with this plan we always do a full tablescan of t2 whereas if we stop after just OR-expansion the tablescan is just a potential threat that may never (or hardly ever) be realised.  That’s a point to check if you find that the transformation starts to appear inappropriately on an upgrade. There is a hint to disable the feature for a query, but it’s not trivial to get it right so if you do need to block the feature the smart hint (or SQL Patch) would be “opt_param(‘_optimizer_join_factorization’ ‘false’)”.

Footnote:

If you want to run the experiments yourself, here’s the script I used to generate the data. It’s more complicated than it needs to be because I use the same tables in several different tests:

rem
rem     Script:         null_plan_122.sql
rem     Author:         Jonathan Lewis
rem     Dated:          February 2018
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1        Join Factorization
rem             12.1.0.2        Concatenation
rem
rem

drop table t2;
drop table t1;

-- @@setup  -- various set commands etc.

create table t1 (
        n1              number(5),
        n2              number(5),
        v1              varchar2(10),
        v2              varchar2(10),
        v3              varchar2(10),
        v4              varchar2(10),
        v5              varchar2(10),
        padding         varchar2(100),
        constraint t1_pk primary key(n1)
);

insert into t1
select
        rownum,
        rownum,
        rownum,
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        trunc(100 * dbms_random.value),
        rpad('x',100)
from all_objects
where
        rownum <= 1000 -- > comment to avoid WordPress format mess
;

create unique index t1_n2 on t1(n2);

create index t1_idx1 on t1(v1);
create index t1_idx2 on t1(v2,v1);
create index t1_idx3 on t1(v3,v2,v1);

create table t2 (
        n1              number(5),
        v1              varchar2(10),
        padding         varchar2(100),
        constraint t2_pk primary key(n1)
);

insert into t2
select
        rownum,
        rownum,
        rpad('x',100)
from all_objects
where
        rownum <= 1000 -- > comment to avoid WordPress format mess
;

create index t2_idx on t2(v1);

begin dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

variable n1 number
variable n2 number
variable v1 varchar2(10)
variable v2 varchar2(10)
variable v3 varchar2(10)

exec :n1 := null
exec :n2 := null
exec :v1 := null
exec :v2 := null
exec :v3 := null

spool null_plan_122

set autotrace traceonly explain

prompt  ============================================
prompt  One colx = nvl(:b1,colx) predicate with join
prompt  ============================================

select
        *
from
        t1, t2
where
        t1.v1 = nvl(:v1,t1.v1)
and     t2.n1 = t1.n1
;

alter session set "_optimizer_join_factorization" = false;

select
        *
from
        t1, t2
where
        t1.v1 = nvl(:v1,t1.v1)
and     t2.n1 = t1.n1
;

alter session set "_optimizer_join_factorization" = true;

set autotrace off

spool off

June 15, 2015

Dynamic Sampling

Filed under: CBO,Oracle — Jonathan Lewis @ 9:41 pm BST Jun 15,2015

Following on from an OTN posting about dynamic sampling difficulties I had planned to write a blog post about the difference between “not sampling when hinted” and “ignoring the sample” – but Mohamed Houri got there before me.

It’s just worth highlighing a little detail that is often overlooked, though: there are two versions of the dynamic_sampling() hint, the cursor level and the table level, and the number of blocks sampled at a particular level is dependent on which version you are using.  Level 4 at the cursor level, for example, will sample 64 blocks if and only if a certain condition is met,  but at the table level it will sample 256 blocks unconditionally.

So try to be a little more specific when you say “I told the optimizer to use dynamic sampling …”, it’s either:

“I told the optimizer to use cursor level dynamic sampling at level X …”

or

“I told the optimizer to use table level dynamic sampling at level Y for table A and …”

Note – apart from the changes to dynamic sampling that allow for a level 11, there’s also a change introduced (I think) in 10g for the sample() clause applied to the table during sampling – it’s the addition of a seed() clause which ensures that when you repeat the same level you generate the same set of random rows.

Addendum

Here’s a little code I wrote some time ago to check the effect of the two options at different levels. I started by creating a (nologging) table from the first 50,000 rows of all_objects, then doubled it up a few times to 400,000 rows total, and ensured that there were no stats on the table. Then executed in turn each variant of the following anonymous pl/sql block (note that I have the execute privilege on the dbms_system package):


declare
	m_ct number;
begin
	execute immediate 'alter session set events ''10053 trace name context forever''';
	for i in 1..10 loop
		sys.dbms_system.ksdwrt(1,'=============');
		sys.dbms_system.ksdwrt(1,'Level ' || i);
		sys.dbms_system.ksdwrt(1,'=============');

		execute immediate 
			'select /*+ dynamic_sampling('    || i || ') */ count(*) from t1 ' ||
--			'select /*+ dynamic_sampling(t1 ' || i || ') */ count(*) from t1 ' ||
			'where owner = ''SYS'' and object_type = ''SYNONYM'''
			into m_ct;
	end loop;
end;
/

Obviously I could examine the resulting trace file to pick out bits of each optimisation, but for a quick check a simple grep for “sample block cnt” is almost all I need to do – with the following (slightly decorated) results from 11.2.0.4:


Table level
===========
Level 1
    max. sample block cnt. : 32
    sample block cnt. : 31
    max. sample block cnt. : 64
    sample block cnt. : 63
    max. sample block cnt. : 128
    sample block cnt. : 127
    max. sample block cnt. : 256
    sample block cnt. : 255
    max. sample block cnt. : 512
    sample block cnt. : 511
    max. sample block cnt. : 1024
    sample block cnt. : 1023
    max. sample block cnt. : 2048
    sample block cnt. : 2047
    max. sample block cnt. : 4096
    sample block cnt. : 4095
    max. sample block cnt. : 8192
    sample block cnt. : 8191
Level 10
    max. sample block cnt. : 4294967295
    sample block cnt. : 11565

Cursor level
============
No sampling at level 1
Level 2
    max. sample block cnt. : 64
    sample block cnt. : 63
    max. sample block cnt. : 64
    sample block cnt. : 63
    max. sample block cnt. : 64
    sample block cnt. : 63
    max. sample block cnt. : 64
    sample block cnt. : 63
    max. sample block cnt. : 128
    sample block cnt. : 127
    max. sample block cnt. : 256
    sample block cnt. : 255
    max. sample block cnt. : 1024
    sample block cnt. : 1023
    max. sample block cnt. : 4096
    sample block cnt. : 4095
Level 10
    max. sample block cnt. : 4294967295
    sample block cnt. : 11565


You’ll notice that the cursor level example didn’t do any sampling at level 1. Although the manual doesn’t quite make it clear, sampling will only occur if three conditions are met:

  • The table has no statistics
  • The table has no indexes
  • The table is involved in a join so that a sample could affect the join order and method

If only the first two conditions are met then the execution path will be a full tablescan whatever the sample looks like and the number of rows returned has no further impact as far as the optimizer is concerned – hence the third requirement (which doesn’t get mentioned explicitly in the manuals). If you do have a query that meets all three requirements then the sample size is 32 (31) blocks.

 

January 1, 2014

NVL()

Filed under: Oracle,Performance — Jonathan Lewis @ 6:11 pm BST Jan 1,2014

Here’s a point that’s probably fairly well-known, but worth repeating: nvl() and coalesce() aren’t identical in behaviour but you may have some cases where you’re using nvl() when coalesce() would be a more efficient option.

The reason for this is “short-circuiting”. The expression nvl(expr1, expr2) will return expr2 if expr1 is null, otherwise it will return expr1; the expression coalesce(expr1, expr2, …, exprN) will return the first non-null expression in the list so, in particular, coalesce(expr1, expr2) will give the same result as nvl(expr1, expr2) ; the big difference is that nvl() will evaluate both expressions while coalesce will evaluate expr2 only if it needs to (i.e. only if expr1 evaluates to null). In many cases the difference in performance will be insignificant – but consider the following type of construct (t1 is a table with a single, numeric, column n1 and a single row):

(more…)

December 11, 2013

Null Quiz

Filed under: Oracle,Performance,Troubleshooting,Tuning — Jonathan Lewis @ 6:42 pm BST Dec 11,2013

Here’s an example I saw a few months ago of the confusion caused by NULL. As the owner of the problem put it: the first query, run from SQL*Plus for testing purposes, takes no time to complete; but when “put into a pl/sql cursor” (as shown in the second query) it takes ages to complete.

(more…)

August 13, 2013

NVL()

Filed under: CBO,Conditional SQL,Execution plans,NULL,Oracle — Jonathan Lewis @ 7:14 am BST Aug 13,2013

Here’s a little detail about how the optimizer can handle the nvl() function that I hadn’t noticed before (and it goes back to at least 8i). This is running on 11.2.0.3, and table t1 is just all_objects where rownum <= 20000:

(more…)

January 31, 2011

ANSI Outer

Filed under: ANSI Standard,CBO,Execution plans,lateral view,Oracle — Jonathan Lewis @ 6:59 pm BST Jan 31,2011

Here’s an example of ANSI SQL that does something in a fashion that arguably looks a little tidier than the strategy you have to adopt in Oracle. As so often when I compare Oracle syntax and ANSI syntax it’s an example that relates to an outer join. We start with two tables – as usual I have locally managed tablespaces, 8KB blocks, 1MB uniform extents and freelist management. I’ve also disabled system statistics (CPU costing):

(more…)

February 23, 2010

Dynamic Sampling

Filed under: CBO,Hints,Oracle — Jonathan Lewis @ 7:41 pm BST Feb 23,2010

If you read the manual pages about “dynamic sampling” it’s easy to get just a little lost in the detail; so this is a brief overview of the variations in the strategies used.

(more…)

June 30, 2009

Connect By

Filed under: Infrastructure — Jonathan Lewis @ 9:22 am BST Jun 30,2009

If you have to deal with hierarchical queries (or “connect by” queries, as they’re commonly known) you might be interested in this note on the OTN Forum where someone has a problem with repeated executions of the same query (same inputs, same results, same execution plan) operating at wildly different speeds with amazing changes in workload.

Ultimately the solution was the same as for the more obvious question: “why does my sort sometimes take much longer than usual?”  – shortage of memory (in this case a very low setting for the pga_aggregate_target).

This example is an interesting variation, though, as it demonstrates how the availability of resources can make Oracle choose to execute different parts of an execution plan – giving another reason for the “conditional plan” strategy I’ve described in the past.

There’s an interesting post on hierarchical queries on Christian Antognini’s blog (which I’ve also referenced from the OTN posting)

Next Page »

Powered by WordPress.com.