Oracle Scratchpad

October 5, 2018

Join Cardinality – 2

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 3:37 pm BST Oct 5,2018

In the previous note I posted about Join Cardinality I described a method for calculating the figure that the optimizer would give for the special case where you had a query that:

  • joined two tables
  • used a single-column to join on equality
  • had no nulls in the join columns
  • had a perfect frequency histogram on the columns at the two ends of the join
  • had no filter predicates associated with either table

The method simply said: “Match up rows from the two frequency histograms, multiply the corresponding frequencies” and I supplied a simple SQL statement that would read and report the two sets of histogram data, doing the arithmetic and reporting the final cardinality for you. In an update I also added an adjustment needed in 11g (or, you might say, removed in 12c) where gaps in the histograms were replaced by “ghost rows” with a frequency that was half the lowest frequency in the histogram.

This is a nice place to start as the idea is very simple, and it’s likely that extensions of the basic idea will be used in all the other cases we have to consider. There are 25 possibilities that could need separate testing – though only 16 of them ought to be relevant from 12c onwards. Oracle allows for four kinds of histograms – in order of how precisely they describe the data they are:

  • Frequency – with a perfect description of the data
  • Top-N (a.k.a. Top-Frequency) – which describes all but a tiny fraction (ca. one bucket’s worth) of data perfectly
  • Hybrid – which can (but doesn’t usually, by default) describe up to 2,048 popular values perfectly and gives an approximate distribution for the rest
  • Height-balanced – which can (but doesn’t usually, by default) describe at most 1,024 popular values with some scope for misinformation.

Finally, of course, we have the general case of no histogram, using only 4 numbers (low value, high value, number of rows, number of distinct values) to give a rough picture of the data – and the need for histograms appears, of course, when the data doesn’t look anything like an even distribution of values between the low and high with close to “number of rows”/”number of distinct values” for each value.

So there are 5 possible statistical descriptions for the data in a column – which means there are 5 * 5 = 25 possible options to consider when we join two columns, or 4 * 4 = 16 if we label height-balanced histograms as obsolete and ignore them (which would be a pity because Chinar has done some very nice work explaining them).

Of course, once we’ve worked out a single-column equijoin between two tables there are plenty more options to consider:  multi-column joins, joins involving range-based predicates, joins involving more than 2 tables, and queries which (as so often happens) have predicates which aren’t involved in the joins.

For the moment I’m going to stick to the simplest case – two tables, one column, equality – and comment on the effects of filter predicates. It seems to be very straightforward as I’ll demonstrate with a new model

rem
rem     Script:         freq_hist_join_03.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2018
rem

execute dbms_random.seed(0)

create table t1(
        id      number(8,0),
        n0040   number(4,0),
        n0090   number(4,0),
        n0190   number(4,0),
        n0990   number(4,0),
        n1      number(4,0)
)
;

create table t2(
        id      number(8,0),
        n0050   number(4,0),
        n0110   number(4,0),
        n0230   number(4,0),
        n1150   number(4,0),
        n1      number(4,0)
)
;

insert into t1
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        mod(rownum,   40) + 1                   n0040,
        mod(rownum,   90) + 1                   n0090,
        mod(rownum,  190) + 1                   n0190,
        mod(rownum,  990) + 1                   n0990,
        trunc(30 * abs(dbms_random.normal))     n1
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 -- > comment to avoid WordPress format issue
;

insert into t2
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        mod(rownum,   50) + 1                   n0050,
        mod(rownum,  110) + 1                   n0110,
        mod(rownum,  230) + 1                   n0230,
        mod(rownum, 1150) + 1                   n1150,
        trunc(30 * abs(dbms_random.normal))     n1
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 -- > comment to avoid WordPress format issue
;

begin
        dbms_stats.gather_table_stats(
                ownname => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1 for columns n1 size 254'
        );
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T2',
                method_opt  => 'for all columns size 1 for columns n1 size 254'
        );
end;
/

You’ll notice that in this script I’ve created empty tables and then populated them. This is because of an anomaly that appeared in 18.3 when I used “create as select”, and should allow the results from 18.3 be an exact match for 12c. You don’t need to pay much attention to the Nxxx columns, they were there so I could experiment with a few variations in the selectivity of filter predicates.

Given the purpose of the demonstration I’ve gathered histograms on the column I’m going to use to join the tables (called n1 in this case), and here are the summary results:


TABLE_NAME           COLUMN_NAME          HISTOGRAM       NUM_DISTINCT NUM_BUCKETS
-------------------- -------------------- --------------- ------------ -----------
T1                   N1                   FREQUENCY                119         119
T2                   N1                   FREQUENCY                124         124

     VALUE  FREQUENCY  FREQUENCY      PRODUCT
---------- ---------- ---------- ------------
         0       2488       2619    6,516,072
         1       2693       2599    6,999,107
         2       2635       2685    7,074,975
         3       2636       2654    6,995,944
...
       113          1          3            3
       115          1          2            2
       116          4          3           12
       117          1          1            1
       120          1          2            2
                                 ------------
sum                               188,114,543

We’ve got frequencyy histograms, and we can see that they don’t have a perfect overlap. I haven’t printed every single line from the cardinality query, just enough to show you the extreme skew, a few gaps, and the total. So here are three queries with execution plans:


set serveroutput off

alter session set statistics_level = all;
alter session set events '10053 trace name context forever';

select
        count(*)
from
        t1, t2
where
        t1.n1 = t2.n1
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

select
        count(*)
from
        t1, t2
where
        t1.n1 = t2.n1
and     t1.n0990 = 20
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));


select
        count(*)
from
        t1, t2
where
        t1.n1 = t2.n1
and     t1.n0990 = 20
and     t2.n1150 = 25
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

I’ve queried the pure join – the count was exactly the 188,114,543 predicted by the cardinality query, of course – then I’ve applied a filter to one table, then to both tables. The first filter n0990 = 20 will (given the mod(,990)) definition identify one row in 990 from the original 100,000 in t1; the second filter n1150 = 25 will identify one row in 1150 from t2. That’s filtering down to 101 rows and 87 rows respectively from the two tables. So what do we see in the plans:


-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:23.47 |     748 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:23.47 |     748 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |    188M|    188M|00:00:23.36 |     748 |  6556K|  3619K| 8839K (0)|
|   3 |    TABLE ACCESS FULL| T1   |      1 |    100K|    100K|00:00:00.01 |     374 |       |       |          |
|   4 |    TABLE ACCESS FULL| T2   |      1 |    100K|    100K|00:00:00.01 |     374 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"="T2"."N1")



-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:00.02 |     748 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.02 |     748 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |    190K|    200K|00:00:00.02 |     748 |  2715K|  2715K| 1647K (0)|
|*  3 |    TABLE ACCESS FULL| T1   |      1 |    101 |    101 |00:00:00.01 |     374 |       |       |          |
|   4 |    TABLE ACCESS FULL| T2   |      1 |    100K|    100K|00:00:00.01 |     374 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"="T2"."N1")
   3 - filter("T1"."N0990"=20)



-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:00.01 |     748 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.01 |     748 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |    165 |    165 |00:00:00.01 |     748 |  2715K|  2715K| 1678K (0)|
|*  3 |    TABLE ACCESS FULL| T2   |      1 |     87 |     87 |00:00:00.01 |     374 |       |       |          |
|*  4 |    TABLE ACCESS FULL| T1   |      1 |    101 |    101 |00:00:00.01 |     374 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"="T2"."N1")
   3 - filter("T2"."N1150"=25)
   4 - filter("T1"."N0990"=20)


The first execution plan shows an estimate of 188M rows – but we’ll have to check the trace file to confirm whether that’s only an approximate match to our calculation, or whether it’s an exact match. So here’s the relevant pair of lines:


Join Card:  188114543.000000 = outer (100000.000000) * inner (100000.000000) * sel (0.018811)
Join Card - Rounded: 188114543 Computed: 188114543.000000

Yes, the cardinality calculation and the execution plan estimates match perfectly. But there are a couple of interesting things to note. First, Oracle seems to be deriving the cardinality by multiplying the individual cardinalities of the two tables with a figure it calls “sel” – the thing that Chinar Aliyev has labelled Jsel the “Join Selectivity”. Secondly, Oracle can’t do arithmetic (or, removing tongue from cheek) the value it’s reported for the join selectivity is reported at only 6 decimal places, but stored to far more. What is the Join Selectivity, though ? It’s the figure we derive from the cardinality SQL divided by the cardinality of the cartesian join of the two tables – i.e. 188,114,543 / (100,000 * 100,000).

With the clue from the first trace file, can we work out why the second and third plans show 190K and 165 rows respectively. How about this – multiply the filtered cardinalities of the two separate tables, then multiply the result by the join selectivity:

  • 1a)   n0990 = 20: gives us 1 row in every 990.    100,000 / 990 = 101.010101…    (echoing the rounded execution plan estimate).
  • 1b)   100,000 * (100,000/990) * 0.0188114543 = 190,014.69898989…    (which is in the ballpark of the plan and needs confirmation from the trace file).

 

  • 2a)   n1150 = 25: gives us 1 row in every 1,150.    100,000 / 1,150 = 86.9565217…    (echoing the rounded execution plan estimate)
  • 2b)   (100,000/990) * (100,000/1,150) * 0.0188114543 = 165.2301651..    (echoing the rounded execution plan estimate).

Cross-checking against extracts from the 10053 trace files:


Join Card:  190014.689899 = outer (101.010101) * inner (100000.000000) * sel (0.018811)
Join Card - Rounded: 190015 Computed: 190014.689899

Join Card:  165.230165 = outer (86.956522) * inner (101.010101) * sel (0.018811)
Join Card - Rounded: 165 Computed: 165.230165

Conclusion.

Remembering that we’re still looking at very simple examples with perfect frequency histograms: it looks as if we can work out a “Join Selectivity” (Jsel) – the selectivity of a “pure” unfiltered join of the two tables – by querying the histogram data then use the resulting value to calculate cardinalities for simple two-table equi-joins by multiplying together the individual (filtered) table cardinality estimates and scaling by the Join Selectivity.

Acknowledgements

Most of this work is based on a document written by Chinar Aliyev in 2016 and presented at the Hotsos Symposium the same year. I am most grateful to him for responding to a recent post of mine and getting me interested in spending some time to get re-acquainted with the topic. His original document is a 35 page pdf file, so there’s plenty more material to work through, experiment with, and write about.

 

October 3, 2018

Join Cardinality

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 12:01 pm BST Oct 3,2018

Following up my “Hacking for Skew” article from a couple of days ago, Chinar Aliyev has written an article about a method for persuading the optimizer to calculate the correct cardinality estimate without using any undocumented, or otherwise dubious, mechanisms. His method essentially relies on the optimizer’s mechanism for estimating join cardinality when there are histograms at both ends of the join, so I thought I’d write a short note describing the simplest possible example of the calculation – an example where the query is a single column equi-join with no nulls in either column and a perfect frequency histograms at both ends of the join.  (For a detailed description of more general cases I always refer to the work done by Alberto Dell’Era a few years ago). We start with two data sets that exhibit a strong skew in their data distributions:

rem
rem     Script:         freq_hist_join_02.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2018
rem

execute dbms_random.seed(0)

create table t1
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        trunc(3 * abs(dbms_random.normal))      n1
from
        generator       v1
;

create table t2
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        trunc(3 * abs(dbms_random.normal))      n1
from
        generator       v1
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 254'
        );
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T2',
                method_opt  => 'for all columns size 254'
        );
end;
/


I’ve generated two tables of 10,000 randomly generated values using the dbms_random.normal() function, but I’ve scaled the value up by a factor of three and taken the absolute value – which has given me a range of 12 distinct integer values with a nicely skewed distribution. Then I’ve gathered stats requesting histograms of up to 254 buckets. Since I’ve tested this only on versions from 11.2.0.4 onwards this means I’ll get a perfect histogram on the n1 columns on both tables.

Now I’m going run a query that reports the values and frequencies from the two tables by querying user_tab_histograms using a variant of an analytic query I published a long time ago to convert the cumulative frequencies recorded as the endpoint values into simple frequencies. If, for some reason, this query doesn’t run very efficiently in your tests you could always /*+ materialize */ the two factored subqueries (CTEs – common table expressions):


prompt  =======================================================================
prompt  Multiply and sum matching frequencies. An outer join is NOT needed
prompt  because rows that don't match won't contributed to the join cardinality
prompt  =======================================================================

break on report skip 1
compute sum of product on report
column product format 999,999,999

with f1 as (
select
        endpoint_value                                                            value,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) frequency
from
        user_tab_histograms
where
        table_name  = 'T1'
and     column_name = 'N1'
order by
        endpoint_value
),
f2 as (
select
        endpoint_value                                                            value,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) frequency
from
        user_tab_histograms
where
        table_name  = 'T2'
and     column_name = 'N1'
order by
        endpoint_value
)
select
        f1.value, f1.frequency, f2.frequency, f1.frequency * f2.frequency product
from
        f1, f2
where
        f2.value = f1.value
;


     VALUE  FREQUENCY  FREQUENCY      PRODUCT
---------- ---------- ---------- ------------
         0       2658       2532    6,730,056
         1       2341       2428    5,683,948
         2       1828       1968    3,597,504
         3       1305       1270    1,657,350
         4        856        845      723,320
         5        513        513      263,169
         6        294        249       73,206
         7        133        117       15,561
         8         40         54        2,160
         9         23         17          391
        10          5          5           25
        11          4          2            8
                                 ------------
sum                                18,746,698

As you can see, the two columns do have a highly skewed data distribution. The pattern of the two data sets is similar though the frequencies aren’t identical, of course. The total I get from this calculation is (I claim) the cardinality (rows) estimate that the optimizer will produce for doing an equi-join on these two tables – so let’s see the test:


set serveroutput off
alter session set statistics_level = all;
alter session set events '10053 trace name context forever';

select
        count(*)
from
        t1, t2
where
        t1.n1 = t2.n1
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
alter session set statistics_level = typical;
alter session set events '10053 trace name context off';

And the resulting output:

Session altered.
Session altered.


  COUNT(*)
----------
  18746698


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0wxytnyqs4b5j, child number 0
-------------------------------------
select  count(*) from  t1, t2 where  t1.n1 = t2.n1

Plan hash value: 906334482
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:03.23 |      40 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:03.23 |      40 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |     18M|     18M|00:00:02.96 |      40 |  2616K|  2616K| 2098K (0)|
|   3 |    TABLE ACCESS FULL| T1   |      1 |  10000 |  10000 |00:00:00.01 |      20 |       |       |          |
|   4 |    TABLE ACCESS FULL| T2   |      1 |  10000 |  10000 |00:00:00.01 |      20 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"="T2"."N1")

As we can see, the estimate for the hash join is “18M” which is in the right ballpark but, in its current format, isn’t entirely helpful which is why I’ve enabled the 10053 trace to get an exact figure from the trace file, and this is what we see:


***********************
Best so far:  Table#: 0  cost: 4.352468  card: 9487.000000  bytes: 28461.000000
              Table#: 1  cost: 378.482370  card: 18467968.000000  bytes: 110807808.000000
***********************

The optimizer’s estimate is exactly the sum of the products of the frequencies of matching values from the (frequency) histogram data. There is a simple rationale for this – it gets the right answer. For each row in t1 with value ‘X’ the (frequency) histogram on t2 tells Oracle how many rows will appear in the join, so multiplying the frequency of ‘X’ in t1 by the frequency of ‘X’ in t2 tells Oracle how many rows the ‘X’s will contribute to the join. Repeat for every distinct value that appears in both (frequency) histograms and sum the results.

As a refinement on this (very simple) example, let’s delete data from the two tables so that we have rows in t1 that won’t join to anything in t2, and vice versa – then re-gather stats, query the histograms, and check the new prediction. We want to check whether a value that appears in the t1 histogram contributes to the join cardinality estimate even if there are no matching values in the t2 histogram (and vice versa):


delete from t1 where n1 = 4;
delete from t2 where n1 = 6;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 254', no_invalidate=>false)
execute dbms_stats.gather_table_stats(user,'t2',method_opt=>'for all columns size 254', no_invalidate=>false)

with f1 as (
select
        endpoint_value                                                            value,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) frequency
from
        user_tab_histograms
where
        table_name  = 'T1'
and     column_name = 'N1'
order by
        endpoint_value
),
f2 as (
select
        endpoint_value                                                            value,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) frequency
from
        user_tab_histograms
where
        table_name  = 'T2'
and     column_name = 'N1'
order by
        endpoint_value
)
select
        f1.value, f1.frequency, f2.frequency, f1.frequency * f2.frequency product
from
        f1, f2
where
        f2.value = f1.value
;


set serveroutput off
alter session set statistics_level = all;
alter session set events '10053 trace name context forever';

select
        count(*)
from
        t1, t2
where
        t1.n1 = t2.n1
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
alter session set statistics_level = typical;
alter session set events '10053 trace name context off';

And the output – with a little cosmetic tidying:


856 rows deleted.
249 rows deleted.

PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.


     VALUE  FREQUENCY  FREQUENCY      PRODUCT
---------- ---------- ---------- ------------
         0       2658       2532    6,730,056
         1       2341       2428    5,683,948
         2       1828       1968    3,597,504
         3       1305       1270    1,657,350
         5        513        513      263,169
         7        133        117       15,561
         8         40         54        2,160
         9         23         17          391
        10          5          5           25
        11          4          2            8
                                 ------------
sum                                17,950,172


Session altered.
Session altered.


  COUNT(*)
----------
  17950172


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0wxytnyqs4b5j, child number 0
-------------------------------------
select  count(*) from  t1, t2 where  t1.n1 = t2.n1

Plan hash value: 906334482
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      1 |00:00:02.89 |      40 |       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:02.89 |      40 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |     17M|     17M|00:00:02.61 |      40 |  2616K|  2616K| 2134K (0)|
|   3 |    TABLE ACCESS FULL| T1   |      1 |   9144 |   9144 |00:00:00.01 |      20 |       |       |          |
|   4 |    TABLE ACCESS FULL| T2   |      1 |   9751 |   9751 |00:00:00.01 |      20 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"="T2"."N1")


From the 10053 trace file:
***********************
Best so far:  Table#: 0  cost: 4.340806  card: 9144.000000  bytes: 27432.000000
              Table#: 1  cost: 368.100010  card: 17950172.000000  bytes: 107701032.000000
***********************

You can see from the frequency histogram report that we “lost” values 4 and 6 from the report; then the total from the report matches the actual number of rows returned by the query, and the cardinality estimate in the plan is again in the right ballpark – with the trace file showing an exact match.

I’ve run this test on 11.2.0.4,  12.1.0.2,  12.2.0.1 and  18.3.0.0 (which generated a different set of random values) – and there’s an anomaly that appears in 11.2.0.4 (though maybe that should be “disappeared from”): the optimizer’s estimate for the cardinality was a little larger than the value generated in the query against user_tab_histograms. [Now explained (probably)]

Conclusion:

For an incredibly simple class of queries with perfect frequency histograms there’s a very simple way to calculate the cardinality estimate that the optimizer will predict. Match up rows from the two frequency histograms, multiply the corresponding frequencies (making sure you don’t multiply the cumulative frequencies), and sum.

This is, of course, only a tiny step in the direction of seeing how Oracle uses histograms and covers only a type of query that is probably too simple to appear in a production system, but it’s a basis on which I may build in future notes over the next few weeks.

Update (5th Oct)

The “error” in the 11g calculation irritated me a little, and I woke up this morning with an idea about the solution. In 10.2.0.4 Oracle changed the way the optimizer calculated for a predicate that used a value that did not appear in the frequency histogram: it did the arithmetic for  “half the least frequently occurring value”. So I thought I’d run up a test where for my “sum of products” query I emulated this model. I had to change my query to an “ANSI”-style full outer join, and here it is:

with f1 as (
select 
        endpoint_value                                                            value, 
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) frequency
from 
        user_tab_histograms 
where 
        table_name  = 'T1' 
and     column_name = 'N1'
),
f2 as (
select 
        endpoint_value                                                            value, 
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) frequency
from 
        user_tab_histograms 
where 
        table_name  = 'T2' 
and     column_name = 'N1'
)
select
        f1.value, f2.value,
        nvl(f1.frequency, 0)                t1_frequency,
        nvl(f2.frequency, 0)                t2_frequency,
        nvl(f1.frequency, &t1_least / 2) *
        nvl(f2.frequency, &t2_least / 2)    product
from
        f1
full outer join
        f2
on
        f2.value = f1.value
order by
        coalesce(f1.value, f2.value)
;

Running this code, and noting that the least frequent value in t1 was 4, while the least frequence in t2 was 2, I got the following results (with the 10053 trace file summary following the output)


     VALUE      VALUE T1_FREQUENCY T2_FREQUENCY      PRODUCT
---------- ---------- ------------ ------------ ------------
         0          0         2658         2532    6,730,056
         1          1         2341         2428    5,683,948
         2          2         1828         1968    3,597,504
         3          3         1305         1270    1,657,350
                    4            0          845        1,690
         5          5          513          513      263,169
         6                     294            0          294
         7          7          133          117       15,561
         8          8           40           54        2,160
         9          9           23           17          391
        10         10            5            5           25
        11         11            4            2            8
                      ------------ ------------ ------------
sum                           9144         9751   17,952,156


Join Card:  17952157.000000 = outer (9751.000000) * inner (9144.000000) * sel (0.201341)
Join Card - Rounded: 17952157 Computed: 17952157.00
 

That’s a pretty good match to the trace file result – and the difference of 1 may simply be a rounding error (despite the trace files text suggesting it is accurate to 6 d.p.)

Footnote

Following an exchange of email with Chinar Aliyev, it’s fairly clear that the “half the least frequency” can actually be derived as “table.num_rows * column.density”.

 

September 27, 2018

Column Group Catalog

Filed under: CBO,extended stats,Indexing,Oracle,Statistics — Jonathan Lewis @ 5:16 pm BST Sep 27,2018

I seem to have written a number of aricles about column groups – the rather special, and most useful, variant on extended stats. To make it as easy as possible to find the right article I’ve decided to produce a little catalogue (catalog) of all the relevant articles, with a little note about the topic each article covers. Some of the articles will link to others in the list, and there are a few items in the list from other blogs. There are also a few items which are the titles of drafts which have been hanging around for the last few years.

 

July 31, 2018

Extended Histograms

Filed under: CBO,extended stats,Histograms,Oracle,Statistics — Jonathan Lewis @ 11:05 pm BST Jul 31,2018

Today’s little puzzle comes courtesy of the Oracle-L mailing list. A table has two columns (c2 and c3), which contain only the values ‘Y’ and ‘N’, with the following distribution:


select   c2, c3, count(*)
from     t1
group by c2, c3
;

C C   COUNT(*)
- - ----------
N Y       1994
Y N      71482

2 rows selected.

The puzzle is this – how do you get the optimizer to predict a cardinality of zero (or, using its best approximation, 1) if you execute a query where the predicate is:

where   c2 = 'N' and c3 = 'N'

Here are 4 tests you might try:

  • Create simple stats (no histograms) on c2 and c3.
  • Create frequency histograms on c2 and c3
  • Create a column group (extended stats) on (c2,c3) but no histograms
  • Create a column group (extended stats) on (c2,c3) with a histogram on (c2, c3)

If you do these tests you’ll find the estimated cardinalities are (from 12.1.0.2):

  • 18,369 – derived as 73,476 / 4  … total rows over total possible combinations
  • 1,940   – derived as 73,476 * (1,994/73,476) * (71,482/73,476) … total rows * fraction where c2 = ‘N’ * fraction where c3 = ‘N’
  • 36,738 – derived as 73,476 / 2 … total rows / number of distinct combinations of (c2, c3)
  • 997      – derived as 1,994 / 2 … half the frequency of the least frequently occurring value in the histogram

The last algorithm appeared in 10.2.0.4; prior to that a “value not in frequency histogram” would have been given an estimated cardinality of 1 (which is what the person on Oracle-L wanted to see).

In fact the optimizer’s behaviour can be reverted to the 10.2.0.3 mechanism by setting fix-control 5483301 to zero (or off), either with an “alter session” call or inside the /*+ opt_param() */ hint. There is, however, another option – if you get the column stats, then immediately set them (dbms_stats.get_column_stats(), dbms_stats.set_column_stats()) the optimizer defines the stats as “user defined” and (for reasons I don’t know – perhaps it’s an oversight) reverts to the 10.2.0.3 behaviour. Here’s some code to demonstrate the point; as the srcipt header says, I’ve tested it on versions up to 18.1


rem
rem     Script:         histogram_hack_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2018
rem
rem     Last tested 
rem             18.1.0.0        via LiveSQL (with some edits)
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1
as
select 'Y' c2, 'N' c3 from all_objects where rownum <= 71482 -- > comment to avoid format issue
union all
select 'N' c2, 'Y' c3 from all_objects where rownum <= 1994 -- > comment to avoid format issue
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 10 for columns (c2,c3) size 10');

column column_name format a128 new_value m_colname

select  column_name
from    user_tab_cols
where   table_name = 'T1'
and     column_name not in ('C2','C3')
;

set autotrace traceonly explain
select /* pre-hack */ * from t1 where c2 = 'N' and c3 = 'N';
set autotrace off

declare
        l_distcnt number default null;
        l_density number default null;
        l_nullcnt number default null;
        l_srec    dbms_stats.statrec;
        l_avgclen number default null;

begin

        dbms_stats.get_column_stats (
                ownname =>user,
                tabname =>'t1',
                colname =>'&m_colname',
                distcnt =>l_distcnt,
                density =>l_density,
                nullcnt =>l_nullcnt,
                srec    =>l_srec,
                avgclen =>l_avgclen
        );

        dbms_stats.set_column_stats(
                ownname =>user,
                tabname =>'t1',
                colname =>'&m_colname',
                distcnt =>l_distcnt,
                density =>l_density,
                nullcnt =>l_nullcnt,
                srec    =>l_srec,
                avgclen =>l_avgclen
        );

end;
/

set autotrace traceonly explain
select /* post-hack */  * from t1 where c2 = 'N' and c3 = 'N';
set autotrace off

I’ve created a simple table for the data and collected stats including histograms on the two columns and on the column group. I’ve taken a simple strategy to find the name of the column group (I could have used the function dbms_stats.create_extended_stats() to set an SQL variable to the name of the column group, of course), and then run a little bit of PL/SQL that literally does nothing more than copy the column group’s stats into memory then write them back to the data dictionary.

Here are the “before” and “after” execution plans that we get from autotrace:

BEFORE
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   997 |  3988 |    23  (27)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   997 |  3988 |    23  (27)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='N' AND "C3"='N')


AFTER
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     4 |    23  (27)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |     4 |    23  (27)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='N' AND "C3"='N')

As required – the estimate for the (‘N’,’N’) rows drops down to (the optimizer’s best approximation to ) zero.

Footnote:

An alternative strategy (and, I’d say, a better strategic approach) would have been to create a “fake” frequency histogram that included the value (‘N’,’N’) giving it a frequency of 1 – a method I’ve suggested in the past  but with the little problem that you need to be able to work out the value to use in the array passed to dbms_stats.set_column_stats() to represent the value for the (‘N’,’N’) combination – and I’ve written about that topic in the past as well.

You might wonder why the optimizer is programmed to use “half the least popular” for predicates references values not in the index. Prior to 12c it’s easy to make an argument for the algorithm. Frequency histograms used to be sampled with a very small sample size, so if you were unlucky a “slightly less popular” value could be missed completely in the sample; if you were requesting a value that didn’t appear in the histogram then (presumably) you knew it should exist in the data, so guessing a cardinality somewhat less than the least popular must have seemed like a good idea.

In 12c, of course, you ought to be taking advantage of the “approximate NDV” implementation for using a 100% sample to generate frequency (and Top-N / Top-Frequency histograms). If you’ve got a 12c frequency histogram then the absence of a value in the histogram means the data really wasn’t there so a cardinality estimate of 1 makes more sense. (Of course, you might have allowed Oracle to gather the histogram at the wrong time – but that’s a different issue). If you’ve got a Top-N histogram then the optimizer will behave as if a “missing” value is one of those nominally allowed for in the “low frequency” bucket and use neither the 1 nor the “half the least popular”.

So, for 12c and columns with frequency histograms it seems perfectly reasonably to set the fix control to zero – after getting approval from Oracle support, of course.

 

July 13, 2018

pushing predicates

Filed under: CBO,Execution plans,Hints,Oracle — Jonathan Lewis @ 1:05 pm BST Jul 13,2018

I came across this odd limitation (maybe defect) with pushing predicates (join predicate push down) a few years ago that made a dramatic difference to a client query when fixed but managed to hide itself rather cunningly until you looked closely at what was going on. Searching my library for something completely different I’ve just rediscovered the model I built to demonstrate the issue so I’ve tested it against a couple of newer versions  of Oracle (including 18.1) and found that the anomaly still exists. It’s an interesting little detail about checking execution plans properly so I’ve written up the details. The critical feature of the problem is a union all view:


rem
rem	Script:		push_pred_limitation.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Jan 2015
rem
rem	Last tested 
rem		18.1.0.0	via LiveSQL
rem		12.2.0.1
rem		12.1.0.2
rem		11.2.0.4
rem

create table t1
as
select	* 
from	all_objects
where	rownum <= 10000 -- > comment to avoid WordPress format issue
;

create table t2
as
select	* 
from	all_objects
where	rownum <= 10000 -- > comment to avoid WordPress format issue
;

create table t3
as
select	* 
from	all_objects
where	rownum <= 10000 -- > comment to avoid WordPress format issue
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1 for columns owner size 254'
	);
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T2',
		method_opt	 => 'for all columns size 1'
	);
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T3',
		method_opt	 => 'for all columns size 1'
	);
end;
/

create index t2_id on t2(object_id);
-- create index t2_id_ot on t2(object_id, object_type);

create index t3_name_type on t3(object_name, object_type);

create or replace view v1
as
select 
	/*+ qb_name(part1) */
	t2.object_id,
	t2.object_type	object_type_2,
	t3.object_type	object_type_3,
	t2.created	date_2,
	t3.created	date_3
from
	t2, t3
where
	t3.object_name = t2.object_name
union all
select
	/*+ qb_name(part2) */
	t2.object_id,
	t2.object_type	object_type_2,
	t3.object_type	object_type_3,
	t2.last_ddl_time	date_2,
	t3.last_ddl_time	date_3
from
	t2, t3
where
	t3.object_name = t2.object_name
;

Two points to note so far: first, the view is basically joining the same two tables in the same way twice but selecting different columns. It’s a close model of what the client was doing but so much simpler that it wouldn’t be hard to find a different way of getting the same result: the client’s version would have been much far harder to rewrite. Secondly, I’ve listed two possible indexes for table t2 but commented one of them out. The indexing will make a difference that I’ll describe later.

So here’s the query with execution plan (from explain plan – but pulling the plan from memory gives the same result):


select
	/*+ qb_name(main) */
	t1.object_name, t1.object_type,
	v1.object_id, v1.date_2, v1.date_3
from
	t1,
	v1
where
	v1.object_id = t1.object_id
and	v1.object_type_2 = t1.object_type
and	v1.object_type_3 = t1.object_type
and	t1.owner = 'OUTLN'
;

Plan hash value: 4123301926

---------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |              |     7 |   588 |    82   (2)| 00:00:01 |
|   1 |  NESTED LOOPS                            |              |     7 |   588 |    82   (2)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL                      | T1           |     7 |   280 |    26   (4)| 00:00:01 |
|*  3 |   VIEW                                   | V1           |     1 |    44 |     8   (0)| 00:00:01 |
|   4 |    UNION ALL PUSHED PREDICATE            |              |       |       |            |          |
|   5 |     NESTED LOOPS                         |              |     1 |    77 |     4   (0)| 00:00:01 |
|   6 |      NESTED LOOPS                        |              |     1 |    77 |     4   (0)| 00:00:01 |
|   7 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2           |     1 |    41 |     2   (0)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN                  | T2_ID        |     1 |       |     1   (0)| 00:00:01 |
|*  9 |       INDEX RANGE SCAN                   | T3_NAME_TYPE |     1 |       |     1   (0)| 00:00:01 |
|  10 |      TABLE ACCESS BY INDEX ROWID         | T3           |     1 |    36 |     2   (0)| 00:00:01 |
|  11 |     NESTED LOOPS                         |              |     1 |    77 |     4   (0)| 00:00:01 |
|  12 |      NESTED LOOPS                        |              |     1 |    77 |     4   (0)| 00:00:01 |
|  13 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2           |     1 |    41 |     2   (0)| 00:00:01 |
|* 14 |        INDEX RANGE SCAN                  | T2_ID        |     1 |       |     1   (0)| 00:00:01 |
|* 15 |       INDEX RANGE SCAN                   | T3_NAME_TYPE |     1 |       |     1   (0)| 00:00:01 |
|  16 |      TABLE ACCESS BY INDEX ROWID         | T3           |     1 |    36 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."OWNER"='OUTLN')
   3 - filter("V1"."OBJECT_TYPE_2"="T1"."OBJECT_TYPE")
   8 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID")
   9 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
  14 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID")
  15 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")

The execution plan appears to be fine – we can see at operation 4 that the union all view has been access with the pushed predicate option and that the subsequent sub-plan has
used index driven nested loop joins in both branches – until we look a little more closely and examine the Predicate section of the plan. What, exactly, has been pushed ?

Look at the predicate for operation 3: “V1″.”OBJECT_TYPE_2″=”T1″.”OBJECT_TYPE”. It’s a join predicate that hasn’t been pushed into the view. On the other hand the original, and similar, join predicate v1.object_type_3 = t1.object_type has been pushed into the view, appearing at operations 9 and 15. There is a difference, of course, the object_type_3 column appears as the second column of the index on table t3.

Two questions then: (a) will the object_type_2 predicate be pushed if we add it to the relevant index on table t2, (b) is there a way to get the predicate pushed without adding it to the index. The answer to both questions is yes. First the index – re-run the test but create the alternative index on t2 and the plan changes to:

Plan hash value: 497545587

---------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |              |     7 |   553 |    82   (2)| 00:00:01 |
|   1 |  NESTED LOOPS                            |              |     7 |   553 |    82   (2)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL                      | T1           |     7 |   280 |    26   (4)| 00:00:01 |
|   3 |   VIEW                                   | V1           |     1 |    39 |     8   (0)| 00:00:01 |
|   4 |    UNION ALL PUSHED PREDICATE            |              |       |       |            |          |
|   5 |     NESTED LOOPS                         |              |     1 |    77 |     4   (0)| 00:00:01 |
|   6 |      NESTED LOOPS                        |              |     1 |    77 |     4   (0)| 00:00:01 |
|   7 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2           |     1 |    41 |     2   (0)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN                  | T2_ID_OT     |     1 |       |     1   (0)| 00:00:01 |
|*  9 |       INDEX RANGE SCAN                   | T3_NAME_TYPE |     1 |       |     1   (0)| 00:00:01 |
|  10 |      TABLE ACCESS BY INDEX ROWID         | T3           |     1 |    36 |     2   (0)| 00:00:01 |
|  11 |     NESTED LOOPS                         |              |     1 |    77 |     4   (0)| 00:00:01 |
|  12 |      NESTED LOOPS                        |              |     1 |    77 |     4   (0)| 00:00:01 |
|  13 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2           |     1 |    41 |     2   (0)| 00:00:01 |
|* 14 |        INDEX RANGE SCAN                  | T2_ID_OT     |     1 |       |     1   (0)| 00:00:01 |
|* 15 |       INDEX RANGE SCAN                   | T3_NAME_TYPE |     1 |       |     1   (0)| 00:00:01 |
|  16 |      TABLE ACCESS BY INDEX ROWID         | T3           |     1 |    36 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."OWNER"='OUTLN')
   8 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID" AND "T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
   9 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
  14 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID" AND "T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
  15 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")

Notice how the predicate at operation 3 has disappeared, and the access predicate at operation 8 now includes the predicate “T2″.”OBJECT_TYPE”=”T1″.”OBJECT_TYPE”.

Alternatively, don’t mess about with the indexes – just tell Oracle to push the predicate. Normally I would just try /*+ push_pred(v1) */ as the hint to do this, but the Outline section of the original execution plan already included a push_pred() hint that looked like this: PUSH_PRED(@”MAIN” “V1″@”MAIN” 3 1), so I first copied exactly that into the SQL to see if it would make any difference. It did – I got the following plan (and the hint in the outline changed to PUSH_PRED(@”MAIN” “V1″@”MAIN” 3 2 1) so this may be a case where the plan produced by a baseline will perform better than the plan that the produced the baseline!):

Plan hash value: 4123301926

---------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |              |     7 |   553 |    82   (2)| 00:00:01 |
|   1 |  NESTED LOOPS                            |              |     7 |   553 |    82   (2)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL                      | T1           |     7 |   280 |    26   (4)| 00:00:01 |
|   3 |   VIEW                                   | V1           |     1 |    39 |     8   (0)| 00:00:01 |
|   4 |    UNION ALL PUSHED PREDICATE            |              |       |       |            |          |
|   5 |     NESTED LOOPS                         |              |     1 |    77 |     4   (0)| 00:00:01 |
|   6 |      NESTED LOOPS                        |              |     1 |    77 |     4   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2           |     1 |    41 |     2   (0)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN                  | T2_ID        |     1 |       |     1   (0)| 00:00:01 |
|*  9 |       INDEX RANGE SCAN                   | T3_NAME_TYPE |     1 |       |     1   (0)| 00:00:01 |
|  10 |      TABLE ACCESS BY INDEX ROWID         | T3           |     1 |    36 |     2   (0)| 00:00:01 |
|  11 |     NESTED LOOPS                         |              |     1 |    77 |     4   (0)| 00:00:01 |
|  12 |      NESTED LOOPS                        |              |     1 |    77 |     4   (0)| 00:00:01 |
|* 13 |       TABLE ACCESS BY INDEX ROWID BATCHED| T2           |     1 |    41 |     2   (0)| 00:00:01 |
|* 14 |        INDEX RANGE SCAN                  | T2_ID        |     1 |       |     1   (0)| 00:00:01 |
|* 15 |       INDEX RANGE SCAN                   | T3_NAME_TYPE |     1 |       |     1   (0)| 00:00:01 |
|  16 |      TABLE ACCESS BY INDEX ROWID         | T3           |     1 |    36 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."OWNER"='TEST_USER')
   7 - filter("T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
   8 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID")
   9 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
  13 - filter("T2"."OBJECT_TYPE"="T1"."OBJECT_TYPE")
  14 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID")
  15 - access("T3"."OBJECT_NAME"="T2"."OBJECT_NAME" AND "T3"."OBJECT_TYPE"="T1"."OBJECT_TYPE")

In this case we see that the critical late-joining predicate has disappeared from operation 3 and re-appeared as a filter predicate at operation 7 In many cases you may find that the change in predicate use makes little difference to the performance – in my example the variation in run time over several executions of each query was larger than the average run time of the query; nevertheless it’s worth noting that the delayed use of the predicate could have increased the number of probes into table t3 for both branches of the union all and resulted in redundant data passing up through several layers of the call stack before being eliminated … and “eliminate early” is one of the major commandments of optimisation.

You might notice that the Plan Hash Value for the hinted execution plan is the same as for the original execution plan: the hashing algorithm doesn’t take the predicates into account (just one of many points that Randolf Geist raised in a blog post several years ago). This is one of the little details that makes it easy to miss the little changes in a plan that can make a big difference in performance.

Summary

If you have SQL that joins simple tables to set based (union all, etc.) views and you see the pushed predicate option appearing take a little time to examine the predicate section of the execution plan to see if the optimizer is pushing all the join predicates that it should and, if it isn’t, test the effects of pushing more predicates.

In many cases adding the hint /*+ push_pred(your_view_name) */ at the top of the query may be sufficient to get the predicate pushing you need, but you may need to look at the outline section of the execution plan and add a series of more complicated push_pred() and no_push_pred() hints because the push_pred hint has evolved over time to deal with increasingly complicated transformations.

 

July 2, 2018

Clustering_Factor

Filed under: CBO,Indexing,Oracle,Statistics — Jonathan Lewis @ 1:24 pm BST Jul 2,2018

Here’s another little note on the clustering_factor for an index and the table preference table_cached_blocks that can be set with a call to dbms_stats.set_table_prefs(). I might be repeating a point that someone made in a comment on an older posting but if that’s the case I can’t find the comment at present, and it’s worth its own posting anyway.

The call to dbms_stats.set_table_prefs(null,'{tablename}’,’table_cached_blocks’,N) – where N can be any integer between 1 and 255, will modify Oracle’s algorithm for calculating the clustering_factor of an index. The default is 1, which often means the clustering_factor is much higher than it ought to be from a humanly visible perspective and leads to Oracle not using an index that could be a very effective index.

The big point is this: the preference has no effect when you execute a “create index” statement, or an “alter index rebuild” statement. Here’s a simple script to demonstrate the point.


rem
rem     Script:         table_cached_blocks_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2018
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem

drop table t1 purge;
create table t1
segment creation immediate
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum-1,100)               n1,
        mod(rownum-1,100)               n2,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator
;

column blocks new_value m_blocks

select  blocks 
from    user_tables
where   table_name = 'T1'
;

column preference_value format a40

select  preference_name, preference_value
from    user_tab_stat_prefs
where
        table_name = 'T1'
;

I’ve created a very simple table of 10,000 rows with two identical columns and captured the number of blocks (which I know will be less than 256) in a substitution variable which I’m going to use in a call to set_table_prefs(). I’ve also run a quick check to show there are no table preferences set for the table. I’ll be running the same check again after setting the table_cached_blocks preference. Step 1 – create two indexes, but set the preference after building the first one; I’ve shown the result of the query against user_indexes immediately after the query:


create index t1_i1 on t1(n1);

execute dbms_stats.set_table_prefs(null,'t1','table_cached_blocks',&m_blocks)

create index t1_i2 on t1(n2);

select
        index_name, clustering_factor, to_char(last_analyzed, 'dd-mon-yyyy hh24:mi:ss') analyzed
from
        user_indexes
where
        table_name = 'T1'
order by
        index_name
;

INDEX_NAME	     CLUSTERING_FACTOR ANALYZED
-------------------- ----------------- -----------------------------
T1_I1				 10000 26-jun-2018 14:13:51
T1_I2				 10000 26-jun-2018 14:13:51


Now we check the effect of rebuilding the t1_i2 index – the one second sleep is so that we can use the last_analyzed time to see that new stats have been created for the index:


execute dbms_lock.sleep(1)
alter index t1_i2 rebuild /* online */ ;

select
        index_name, clustering_factor, to_char(last_analyzed, 'dd-mon-yyyy hh24:mi:ss') analyzed
from
        user_indexes
where
        table_name = 'T1'
order by
        index_name
;

INDEX_NAME	     CLUSTERING_FACTOR ANALYZED
-------------------- ----------------- -----------------------------
T1_I1				 10000 26-jun-2018 14:13:51
T1_I2				 10000 26-jun-2018 14:13:52


Finally we do an explicit gather_index_stats():


execute dbms_lock.sleep(1)
execute dbms_stats.gather_index_stats(null,'t1_i2')

select
        index_name, clustering_factor, to_char(last_analyzed, 'dd-mon-yyyy hh24:mi:ss') analyzed
from
        user_indexes
where
        table_name = 'T1'
order by
        index_name
;

INDEX_NAME	     CLUSTERING_FACTOR ANALYZED
-------------------- ----------------- -----------------------------
T1_I1				 10000 26-jun-2018 14:13:51
T1_I2				   179 26-jun-2018 14:13:53

At last – on the explicit call to gather stats – the table_cached_blocks preference is used.

Dire Threat

Think about what this means: you’ve carefully worked out that a couple of indexes really need a special setting of table_cached_blocks and you gathered stats on those indexes so you have a suitable value for the clustering_factor. Then, one night, someone decides that they’re going to rebuild some of those indexes. The following morning the clustering_factor is much higher and a number of critical execution plans change as a consequence, and don’t revert until the index statistics (which are perfectly up to date!) are re-gathered.

Footnote

The same phenomenon appears even when you’ve set the global preference for stats collection with dbms_stats.set_global_prefs().

June 23, 2018

Cursor_sharing force

Filed under: CBO,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 12:05 pm BST Jun 23,2018

Prompted by a recent ODC (OTN) question I’ve just written up an example of one case where setting the cursor_sharing parameter to force doesn’t work as you might expect. It’s a specific example of what I believe is a theme that can appear in several different circumstances: if your SQL mixes “genuine” bind variable with literals then the literals may not be substituted.

Here’s a simple data set to start with:


rem
rem     Script:         cursor_sharing_limit.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2018
rem     Purpose:
rem
rem     Last tested
rem             18.1.0.0
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1
as
select
        rownum            n1,
        rownum            n2,
        lpad(rownum,10)   small_vc,
        rpad('x',100,'x') padding
from dual
connect by
        level <= 1e4 -- > comment to avoid WordPress format issue
;

I haven’t bothered to gather stats – it’s not needed in trivial cases of CTAS.

Here’s one way to execute SQL that doesn’t get literal substitution when (apparently) it should:


alter system flush shared_pool;
alter session set cursor_sharing=force;

declare
        m_ct number;
        m_n1 number := 20;
begin
        execute immediate
                'select /*+ trace this */ count(*) from t1 where n2 = 15 and n1 = :b1'
                into m_ct using m_n1;
        dbms_output.put_line(m_ct);

        execute immediate
                'select /*+ trace this too */ count(*) from t1 where n1 = 15 and n2 = 15'
                into m_ct;
        dbms_output.put_line(m_ct);
end;
/

alter session set cursor_sharing=exact;

I’ve included a “hint” that allows me to find the SQL statements in v$sql very easily – and here they are, preceded by the query I used to find them:


select  sql_id, parse_calls, executions, rows_processed, sql_text 
from    v$sql
where   sql_text like 'select%trace this%'
and     sql_text not like '%v$sql%'
;


SQL_ID        PARSE_CALLS EXECUTIONS ROWS_PROCESSED
------------- ----------- ---------- --------------
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------------------
cbu4s78h5pfj5           1          1              1
select /*+ trace this too */ count(*) from t1 where n1 = :"SYS_B_0" and n2 = :"SYS_B_1"

cru67sufptx8x           1          1              1
select /*+ trace this */ count(*) from t1 where n2 = 15 and n1 = :b1

Notice how one statement – which used only literals – underwent bind variable transformation but the other – with its mix of literals and bind variables – didn’t. On the other hand, if I execute the mixed format statement from a basic SQL*Plus session then pull the plan from memory, this is what I get:


SQL> select /*+ SQL*Plus session */ count(*) from t1 where n2 = 15 and n1 = :b1;

  COUNT(*)
----------
         0

1 row selected.

SQL> select  sql_id, parse_calls, executions, rows_processed, sql_text
  2  from    v$sql
  3  where   sql_text like 'select%Plus session%'
  4  and     sql_text not like '%v$sql%'
  5  ;

SQL_ID        PARSE_CALLS EXECUTIONS ROWS_PROCESSED
------------- ----------- ---------- --------------
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------------------
gq2qy2a9yuta7           1          1              1
select /*+ SQL*Plus session */ count(*) from t1 where n2 = :"SYS_B_0" and n1 = :b1

When trying to take advantage of cursor_sharing = force, you have to think about how the SQL gets to the database engine – is it delivered directly, or is it sent through some layer of code that means the raw SQL statement is protected in some way from the substitution code.

Footnote:

I happen to have used an example that puts the bind variable into the where clause; you will see the same effect even if the bind variables are in the select list – for example if you’ve selected something like to_char(date_col, :formatting_variable).

May 30, 2018

Index Bouncy Scan 3

Filed under: 12c,Bugs,CBO,Oracle — Jonathan Lewis @ 1:15 pm BST May 30,2018

This is a follow-up to a problem I had with yesterday’s example of using recursive CTEs to “bounce” along a multi-column index to pick out the unique set of combinations of the first two columns. Part of the resulting query used a pair of aggregate scalar subqueries in a select list – and Andrew Sayer improved on my query by introducing a “cross apply” (which I simply hadn’t thought of) which the optimizer transformed into a lateral view (which I had thought of, but couldn’t get to work).

After seeing what the Andrew and the optimizer had done I looked a little more closely at my lateral view experiment and modified it so that it worked. Here are the three critical versions of the relevant code fragment; first is my original code, then Andrew’s cross apply, then my working lateral view version:

select
        (select min(t1.val1) val1 from t1 where t1.val1 > bounce1.val1) val1,
        (select min(t1.val2) val2 from t1 where t1.val1 > bounce1.val1 and rownum = 1) val2
from    bounce1
where   bounce1.val1 is not null
 
 
select
        ca.val1 ,ca.val2
from    bounce1
cross  apply (select val1, val2
              from  (select /*+ index(t1) no_index_ffs(t1) */
                             val1, val2
                     from    t1
                     where   t1.val1 > bounce1.val1
                     and     rownum = 1
                    )
             ) ca
where  bounce1.val1 is not null
 
----

select
        ca.val1 ,ca.val2
from    bounce1, 
        lateral(select val1, val2
              from  (select /*+ index(t1) no_index_ffs(t1) */
                             val1, val2
                     from    t1
                     where   t1.val1 > bounce1.val1
                     and     rownum = 1
                    )
             ) ca
where  bounce1.val1 is not null

All I’ve done to modify Andrew’s code is put a comma after the table (actually CTE) bounce1, then change “cross apply” to “lateral”. Compare the resulting text with the following lateral version that doesn’t work:


select
        ca.val1 ,ca.val2
from    bounce1, 
        lateral (
                   select /*+ index(t1) no_index_ffs(t1) */
                             val1, val2
                     from    t1
                     where   t1.val1 > bounce1.val1
                     and     rownum = 1
             ) ca
where  bounce1.val1 is not null

To get from not working to working all I’ve done is wrap the text in my lateral() subquery inside one more (apparently redundant) layer of “select * from ()”!

In fact my assumption that my code wasn’t working was incorrect – what was really going on was that the code I had written was producing the wrong results but I thought that I had made a mistake in the way I was writing it and couldn’t figure out what I had done wrong.

Problem Solving:

To get a better idea of what’s going on, I took a closer look at the execution plans. Here are the plans (main body only) for the two variants of using the lateral() view – the first from the SQL with the “redundant” select, the second as I originally wrote it. Notice that the number of rows (A-Rows) returned in the first case is the 30 expected while in the second case it’s only 10.


---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                 |      1 |        |   125 (100)|     30 |00:00:00.01 |      40 |     28 |       |       |          |
|   1 |  SORT ORDER BY                               |                 |      1 |      4 |   125   (3)|     30 |00:00:00.01 |      40 |     28 |  2048 |  2048 | 2048  (0)|
|*  2 |   VIEW                                       |                 |      1 |      4 |   124   (2)|     30 |00:00:00.01 |      40 |     28 |       |       |          |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST  |                 |      1 |        |            |     33 |00:00:00.01 |      40 |     28 |  1024 |  1024 |          |
|*  4 |     VIEW                                     |                 |      1 |      2 |    61   (2)|      3 |00:00:00.01 |       8 |      4 |       |       |          |
|   5 |      UNION ALL (RECURSIVE WITH) BREADTH FIRST|                 |      1 |        |            |      3 |00:00:00.01 |       8 |      4 |  1024 |  1024 |          |
|*  6 |       VIEW                                   |                 |      1 |      1 |    29   (0)|      1 |00:00:00.01 |       2 |      1 |       |       |          |
|*  7 |        WINDOW NOSORT STOPKEY                 |                 |      1 |  10000 |    29   (0)|      1 |00:00:00.01 |       2 |      1 | 73728 | 73728 |          |
|   8 |         INDEX FULL SCAN                      | T1_PK           |      1 |  10000 |    29   (0)|      2 |00:00:00.01 |       2 |      1 |       |       |          |
|   9 |       NESTED LOOPS                           |                 |      3 |      1 |    31   (0)|      2 |00:00:00.01 |       6 |      3 |       |       |          |
|  10 |        RECURSIVE WITH PUMP                   |                 |      3 |        |            |      3 |00:00:00.01 |       0 |      0 |       |       |          |
|  11 |        VIEW                                  | VW_LAT_1BBF5C63 |      3 |      1 |     2   (0)|      2 |00:00:00.01 |       6 |      3 |       |       |          |
|  12 |         VIEW                                 |                 |      3 |      1 |     2   (0)|      2 |00:00:00.01 |       6 |      3 |       |       |          |
|* 13 |          COUNT STOPKEY                       |                 |      3 |        |            |      2 |00:00:00.01 |       6 |      3 |       |       |          |
|* 14 |           INDEX RANGE SCAN                   | T1_PK           |      3 |      1 |     2   (0)|      2 |00:00:00.01 |       6 |      3 |       |       |          |
|  15 |     SORT AGGREGATE                           |                 |     30 |      1 |            |     30 |00:00:00.01 |      32 |     24 |       |       |          |
|  16 |      FIRST ROW                               |                 |     30 |      1 |     2   (0)|     27 |00:00:00.01 |      32 |     24 |       |       |          |
|* 17 |       INDEX RANGE SCAN (MIN/MAX)             | T1_PK           |     30 |      1 |     2   (0)|     27 |00:00:00.01 |      32 |     24 |       |       |          |
|  18 |     RECURSIVE WITH PUMP                      |                 |     11 |        |            |     30 |00:00:00.01 |       0 |      0 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------


------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                 |      1 |        |   125 (100)|     10 |00:00:00.01 |      16 |       |       |          |
|   1 |  SORT ORDER BY                               |                 |      1 |      4 |   125   (3)|     10 |00:00:00.01 |      16 |  2048 |  2048 | 2048  (0)|
|*  2 |   VIEW                                       |                 |      1 |      4 |   124   (2)|     10 |00:00:00.01 |      16 |       |       |          |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST  |                 |      1 |        |            |     11 |00:00:00.01 |      16 |  1024 |  1024 |          |
|*  4 |     VIEW                                     |                 |      1 |      2 |    61   (2)|      1 |00:00:00.01 |       4 |       |       |          |
|   5 |      UNION ALL (RECURSIVE WITH) BREADTH FIRST|                 |      1 |        |            |      1 |00:00:00.01 |       4 |  1024 |  1024 |          |
|*  6 |       VIEW                                   |                 |      1 |      1 |    29   (0)|      1 |00:00:00.01 |       2 |       |       |          |
|*  7 |        WINDOW NOSORT STOPKEY                 |                 |      1 |  10000 |    29   (0)|      1 |00:00:00.01 |       2 | 73728 | 73728 |          |
|   8 |         INDEX FULL SCAN                      | T1_PK           |      1 |  10000 |    29   (0)|      2 |00:00:00.01 |       2 |       |       |          |
|   9 |       NESTED LOOPS                           |                 |      1 |      1 |    31   (0)|      0 |00:00:00.01 |       2 |       |       |          |
|  10 |        RECURSIVE WITH PUMP                   |                 |      1 |        |            |      1 |00:00:00.01 |       0 |       |       |          |
|* 11 |        VIEW                                  | VW_DCL_1BBF5C63 |      1 |      1 |     2   (0)|      0 |00:00:00.01 |       2 |       |       |          |
|* 12 |         COUNT STOPKEY                        |                 |      1 |        |            |      1 |00:00:00.01 |       2 |       |       |          |
|  13 |          INDEX FULL SCAN                     | T1_PK           |      1 |      1 |     2   (0)|      1 |00:00:00.01 |       2 |       |       |          |
|  14 |     SORT AGGREGATE                           |                 |     10 |      1 |            |     10 |00:00:00.01 |      12 |       |       |          |
|  15 |      FIRST ROW                               |                 |     10 |      1 |     2   (0)|      9 |00:00:00.01 |      12 |       |       |          |
|* 16 |       INDEX RANGE SCAN (MIN/MAX)             | T1_PK           |     10 |      1 |     2   (0)|      9 |00:00:00.01 |      12 |       |       |          |
|  17 |     RECURSIVE WITH PUMP                      |                 |     11 |        |            |     10 |00:00:00.01 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------

Most importantly we can see that the optimizer has used two different transformations. For the working query we see the view name VW_LAT_xxxxxxxx at operation 11, this is Oracle implementing a lateral view; for the problem query we see the view name VW_DCL_xxxxxxxx at operation 11, which is Oracle implementing a transformation to a “decorrelated lateral view”.

My first test after noting this difference was to see what would happen in I added the hint /*+ no_query_transformation */ to the query: it resulted in the VW_DCL_xxxxxxxx view name changing to VW_LAT_xxxxxxxx and the query producing the right result. Andrew Sayer, on the ODC thread, then pointed out that he’d done a couple more experiments and used the /*+ no_decorrelate() */ hint so I tried that with my query, adding it (with no parameters) to the subquery inside the lateral() clause – again the plan changed from using VW_DCL to VW_LAT and the results were correct.

Test Case

Bottom line on this – it looks like the optimizer is decorrelating a subquery when it shouldn’t, leading to wrong results. To make it easier to see this anomaly I stripped the original sample down to a basic test case starting with the table that I used in the previous posting:

rem
rem     Script:         decorralate.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem
rem     Last tested 
rem             18.1.0.0  -- via liveSQL
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1
segment creation immediate
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum-1,3)                 val1,
        mod(rownum-1,10)                val2,
        lpad('x',100,'x')               padding
from
        generator       v1
order by
        dbms_random.value
;

begin
        dbms_stats.gather_table_stats(
                ownname     => user,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter table t1 add constraint t1_pk primary key(val1, val2, id);

Now two versions of a simplified piece of code that should select the distinct values of val1 greater than the lowest value (each row in the UNION ALL of dual is emulating the way in which yesterday’s recursive CTE was effectively saying “this is a current known value, find the next higher”):


prompt  =============
prompt  Right results
prompt  =============

select
        v1.val1, v1.val2
from    (
        select  0 val1, 0 val2 from dual
        union all
        select 1,0 from dual
        union all
        select 2,0 from dual
        ) bounce1,
        lateral (
            select val1, val2 from (
              select  /*+ index(t1) no_index_ffs(t1) */
                      t1.val1, t1.val2
              from    t1
              where   t1.val1 > bounce1.val1
              and     rownum = 1
            )
        ) v1
;

prompt  ===========================================
prompt  Wrong results -- "redundant" select removed
prompt  ===========================================

select
        v1.val1, v1.val2
from    (
        select  0 val1, 0 val2 from dual
        union all
        select 1,0 from dual
        union all
        select 2,0 from dual
        ) bounce1,
        lateral (
            -- select val1, val2 from (
              select  /*+ index(t1) no_index_ffs(t1) */
                      t1.val1, t1.val2
              from    t1
              where   t1.val1 > bounce1.val1
              and     rownum = 1
            -- )
        ) v1
;

Here’s a cut-n-paste from running the two queries:


=============
Right results
=============

      VAL1       VAL2
---------- ----------
         1          0
         2          0

2 rows selected.

============================================
Wrong results  -- "redundant" select removed
============================================

no rows selected

Finally, to get an idea of what’s gone wrong – and to show that the optimizer has done something wrong when attempting to decorrelate – we can take a look at the optimizer trace file to see the final transformed SQL that the optimizer has produced a plan for. (I enabled the trace with the command “alter session set events ‘trace [rdbms.SQL_Transform.*]’;” to limit the trace to just the information about optimizer transformations.) This – cosmetically altered – is the final “unparsed” query:

select 
        vw_dcl_a18161ff.val1 val1,
        vw_dcl_a18161ff.val2 val2 
from    ( 
                (select 0 val1 from sys.dual dual) 
                union all  
                (select 1 1 from sys.dual dual) 
                union all  
                (select 2 2 from sys.dual dual)
        ) bounce1, 
        (
        select
                 /*+ no_index_ffs (t1) index (t1) */ 
                t1.val1 val1_0,
                t1.val2 val2_1 
        from
                test_user.t1 t1
        where 
                rownum = 1
        ) vw_dcl_a18161ff 
where 
        vw_dcl_a18161ff.val1 > bounce1.val1

As you can see, the lateral view has turned into a non-mergeable inline view which selects the first row available from t1 by following the supplied hints, and joins that single row result set to bounce1. I have a suspicion that lateral views which include rownum predicates should not be decorrelated. I have looked on MoS to see if I can find any bugs related to decorrelating lateral views, but either there are none or my search terms weren’t good enough.

 

May 28, 2018

Filtering LOBs

Filed under: CBO,Execution plans,LOBs,Oracle,subqueries — Jonathan Lewis @ 8:25 am BST May 28,2018

A two-part question about the FILTER operation appeared on the Oracle-L list server a couple of days ago. The first part was a fairly common question – one that’s often prompted by the way the optimizer used to behave in older versions of Oracle. Paraphrased, it was: “Why is the total cost of the query so high compared to the sum of its parts?”

Here’s the query, and the execution plan.

 INSERT INTO TEMP
  SELECT DISTINCT 'id',
    PHT.emplid
  FROM PHOTO PHT
  WHERE 1               =1
  AND PHT.PHOTO IS NOT NULL
  AND NOT EXISTS
    (SELECT 'x'
    FROM TEMP TMP
    WHERE PHT.EMPLID=TMP.EMPLID_SRCH
    AND TMP.OPRID  = 'id'
    )
  ;  

  
-------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |          | 21210 |  3334K|  5802K  (2)| 00:03:47 |
|   1 |  LOAD TABLE CONVENTIONAL | TEMP     |       |       |            |          |
|*  2 |   FILTER                 |          |       |       |            |          |
|*  3 |    TABLE ACCESS FULL     | PHOTO    | 21211 |  3334K|   313   (1)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL     | TEMP     |     1 |    17 |   380   (2)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( NOT EXISTS (SELECT 0 FROM " TEMP" "TMP" WHERE
              "TMP"."EMPLID_SRCH"=:B1 AND "TMP"."OPRID"='id'))
   3 - filter("PHT"."PHOTO" IS NOT NULL)
   4 - filter("TMP"."EMPLID_SRCH"=:B1 AND "TMP"."OPRID"='id')

Note that the “not exists” subquery against temp runs as a filter subquery with a cost of 380 for the tablescan. Combine that with the cost of 313 for the driving tablescan of photo and you might wonder why the resulting cost isn’t something like 693 – and in some old versions of Oracle that’s probably how it would be reported.

Historically the optimizer has been very bad about producing a final cost when queries have included subqueries – whether as filter subqueries in the predicate section or as scalar subqueries in the select list. Sometimes the cost would simply vanish from the final cost, sometimes it would be added just once to the final cost regardless of how many times the subquery might actually execute.

In this example the subquery against temp is a correlated subquery and might have to run once for every row in photo where the column photo was not null. At best it would have to run at least once for every distinct value of the photo.emplid column (the correlation column) found in those rows. In recent versions of Oracle the optimizer has tried to introduce some estimate of how many times the subquery would run as part of its calculation of the total cost. So (to a crude approximation) 5802K = 313 + N * 380. Unfortunately if we try to work backwards to N we find it would be about 15,267 which is about 72% of the 21,200 rows estimated as the result of the tablescan of photo – I haven’t tried to investigate the algorithms yet but presumably the optimizer makes some allowances somewhere for “self caching” as the subquery runs.

The more interesting part of the question came when the OP decided to test the effect of getting rid of the subquery. Check the costs in the resulting plan:


  INSERT INTO TEMP
  SELECT DISTINCT 'id',
    PHT.emplid
  FROM PHOTO PHT
  WHERE 1               =1
  AND PHT.UC_PBI_PHOTO IS NOT NULL;

  
-------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |          | 21211 |  3334K|  3659   (1)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL | TEMP     |       |       |            |          |
|*  2 |   TABLE ACCESS FULL      | PHOTO    | 21211 |  3334K|  3659   (1)| 00:00:01 |
-------------------------------------------------------------------------------------
  
Predicate Information (identified by operation id):
---------------------------------------------------
     2 - filter("PHT"."PHOTO" IS NOT NULL)

Note how the cost of the tablescan of photo has gone up from 313 in the previous query to 3,659 in the simpler query! How can a tablescan that drives a subquery have a lower cost than the tablescan on its own? Bear in mind that in both cases the Cost attributed to the operation “Table Access Full” is purely about scanning the rows in the photo table and is (or should be) entirely disconnected from the cost and frequency of the subquery.

The clue is in the table definition. The column photo.photo is a BLOB.

Models

I think there are potentially two errors in the optimizer displayed by this example. The first is that it’s adding in a cost that it shouldn’t even be considering; the second is that it’s being inconsistent in the way that it’s deriving that cost.

To demonstrate what I think is happening, I built a variant of the OP’s example as follows:


rem
rem     Script:         optimizer_lob_costs.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem

create table photo (
        emplid          varchar2(11) not null,
        photo           clob,
        other_col       varchar2(1000)
)
lob (photo) 
store as
        photo_lob(
        disable storage in row 
        cache
        logging
)
;

create unique index ph_uk on photo(emplid);

insert /*+ append */ into photo
select
        lpad(2 * rownum,10,0),
        rpad('x',1000),
        rpad('x',1000)
from
        all_objects
where
        rownum <= 10000 -- > comment to avoid wordpress format issue
;

commit;

create table temp(
        oprid           varchar2(30),
        emplid_srch     varchar2(11)
)
;

insert /*+ append */ into temp
select
        'id',
        lpad(2 * rownum,10,0)
from
        all_objects
where
        rownum <= 1000 -- > comment to avoid wordpress format issue
;

commit;

execute dbms_stats.gather_table_stats(user,'photo',method_opt=>'for all columns size 1', cascade=>true)
execute dbms_stats.gather_table_stats(user,'temp', method_opt=>'for all columns size 1', cascade=>true)


I’ve changed the BLOB to a CLOB defined with storage in row disabled, and I’ve introduced a varchar2() column of the same size as the CLOB column. I’ve declared the correlating column not null and created a unique index on it. Here are the two queries I want to review – slightly simplified versions of the original:


explain plan for
insert into temp(emplid_srch)
select 
        distinct pht.emplid
from 
        photo pht
where 
        1 = 1
and  pht.photo is not null
-- and     pht.other_col is not null
and     not exists (
                select /*+ no_unnest */
                        null
                from 
                        temp tmp
                where 
                        pht.emplid=tmp.emplid_srch
        )
;  

select * from table(dbms_xplan.display);

explain plan for
insert into temp(emplid_srch)
select
        distinct pht.emplid
from    photo pht
where   1               =1
and  pht.photo is not null
-- and     pht.other_col is not nulL
;  

select * from table(dbms_xplan.display);

As you can see I’ve had to include a /*+ no_unnest */ hint in my SQL to get the FILTER operation to appear in the plan (the OP had the hidden parameter “_unnest_subquery” set to false); I’ve also allowed for two variants of each query, one referencing the CLOB column the other referencing the varchar2() column. The only results I’ll show are for the queries accessing the CLOB, and here are the plans first with, then without, the subquery. Check the cost of the tablescan of the photo table in the two cases:


----------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |       |  9999 |   956K| 10458   (3)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL | TEMP  |       |       |            |          |
|*  2 |   FILTER                 |       |       |       |            |          |
|*  3 |    TABLE ACCESS FULL     | PHOTO | 10000 |   957K|   216   (1)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL     | TEMP  |     1 |    11 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "TEMP" "TMP"
              WHERE "TMP"."EMPLID_SRCH"=:B1))
   3 - filter("PHT"."PHOTO" IS NOT NULL)
   4 - filter("TMP"."EMPLID_SRCH"=:B1)


----------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |       | 10000 |   957K|   285   (2)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL | TEMP  |       |       |            |          |
|*  2 |   TABLE ACCESS FULL      | PHOTO | 10000 |   957K|   285   (2)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("PHT"."PHOTO" IS NOT NULL)

With the subquery in place the tablescan of photo reports a cost of 285, in the absence of the subquery it reports a cost of 216, a difference of 69. Repeating the test but using the varchar2() column the cost of the tablescan was 213 in both cases – suggesting that the variation was due to the column being a LOB.

With no further clues in the plan it looked like one of those rare occasions when I have to look at the 10053 (optimizer) trace file – and this is what I got from the 12.1.0.2 trace, looking at the section headed “SINGLE TABLE ACCESS PATH” for the photo table. First the base query without the subquery:


SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for PHOTO[PHT]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Column (#2): PHOTO(LOB)
    AvgLen: 87 NDV: 0 Nulls: 0 Density: 0.000000
  Table: PHOTO  Alias: PHT
    Card: Original: 10000.000000  Rounded: 10000  Computed: 10000.000000  Non Adjusted: 10000.000000
  Scan IO  Cost (Disk) =   210.000000
  Scan CPU Cost (Disk) =   13571440.480000
  Total Scan IO  Cost  =   210.000000 (scan (Disk))
                         + 70.000000 (io filter eval) (= 0.007000 (per row) * 10000.000000 (#rows))
                       =   280.000000
  Total Scan CPU  Cost =   13571440.480000 (scan (Disk))
                         + 9138463.200000 (cpu filter eval) (= 913.846320 (per row) * 10000.000000 (#rows))
                       =   22709903.680000

Note the “Total Scan IO Cost” described at line 13 includes a component at line 12 labelled “(io filter eval)” – why, for the predicate “photo is null”, would we do any special I/O when that predicate can be met in the basic table scan.

(Note: A predicate like “lob_column is null” means there is no lob locator in place, so no lob access need be done for that test. In fact the related, but very different, predicate “length(lob_column) = 0” meaning the lob locator exists but the lob is “empty” could also be satisfied during the tablescan without reference to the physical lob segment(s) because the length of the lob is included in the lob locator.)

Let’s assume that the optimizer is incorrectly assuming the run-time engine will have to access the lob in some way to determine that the lob is null. The worst case scenario is that Oracle will start by accessing the LOBindex – so why don’t we check how big the LOBindex is. The first step I took was to check the object_id of the LOBindex and then do a tree dump (which showed 66 leaf blocks) and then I checked the segment header block and dumped that with the following results:


  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 1      #blocks: 127
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x01400447  ext#: 0      blk#: 70     ext size: 127
  #blocks in seg. hdr's freelists: 4
  #blocks below: 70
  mapblk  0x00000000  offset: 0
                   Unlocked
     Map Header:: next  0x00000000  #extents: 1    obj#: 194295 flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x01400401  length: 127

See the “Highwater::” information at line 6 – the allocated space in the segment is the first 70 blocks of the first extent. That’s (almost certainly) where the incremental cost of 70 (single block I/Os) comes from.  (And I did couple of big updates to the LOB, designed to expand the LOBindex without changing the segment size of the underlying table, to corroborate that hypothesis.)

This brings us to the question of why the cost of the tablescan drops when the subquery is included. Again we generate the 10053 trace and examine the details under the “SINGLE TABLE ACCESS PATH”:


SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for PHOTO[PHT]
  SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
  Table: PHOTO  Alias: PHT
    Card: Original: 10000.000000  Rounded: 10000  Computed: 10000.000000  Non Adjusted: 10000.000000
  Scan IO  Cost (Disk) =   210.000000
  Scan CPU Cost (Disk) =   13571440.480000
  Total Scan IO  Cost  =   210.000000 (scan (Disk))
                         + 3.500000 (io filter eval) (= 0.000350 (per row) * 10000.000000 (#rows))
                       =   213.500000
  Total Scan CPU  Cost =   13571440.480000 (scan (Disk))
                         + 656923.160000 (cpu filter eval) (= 65.692316 (per row) * 10000.000000 (#rows))
                       =   14228363.640000


In this case the “(io filter eval)” at line 10 is only 3.5 – and if you know your optimizer and how it handles subqueries you’re allowed to guess that could be one of Oracle’s standard guesses of 5% coming into play. (Again, growing the index seemed to corroborate this hypothesis.)

So here’s (possible) bug number 2: the first bug is adding a cost for accessing the LOBindex when there should be no need to  access the index at all – the execution plan says we will get 10,000 rows from the table, the filter predicate does report a cardinality reduced by just 1 on a column that has been declared with a uniqueness constraint, but a fairly commonly used “guess factor” of 5% is used as an indicator of the number of times the lob predicate will be tested. The various bits of the arithmetic are not consistent with each other.

Summary notes:

If you have a tablescan with a predicate that references a lob column then the cost of the tablescan includes the cost of the lob access – and there are cases where lob access is not needed but still gets costed {this is bug number 1 – the predicates are column is/is not null, and length(column) = / != 0)}.

If the lob data itself does not need to be accessed then the size of the lob index – which you can’t easily find – may have a significant impact on the cost of the tablescan.

If the query also includes predicates that result in the optimizer guessing about cardinality effects (1%, 5%, 0.25% are common guesses) then that guess may be used to scale the assumed (and potentially irrelevant) cost of the lob access. (There is scope for further experimentation in this area to examine the effects of “non-guess” predicates and the assumed order of application of predicates, e.g. are lob predicates costed as the last to be applied, does the algorithm for costing matched the execution order.)

As often happens it’s easy to see that there are oddities in the arithmetic that affect the cost of a query in ways that might make the optimizer pick a silly execution plan. Unfortunately it’s not easy to predict when you’re likely to see the effects of these oddities; the best we can do is remember that there is an anomaly with costing lob-based predicates and hope that we think of it when we see the optimizer picking a bad plan for reasons that initially are not obvious.

May 18, 2018

Bitmap Join Indexes

Filed under: bitmaps,CBO,Execution plans,Indexing,Oracle,Statistics — Jonathan Lewis @ 2:29 pm BST May 18,2018

I’ve been prompted by a recent question on the ODC database forum to revisit a note I wrote nearly five years ago about bitmap join indexes and their failure to help with join cardinalities. At the time I made a couple of unsupported claims and suggestions without supplying any justification or proof. Today’s article finally fills that gap.

The problem is this – I have a column which exhibits an extreme skew in its data distribution, but it’s in a “fact” table where most columns are meaningless ids and I have to join to a dimension table on its primary key to translate an id into a name. While there is a histogram on the column in the fact table the information in the histogram ceases to help if I do the join to the dimension and query by name, and the presence of a bitmap join index doesn’t make any difference. Let’s see this in action – some of the code follows a different pattern and format from my usual style because I started by copying and editing the example supplied in the database forum:


rem
rem     Script:         bitmap_join_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem
rem     Last tested 
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem
rem     Notes:
rem     Bitmap join indexes generate virtual columns on the fact table
rem     but you can't get stats on those columns - which means if the
rem     data is skewed you can have a histogram on the raw column but
rem     you don't have a histogram on the bitmap virtual column.
rem

drop table t1;
drop table dim_table;

create table dim_table (type_code number, object_type varchar2(10));

insert into dim_table values (1,'TABLE');
insert into dim_table values (2,'INDEX');
insert into dim_table values (3,'VIEW');
insert into dim_table values (4,'SYNONYM');
insert into dim_table values (5,'OTHER');

alter table dim_table add constraint dim_table_pk primary key (type_code) using index;

exec dbms_stats.gather_table_stats(user,'dim_table',cascade=>true);

create table t1 
nologging
as 
select 
        object_id, object_name, 
        decode(object_type, 'TABLE',1,'INDEX',2,'VIEW',3,'SYNONYM',4,5) type_code 
from 
        all_objects
where
        rownum <= 50000 -- > comment to bypass wordpress format issue
;

insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;


create  bitmap index t1_b1 on t1(dt.object_type)
from    t1, dim_table dt
where   t1.type_code = dt.type_code
;

exec dbms_stats.gather_table_stats(null, 't1', cascade=>true, method_opt=>'for all columns size 254');


select
        dt.object_type, count(*)
from
        t1, dim_table  dt
where
        t1.type_code   = dt.type_code
group by
        dt.object_type
order by
        dt.object_type
;

I’ve started with a dimension table that lists 5 type codes and has a primary key on that type code; then I’ve used all_objects to generate a table of 400,000 rows using those type codes, and I’ve created a bitmap join index on the fact (t1) table based on the dimension (dim_table) table column. By choice the distribution of the five codes is massively skewed so after gathering stats (including histograms on all columns) for the table I’ve produced a simple aggregate report of the data showing how many rows there are of each type – by name. Here are the results – with the execution plan from 12.1.0.2 showing the benefit of the “group by placement” transformation:


OBJECT_TYP   COUNT(*)
---------- ----------
INDEX           12960
OTHER          150376
SYNONYM        177368
TABLE           12592
VIEW            46704

5 rows selected.

-------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost  |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |       |       |   735 |
|   1 |  SORT GROUP BY        |           |     5 |   125 |   735 |
|*  2 |   HASH JOIN           |           |     5 |   125 |   720 |
|   3 |    VIEW               | VW_GBF_7  |     5 |    80 |   717 |
|   4 |     HASH GROUP BY     |           |     5 |    15 |   717 |
|   5 |      TABLE ACCESS FULL| T1        |   400K|  1171K|   315 |
|   6 |    TABLE ACCESS FULL  | DIM_TABLE |     5 |    45 |     2 |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ITEM_1"="DT"."TYPE_CODE")

Having established the basic result we can now examine some execution plans to see how well the optimizer is estimating cardinality for queries relating to that skewed distribution. I’m going to generate the execution plans for a simple select of all the rows of type ‘TABLE’ – first by code, then by name, showing the execution plan of each query:


explain plan for
select  t1.object_id
from
        t1
where
        t1.type_code = 1
;

select * from table(dbms_xplan.display(null,null,'outline'));


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 12592 |    98K|   281   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 12592 |    98K|   281   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T1"."TYPE_CODE"=1)

Thanks to the histogram I generated on the type_code table the optimizer’s estimate of the number of rows is very accurate. So how well does the optimizer handle the join statistics:


prompt  =============
prompt  Unhinted join
prompt  =============

explain plan for
select  t1.object_id
from
        t1, dim_table  dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;

select * from table(dbms_xplan.display(null,null,'outline'));

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 80000 |  1328K|   287  (10)| 00:00:01 |
|*  1 |  HASH JOIN         |           | 80000 |  1328K|   287  (10)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |     1 |     9 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1        |   400K|  3125K|   277   (7)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")
   2 - filter("DT"."OBJECT_TYPE"='TABLE')

Taking the default execution path the optimizer’s estimate of rows identified by type name is 80,000 – which is one fifth of the total number of rows. Oracle knows that the type_code is skewed in t1, but at compile time doesn’t have any idea which type_code corresponds to type ‘TABLE’, so it’s basically using the number of distinct values to dictate the estimate.

We could try hinting the query to make sure it uses the bitmap join index – just in case this somehow helps the optimizer (and we’ll see in a moment why we might have this hope, and why it is forlorn):


prompt  ===================
prompt  Hinted index access
prompt  ===================

explain plan for
select 
        /*+ index(t1 t1_b1) */
        t1.object_id
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;

select * from table(dbms_xplan.display(null,null,'outline'));

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       | 80000 |   625K|   687   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    | 80000 |   625K|   687   (1)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |       |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE        | T1_B1 |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."SYS_NC00004$"='TABLE')

The plan tells us that the optimizer now realises that it doesn’t need to reference the dimension table at all – all the information it needs is in the t1 table and its bitmap join index – but it still comes up with an estimate of 80,000 for the number of rows. The predicate section tells us what to do next – it identifies a system-generated column, which is the virtual column underlying the bitmap join index: let’s see what the stats on that column look like:


select
        column_name, histogram, num_buckets, num_distinct, num_nulls, sample_size
from
        user_tab_cols
where
        table_name = 'T1'
order by
        column_id
;


COLUMN_NAME          HISTOGRAM       NUM_BUCKETS NUM_DISTINCT  NUM_NULLS SAMPLE_SIZE
-------------------- --------------- ----------- ------------ ---------- -----------
OBJECT_ID            HYBRID                  254        50388          0        5559
OBJECT_NAME          HYBRID                  254        29224          0        5560
TYPE_CODE            FREQUENCY                 5            5          0      400000
SYS_NC00004$         NONE

4 rows selected.

There are no stats on the virtual column – and Oracle won’t try to collect any, and even if you write some in (using dbms_stats.set_column_stats) it won’t use them for the query. The optimizer seems to be coded to use the number of distinct keys from the index in this case.

Workaround

It’s very disappointing that there seems to be no official way to work around this problem – but Oracle has their own (undocumented) solution to the problem that comes into play with OLAP – the hint /*+ precompute_subquery() */. It’s possible to tell the optimizer to execute certain types of subquery as the first stage of optimising a query, then changing the query to take advantage of the resulting data:


explain plan for
select
        /*+
                qb_name(main)
                precompute_subquery(@subq)
        */
        t1.object_id
from
        t1
where
        t1.type_code in (
                select
                        /*+
                                qb_name(subq)
                        */
                        dt.type_code
                from    dim_table dt
                where   dt.object_type = 'TABLE'
        )
;

select * from table(dbms_xplan.display(null,null,'outline'));

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 12592 |    98K|   281   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 12592 |    98K|   281   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T1"."TYPE_CODE"=1)

Oracle hasn’t optimized the query I wrote, instead it has executed the subquery, derived a (very short, in this case) list of values, then optimized and executed the query I first wrote using the constant(s) returned by the subquery. And you can’t see the original subquery in the execution plan. Of course, with the literal values in place, the cardinality estimate is now correct.

It’s such a pity that this hint is undocumented, and one that you shouldn’t use in production.

 

May 11, 2018

Skip Scan 3

Filed under: CBO,Index skip scan,Indexing,Oracle — Jonathan Lewis @ 2:26 pm BST May 11,2018

If you’ve come across any references to the “index skip scan” operation for execution plans you’ve probably got some idea that this can appear when the number of distinct values for the first column (or columns – since you can skip multiple columns) is small. If so, what do you make of this demonstration:


rem
rem     Script:         skip_scan_cunning.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',1000);
end;
/

create table t1
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        rownum                          id1,
        rownum                          id2,
        lpad(rownum,10,'0')             v1,
        lpad('x',150,'x')               padding
/*
        cast(rownum as number(8,0))                     id,
        cast(lpad(rownum,10,'0') as varchar2(10))       v1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
*/
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(id1, id2);

begin
        dbms_stats.gather_table_stats(
                ownname     => user,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

For repeatability I’ve set some system statistics, but if you’ve left the system stats to default you should see the same effect. All I’ve done is create a table and an index on that table. The way I’ve defined the id1 and id2 columns means they could individually support unique constraints and the index clearly has 1 million distinct values for id1 in the million index entries. So what execution plan do you think I’m likely to get from the following simple query:


set serveroutput off
alter session set statistics_level = all;

prompt  =======
prompt  Default
prompt  =======

select  id 
from    t1
where   id2 = 999
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

You’re probably not expecting an index skip scan to appear, but given the title of this posting you may have a suspicion that it will; so here’s the plan I got running this test on 12.2.0.1:


SQL_ID  8r5xghdx1m3hn, child number 0
-------------------------------------
select id from t1 where id2 = 999

Plan hash value: 400488565

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |  2929 (100)|      1 |00:00:00.17 |    2932 |      5 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |      1 |  2929   (1)|      1 |00:00:00.17 |    2932 |      5 |
|*  2 |   INDEX SKIP SCAN                   | T1_I1 |      1 |      1 |  2928   (1)|      1 |00:00:00.17 |    2931 |      4 |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID2"=999)
       filter("ID2"=999)


So, an index skip scan doesn’t require a small number of distinct values for the first column of the index (unless you’re running a version older than 11.2.0.2 where a code change appeared that could be disabled by setting fix_control 9195582 off).

When the optimizer doesn’t do what you expect it’s always worth hinting the code to follow the plan you were expecting – so here’s the effect of hinting a full tablescan (which happened to do direct path reads):

SQL_ID  bxqwhsjwqfm7q, child number 0
-------------------------------------
select  /*+ full(t1) */  id from t1 where id2 = 999

Plan hash value: 3617692013

----------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |  3317 (100)|      1 |00:00:00.12 |   25652 |  25635 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |      1 |  3317   (3)|      1 |00:00:00.12 |   25652 |  25635 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID2"=999)

Note that the cost is actually more expensive than the cost of the indexed access path.  For reference you need to know that the blocks statistic for the table was 25,842 while the number of index leaf blocks was 2,922. The latter figure (combined with a couple of other details regarding the clustering_factor and undeclared uniqueness of the index) explains why the cost of the skip scan was only 2,928: the change that appeared in 11.2.0.2 limited the I/O cost of an index skip scan to the total number of leaf blocks in the index.  The tablescan cost (with my system stats) was basically dividing my table block count by 16 (to get the number of multi-block reads) and then doubling (because the multiblock read time is twice the single block read time).

As a quick demo of how older versions of Oracle would behave after setting “_fix_control”=’9195582:OFF’:


SQL_ID	bn0p9072w9vfc, child number 1
-------------------------------------
select	/*+ index_ss(t1) */  id from t1 where id2 = 999

Plan hash value: 400488565

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |	A-Time	 | Buffers |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |	    |	   1 |	      |  1001K(100)|	  1 |00:00:00.13 |    2932 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |	   1 |	    1 |  1001K	(1)|	  1 |00:00:00.13 |    2932 |
|*  2 |   INDEX SKIP SCAN		    | T1_I1 |	   1 |	    1 |  1001K	(1)|	  1 |00:00:00.13 |    2931 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID2"=999)
       filter("ID2"=999)

The cost of the skip scan is now a little over 1,000,000 – corresponding (approximately) to the 1 million index probes that will have to take place. You’ll notice that the number of buffer visits recorded is 2931 for the index operation, though: this is the result of the run-time optimisation that keeps buffers pinned very aggressively for skip scan – you might expect to see a huge number of visits recorded as “buffer is pinned count”, but for some reason that doesn’t happen. The cost is essentially Oracle calculating (with pinned root and branch) the cost of “id1 = {constant} and id2 = 999” and multiplying by ndv(id1).

Footnote:

Ideally, of course, the optimizer ought to work out that an index fast full scan followed by a table access ought to have a lower cost (using multi-block reads rather than walking the index in leaf block order one block at a time (which is what this particular skip scan will have to do) – but that’s not (yet) an acceptable execution plan though it does now appear a plan for deleting data.

tl;dr

If you have an index that is very much smaller than the table you may find examples where the optimizer does what appears to be an insanely stupid index skip scan when you were expecting a tablescan or, possibly, some other less efficient index to be used. There is a rationale for this, but such a plan may be much more CPU and read intensive than it really ought to be.

 

May 4, 2018

FBI Limitation

Filed under: CBO,distributed,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 9:19 am BST May 4,2018

A recent question on the ODC (OTN) database forum prompted me to point out that the optimizer doesn’t consider function-based indexes on remote tables in distributed joins. I then spent 20 minutes trying to find the blog note where I had demonstrated this effect, or an entry in the manuals reporting the limitation – but I couldn’t find anything, so I’ve written a quick demo which I’ve run on 12.2.0.1 to show the effect. First, the SQL to create a couple of tables and a couple of indexes:


rem
rem     Script:         fbi_limitation.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem

-- create public database link orcl@loopback using 'orcl'; 
define m_target = orcl@loopback

create table t1
segment creation immediate
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        rownum                          n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create table t2
nologging
as
select * from t1
;

alter table t1 add constraint t1_pk primary key(id);
alter table t2 add constraint t2_pk primary key(id);
create unique index t2_f1 on t2(id+1);

begin
        dbms_stats.gather_table_stats(
                ownname     => user,
                tabname     => 'T1',
                cascade     => true,
                method_opt  => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname     => user,
                tabname     => 'T2',
                cascade     => true,
                method_opt  => 'for all columns size 1'
        );
end;
/


The code is very simple, it creates a couple of identical tables with an id column that will produce an index with a very good clustering_factor. You’ll notice that I’ve (previously) created a public database link that is (in my case) a loopback to the current database and the code defines a variable that I can use as a substitution variable later on. If you want to do further tests with this model you’ll need to make some changes in these two lines.

So now I’m going to execute a query that should result in the optimizer choosing a nested loop between the tables – but I have two versions of the query, one which treats t2 as the local table it really is, and one that pretends (through the loopback) that t2 is remote.


set serveroutput off

select
        t1.v1, t2.v1
from
        t1,
        t2
--      t2@orcl@loopback
where
        t2.id+1 = t1.id
and     t1.n1 between 101 and 110
;


select * from table(dbms_xplan.display_cursor);

select
        t1.v1, t2.v1
from
        t1,
--      t2
        t2@orcl@loopback
where
        t2.id+1 = t1.id
and     t1.n1 between 101 and 110
;

select * from table(dbms_xplan.display_cursor);

Here are the two execution plans, pulled from memory – including the “remote” section in the distributed case:


SQL_ID  fthq1tqthq8js, child number 0
-------------------------------------
select  t1.v1, t2.v1 from  t1,  t2 -- t2@orcl@loopback where  t2.id+1 =
t1.id and t1.n1 between 101 and 110

Plan hash value: 1798294492

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |       |       |  2347 (100)|          |
|   1 |  NESTED LOOPS                |       |    11 |   407 |  2347   (3)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL          | T1    |    11 |   231 |  2325   (4)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T2    |     1 |    16 |     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | T2_F1 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("T1"."N1"<=110 AND "T1"."N1">=101))
   4 - access("T2"."SYS_NC00005$"="T1"."ID")

Note
-----
   - this is an adaptive plan




SQL_ID  ftnmywddff1bb, child number 0
-------------------------------------
select  t1.v1, t2.v1 from  t1, -- t2  t2@orcl@loopback where  t2.id+1 =
t1.id and t1.n1 between 101 and 110

Plan hash value: 1770389500

-------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |  4663 (100)|          |        |      |
|*  1 |  HASH JOIN         |      |    11 |   616 |  4663   (4)| 00:00:01 |        |      |
|*  2 |   TABLE ACCESS FULL| T1   |    11 |   231 |  2325   (4)| 00:00:01 |        |      |
|   3 |   REMOTE           | T2   |  1000K|    33M|  2319   (3)| 00:00:01 | ORCL@~ | R->S |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."ID"="T2"."ID"+1)
   2 - filter(("T1"."N1"<=110 AND "T1"."N1">=101))

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT "ID","V1" FROM "T2" "T2" (accessing 'ORCL@LOOPBACK' )

Both plans show that the optimizer has estimated the number of rows that would be retrieved from t1 correctly (very nearly); but while the fully local query does a nested loop join using the high-precision, very efficient function-based index (reporting the internal supporting column referenced in the predicate section) the distributed query seems to have no idea about the remote function-based index and select all the required rows from the remote table and does a hash join.

Footnote:

Another reason for changes in execution plan when you test fully local and then run distributed is due to the optimizer ignoring remote histograms, as demonstrated in a much older blog note (though still true in 12.2.0.1).

Addendum

After finishing this note, I discovered that I had written a similar note about reverse key indexes nearly five years ago. Arguably a reverse key is just a special case of a function-based index – except it’s not labelled as such in user_tab_cols, and doesn’t depend on a system-generated hidden column.

 

March 13, 2018

Deferred Invalidation

Filed under: 12c,CBO,Infrastructure,Oracle,Troubleshooting,Upgrades — Jonathan Lewis @ 6:30 pm BST Mar 13,2018

I was going to write an article on the way 12.2 has introduced the option for “deferred invalidation” for a number of DDL operations, but I did a quick google search before I started writing and found that both Franck Pachot and Richard Foote (yes, rebuild index is one of the operations) had got there long ago so here are a couple of links – as much for my own benefit as anything else:

Richard Foote:

Franck Pachot:

Franck’s 2nd example may be particularly relevant to some clients of mine who were having problems with SQL queries that were crashing (slowly and randomly) instead of running very efficiently because they were running queries against one subpartition of a table while another subpartition of the same table was subject to exchange. With a little bad luck in the timing an exchange that took place between a parse and an execute would cause a query to have its cursor invalidated and re-parsed in a way that failed to do (sub-)partition elimination the way it should have because the local indexes were in an indeterminate state.

 

March 8, 2018

Column Groups

Filed under: CBO,extended stats,Oracle,Statistics — Jonathan Lewis @ 6:54 am BST Mar 8,2018

There’s a question on the ODC database forum about column groups that throws up an interesting side point. The OP is looking at a query like the following and asking about which column groups might help the optimizer get the best plan:

select
        a.*, b.*, c.*
from
        a, b, c
where
        a.id   = b.id
and     a.id1  = b.id1
and     a.id   = c.id
and     b.id2  = c.id2
and     a.id4  = 66
and     b.id7  = 44
and     c.id88 = 88
;

I’m going to start by being a bit boring about presentation and meaning (although this query has fairly obviously being engineered to conceal any meaningful column and table names) and do a cosmetic edit of the query because if I had a from clause reading “a, b, c” it would be because I thought the optimizer should identify that as the best join order, in which case I would also have written the predicate section to display the order in which the predicates would be used:

select
        a.*, b.*, c.*
from
        a, b, c
where
        a.id4  = 66
--
and     b.id   = a.id
and     b.id1  = a.id1
and     b.id7  = 44
--
and     c.id   = a.id
and     c.id2  = b.id2
and     c.id88 = 88
;

Having (to my mind) cosmetically enhanced the query, I’ll now ask the question: “Would it make sense to create column groups on a(id, id1), b(id, id1) and c(id, id2) ?”

I’ve written various articles on cases where column groups have effects (or not): “out of range” predicates, “is null” predicates, “histograms issues”, “statistics at both ends of the join”, and “multi-column indexes vs. column groups” are just some of the key areas. Assuming there are no reasons to stop a particular column group from working we can look at the join from table A to table B: it’s a two-column join so if there’s some strong correlation between the id and id1 columns of these two tables then creating the two column groups (one at each end of the join) can make a difference to the optimizer’s calculations with the most likely effect that the cardinality estimate on the join will go up and, as a side effect the join order and join method may change.

If we then consider the join to table C – we note that it involves two columns from table C being joined to one column from table A and one from table B so, while we could create a column group on those two columns at the table C end of the join, a column group is simply not possible at the A/B of the join. This means that one end of the join may have a selectivity that is hugely increased (far fewer combinations) because the column group has quantified the correlation, but the selectivity at the other end is simply based on the two separate selectivities from a.id and b.id2, and that’s likely to be smaller than the selectivity of (c.id, c.id2), and the optimizer will choose the smaller join selectivity hence producing a lower cardinality estimate.

This is where a collateral point appears – and it’s a point which also justifies the careful rearrangement of the SQL text – there is an opportunity for transitive closure that the human eye can see but the optimizer is not coded to manipulate. We have two predicates: “b.id = a.id” and “c.id = a.id” but they can only both be true when “c.id = b.id”, so let’s replace “c.id = a.id” with “c.id = b.id” and the join predicate to table C becomes:

and     c.id   = b.id
and     c.id2  = b.id2

Both left hand sides reference table C, both right hand sides reference table B – so if we now create a column group on c(id, id2) and an additional column group on b(id, id2) then we may give Oracle some better information about this join as well. In fact, even if we create NO column groups at all this specific change may be enough to result in a change in the selectivity calculations with a subsequent change in cardinality estimate and execution plan.

 

February 20, 2018

Assumptions

Filed under: CBO,Oracle,Philosophy — Jonathan Lewis @ 8:57 am BST Feb 20,2018

As the years roll on I’ve found it harder and harder to supply quick answers to “simple” questions on the Oracle-L list server and OTN/ODC forum because things are constantly changing and an answer that may have been right the last time I checked could now be wrong. A simple example of the consequences of change showed up recently on the OTN/ODC forum where one reply to a question started:

Just why do you need distinct in a subquery??? That’s the first thing that appears really shocking to me. If it’s a simple in (select …) adding a distinct to the subquery would just impose a sort unique(as you can see in the explain plan), which may be quite costly.

Three question-marks is already tip-toeing its way to the Pratchett limit – but “really shocking” ? It’s bad enough that the comment goes operatic, but going operatic in order to introduce an error pushes the thing into tragedy (or possibly comedy – or maybe both). To make the self-inflicted injury worse, there were two execution plans supplied in the original post anyway of which only one showed any attempt to achieve uniqueness.

Bottom line – when you’re about to correct someone for doing something that is “obviously” wrong, be a little bit kind about it and then be kind to yourself and do a quick sanity check that your attempt at correction is itself correct. A good guideline would be to ask yourself: “How do I know what I know – and am I about to make myself look like an idiot (again).”

Check It

Question: Does a  “distinct” in a subquery impose a sort (or hash) unique ?

Answer: No – a uniqueness operation may appear, but it’s not guaranteed to appear.

Here’s a quick example which does not result in any attempt at imposing uniqueness (running 11.2.0.4):

rem
rem     Script:         unnest_demo.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Feb 2018
rem

drop table t2 purge;
drop table t1 purge;
create table t1 as select * from all_objects where rownum  <= 100;
create table t2 as select * from all_objects where rownum <= 100;

create index t1_i1 on t1(owner);
create index t2_i2 on t2(object_type);


set autotrace traceonly explain

select  * 
from    t1 
where   owner = 'OUTLN' 
and     object_name in (
                select distinct object_name 
                from   t2 
                where  object_type = 'TABLE'
        )
;


Execution Plan
----------------------------------------------------------
Plan hash value: 3169044451

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     3 |   558 |     4   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI              |       |     3 |   558 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |     3 |   474 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |     3 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2    |    12 |   336 |     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | T2_I2 |    12 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OBJECT_NAME"="OBJECT_NAME")
   3 - access("OWNER"='OUTLN')
   5 - access("OBJECT_TYPE"='TABLE')

Note
-----
   - dynamic sampling used for this statement (level=2)


There’s no sign of a sort unique or hash unique. The optimizer has decided that the IN subquery can be transformed into an EXISTS subquery, which can then be transformed into a semi-join.

I can think of three other execution plan strategies that might have appeared depending on the data, indexing, and statistics – initially I had just hacked the text above to produce the plans and hadn’t saved anything as a library script, but following a request in the comments below I decided to recreate the whole text and report the hints I’d used. In all the following cases the hints I quote go in the subquery, not in the main body of the query:

a) Hinting /*+ no_unnest */ transforms the IN subquery to an EXISTS subquery then operate as a filter subquery (with no uniqueness imposed):


--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |   158 |     5   (0)| 00:00:01 |
|*  1 |  FILTER                      |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |     3 |   474 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |     3 |       |     1   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| T2    |     1 |    28 |     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | T2_I2 |    12 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

b) Hinting /*+ unnest no_merge no_semijoin */ gets a simple unnest with sort/hash unique and join


--------------------------------------------------------------------------------------------
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |          |     3 |   525 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN                      |          |     3 |   525 |     5  (20)| 00:00:01 |
|   2 |   JOIN FILTER CREATE            | :BF0000  |     3 |   474 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID  | T1       |     3 |   474 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN            | T1_I1    |     3 |       |     1   (0)| 00:00:01 |
|   5 |   VIEW                          | VW_NSO_1 |    12 |   204 |     3  (34)| 00:00:01 |
|   6 |    HASH UNIQUE                  |          |    12 |   336 |     3  (34)| 00:00:01 |
|   7 |     JOIN FILTER USE             | :BF0000  |    12 |   336 |     2   (0)| 00:00:01 |
|   8 |      TABLE ACCESS BY INDEX ROWID| T2       |    12 |   336 |     2   (0)| 00:00:01 |
|*  9 |       INDEX RANGE SCAN          | T2_I2    |    12 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("OBJECT_NAME"="OBJECT_NAME")
   4 - access("OWNER"='OUTLN')
   9 - access("OBJECT_TYPE"='TABLE')


For this data set I actually had to take the optimizer_features_enable back to ‘8.1.7’ to get this plan – On recreating the tests I realised that there was a way to get this plan from basic hints (though the modern versions of Oracle can slip a Bloom filter into the hash join). As you can see that there’s a HASH UNIQUE at operation 6, but that would have been there whether or not the DISTINCT keyword had appeared in the SQL. Effectively the query has been transformed to:


select  t1.*
from    (
                select  distinct t2.object_name object_name
                from    t2
                where   t2.object_type='TABLE'
        )
        vw_nso_1,
        t1
where   t1.owner = 'OUTLN'
and     t1.object_name = vw_nso_1.object_name
/

c) Hinting /*+ unnest no_semijoin merge */ results in unnesting, then a “transform distinct aggregation” so that the distinct is applied after the join. (In the original text I had said this was using “place group by”. But that’s the transformation that pushes an aggregate inside a join while what’s happening here is one of the variants of the opposite transformation.)

--------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |     3 |   474 |     5  (20)| 00:00:01 |
|   1 |  VIEW                          | VM_NWVW_1 |     3 |   474 |     5  (20)| 00:00:01 |
|   2 |   HASH UNIQUE                  |           |     3 |   594 |     5  (20)| 00:00:01 |
|*  3 |    HASH JOIN                   |           |     3 |   594 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID| T1        |     3 |   510 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | T1_I1     |     3 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T2        |    12 |   336 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | T2_I2     |    12 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."OBJECT_NAME"="T2"."OBJECT_NAME")
   5 - access("T1"."OWNER"='OUTLN')
   7 - access("T2"."OBJECT_TYPE"='TABLE')

Note
-----
   - dynamic sampling used for this statement (level=2)

Again, the plan would be the same whether or not the original subquery had a redundant distinct.

The things you think you know may have been true 10 years ago – but maybe they’re not true any longer, or maybe they’re still true on your version of the database but not every  version of the database. So I often end up looking at a question, thinking the poster’s claim can’t be right, and then working out and modelling the circumstances that might make the poster’s observations appear (and learning something new).

Remember: “I’ve never seen it before” doesn’t mean “It doesn’t happen”.

Update (1st March 2018)

In a remarkably timely coincidence – showing that there’s always more to see, no matter how carefully you think you’ve been looking – Nenad Noveljic shows us that sometimes it’s actually a positively good thing to have a “redundant” distinct, because it bypasses an optimizer bug.

 

Next Page »

Powered by WordPress.com.