Oracle Scratchpad

December 8, 2014

Cardinality Change

Filed under: CBO,Oracle — Jonathan Lewis @ 9:35 pm GMT Dec 8,2014

Here’s an entertaining little change across versions of Oracle, brought to my attention by Tony Hasler during UKOUG Tech 14. It’s a join cardinality estimate, so here are a couple of tables to demonstrate the issue – the only columns needed are the alpha_06 columns, but I reused some code from other demonstrations to create my test case, so there are lots of irrelevant columns in the create table script:


create table t1 nologging as
with generator as (
        select rownum id
        from dual
        connect by rownum <= 1000
)
select
        rownum                                          id,
        mod(rownum-1,200)                               mod_200,
        trunc(dbms_random.value(0,300))                 rand_300,
        mod(rownum-1,10000)                             mod_10000,
        trunc(sysdate) +
                trunc(dbms_random.value(0,1000))        date_1000,
        dbms_random.string('l',6)                       alpha_06,
        dbms_random.string('l',20)                      alpha_20
from
        generator,
        generator
where
        rownum <= 1e6
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1')

create table t2 nologging as select * from t1;
execute dbms_stats.gather_table_stats(user,'t2',method_opt=>'for all columns size 1')

I’m going to join t1 to t2 with a predicate based on the alpha_06 columns – using a LIKE predicate. Before I do so I’ll point out that there are are 1,000,000 rows in the table, and (checking the column stats) 985,920 distinct values for alpha_06. Here’s my query, with the execution plan I got from 11.1.0.7:


select
        count(*)
from
        t1, t2
where
        t2.alpha_06 like t1.alpha_06
;

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    14 |  1122M  (6)|999:59:59 |
|   1 |  SORT AGGREGATE     |      |     1 |    14 |            |          |
|   2 |   NESTED LOOPS      |      |    50G|   651G|  1122M  (6)|999:59:59 |
|   3 |    TABLE ACCESS FULL| T1   |  1000K|  6835K|  1123   (6)| 00:00:06 |
|*  4 |    TABLE ACCESS FULL| T2   | 50000 |   341K|  1122   (6)| 00:00:06 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("T2"."ALPHA_06" LIKE "T1"."ALPHA_06")

The 50,000 cardinality estimate for t2 looks like the standard 5% guess for “column >= {unknown value}”, following which the join cardinality of 50G is the same 5% guess applied to the Cartesian join between t1 and t2 (1M * 1M * 0.05). It’s not a good estimate in my case because the right answer happens to be close to 1M rows, specifically 1,003,176. So let’s upgrade to 11.2.0.4 and see what we get instead:

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    14 |  1050M  (6)|999:59:59 |
|   1 |  SORT AGGREGATE     |      |     1 |    14 |            |          |
|   2 |   NESTED LOOPS      |      |  2014K|    26M|  1050M  (6)|999:59:59 |
|   3 |    TABLE ACCESS FULL| T1   |  1000K|  6835K|  1051   (6)| 00:00:06 |
|*  4 |    TABLE ACCESS FULL| T2   |     2 |    14 |  1050   (6)| 00:00:06 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("T2"."ALPHA_06" LIKE "T1"."ALPHA_06")

The estimate has dropped from 50 Billion rows down to 2 Million – a factor of about 25,000: possibly an indicator that the algorithm has changed, and that a few people might find execution plans changing as they upgrade to a newer version of Oracle. The change occurred at 11.2.0.2 as revealed by fix control 9303766 which has the description: “use 1/NDV+1/NROWS for col1 LIKE col2 selectivities”.

Just as a quick check on the arithmetic: there are 1 million rows in table t2, with (as noted above) 985,920 distinct values in the column, so the selectivity should be: 1/1000000 + 1/985920 = 2.014281 * e-6. Multiply the selectivity by 1e6 and you get 2, the cardinality estimate for t2; multiply the selectivity by 1M*1M (the Cartesian join) and you get 2,014,281, the cardinality estimate of the join. QED.

There are workarounds, of course. One would be to reverse out the fix control, either as an initialisation parameter or in a session logon trigger, another might be to modify the SQL – I think the following would be equivalent:


select
        *
from    t1, t2
where
        t2.alpha_06 like substr(t1.alpha_06,1,length(t1.alpha_06))||'%'
and     t1.alpha_06 is not null
and     t2.alpha_06 is not null

This changes the critical predicate from the form “col1 like col2″ to “col1 like {unknown value from function}” i.e. back to a case where the optimizer uses the 5% guess, and the cardinality estimates go back the original values.

December 5, 2014

Closure

Filed under: CBO,Oracle,Upgrades — Jonathan Lewis @ 8:11 am GMT Dec 5,2014

It’s been a long time since I said anything interesting about transitive closure in Oracle, the mechanism by which Oracle can infer that if a = b and b = c then a = c but only (in Oracle’s case) if one of a, b, or c is a literal constant rather than a column. So with that quick reminder in place, here’s an example of optimizer mechanics to worry you. It’s not actually a demonstration of transitive closure coming into play, but I wanted to remind you of the logic to set the scene.

I have three identical tables, one million rows, no indexes. The SQL to create the first table is one I supplied a couple of days ago to demonstrate changes in join cardinality dependent on Oracle version:


create table t1
nologging
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	trunc(dbms_random.value(0,1000))	n_1000,
	trunc(dbms_random.value(0,750))		n_750,
	trunc(dbms_random.value(0,600))		n_600,
	trunc(dbms_random.value(0,400))		n_400,
	trunc(dbms_random.value(0,90))		n_90,
	trunc(dbms_random.value(0,72))		n_72,
	trunc(dbms_random.value(0,40))		n_40,
	trunc(dbms_random.value(0,3))		n_3
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

Here’s a simple SQL statement that joins the three tables:


select
	t1.*, t2.*, t3.*
from
	t1, t2, t3
where
	t2.n_90  = t1.n_90
and	t3.n_90  = t2.n_90
and	t3.n_600 = t2.n_600
and	t1.n_400 = 1
and	t2.n_400 = 2
and	t3.n_400 = 3
;

Given the various n_400 = {constant} predicates we should expect to see close to 2,500 rows from each table participating in the join – and that is exactly what Oracle predicts in the execution plan. The question is: what is the cardinality of the final join? Before showing you the execution plan and its prediction I’m going to bring transitivity into the picture.  Note the lines numbered 6 and 7.  If t2.n_90 = t1.n_90 and t3.n_90 = t2.n_90 then t3.n_90 = t1.n_90; so I might have written my query slightly differently – note the small change at line 7 below:


select
	t1.*, t2.*, t3.*
from
	t1, t2, t3
where
	t2.n_90  = t1.n_90
and	t3.n_90  = t1.n_90		-- changed
and	t3.n_600 = t2.n_600
and	t1.n_400 = 1
and	t2.n_400 = 2
and	t3.n_400 = 3
;

So here’s the exciting bit. My two queries are logically equivalent, and MUST return exactly the same row set. Check the final cardinality predictions in these two execution plans (from 12.1.0.2, but you get the same results in 11.2.0.4, older versions have other differences):


First Version - note the predicate for operation 3
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 70949 |  5820K|  1869  (10)| 00:00:01 |
|*  1 |  HASH JOIN          |      | 70949 |  5820K|  1869  (10)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL | T1   |  2500 | 70000 |   622  (10)| 00:00:01 |
|*  3 |   HASH JOIN         |      |  2554 |   139K|  1245  (10)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |  2500 | 70000 |   622  (10)| 00:00:01 |
|*  5 |    TABLE ACCESS FULL| T3   |  2500 | 70000 |   622  (10)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_90"="T1"."N_90")
   2 - filter("T1"."N_400"=1)
   3 - access("T3"."N_90"="T2"."N_90" AND "T3"."N_600"="T2"."N_600")
   4 - filter("T2"."N_400"=2)
   5 - filter("T3"."N_400"=3)

Second Version - note the predicate for operation 1
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |  3264 |   267K|  1868  (10)| 00:00:01 |
|*  1 |  HASH JOIN          |      |  3264 |   267K|  1868  (10)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL | T1   |  2500 | 70000 |   622  (10)| 00:00:01 |
|*  3 |   HASH JOIN         |      | 10575 |   578K|  1245  (10)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |  2500 | 70000 |   622  (10)| 00:00:01 |
|*  5 |    TABLE ACCESS FULL| T3   |  2500 | 70000 |   622  (10)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_90"="T1"."N_90" AND "T3"."N_90"="T1"."N_90")
   2 - filter("T1"."N_400"=1)
   3 - access("T3"."N_600"="T2"."N_600")
   4 - filter("T2"."N_400"=2)
   5 - filter("T3"."N_400"=3)

The a small change in the choice of presenting the predicates gives me a factor of 22 in the cardinality estimate – oops!

The actual result with my data was close to 3,000 rows – so one of the estimates in the second version was pretty good; but the point of the blog isn’t that you can “tune” the optimizer by carefully picking your way through transitive closure, the point is that a small “cosmetic” change you might make to a query could result in a significant change in the cardinality calculations which could then make a dramatic difference to the final execution plan. This example, by the way, depends on the same “multi-column sanity check” that showed up in the previous posting.

I will be expanding on this posting some time in the next couple of weeks but, again, the example should come up in my session on calculating selectivity at “Super Sunday” at UKOUG Tech 14.

 

 

December 3, 2014

Upgrades

Filed under: CBO,Oracle,Upgrades — Jonathan Lewis @ 8:24 am GMT Dec 3,2014

I have a simple script that creates two identical tables , collects stats (with no histograms) on the pair of them, then executes a join. Here’s the SQL to create the first table:


create table t1
nologging
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	trunc(dbms_random.value(0,1000))	n_1000,
	trunc(dbms_random.value(0,750))		n_750,
	trunc(dbms_random.value(0,600))		n_600,
	trunc(dbms_random.value(0,400))		n_400,
	trunc(dbms_random.value(0,90))		n_90,
	trunc(dbms_random.value(0,72))		n_72,
	trunc(dbms_random.value(0,40))		n_40,
	trunc(dbms_random.value(0,3))		n_3
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

-- gather stats: no histograms

The two tables have 1,000,000 rows each and t2 is created from t1 with a simple “create as select”. The columns are all defined to be integers, and the naming convention is simple – n_400 holds 400 distinct values with uniform distribution from 0 – 399, n_750 holds 750 values from 0 – 749, and so on.

Here’s the simple query:


select
        t1.*, t2.*
from
        t1, t2
where
        t1.n_400 = 0
and     t2.n_72  = t1.n_90
and     t2.n_750 = t1.n_600
and     t2.n_400 = 1
;

Since I’ve created no indexes you might expect the query to do a couple of and a hash join to get its result – and you’d be right; but what do you think the predicted cardinality would be ?

Here are the results from running explain plan on the query and then reporting the execution plan – for three different versions of Oracle:



9.2.0.8
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |    96 |  4992 |  1230  (10)|
|*  1 |  HASH JOIN           |             |    96 |  4992 |  1230  (10)|
|*  2 |   TABLE ACCESS FULL  | T1          |  2500 | 65000 |   617  (11)|
|*  3 |   TABLE ACCESS FULL  | T2          |  2500 | 65000 |   613  (10)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_750"="T1"."N_600" AND "T2"."N_72"="T1"."N_90")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

***************************************************************************

10.2.0.5
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   116 |  6032 |  1229  (10)| 00:00:07 |
|*  1 |  HASH JOIN         |      |   116 |  6032 |  1229  (10)| 00:00:07 |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 | 65000 |   616  (11)| 00:00:04 |
|*  3 |   TABLE ACCESS FULL| T2   |  2500 | 65000 |   612  (10)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_750"="T1"."N_600" AND "T2"."N_72"="T1"."N_90")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

***************************************************************************

11.2.0.4
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  2554 |   139K|  1225  (10)| 00:00:07 |
|*  1 |  HASH JOIN         |      |  2554 |   139K|  1225  (10)| 00:00:07 |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 | 70000 |   612  (10)| 00:00:04 |
|*  3 |   TABLE ACCESS FULL| T2   |  2500 | 70000 |   612  (10)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_72"="T1"."N_90" AND "T2"."N_750"="T1"."N_600")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

The change for 11.2.0.4 (which is still there for 12.1.0.2. I didn’t check to see if it also appears in 11.1.0.7) is particularly worrying. When you see a simple query like this changing cardinality on the upgrade you can be fairly confident that some of your more complex queries will change their plans – even if there are no clever new optimizer transformations coming into play.

I’ll write up an explanation of how the optimizer has produced three different estimates some time over the next couple of weeks; but if you want an earlier answer this is one of the things I’ll be covering in my presentation on calculating selectivity at “Super Sunday” at UKOUG Tech 14.

November 26, 2014

Lunchtime quiz

Filed under: CBO,dbms_xplan,Oracle — Jonathan Lewis @ 12:41 pm GMT Nov 26,2014

There was a question on OTN a few days ago asking the following question:

Here’s a query that ran okay on 11g, but crashed with Oracle error “ORA-01843: not a valid month” after upgrade to 12c; why ?

The generically correct answer, of course, is that the OP had been lucky (or unlucky, depending on your point of view) on 11g – and I’ll explain that answer in another blog posting.

That isn’t the point of this posting, though. This posting is a test of observation and deduction. One of the respondants in the thread had conveniently supplied a little bit of SQL that I copied and fiddled about with to demonstrate a point regarding CPU costing, but as I did so I thought I’d show you the following and ask a simple question.’


drop table T;

Create Table T
As
with
periods as (
                  Select 'January' period, 1 cal  From Dual
        union all Select 'February' period , 2 cal From Dual
        union all Select 'March' period , 3 cal From Dual
        union all Select 'April' period , 4 cal From Dual
        union all Select 'May'  period, 5 cal From Dual
        union all Select 'June' period, 6 cal From Dual
        union all Select 'July' period, 7 cal From Dual
        union all Select 'August' period, 8 cal From Dual
        union all Select 'September' period, 9 cal  From Dual
        union all Select 'October' period, 10 cal From Dual
        union all Select 'November' period, 11 cal From Dual
        Union All Select 'December' Period, 12 Cal From Dual
        Union All Select '13 Series' Period, Null Cal  From Dual
)
Select  Period,Cal
from periods;

prompt  ==================================
prompt  When we invoke below SQL it works.
prompt  ==================================

set autotrace on explain

select *
from    (
        select
                Period,
                Cal,
                to_date(Period || ', ' || 2014,'Month, YYYY') col1 ,
                to_date('November, 2014','Month, YYYY') col2
        From  T
        Where  Cal > 0
        )
;

prompt  ================================================
prompt  But when we add comparison operations , it fails
prompt  ================================================

select *
from    (
        select
                Period,
                Cal,
                to_date(Period || ', ' || 2014,'Month, YYYY')   col1,
                to_date('November, 2014','Month, YYYY')         col2
        From  T
        Where  Cal > 0
        )
where
        col1 >= col2
;

set autotrace off



All I’ve done is create a table then run and generate the execution plans for two queries – with a comment that if you try to run one query it will succeed but if you try to run the other it will fail (and raise ORA-01843). As far as the original supplier was concerned, both queries succeeded in 11g and the failure of the second one appeared only in 12c. In fact, for reasons that I won’t discuss here, it is POSSIBLE for the failure to appear in 11g as well, though not necessarily with this exact data set.

Here’s the COMPLETE output I got from running the code above on an 11.2.0.4 instance:



Table dropped.


Table created.

==================================
When we invoke below SQL it works.
==================================

PERIOD           CAL COL1      COL2
--------- ---------- --------- ---------
January            1 01-JAN-14 01-NOV-14
February           2 01-FEB-14 01-NOV-14
March              3 01-MAR-14 01-NOV-14
April              4 01-APR-14 01-NOV-14
May                5 01-MAY-14 01-NOV-14
June               6 01-JUN-14 01-NOV-14
July               7 01-JUL-14 01-NOV-14
August             8 01-AUG-14 01-NOV-14
September          9 01-SEP-14 01-NOV-14
October           10 01-OCT-14 01-NOV-14
November          11 01-NOV-14 01-NOV-14
December          12 01-DEC-14 01-NOV-14

12 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    12 |   228 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |    12 |   228 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CAL">0)

Note
-----
   - dynamic sampling used for this statement (level=2)

================================================
But when we add comparison operations , it fails
================================================

PERIOD           CAL COL1      COL2
--------- ---------- --------- ---------
November          11 01-NOV-14 01-NOV-14
December          12 01-DEC-14 01-NOV-14

2 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    19 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    19 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CAL">0 AND TO_DATE("PERIOD"||', '||'2014','Month,
              YYYY')>=TO_DATE(' 2014-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


So this is the question. What’s the anomaly in this output ?

Bonus question: What’s the explanation for the anomaly ?

Answers:

If I had asked why the query might, or might not, crash – the answer would be about the order of predicate evaluation, and simply collecting stats (or not) might have made a difference. Ever since “system stats”  and “CPU costing” appeared the optimizer has been able to change the order in which it applies filter predicates to a table (there’s a pdf of an article of mine from Oracle magazine in the 9i / 10g timeline linked at this URL) .  In this case, applying the “cal > 0″ predicate first luckily eliminates the rows that would fail the second predicate. Since the effect is driven by the optimizer’s stats this type of failure could occur ANY TIME you have a predicate that requires coercion between types to take place – which is one reason why you see the injunctions to use the correct data types; and why, if you need coercion to work around incorrect data types you have to consider writing your own functions to trap and resolve the necessary errors raised by Oracle’s implicit conversion mechanisms.

For a quick sketch of the optimizer strategy, the arithmetic is roughly:  predicate A costs c1 and predicate B costs c2; if I apply predicate A to every row I have to apply predicate B to only N surviving rows; if I apply predicate B to every row I have to apply predicate A to M surviving rows; which is smaller: (input_rows * c1 + N * c2) or (input_rows * c2 + M * c1).

The answer to the question I actually asked is this, though: I stressed the fact that this was the COMPLETE output because, as Narenda highlighted in comment 7 below –  the first query shows a note about dynamic sampling and the second query does not. This is a little surprising; we don’t have stats on the table, and the two queries are different so we have to optimizer both of them.  In 12c, of course, it’s possible that the optimizer may have done something clever with statistics feedback (formerly cardinality feedback) and created an SQL directive – but even then we should have seen a note about that.

For the bonus question: given the second output doesn’t report dynamic sampling we should be curious why not – did the optimizer simply decide not to try, did it try then decide not to use the results for some reason, or is there some other reason.  The obvious next step is to look at the 10053 (optimizer) trace – where you find that the optimizer DID do dynamic sampling or rather, it tried to do dynamic sampling but the query generated to take the sample failed with Oracle error ORA-01843, as suggested by Chinar Aliyev in comment 9  and expanded by Mohamed Houri in comment 11.

The irony of the sampling problem (hinted by Chinar Aliyev in comment 10) is that you could be in a position where you have a large table and oracle picks a small sample which happens to miss any of the problem rows and then return a sample that persuades the optimizer to pick an execution plan that is bound to find a problem row; alternatively the SQL used to generate the sample might apply the predicate in an order that manages to eliminate the problem rows, while the final plan derived after sampling persuades the optimizer to use the predicate in the order B, A.

 

November 23, 2014

Baselines

Filed under: CBO,Hints,Oracle,Troubleshooting — Jonathan Lewis @ 8:58 pm GMT Nov 23,2014

I’m not very keen on bending the rules on production systems, I’d prefer to do things that look as if they could have happened in a completely legal fashion, but sometimes it’s necessary to abuse the system and here’s an example to demonstrate the point. I’ve got a simple SQL statement consisting of nothing more than an eight table join where the optimizer (on the various versions I’ve tested, including 12c) examines 5,040 join orders (even though _optimizer_max_permutations is set to the default of 2,000 – and that might come as a little surprise if you thought you knew what that parameter was supposed to do):

select
	/*+ star */
	d1.p1, d2.p2, d3.p3,
	d4.p4, d5.p5, d6.p6,
	d7.p7,
	f.small_vc
from
	dim1		d1,
	dim2		d2,
	dim3		d3,
	dim4		d4,
	dim5		d5,
	dim6		d6,
	dim7		d7,
	fact_tab	f
where
	d1.v1 = 'abc'
and	d2.v2 = 'def'
and	d3.v3 = 'ghi'
and	d4.v4 = 'ghi'
and	d5.v5 = 'ghi'
and	d6.v6 = 'ghi'
and	d7.v7 = 'ghi'
and	f.id1 = d1.id
and	f.id2 = d2.id
and	f.id3 = d3.id
and	f.id4 = d4.id
and	f.id5 = d5.id
and	f.id6 = d6.id
and	f.id7 = d7.id
;

It’s useful to have such extreme examples because they make it easy to notice certain features of Oracle’s behaviour – in this case the difference between SQL Plan Baselines and SQL Profiles. After I had created a baseline for this statement the optimizer still examined 5,040 join orders because that’s the way that baselines work – the optimizer first tries to optimize the statement without the baseline in case it can produce a better plan (for future evaluation and evolution) than the plan dictated by the baseline.

In my example this wasn’t really a problem (apart from the memory requirement in the SGA) but one of my clients has a production query that takes 7 seconds to optimize then runs in a few seconds more, so I’d like to get rid of that 7 seconds … without touching the code. Adding a baseline won’t reduce the time to optimize. (Note: 7 seconds may not seem like a lot, but when several copies of this query are executed concurrently using a different literal value as an input, that 7 seconds can go up to anything between 40 and 500 seconds of CPU parse time per query).

If I take a different tack and write some code to acquire the content of the SQL Plan Baseline (viz. the outline section from the in-memory execution plan) but store it as an SQL Profile the optimizer simply follows the embedded hints and examines just one join order (because the set of hints includes a leading() hint specifying exactly the join order required). This is why I will, occasionally, take advantage of the code that Kerry Osborne and Randolf Geist produced some time ago to raid the library cache for a child cursor and store its plan outline as an SQL profile.

Footnote:

This dirty trick doesn’t always solve the problem – the first example I tried to use for my demonstration did a complex concatenation that took a couple of minutes to optimise, and storing the baseline as a profile didn’t help.

 

 

November 21, 2014

Parallel Costs

Filed under: CBO,Oracle,Parallel Execution — Jonathan Lewis @ 11:36 am GMT Nov 21,2014

While creating a POC of a SQL rewrite recently I received a little surprise as I switched my query from serial execution to parallel execution and saw the optimizer’s estimated cost increase dramatically. I’ll explain why in a moment, but it made me think it might be worth setting up a very simple demonstration of the anomaly. I created a table t1 by copying view all_source – which happened to give me a table with about 100,000 rows and 1117 blocks – and then ran the query ‘select max(line) from t1;’ repeating the query with a /*+ parallel(t1 2) */ hint. From 11.2.0.4 here are the two execution plans I got:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |   153   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 99173 |   387K|   153   (4)| 00:00:01 |
---------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |     4 |  1010   (1)| 00:00:05 |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |     4 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     4 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |     4 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          | 99173 |   387K|  1010   (1)| 00:00:05 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| T1       | 99173 |   387K|  1010   (1)| 00:00:05 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------

It seems that the optimizer thinks that running the query parallel 2 will take five times as long as running it serially ! Is it, perhaps, some cunning fix to the optimizer that is trying to allow for process startup time for the parallel execution slaves ? Is it a bug ? No – it’s just that I hadn’t got around to setting my system stats and they were complete garbage thanks to various other tests I had been running over the previous couple of weeks. Critically, I had not adjusted the “parallel-oriented” statistics to be consistent with the “serial-oriented” stats.

Here, from the 10053 trace file for the parallel run, is the section on the Single Table Access costing, together with the system stats and base statistical information:


SYSTEM STATISTICS INFORMATION
-----------------------------
  Using WORKLOAD Stats
  CPUSPEED: 1110 millions instructions/sec
  SREADTIM: 4.540000 milliseconds
  MREADTIM: 12.440000 millisecons
  MBRC:     21 blocks
  MAXTHR:   3000000 bytes/sec
  SLAVETHR: 1000000 bytes/sec

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: T1  Alias: T1
    #Rows: 99173  #Blks:  1117  AvgRowLen:  76.00  ChainCnt:  0.00
Access path analysis for T1
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  Table: T1  Alias: T1
    Card: Original: 99173.000000  Rounded: 99173  Computed: 99173.00  Non Adjusted: 99173.00
  Access Path: TableScan
    Cost:  151.13  Resp: 1010.06  Degree: 0
      Cost_io: 147.00  Cost_cpu: 20826330
      Resp_io: 1007.76  Resp_cpu: 11570183

I’m going to walk through the optimizer’s calculations that got the serial I/O cost (cost_io: 147.00) and the parallel I/O cost (Resp_io: 1007.76), but before I do that I’ll point out how inconsistent some of the system stat are. The multiblock read time (mreadtim) is 12.44 milliseconds, to read an average of 21 blocks (mbrc) which, converting to bytes per second means (21 * 8192) * 1000/12.44 = 13,828,938 bytes per second; but the I/O rate for a single parallel execution slave (slavethr) is only 1,000,000 bytes per second – which means a single serial session can (apparently) tablescan nearly 14 times as fast as an individual parallel execution slave. It’s not surprising that somehow the optimizer thinks a serial tablescan will be faster than parallel 2) – but let’s check exactly how the arithmetic falls out.

Serial:

  • #Blks: 1117, MBRC = 21 => read requests = 1117/21 = 53.19
  • sreadtim = 4.54 milliseconds, mreadtim = 12.44 milliseconds = 2.74 * sreadtim
  • Cost = 53.19 * 2.74 + 1 (_tablescan_cost_plus_one = true) = 146.74 — which looks close enough.

Parallel:

  • #Blks: 1117, block size = 8KB => total I/O requirement = 9,150,464 bytes
  • slavethr: 1,000,000 bytes/sec, degree 2 => total throughput 2,000,000 bytes/sec => elapsed I/O time = 4.575232 seconds
  • sreadtim = 4.54 milliseconds => cost = 4.575232 / 0.00454 = 1007.76 QED.

Two final thoughts:

First, if I increase the degree of parallelism to 3 the cost drops to 673 (671.84 plus a bit of CPU); if I increase the degree any further the cost doesn’t drop any further – that’s because the maxthr (maximum throughput) is 3,000,000. The optimizer uses the smaller of maxthr and (degree * slavethr) in the parallel arithmetic.

Secondly, and most importantly, the way the optimizer behaves will be affected by the various parameters relating to parallelism, so here’s a list of the relevant settings on the instance I was using when creating this example:


SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_degree_limit                string      CPU
parallel_degree_policy               string      MANUAL
parallel_execution_message_size      integer     16384
parallel_force_local                 boolean     FALSE
parallel_instance_group              string
parallel_io_cap_enabled              boolean     FALSE
parallel_max_servers                 integer     80
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_min_time_threshold          string      AUTO
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_servers_target              integer     32
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

Note, particularly, that I have not enabled parallel_automatic_tuning.

For further details on parallel costs and the cost/time equivalence that the optimizer uses in recent versions, here’s a link to an article by Randolf Geist.

November 20, 2014

Quantum Data

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 11:30 am GMT Nov 20,2014

That’s data that isn’t there until you look for it, sort of, from the optimizer’s perspective.

Here’s some code to create a sample data set:


create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum					id,
	mod(rownum-1,200)			mod_200,
	mod(rownum-1,10000)			mod_10000,
	lpad(rownum,50)				padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'
	);
end;
/

Now derive the execution plans for a couple of queries noting, particularly, that we are using queries that are NOT CONSISTENT with the current state of the data (or more importantly the statistics about the data) – we’re querying outside the known range.


select * from t1 where mod_200  = 300;
select * from t1 where mod_200 >= 300;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  2462 |   151K|  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  2462 |   151K|  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=300)

SQL> select * from t1 where mod_200 >=300;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  2462 |   151K|  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  2462 |   151K|  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200">=300)

The predicted cardinality for mod_200 = 300 is the same as that for mod_200 >= 300. So, to be self-consistent, the optimizer really ought to report no rows (or a token 1 row) for any value of mod_200 greater than 300 – but it doesn’t.


SQL> select * from t1 where mod_200 = 350;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1206 | 75978 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  1206 | 75978 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=350)

SQL> select * from t1 where mod_200 =360;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   955 | 60165 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   955 | 60165 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=360)

SQL> select * from t1 where mod_200 =370;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   704 | 44352 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   704 | 44352 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=370)

SQL> select * from t1 where mod_200 =380;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   452 | 28476 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   452 | 28476 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=380)

SQL> select * from t1 where mod_200 in (350, 360, 370, 380);

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  3317 |   204K|  1275   (7)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  3317 |   204K|  1275   (7)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=350 OR "MOD_200"=360 OR "MOD_200"=370 OR "MOD_200"=380)

The IN-list results are consistent with the results for the individual values – but the result for the IN-list is NOT consistent with the result for the original mod_200 >= 300. The optimizer uses a “linear decay” strategy for handling predicates that go out of range, but not in a consistent way. It seems that, as far as out-of-range, range-based predicates are concerned, the data doesn’t exist until the wave front collapses.

Footnote:

This type of anomaly COULD affect some execution plans if your statistics haven’t been engineered to avoid the problems of “out of range” traps.

November 17, 2014

Plan puzzle

Filed under: CBO,Oracle,Troubleshooting — Jonathan Lewis @ 1:43 pm GMT Nov 17,2014

I was in Munich a few weeks ago running a course on Designing Optimal SQL and Troubleshooting and Tuning, but just before I flew in to Munich one of the attendees emailed me with an example of a statement that behaved a little strangely and asked me if we could look at it during the course.  It displays an odd little feature, and I thought it might be interesting to write up what I did to find out what was going on. We’ll start with the problem query and execution plan:


select     s section_size,
           max(program_id) ,avg(program_id)
from       fullclones_test
cross join  (select distinct section_size s from fullclones_test)
where      section_size = (select min(section_size) from fullclones_test)
and        clone_size >= s
group by   s
order by   1; 

Since I found this a little difficult to follow (with repetitions of the same table name, and column aliases switching back and forth) I did a little cosmetic modification; all I’ve done is add table aliases and query block names, then arranged the text for my visual benefit. The aliases and query block names can help when dissecting the anomaly.


select
	/*+ qb_name(main) */
	max(ft1.program_id),
	avg(ft1.program_id),
	ftv.s	section_size
from
	fullclones_test	ft1
cross join
	(
	select	/*+ qb_name(inline) */
		distinct ft2.section_size s
	from	fullclones_test	ft2
	)	ftv
where	ft1.section_size = (
		select 	/*+ qb_name(subq) */
			min(section_size)
		from	fullclones_test	ft3
	)
and	ft1.clone_size >= ftv.s
group by
	ftv.s
order by
	ftv.s
; 

This query ran reasonably quickly (about half a second), producing the following execution plan:


--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                 |      1 |        |   404 (100)|      4 |00:00:00.05 |    1116 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |                 |      1 |      5 |   404   (6)|      4 |00:00:00.05 |    1116 |       |       |          |
|   2 |   MERGE JOIN           |                 |      1 |  48637 |   299   (7)|  50361 |00:00:00.58 |    1116 |       |       |          |
|   3 |    SORT JOIN           |                 |      1 |      5 |   114  (11)|      5 |00:00:00.02 |     372 |  2048 |  2048 | 2048  (0)|
|   4 |     VIEW               |                 |      1 |      5 |   114  (11)|      5 |00:00:00.02 |     372 |       |       |          |
|   5 |      HASH UNIQUE       |                 |      1 |      5 |   114  (11)|      5 |00:00:00.02 |     372 |  5894K|  3254K|  884K (0)|
|   6 |       TABLE ACCESS FULL| FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.31 |     372 |       |       |          |
|*  7 |    SORT JOIN           |                 |      5 |  20000 |   185   (4)|  50361 |00:00:00.16 |     744 |   619K|   472K|  550K (0)|
|*  8 |     TABLE ACCESS FULL  | FULLCLONES_TEST |      1 |  20000 |   106   (4)|  20076 |00:00:00.09 |     744 |       |       |          |
|   9 |      SORT AGGREGATE    |                 |      1 |      1 |            |      1 |00:00:00.01 |     372 |       |       |          |
|  10 |       TABLE ACCESS FULL| FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.29 |     372 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$071BB01A
   4 - INLINE       / FTV@SEL$1
   5 - INLINE   
   6 - INLINE       / FT2@INLINE   
   8 - SEL$071BB01A / FT1@SEL$1
   9 - SUBQ 
  10 - SUBQ         / FT3@SUBQ

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - access("FT1"."CLONE_SIZE">="FTV"."S")
       filter("FT1"."CLONE_SIZE">="FTV"."S")
   8 - filter("FT1"."SECTION_SIZE"=)

As you can see by comparing the block name / object alias information, we can identify a single full tablescan being executed at line 9 to produce the min(section_size) in the subquery.

We can also see that the “select distinct” executes at lines 5/6 to produce 5 rows which are then joined with a merge join to the first full tablescan of t1.

If you’re wondering about the appearance of a sel$1 despite my efforts to name every query block, that’s the (almost) inevitable side effect of using ANSI syntax – virtually every join after the first two tables will introduce a new (unnameable) query block to introduce the next table.

Now here’s the anomaly: if we eliminate the avg() from the select list we’re going to produce a result that ought to require less work – but look what happens:

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                 |      1 |        | 10802 (100)|      4 |00:02:48.83 |    1116 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |                 |      1 |      5 | 10802  (94)|      4 |00:02:48.83 |    1116 |       |       |          |
|   2 |   MERGE JOIN           |                 |      1 |    972M| 10697  (95)|   1007M|03:21:28.41 |    1116 |       |       |          |
|   3 |    SORT JOIN           |                 |      1 |  99999 |   380   (4)|  80042 |00:00:00.39 |     372 |  2037K|   674K| 1810K (0)|
|   4 |     TABLE ACCESS FULL  | FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.26 |     372 |       |       |          |
|*  5 |    SORT JOIN           |                 |  80042 |  20000 |   185   (4)|   1007M|00:57:11.13 |     744 |   619K|   472K|  550K (0)|
|*  6 |     TABLE ACCESS FULL  | FULLCLONES_TEST |      1 |  20000 |   106   (4)|  20076 |00:00:00.11 |     744 |       |       |          |
|   7 |      SORT AGGREGATE    |                 |      1 |      1 |            |      1 |00:00:00.01 |     372 |       |       |          |
|   8 |       TABLE ACCESS FULL| FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.28 |     372 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$6B65F52B
   4 - SEL$6B65F52B / FT2@INLINE
   6 - SEL$6B65F52B / FT1@SEL$1
   7 - SUBQ
   8 - SUBQ         / FT3@SUBQ

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("FT1"."CLONE_SIZE">="FT2"."SECTION_SIZE")
       filter("FT1"."CLONE_SIZE">="FT2"."SECTION_SIZE")
   6 - filter("FT1"."SECTION_SIZE"=)

Ignore the timings from lines 2 and 5 – I was using the hint gather_plan_statistics to collect the rowsource execution stats, and those lines are showing a massive sampling error. The query took about 7 minutes 30 seconds to run. The key difference is that line 4 shows that the “select distinct” is NOT aggregated early – the optimizer has used complex view merging to “join then aggregate” rather than “aggregate then join”. As you can see, this was a bad choice and the join has produced over a billion (US) rows at line 2 which then have to aggregated down to just 4 rows in line 1.

The question then is why ? If I put a /*+ no_merge */ hint in query block named “inline” the optimizer accepts the hint and goes back to the plan that aggregates early and runs very quickly – so it’s not a question of the optimizer bypassing some mechanism to avoid getting the wrong answer. I think the only option available to use for further investigation at this point is to examine the 10053 (optimizer) trace file to see what’s going on.

From the (12c)  trace file where we select the avg() we see the following lines:

OJE: Begin: find best directive for query block INLINE (#0)
OJE: End: finding best directive for query block INLINE (#0)
CVM: CBQT Marking query block INLINE (#0) as valid for CVM.
CVM:   Not Merging INLINE (#0) into SEL$1 (#0) due to CBQT directive.

From the equivalent position in the trace file where we select only the max() we see the lines:

OJE: Begin: find best directive for query block INLINE (#0)
OJE: End: finding best directive for query block INLINE (#0)
CVM:   Merging SPJ view INLINE (#0) into SEL$1 (#0)

It’s always hard to decide exactly WHY things happen – but it looks as if the optimizer merges the view heuristically in the max() case “because it can”, but has a heuristic (internal directive) that stops it from merging in the avg() case. What a good job we have hints !

Footnote:

In cases like this it’s always possible that there’s a generic case that might produce wrong results even though the specific case won’t – so it’s worth spending a little time thinking about how the wrong results might appear. It’s also worth keep hold of the SQL as a test case because if there’s a generic block in place to handle specific cases you may find that future enhancements allow the block to be lifted for some cases, and it’s nice to be able to check for such cases as you upgrade.

On the other hand, you can get back to the efficient plan if you change the inline view to be:


        (
        select  /*+
                        qb_name(inline)
                */
                ft2.section_size s , count(*) ct
        from    fullclones_test ft2
        group by
                ft2.section_size
        )       ftv


That’s just a little bit surprising – but I’m not going to pursue this one any further, beyond noting that there are some interesting anomalies available with inline aggregate views, even in 12.1.0.2.

Footnote 2:

If you want to experiment further, here’s the SQL to generate the data set:

create table fullclones_test (
        program_id      number(5,0),
        start_line      number,
        clone_id        number(5,0),
        clone_line      number,
        clone_size      number,
        range_start     number,
        section_size    number(4,0)
)
;

insert into fullclones_test (
        program_id, start_line, clone_id, clone_line,
        clone_size, range_start, section_size
)
Select
        1,
        trunc(dbms_random.value(1,1000)),
        trunc(dbms_random.value(1,10)),
        trunc(dbms_random.value(1,1000)),
        trunc(dbms_random.value(20,100)),
        0,
        20*trunc(dbms_random.value(1,6))
from
        dual
connect by
        level <100000
;

commit;

exec dbms_stats.gather_table_stats(user,'fullclones_test')

Finally, for consistently reproducible results I had engineered my system stats as follows:


        begin
                dbms_stats.set_system_stats('MBRC',16);
                dbms_stats.set_system_stats('MREADTIM',10);
                dbms_stats.set_system_stats('SREADTIM',5);
                dbms_stats.set_system_stats('CPUSPEED',1000);
        end;

November 5, 2014

Cardinality Feedback

Filed under: CBO,Oracle,Troubleshooting — Jonathan Lewis @ 6:43 pm GMT Nov 5,2014

A fairly important question, and a little surprise, appeared on Oracle-L a couple of days ago. Running 11.2.0.3 a query completed quickly on the first execution then ran very slowly on the second execution because Oracle had used cardinality feedback to change the plan. This shouldn’t really be entirely surprising – if you read all the notes that Oracle has published about cardinality feedback – but it’s certainly a little counter-intuitive.

Of course there are several known bugs related to cardinality feedback that could cause this anomaly to appear – a common complaint seems to relate to views on the right-hand (inner table) side of nested loop joins, and cardinality feedback being used on a table inside the view; but there’s an inherent limitation to cardinality feedback that makes it fairly easy to produce an example of a query doing more work on the second execution.

The limitation is that cardinality feedback generally can’t be used (sensibly) on all the tables where better information is needed. This blog describes the simplest example I can come up with to demonstrate the point. Inevitably it’s a little contrived, but it captures the type of guesswork and mis-estimation that can make the problem appear in real data sets. Here’s the query I’m going to use:


select
	t1.n1, t1.n2, t2.n1, t2.n2
from
	t1, t2
where
	t1.n1 = 0
and	t1.n2 = 1000
and	t2.id = t1.id
and	t2.n1 = 0
and	t2.n2 = 400
;

You’ll notice that I’ve got two predicates on both tables so, in the absence of “column-group” extended stats the optimizer will enable cardinality feedback as the query runs to check whether or not its “independent columns” treatment of the predicates gives a suitably accurate estimate of cardinality and a reasonable execution plan. If the estimates are bad enough the optimizer will use information it has gathered as the query ran as an input to re-optimising the query on the next execution.

So here’s the trick.  I’m going to set up the data so that there seem to be only two sensible plans:  (a) full scan of t1, with nested loop unique index access to t2; (b) full scan of t2, with nested loop unique index access to t1. But I’m going to make sure that the optimizer thinks that (a) is more efficient than (b) by making making the stats look as if (on average) the predicates on t1 should return 100 rows while the predicates on t2 return 200 rows.

On the other hand I’ve set the data up so that (for this specific set of values) t1 returns 1,000 rows which means Oracle will decide that its estimate was so far out that it will re-optimize with 1,000 as the estimated single table access cardinality for t1 – and that means it will decide to do the nested loop from t2 to t1. But what the optimizer doesn’t know (and hasn’t been able to find out by running the first plan) is that with this set of predicates t2 will return 20,000 rows to drive the nested loop into t1 – and the new execution plan will do more buffer gets and use more CPU (and time) than the old plan. Since cardinality feedback is applied only once, the optimizer won’t be able to take advantage of the second execution to change the plan again, or even to switch back to the first plan.

Here’s the setup so you can test the behaviour for yourselves:


create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum			id,
	mod(rownum,2)		n1,
	mod(rownum,2000)	n2,	-- 200 rows for each value on average
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 4e5
;

alter table t1 add constraint t1_pk primary key(id);

create table t2
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum			id,
	mod(rownum,2)		n1,
	2 * mod(rownum,1000)	n2,	-- 400 rows for each value on average, same range as t1
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 4e5
;

alter table t2 add constraint t2_pk primary key(id);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T2',
		method_opt	 => 'for all columns size 1'
	);
end;
/

--
-- Now update both tables to put the data out of sync with the statistics
-- We need a skewed value in t1 that is out by a factor of at least 8 (triggers use of CF)
-- We need a skewed value in t2 that is so bad that the second plan is more resource intensive than the first
--

update t1 set n2 = 1000 where n2 between 1001 and 1019;
update t2 set n2 =  400 where n2 between 402 and 598;
commit;

Here are the execution plans for the first and second executions (with rowsource execution statistics enabled, and the “allstats last” option used in a call to dbms_xplan.display_cursor()).


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |      1 |        |  1049 (100)|      0 |00:00:00.36 |   11000 |   6588 |
|   1 |  NESTED LOOPS                |       |      1 |    100 |  1049   (3)|      0 |00:00:00.36 |   11000 |   6588 |
|   2 |   NESTED LOOPS               |       |      1 |    100 |  1049   (3)|   2000 |00:00:00.35 |    9000 |   6552 |
|*  3 |    TABLE ACCESS FULL         | T1    |      1 |    100 |   849   (4)|   2000 |00:00:00.30 |    6554 |   6551 |
|*  4 |    INDEX UNIQUE SCAN         | T2_PK |   2000 |      1 |     1   (0)|   2000 |00:00:00.02 |    2446 |      1 |
|*  5 |   TABLE ACCESS BY INDEX ROWID| T2    |   2000 |      1 |     2   (0)|      0 |00:00:00.01 |    2000 |     36 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("T1"."N2"=1000 AND "T1"."N1"=0))
   4 - access("T2"."ID"="T1"."ID")
   5 - filter(("T2"."N2"=400 AND "T2"."N1"=0))

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |      1 |        |  1249 (100)|      0 |00:00:00.66 |   32268 |   1246 |
|   1 |  NESTED LOOPS                |       |      1 |    200 |  1249   (3)|      0 |00:00:00.66 |   32268 |   1246 |
|   2 |   NESTED LOOPS               |       |      1 |    200 |  1249   (3)|  20000 |00:00:00.56 |   12268 |    687 |
|*  3 |    TABLE ACCESS FULL         | T2    |      1 |    200 |   849   (4)|  20000 |00:00:00.12 |    6559 |    686 |
|*  4 |    INDEX UNIQUE SCAN         | T1_PK |  20000 |      1 |     1   (0)|  20000 |00:00:00.19 |    5709 |      1 |
|*  5 |   TABLE ACCESS BY INDEX ROWID| T1    |  20000 |      1 |     2   (0)|      0 |00:00:00.15 |   20000 |    559 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("T2"."N2"=400 AND "T2"."N1"=0))
   4 - access("T2"."ID"="T1"."ID")
   5 - filter(("T1"."N2"=1000 AND "T1"."N1"=0))

Note
-----
   - cardinality feedback used for this statement

The second plan does fewer reads because of the buffering side effects from the first plan – but that’s not what the optimizer is looking at. The key feature is that the first plan predicts 100 rows for t1, with 100 starts for the index probe, but discovers 2,000 rows and does 2,000 probes. Applying cardinality feedback the optimizer decides that fetching 200 rows from t2 and probing t1 200 times will be lower cost than running the join the other way round with the 2,000 rows it now knows it will get – but at runtime Oracle actually gets 20,000 rows, does three times as many buffer gets, and spends twice as much time as it did on the first plan.

Hinting

Oracle hasn’t been able to learn (in time) that t2 will supply 20,000 rows – but if you knew this would happen you could use the cardinality() hint to tell the optimizer the truth about both tables /*+ cardinality(t1 2000) cardinality(t2 20000) */ this is the plan you would get:

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |  1698 (100)|      0 |00:00:00.06 |   13109 |  13105 |       |       |          |
|*  1 |  HASH JOIN         |      |      1 |   2000 |  1698   (4)|      0 |00:00:00.06 |   13109 |  13105 |  1696K|  1696K| 1647K (0)|
|*  2 |   TABLE ACCESS FULL| T1   |      1 |   2000 |   849   (4)|   2000 |00:00:00.05 |    6554 |   6552 |       |       |          |
|*  3 |   TABLE ACCESS FULL| T2   |      1 |  20000 |   849   (4)|  20000 |00:00:00.09 |    6555 |   6553 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."ID"="T1"."ID")
   2 - filter(("T1"."N2"=1000 AND "T1"."N1"=0))
   3 - filter(("T2"."N2"=400 AND "T2"."N1"=0))

Unfortunately, unless you have used hints, it doesn’t matter how many times you re-run the query after cardinality feedback has pushed you into the bad plan – it’s not going to change again (unless you mess around flushing the shared pool or using dbms_shared_pool.purge() to kick out the specific statement).

Upgrade

If you upgrade to 12c the optimizer does a much better job of handling this query – it produces an adaptive execution plan (starting with the nested loop join but dynamically switching to the hash join as the query runs). Here’s the full adaptive plan pulled from memory after the first execution – as you can see both the t1/t2 nested loop and hash joins were considered, then the nested loop was discarded in mid-execution. Checking the 10053 trace file I found that Oracle has set the inflexion point (cross-over from NLJ to HJ) at 431 rows.


----------------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                     | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT              |       |      1 |        |  1063 (100)|      0 |00:00:00.06 |   13113 |  13107 |       |       |          |
|  *  1 |  HASH JOIN                    |       |      1 |    100 |  1063   (3)|      0 |00:00:00.06 |   13113 |  13107 |  1519K|  1519K| 1349K (0)|
|-    2 |   NESTED LOOPS                |       |      1 |    100 |  1063   (3)|   2000 |00:00:00.11 |    6556 |   6553 |       |       |          |
|-    3 |    NESTED LOOPS               |       |      1 |    100 |  1063   (3)|   2000 |00:00:00.10 |    6556 |   6553 |       |       |          |
|-    4 |     STATISTICS COLLECTOR      |       |      1 |        |            |   2000 |00:00:00.09 |    6556 |   6553 |       |       |          |
|  *  5 |      TABLE ACCESS FULL        | T1    |      1 |    100 |   863   (4)|   2000 |00:00:00.08 |    6556 |   6553 |       |       |          |
|- *  6 |     INDEX UNIQUE SCAN         | T2_PK |      0 |      1 |     1   (0)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|- *  7 |    TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |     2   (0)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|  *  8 |   TABLE ACCESS FULL           | T2    |      1 |      1 |     2   (0)|  20000 |00:00:00.07 |    6557 |   6554 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."ID"="T1"."ID")
   5 - filter(("T1"."N2"=1000 AND "T1"."N1"=0))
   6 - access("T2"."ID"="T1"."ID")
   7 - filter(("T2"."N2"=400 AND "T2"."N1"=0))
   8 - filter(("T2"."N2"=400 AND "T2"."N1"=0))

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

Footnote:

For reference, here are a couple of the bug (or patch)  numbers associated with cardinality feedback:

  • Patch 13454409: BAD CARDINALITY FROM FEEDBACK (CFB) ON VIEW RHS OF NLJ
  • Bug 16837274 (fixed in 12.2): Bad cost estimate for object on RHS of NLJ
  • Bug 12557401: The table that is being incorrectly adjusted is in the right hand side of a nested loops.
  • Bug 8521689: Bad cardinality feedback estimate for view on right-hand side of NLJ

 

October 31, 2014

first_rows(10)

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 5:31 pm GMT Oct 31,2014

No, not the 10th posting about first_rows() this week – whatever it may seem like – just an example that happens to use the “calculate costs for fetching the first 10 rows” optimizer strategy and does it badly. I think it’s a bug, but it’s certainly a defect that is a poster case for the inherent risk of using anything other than all_rows optimisation.  Here’s some code to build a couple of sample tables:


begin
	dbms_stats.set_system_stats('MBRC',16);
	dbms_stats.set_system_stats('MREADTIM',12);
	dbms_stats.set_system_stats('SREADTIM',5);
	dbms_stats.set_system_stats('CPUSPEED',1000);
end;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	rownum					id,
	trunc(dbms_random.value(1,1000))	n1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

create index t1_n1 on t1(id, n1);

create table t2
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	rownum					id,
	trunc(dbms_random.value(10001,20001))	x1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

create index t2_i1 on t2(x1);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		estimate_percent => 100,
		method_opt	 => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T2',
		estimate_percent => 100,
		method_opt	 => 'for all columns size 1'
	);

end;
/


create or replace view  v1
as
select 
	id, n1, small_vc, padding
from	t1 
where	n1 between 101 and 300
union all
select 
	id, n1, small_vc, padding
from	t1 
where	n1 between 501 and 700
;

The key feature of this demonstration is the UNION ALL view and what the optimizer does with it when we have first_rows_N optimisation – this is a simplified model of a production problem I was shown a couple of years ago, so nothing special, nothing invented. Here’s a query that behaves badly:


select
	/*+ gather_plan_statistics */
	v1.small_vc,
	v1.n1
from
	v1,
	t2
where
	t2.id = v1.id
and	t2.x1 = 15000
;

I’m going to execute this query in three different ways – as is, using all_rows optimisation; as is, using first_rows_10 optimisation, then using all_rows optimisation but with the necessary hints to make it follow the first_rows_10 execution path. Here are the resulting plans from an instance of 12.1.0.2 (the same thing happens in 11.2.0.4):


first_rows_10 plan
----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |     1 |    35 |   107   (0)| 00:00:01 |
|*  1 |  HASH JOIN                           |       |     1 |    35 |   107   (0)| 00:00:01 |
|   2 |   VIEW                               | V1    |    24 |   600 |     4   (0)| 00:00:01 |
|   3 |    UNION-ALL                         |       |       |       |            |          |
|*  4 |     TABLE ACCESS FULL                | T1    |    12 |   240 |     2   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL                | T1    |    12 |   240 |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |   100 |  1000 |   103   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN                  | T2_I1 |   100 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

all_rows plan
------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |       |    40 |  1400 |   904   (1)| 00:00:01 |
|   1 |  NESTED LOOPS                          |       |    40 |  1400 |   904   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED  | T2    |   100 |  1000 |   103   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                    | T2_I1 |   100 |       |     3   (0)| 00:00:01 |
|   4 |   VIEW                                 | V1    |     1 |    25 |     8   (0)| 00:00:01 |
|   5 |    UNION ALL PUSHED PREDICATE          |       |       |       |            |          |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    20 |     4   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                  | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
|   8 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    20 |     4   (0)| 00:00:01 |
|*  9 |      INDEX RANGE SCAN                  | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

first_rows_10 plan hinted under all_rows optimisation
---------------------------------------------------------------------------------------------- 
| Id  | Operation                    | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | 
---------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT             |       |   200 |  8600 |       |  6124   (3)| 00:00:01 | 
|*  1 |  HASH JOIN                   |       |   200 |  8600 |    17M|  6124   (3)| 00:00:01 |
|   2 |   VIEW                       | V1    |   402K|    12M|       |  5464   (3)| 00:00:01 | 
|   3 |    UNION-ALL                 |       |       |       |       |            |          | 
|*  4 |     TABLE ACCESS FULL        | T1    |   201K|  3933K|       |  2731   (3)| 00:00:01 | 
|*  5 |     TABLE ACCESS FULL        | T1    |   201K|  3933K|       |  2733   (3)| 00:00:01 | 
|   6 |   TABLE ACCESS BY INDEX ROWID| T2    |   100 |  1000 |       |   103   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN          | T2_I1 |   100 |       |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

I’m not sure why the first_rows_10 plan uses “table access by rowid batched”, but I’d guess it’s because the optimiser calculates that sorting the index rowids before visiting the table may have a small benefit on the speed of getting the first 10 rows – eventually I’ll get around to checking the 10053 trace file. The important thing, though, is the big mistake in the strategy, not the little difference in table access.

In the first_rows_10 plan the optimizer has decided building an in-memory hash table from the UNION ALL of the rows fetched from the two copies of the t1 table will be fast and efficient; but it’s made that decision based on the assumption that it will only get 10 rows from each copy of the table – and at run-time it HAS to get all the relevant t1 rows to build the hash table before it can get any t2 rows. We can get some idea of the scale of this error when we look at the hinted plan under all_rows optimisation – it’s a lot of redundant data and a very expensive hash table build.

In contrast the all_rows plan does an efficient indexed access into the t2 table then, for each row, does a join predicate pushdown into the union all view using an indexed access path. If we only wanted to fetch 10 rows we could stop after doing a minimum amount of work. To demonstrate the error more clearly I’ve re-run the experiment for the first two plans from SQL*PLus, setting the arraysize to 11, the pagesize to 5, and stopping after the first 10 rows. Here are the plans showing the rowsource execution stats:


first_rows_10 plan
------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |      1 |        |   107 (100)|     12 |00:00:00.43 |   35150 |       |       |          |
|*  1 |  HASH JOIN                           |       |      1 |      1 |   107   (0)|     12 |00:00:00.43 |   35150 |    24M|  3582K|   23M (0)|
|   2 |   VIEW                               | V1    |      1 |     24 |     4   (0)|    400K|00:00:06.48 |   35118 |       |       |          |
|   3 |    UNION-ALL                         |       |      1 |        |            |    400K|00:00:04.20 |   35118 |       |       |          |
|*  4 |     TABLE ACCESS FULL                | T1    |      1 |     12 |     2   (0)|    200K|00:00:00.71 |   17559 |       |       |          |
|*  5 |     TABLE ACCESS FULL                | T1    |      1 |     12 |     2   (0)|    200K|00:00:00.63 |   17559 |       |       |          |
|   6 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |      1 |    100 |   103   (0)|     28 |00:00:00.01 |      32 |       |       |          |
|*  7 |    INDEX RANGE SCAN                  | T2_I1 |      1 |    100 |     3   (0)|     28 |00:00:00.01 |       4 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

all_rows plan
-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |       |      1 |        |   904 (100)|     12 |00:00:00.01 |     213 |
|   1 |  NESTED LOOPS                          |       |      1 |     43 |   904   (1)|     12 |00:00:00.01 |     213 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED  | T2    |      1 |    100 |   103   (0)|     28 |00:00:00.01 |      32 |
|*  3 |    INDEX RANGE SCAN                    | T2_I1 |      1 |    100 |     3   (0)|     28 |00:00:00.01 |       4 |
|   4 |   VIEW                                 | V1    |     28 |      1 |     8   (0)|     12 |00:00:00.01 |     181 |
|   5 |    UNION ALL PUSHED PREDICATE          |       |     28 |        |            |     12 |00:00:00.01 |     181 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     28 |    212K|     4   (0)|      8 |00:00:00.01 |      93 |
|*  7 |      INDEX RANGE SCAN                  | T1_N1 |     28 |      1 |     3   (0)|      8 |00:00:00.01 |      85 |
|   8 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     28 |    213K|     4   (0)|      4 |00:00:00.01 |      88 |
|*  9 |      INDEX RANGE SCAN                  | T1_N1 |     28 |      1 |     3   (0)|      4 |00:00:00.01 |      84 |
-----------------------------------------------------------------------------------------------------------------------

If I had set the optimizer_mode to first_rows_10 because I really only wanted to fetch (about) 10 rows then I’ve managed to pay a huge overhead in buffer visits, memory and CPU for the privilege – the all_rows plan was much more efficient.

Remember – we often see cases where the first_rows(n) plan will do more work to get the whole data set in order to be able to get the first few rows more quickly (the simplest example is when the optimizer uses a particular index to get the first few rows of a result set in order without sorting rather than doing a (faster) full tablescan with sort. This case, though, is different: the optimizer is choosing to build a hash table as if it only has to put 10 rows into that hash table when it actually HAS to build the whole has table before it can take any further steps – we don’t get 10 rows quicker and the rest more slowly; we just get 10 very slow rows.

Footnote:

It’s possible that this is an example of bug 9633142: (FIRST_ROWS OPTIMIZER DOES NOT PUSH PREDICATES INTO UNION INLINE VIEW) but that’s reported as fixed in 12c, with a couple of patches for 11.2.0.2/3. However, setting “_fix_control”=’4887636:off’, does bypass the problem. (The fix control, introduced in 11.1.0.6 has description: “remove restriction from first K row optimization”)

October 30, 2014

Quiz night

Filed under: CBO,Oracle — Jonathan Lewis @ 6:43 pm GMT Oct 30,2014

Here’s a little puzzle that came up on OTN recently.  (No prizes for following the URL to find the answer) (Actually, no prizes anyway). There’s more in the original code sample than was really needed, so although I’ve done a basic cut and paste from the original I’ve also eliminated a few lines of the text:


execute dbms_random.seed(0)

create table t
as
select rownum as id,
       100+round(ln(rownum/3.25+2)) aS val2,
       dbms_random.string('p',250) aS pad
from dual
connect by level <= 1000
order by dbms_random.value;

begin
  dbms_stats.gather_table_stats(ownname          => user,
                                tabname          => 'T',
                                method_opt       => 'for all columns size 254'
  );
end;
/

column endpoint_value format 9999
column endpoint_number format 999999
column frequency format 999999

select endpoint_value, endpoint_number,
       endpoint_number - lag(endpoint_number,1,0)
                  OVER (ORDER BY endpoint_number) AS frequency
from user_tab_histograms
where table_name = 'T'
and column_name = 'VAL2'
order by endpoint_number
;

alter session set optimizer_mode = first_rows_100;

explain plan set statement_id '101' for select * from t where val2 = 101;
explain plan set statement_id '102' for select * from t where val2 = 102;
explain plan set statement_id '103' for select * from t where val2 = 103;
explain plan set statement_id '104' for select * from t where val2 = 104;
explain plan set statement_id '105' for select * from t where val2 = 105;
explain plan set statement_id '106' for select * from t where val2 = 106;

select statement_id, cardinality from plan_table where id = 0;

The purpose of the method_opt in the gather_table_stats() call is to ensure we get a frequency histogram on val2; and the query against the user_tab_columns view should give the following result:


ENDPOINT_VALUE ENDPOINT_NUMBER FREQUENCY
-------------- --------------- ---------
           101               8         8
           102              33        25
           103             101        68
           104             286       185
           105             788       502
           106            1000       212

Given the perfect frequency histogram, the question then arises why the optimizer seems to calculate incorrect cardinalities for some of the queries; the output from the last query is as follows:


STATEMENT_ID                   CARDINALITY
------------------------------ -----------
101                                      8
102                                     25
103                                     68
104                                    100           -- expected prediction 185
105                                    100           -- expected prediction 502
106                                    100           -- expected prediction 212

I’ve disabled comments so that you can read the answer at OTN if you want to – but see if you can figure out the reason before reading it. (This reproduces on 11g and 12c – and probably on earlier versions all the way back to 9i).

I haven’t done anything extremely cunning with hidden parameters, materialized views, query rewrite, hidden function calls, virtual columns or any other very dirty tricks, by the way.

October 28, 2014

First Rows

Filed under: CBO,Hints,Oracle — Jonathan Lewis @ 7:01 am GMT Oct 28,2014

Following on from the short note I published about the first_rows optimizer mode yesterday here’s a note that I wrote on the topic more than 2 years ago but somehow forgot to publish.

I can get quite gloomy when I read some of the material that gets published about Oracle; not so much because it’s misleading or wrong, but because it’s clearly been written without any real effort being made to check whether it’s true. For example, a couple of days ago [ed: actually some time around May 2012] I came across an article about optimisation in 11g that seemed to be claiming that first_rows optimisation somehow “defaulted” to first_rows(1) , or first_rows_1, optimisation if you didn’t supply a final integer value.

For at least 10 years the manuals have described first_rows (whether as a hint or as a parameter value) as being available for backwards compatibility; so if it’s really just a synonym for first_rows_1 (or first_rows(1)) you might think that the manuals would actually mention this. Even if the manuals didn’t mention it you might just consider a very simple little test before making such a contrary claim, and if you did make such a test and found that your claim was correct you might actually demonstrate (or describe) the test so that other people could check your results.

It’s rather important, of course, that people realise (should it ever happen) that first_rows has silently changed into first_rows_1 because any code that’s using it for backwards compatibility might suddenly change execution path when you did the critical upgrade where the optimizer changed from “backwards compatibility” mode to “completely different optimisation strategy” mode. So here’s a simple check (run from 11.2.0.4 – to make sure I haven’t missed the switch):

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',1000);
end;
/

create table t2 as
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from    all_objects
where rownum <= 3000
;

begin
        dbms_stats.gather_table_stats(
                user,
                't2',
                method_opt => 'for all columns size 1'
 
        );
end;
/

create index t2_i1 on t2(n1);

SQL> select /*+ all_rows */ n2 from t2 where n1 = 15;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    15 |   120 |    12   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N1"=15)

You’ll notice that I’ve created my data in a way that means I’ll have 15 rows with the value 15, scattered evenly through the table. As a result of the scattering the clustering_factor on my index is going to be similar to the number of rows in the table, and the cost of fetching all the rows by index is going to be relatively high. Using all_rows optimization Oracle has chosen a tablescan.

So what happens if I use the first_rows(1) hint, and how does this compare with using the first_rows hint ?

SQL> select /*+ first_rows(1) */ n2 from t2 where n1 = 15;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     2 |    16 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2    |     2 |    16 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_I1 |       |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=15)

SQL> select /*+ first_rows */ n2 from t2 where n1 = 15;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    15 |   120 |    16   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2    |    15 |   120 |    16   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_I1 |    15 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=15)

You might not find it too surprising to see that Oracle used the indexed access path in both cases. Does this mean that first_rows really means (or defaults to) first_rows(1) ?

Of course it doesn’t – you need only look at the estimated cost and cardinality to see this. The two mechanisms are clearly implemented through difference code paths. The first_rows method uses some heuristics to restrict the options it examines, but still gives us the estimated cost and cardinality of fetching ALL the rows using the path it has chosen. The first_rows(1) method uses arithmetic to decide on the best path for getting the first row, and adjusts the cost accordingly to show how much work it thinks it will have to do to fetch just that one row.

Of course, no matter how inane a comment may seem to be, there’s always a chance that it might be based on some (unstated) insight. Is there any way in which first_rows(n) and first_rows are related ? If so could you possibly manage to claim that this establishes a “default value” link?

Funnily enough there is a special case: if you try hinting with first_rows(0) – that’s the number zero – Oracle will use the old first_rows optimisation method – you can infer this from the cost and cardinality figures, or you can check the 10053 trace file, or use a call to dbms_xplan() to report the outline.  It’s an interesting exercise (left to the reader) to decide whether this is the lexical analyzer deciding to treat the “(0)” as a new – and meaningless – token following the token “first_rows”, or whether it is the optimizer recognising the lexical analyzer allowing “first_rows(0)” as a token which the optimizer is then required to treat as first_rows.

Mind you, if you only want the first zero rows of the result set there’s a much better trick you can use to optimise the query – don’t run the query.

 

October 27, 2014

First Rows

Filed under: CBO,Oracle — Jonathan Lewis @ 7:21 am GMT Oct 27,2014

I received an email earlier on this year asking me my opinion of the first_rows option for the optimizer mode. My correspondent was looking at a database with the following settings:

optimizer_mode=first_rows
_sort_elimination_cost_ratio=4

He felt that first_rows was a very old optimizer instruction that might cause suboptimal execution plans in it’s attempt to avoid blocking operations. As for the cost ratio, no-one seemed to be able to explain why it was there.

He was correct; I’ve written the first_rows option a few times in the past – it was left in for backwards compatibility, and reported as such from 9i onwards!

As for the _sort_elimination_cost_ratio – it’s (probably) there to work around the problems caused by first_rows optimisation when you have an ORDER BY clause that could be met by walking an index in competition with a WHERE clause that could be met from another index. Under first_rows the optimizer is highly likely to choose the index for the order by to avoid sorting; but the _sort_elimination_cost_ratio says:  “if the cost of using the index for the ORDER BY is more than N times the cost of using the other index for the WHERE clause then use an index on the WHERE clause and sort the result.”

 The fact that the parameter has been set so low in this case suggests that the end-user:
  1. set first_rows because “it’s an OLTP system” – a myth that even the manuals promoted
  2. found lots of queries taking silly index-driven execution plans because they’d use (say) a primary key index to access and discard vast amounts of  data in the required order, instead of picking up a small amount of data using a better choice of index and then sorting it.
I’ve said many times it in the past: you probably don’t need any first_rows(n) or first_rows_N optimisation, but if you’re using first_rows (i.e. the old option) you really ought to get away from it. Depending on the time you have for testing and your aversion to risk, you might go straight to all_rows, or switch to first_rows_10.  (First_rows_1 can be over-aggressive and introduce some of the same side effects as first_rows).

August 21, 2014

Quiz night

Filed under: CBO,Indexing,NULL,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 6:05 pm GMT Aug 21,2014

Here’s a script to create a table, with index, and collect stats on it. Once I’ve collected stats I’ve checked the execution plan to discover that a hint has been ignored (for a well-known reason):

create table t2
as
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from
        all_objects
where
        rownum <= 3000
;

create index t2_i1 on t2(n1);

begin
        dbms_stats.gather_table_stats(
                user,
                't2',
                method_opt => 'for all columns size 1'
        );
end;
/

explain plan for
select  /*+ index(t2) */
        n1
from    t2
where   n2 = 45
;

select * from table(dbms_xplan.display);

----------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    15 |   120 |    15 |
|*  1 |  TABLE ACCESS FULL| T2   |    15 |   120 |    15 |
----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N2"=45)

Of course we don’t expect the optimizer to use the index because we didn’t declare n1 to be not null, so there may be rows in the table which do not appear in the index. The only option the optimizer has for getting the right answer is to use a full tablescan. So the question is this – how come Oracle will obey the hint in the following SQL statement:


explain plan for
select
        /*+
                leading (t2 t1)
                index(t2) index(t1)
                use_nl(t1)
        */
        t2.n1, t1.n2
from
        t2      t2,
        t2      t1
where
        t2.n2 = 45
and     t2.n1 = t1.n1
;

select * from table(dbms_xplan.display);

-------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |   225 |  3600 |  3248 |
|   1 |  NESTED LOOPS                         |       |   225 |  3600 |  3248 |
|   2 |   NESTED LOOPS                        |       |   225 |  3600 |  3248 |
|*  3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T2    |    15 |   120 |  3008 |
|   4 |     INDEX FULL SCAN                   | T2_I1 |  3000 |       |     8 |
|*  5 |    INDEX RANGE SCAN                   | T2_I1 |    15 |       |     1 |
|   6 |   TABLE ACCESS BY INDEX ROWID         | T2    |    15 |   120 |    16 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("T2"."N2"=45)
   5 - access("T2"."N1"="T1"."N1")

I ran this on 11.2.0.4, but it does the same on earlier versions.

Update:

This was clearly too easy – posted at 18:04, answered correctly at 18:21. At some point in it’s evolution the optimizer acquired a rule that allowed it to infer unwritten “is not null” predicates from the join predicate.

 

 

 

July 6, 2014

SQL Plan Baselines

Filed under: CBO,Hints,Oracle — Jonathan Lewis @ 6:34 pm GMT Jul 6,2014

Here’s a thread from Oracle-L that reminded of an important reason why you still have to hint SQL sometimes (rather than following the mantra “if you can hint it, baseline it”).

I have a query that takes 77 seconds to optimize (it’s not a production query, fortunately, but one I engineered to make a point). I can enable sql plan baseline capture and create a baseline for it, and given the nature of the query I can be confident that the resulting plan will always be exactly the plan I want. If I have to re-optimize the query at any time  (because it runs once per hour, say, and is constantly being flushed from the library cache) how much time will the SQL plan baseline save for me ?

The answer is NONE.

The first thing that the optimizer does for a query with a stored sql plan baseline is to optimize it as if the baseline did not exist.

If I want to get rid of that 77 seconds I’ll have to extract (most of) the hints from the SQL Plan Baseline and write them into the query.  (Or, maybe, create a Stored Outline – except that they’re deprecated in the latest version of Oracle, and I’d have to check whether the optimizer used the same strategy with stored outlines or whether it applied the outline before doing any optimisation). Maybe we could do with a hint which forces the optimizer to attempt to use an existing, accepted SQL Baseline without attempting the initial optimisation pass.

 

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,521 other followers