Oracle Scratchpad

December 3, 2014

Upgrades

Filed under: CBO,Oracle,Upgrades — Jonathan Lewis @ 8:24 am BST Dec 3,2014

I have a simple script that creates two identical tables , collects stats (with no histograms) on the pair of them, then executes a join. Here’s the SQL to create the first table:


create table t1
nologging
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	trunc(dbms_random.value(0,1000))	n_1000,
	trunc(dbms_random.value(0,750))		n_750,
	trunc(dbms_random.value(0,600))		n_600,
	trunc(dbms_random.value(0,400))		n_400,
	trunc(dbms_random.value(0,90))		n_90,
	trunc(dbms_random.value(0,72))		n_72,
	trunc(dbms_random.value(0,40))		n_40,
	trunc(dbms_random.value(0,3))		n_3
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

-- gather stats: no histograms

The two tables have 1,000,000 rows each and t2 is created from t1 with a simple “create as select”. The columns are all defined to be integers, and the naming convention is simple – n_400 holds 400 distinct values with uniform distribution from 0 – 399, n_750 holds 750 values from 0 – 749, and so on.

Here’s the simple query:


select
        t1.*, t2.*
from
        t1, t2
where
        t1.n_400 = 0
and     t2.n_72  = t1.n_90
and     t2.n_750 = t1.n_600
and     t2.n_400 = 1
;

Since I’ve created no indexes you might expect the query to do a couple of and a hash join to get its result – and you’d be right; but what do you think the predicted cardinality would be ?

Here are the results from running explain plan on the query and then reporting the execution plan – for three different versions of Oracle:



9.2.0.8
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |    96 |  4992 |  1230  (10)|
|*  1 |  HASH JOIN           |             |    96 |  4992 |  1230  (10)|
|*  2 |   TABLE ACCESS FULL  | T1          |  2500 | 65000 |   617  (11)|
|*  3 |   TABLE ACCESS FULL  | T2          |  2500 | 65000 |   613  (10)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_750"="T1"."N_600" AND "T2"."N_72"="T1"."N_90")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

***************************************************************************

10.2.0.5
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   116 |  6032 |  1229  (10)| 00:00:07 |
|*  1 |  HASH JOIN         |      |   116 |  6032 |  1229  (10)| 00:00:07 |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 | 65000 |   616  (11)| 00:00:04 |
|*  3 |   TABLE ACCESS FULL| T2   |  2500 | 65000 |   612  (10)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_750"="T1"."N_600" AND "T2"."N_72"="T1"."N_90")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

***************************************************************************

11.2.0.4
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  2554 |   139K|  1225  (10)| 00:00:07 |
|*  1 |  HASH JOIN         |      |  2554 |   139K|  1225  (10)| 00:00:07 |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 | 70000 |   612  (10)| 00:00:04 |
|*  3 |   TABLE ACCESS FULL| T2   |  2500 | 70000 |   612  (10)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_72"="T1"."N_90" AND "T2"."N_750"="T1"."N_600")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

The change for 11.2.0.4 (which is still there for 12.1.0.2. I didn’t check to see if it also appears in 11.1.0.7) is particularly worrying. When you see a simple query like this changing cardinality on the upgrade you can be fairly confident that some of your more complex queries will change their plans – even if there are no clever new optimizer transformations coming into play.

I’ll write up an explanation of how the optimizer has produced three different estimates some time over the next couple of weeks; but if you want an earlier answer this is one of the things I’ll be covering in my presentation on calculating selectivity at “Super Sunday” at UKOUG Tech 14.

11 Comments »

  1. Hey Jonathan,
    unfortunately it is not quiz night, but afaik the cardinality change is caused by (newly) introduced sanity checks (or better said multi column sanity checks – one side selectivity) :-))

    Have fun at UKOUG Tech 14, hope to chat with you at DOAG 2015 again.

    Regards
    Stefan

    Comment by Stefan Koehler — December 3, 2014 @ 10:16 am BST Dec 3,2014 | Reply

    • Stefan,

      Correct – that explains the relatively small change in this example from 9i to 10g; and the larger change from 10g to 11g is explained by a change in the way the sanity check is applied.

      Comment by Jonathan Lewis — December 3, 2014 @ 10:20 am BST Dec 3,2014 | Reply

  2. Jonathan,
    since Stefan already started to handle it as a quiz night I would guess that the calculation in 10.2.0.5 is using the standard formula and uses the values from only one side (t2) to calculate the join selectivity: 1/72 * 1/750 * 2500 * 2500 = 115.74…

    Martin

    Comment by Martin Preiss — December 3, 2014 @ 11:34 am BST Dec 3,2014 | Reply

    • Martin,

      That is correct – but it’s an error, and the error has been addressed (arguably incorrectly) in the 11g example.

      Comment by Jonathan Lewis — December 3, 2014 @ 11:46 am BST Dec 3,2014 | Reply

      • playing with the post 11g example and its multi-column cardinality sanity check it seems that:
        – the cardinality is still determined by one side of the join
        – the cardinality does not get smaller than 2500 in the given example with different join columns
        – the cardinality above 2500 (54 in the given example) is somehow related to 1/NDV (54000 for the example according to a CBO trace created with dbms_sqldiag.dump_trace); but I am not able to catch this “somehow”.
        So I have to wait for the following articles …

        Comment by Martin Preiss — December 3, 2014 @ 8:26 pm BST Dec 3,2014 | Reply

        • Martin,

          Let me just point you in the critical direction – a paper (dated 2007, so Oracle’s application of the algorithm may have changed) by Alberto Dell’Era: http://www.adellera.it/investigations/select_without_replacement/index.html

          Comment by Jonathan Lewis — December 3, 2014 @ 9:15 pm BST Dec 3,2014

        • Hi Martin,
          i don’t want to anticipate Jonathan’s blog series about upgrades, but you may want to try this algorithm.

          cardinality(T1) = 1/400(= N_400) x 1000000 = 2500 rows (= ID line 2)
          cardinality(T2) = 1/400(= N_400) x 1000000 = 2500 rows (= ID line 3)
          cartesian_join(T1.T2) = cardinality(T1) x cardinality(T2) = 2500 rows x 2500 rows = 6250000 rows

          Multi column sanity check with larger selectivity from one side (does not apply in this case)
          NUM_DIST(T2side) = NUM_DIST(T2.N_72) x NUM_DIST(T2.N_750) = 72 x 750 = 54000
          NUM_DIST(T1side) = NUM_DIST(T1.N_90) x NUM_DIST(T1.N_600) = 90 x 600 = 54000

          NUM_DIST(JOIN) = SWRU(54000,1000000,2500) = 54000 x (1 – power(1-2500/1000000,1000000/54000)) = 2446.00097

          JOIN_CARD = cartesian_join(T1.T2) / NUM_DIST(JOIN) = 6250000 / 2446.00097 = 2555.19114

          Not quite sure about the difference of one, but maybe caused by some simplification or rounding issues. I am pretty sure Jonathan knows the answer to that :-))

          Regards
          Stefan

          Comment by Stefan Koehler — December 4, 2014 @ 10:14 am BST Dec 4,2014

        • Stefan,

          That’s the arithmetic.

          At the moment I’m prepared to put the difference of 1 down to a choice of rounding until I run an automated range test that suggest otherwise. It may be that the optimizer always rounds the adjusted num_distinct up to the next integer value.

          Comment by Jonathan Lewis — December 4, 2014 @ 6:10 pm BST Dec 4,2014

  3. Hi Stefan,
    looks promising: I get convincing results for my test cases. Alberto Dell’Era’s SWRU function shows me once again that a solid mathematical background has some value when you try to answer this kind of questions…

    Regards
    Martin

    Comment by Martin Preiss — December 4, 2014 @ 1:50 pm BST Dec 4,2014 | Reply

  4. […] indexes. The SQL to create the first table is one I supplied a couple of days ago to demonstrate changes in join cardinality dependent on Oracle […]

    Pingback by Closure | Oracle Scratchpad — December 5, 2014 @ 8:12 am BST Dec 5,2014 | Reply

  5. […] selectivity}”, and in the absence of any nulls the join selectivity – thanks to the “multi-column sanity check” – will be 1/(greater number of distinct combinations). So we get 1,000,000 / 100 = […]

    Pingback by Column Groups | Oracle Scratchpad — December 29, 2015 @ 1:13 pm BST Dec 29,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.