Oracle Scratchpad

December 3, 2014

Upgrades

Filed under: CBO,Oracle,Upgrades — Jonathan Lewis @ 8:24 am GMT Dec 3,2014

I have a simple script that creates two identical tables , collects stats (with no histograms) on the pair of them, then executes a join. Here’s the SQL to create the first table:

rem
rem     Script:         aggregate_selectivity_b.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2014
rem

create table t1
nologging
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4        -- > comment to avoid wordpress format issue
)
select
	trunc(dbms_random.value(0,1000))	n_1000,
	trunc(dbms_random.value(0,750))		n_750,
	trunc(dbms_random.value(0,600))		n_600,
	trunc(dbms_random.value(0,400))		n_400,
	trunc(dbms_random.value(0,90))		n_90,
	trunc(dbms_random.value(0,72))		n_72,
	trunc(dbms_random.value(0,40))		n_40,
	trunc(dbms_random.value(0,3))		n_3
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6        -- > comment to avoid wordpress format issue
;

-- gather stats: no histograms

The two tables have 1,000,000 rows each and t2 is created from t1 with a simple “create as select”. The columns are all defined to be integers, and the naming convention is simple – n_400 holds 400 distinct values with uniform distribution from 0 – 399, n_750 holds 750 values from 0 – 749, and so on.

Here’s the simple query:


select
        t1.*, t2.*
from
        t1, t2
where
        t1.n_400 = 0
and     t2.n_72  = t1.n_90
and     t2.n_750 = t1.n_600
and     t2.n_400 = 1
;

Since I’ve created no indexes you might expect the query to do a couple of and a hash join to get its result – and you’d be right; but what do you think the predicted cardinality would be ?

Here are the results from running explain plan on the query and then reporting the execution plan – for three different versions of Oracle:



9.2.0.8
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |    96 |  4992 |  1230  (10)|
|*  1 |  HASH JOIN           |             |    96 |  4992 |  1230  (10)|
|*  2 |   TABLE ACCESS FULL  | T1          |  2500 | 65000 |   617  (11)|
|*  3 |   TABLE ACCESS FULL  | T2          |  2500 | 65000 |   613  (10)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_750"="T1"."N_600" AND "T2"."N_72"="T1"."N_90")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

***************************************************************************

10.2.0.5
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   116 |  6032 |  1229  (10)| 00:00:07 |
|*  1 |  HASH JOIN         |      |   116 |  6032 |  1229  (10)| 00:00:07 |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 | 65000 |   616  (11)| 00:00:04 |
|*  3 |   TABLE ACCESS FULL| T2   |  2500 | 65000 |   612  (10)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_750"="T1"."N_600" AND "T2"."N_72"="T1"."N_90")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

***************************************************************************

11.2.0.4
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  2554 |   139K|  1225  (10)| 00:00:07 |
|*  1 |  HASH JOIN         |      |  2554 |   139K|  1225  (10)| 00:00:07 |
|*  2 |   TABLE ACCESS FULL| T1   |  2500 | 70000 |   612  (10)| 00:00:04 |
|*  3 |   TABLE ACCESS FULL| T2   |  2500 | 70000 |   612  (10)| 00:00:04 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_72"="T1"."N_90" AND "T2"."N_750"="T1"."N_600")
   2 - filter("T1"."N_400"=0)
   3 - filter("T2"."N_400"=1)

The change for 11.2.0.4 (which is still there for 12.1.0.2. I didn’t check to see if it also appears in 11.1.0.7) is particularly worrying. When you see a simple query like this changing cardinality on the upgrade you can be fairly confident that some of your more complex queries will change their plans – even if there are no clever new optimizer transformations coming into play.

11 Comments »

  1. Hey Jonathan,
    unfortunately it is not quiz night, but afaik the cardinality change is caused by (newly) introduced sanity checks (or better said multi column sanity checks – one side selectivity) :-))

    Have fun at UKOUG Tech 14, hope to chat with you at DOAG 2015 again.

    Regards
    Stefan

    Comment by Stefan Koehler — December 3, 2014 @ 10:16 am GMT Dec 3,2014 | Reply

    • Stefan,

      Correct – that explains the relatively small change in this example from 9i to 10g; and the larger change from 10g to 11g is explained by a change in the way the sanity check is applied.

      Comment by Jonathan Lewis — December 3, 2014 @ 10:20 am GMT Dec 3,2014 | Reply

  2. Jonathan,
    since Stefan already started to handle it as a quiz night I would guess that the calculation in 10.2.0.5 is using the standard formula and uses the values from only one side (t2) to calculate the join selectivity: 1/72 * 1/750 * 2500 * 2500 = 115.74…

    Martin

    Comment by Martin Preiss — December 3, 2014 @ 11:34 am GMT Dec 3,2014 | Reply

    • Martin,

      That is correct – but it’s an error, and the error has been addressed (arguably incorrectly) in the 11g example.

      Comment by Jonathan Lewis — December 3, 2014 @ 11:46 am GMT Dec 3,2014 | Reply

      • playing with the post 11g example and its multi-column cardinality sanity check it seems that:
        – the cardinality is still determined by one side of the join
        – the cardinality does not get smaller than 2500 in the given example with different join columns
        – the cardinality above 2500 (54 in the given example) is somehow related to 1/NDV (54000 for the example according to a CBO trace created with dbms_sqldiag.dump_trace); but I am not able to catch this “somehow”.
        So I have to wait for the following articles …

        Comment by Martin Preiss — December 3, 2014 @ 8:26 pm GMT Dec 3,2014 | Reply

        • Martin,

          Let me just point you in the critical direction – a paper (dated 2007, so Oracle’s application of the algorithm may have changed) by Alberto Dell’Era.

          Comment by Jonathan Lewis — December 3, 2014 @ 9:15 pm GMT Dec 3,2014

        • Hi Martin,
          i don’t want to anticipate Jonathan’s blog series about upgrades, but you may want to try this algorithm.

          cardinality(T1) = 1/400(= N_400) x 1000000 = 2500 rows (= ID line 2)
          cardinality(T2) = 1/400(= N_400) x 1000000 = 2500 rows (= ID line 3)
          cartesian_join(T1.T2) = cardinality(T1) x cardinality(T2) = 2500 rows x 2500 rows = 6250000 rows

          Multi column sanity check with larger selectivity from one side (does not apply in this case)
          NUM_DIST(T2side) = NUM_DIST(T2.N_72) x NUM_DIST(T2.N_750) = 72 x 750 = 54000
          NUM_DIST(T1side) = NUM_DIST(T1.N_90) x NUM_DIST(T1.N_600) = 90 x 600 = 54000

          NUM_DIST(JOIN) = SWRU(54000,1000000,2500) = 54000 x (1 – power(1-2500/1000000,1000000/54000)) = 2446.00097

          JOIN_CARD = cartesian_join(T1.T2) / NUM_DIST(JOIN) = 6250000 / 2446.00097 = 2555.19114

          Not quite sure about the difference of one, but maybe caused by some simplification or rounding issues. I am pretty sure Jonathan knows the answer to that :-))

          Regards
          Stefan

          Comment by Stefan Koehler — December 4, 2014 @ 10:14 am GMT Dec 4,2014

        • Stefan,

          That’s the arithmetic.

          At the moment I’m prepared to put the difference of 1 down to a choice of rounding until I run an automated range test that suggest otherwise. It may be that the optimizer always rounds the adjusted num_distinct up to the next integer value.

          Comment by Jonathan Lewis — December 4, 2014 @ 6:10 pm GMT Dec 4,2014

  3. Hi Stefan,
    looks promising: I get convincing results for my test cases. Alberto Dell’Era’s SWRU function shows me once again that a solid mathematical background has some value when you try to answer this kind of questions…

    Regards
    Martin

    Comment by Martin Preiss — December 4, 2014 @ 1:50 pm GMT Dec 4,2014 | Reply

  4. […] indexes. The SQL to create the first table is one I supplied a couple of days ago to demonstrate changes in join cardinality dependent on Oracle […]

    Pingback by Closure | Oracle Scratchpad — December 5, 2014 @ 8:12 am GMT Dec 5,2014 | Reply

  5. […] selectivity}”, and in the absence of any nulls the join selectivity – thanks to the “multi-column sanity check” – will be 1/(greater number of distinct combinations). So we get 1,000,000 / 100 = […]

    Pingback by Column Groups | Oracle Scratchpad — December 29, 2015 @ 1:13 pm GMT Dec 29,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: