Oracle Scratchpad

October 23, 2018

Upgrade threat

Filed under: 12c,18c,Histograms,Oracle,Statistics,Upgrades — Jonathan Lewis @ 7:50 pm GMT Oct 23,2018

Here’s one I’ve just discovered while trying to build a reproducible test case – that didn’t reproduce because an internal algorithm has changed.

If you upgrade from 12c to 18c and have a number of hybrid histograms in place you may find that some execution plans change because of a change in the algorithm for producing hybrid histograms (and that’s not just if you happen to get the patch that fixes the top-frequency/hybrid bug relating to high values).

Here’s a little test to demonstrate how I wasted a couple of hours trying to solve the wrong problem – first a simple data set:


rem
rem     Script:         18c_histogram_upgrade.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2018
rem 

drop table t2 purge;

execute dbms_random.seed(0)

create table t2(
        id              number(8,0),
        n20             number(6,0),
        n30             number(6,0),
        n50             number(6,0),
        j2              number(6,0)
)
;

insert into t2
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        mod(rownum,   20) + 1                   n20,
        mod(rownum,   30) + 1                   n30,
        mod(rownum,   50) + 1                   n50,
        28 - round(abs(7*dbms_random.normal))        j2
from
        generator       v1
where
        rownum <= 800 -- > comment to avoid WordPress format issue
;

commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => null,
                tabname          => 'T2',
                method_opt       => 'for all columns size 1 for columns j2 size 13'
        );
end;
/

I’ve created a skewed data set which (we will see) has 22 distinct values and created a histogram of 13 buckets on it. This will be a hybrid histogram – but different versions of Oracle will produce different histograms (even though the data set is the same for both versions):


select
        j2, count(*)
from
        t2
group by
        j2
order by
        j2
;

select
        endpoint_value                                                            value,
        endpoint_number,
        endpoint_number - lag(endpoint_number,1,0) over(order by endpoint_number) bucket_size,
        endpoint_repeat_count
from
        user_tab_histograms
where
        table_name  = 'T2'
and     column_name = 'J2'
order by
        endpoint_value
;

Here’s the dataset from 12.2.0.1 and 18.3.0.0


        J2   COUNT(*)
---------- ----------
         1          1
         8          3
         9          1
        10          5
        11          4
        12          8
        13         14
        14          9
        15         11
        16         22
        17         34
        18         31
        19         36
        20         57
        21         44
        22         45
        23         72
        24         70
        25         87
        26        109
        27         96
        28         41

22 rows selected.



And here are the histograms - 12.2.0.1 then 18.3.0.0:



     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1               1           1                     1
        15              56          55                    11
        17             112          56                    34
        18             143          31                    31
        19             179          36                    36
        20             236          57                    57
        21             280          44                    44
        22             325          45                    45
        23             397          72                    72
        24             467          70                    70
        25             554          87                    87
        26             663         109                   109
        28             800         137                    41

13 rows selected.

     VALUE ENDPOINT_NUMBER BUCKET_SIZE ENDPOINT_REPEAT_COUNT
---------- --------------- ----------- ---------------------
         1               1           1                     1
        15              56          55                    11
        17             112          56                    34
        19             179          67                    36
        20             236          57                    57
        21             280          44                    44
        22             325          45                    45
        23             397          72                    72
        24             467          70                    70
        25             554          87                    87
        26             663         109                   109
        27             759          96                    96
        28             800          41                    41

13 rows selected.

Both histograms have 13 buckets as requested; both are hybrid histograms as expected.

But why does 12c have the value 18 when 18c doesn’t, and why does 18c have the value 27 when 12c doesn’t ?

That’s the second time in two weeks I’ve had reproducible test cases not reproducing – thanks to an 18c upgrade.

Update (See comments)

I had completely forgotten that a previous defect in the construction of hybrid (and Top-N) histograms had been addressed in 18.3 but needed a fix in 12.2 and a backport patch in 12.1.0.2.

Since the defect could “lose” a popular value in order to ensure that both the low and high values were captured in the histogram it’s not surprising that a fix could result in one of the popular values in a histogram dissappearing (after the upgrade) even when the gather had used a 100% sample. Quite possibly the algorithm used to ensure the presence of the high value has had a cascading effect down the histogram that can affect which popular values get into the histogram with repeat counts.

I think I’m going to have to grit my teeth and patch a 12.1.0.2, or update a 12.2.0.1 with exactly the right patch-set to find out.

[It has now been confirmed by Nigel Bayliss that this is a side effect of the fix to the bug 25994960]

October 8, 2018

Random Upgrade

Filed under: 18c,Oracle,Upgrades — Jonathan Lewis @ 1:36 pm GMT Oct 8,2018

Here’s a problem that (probably) won’t affect the day to day running of most systems – but it could be a pain in the backside for people who write programs to generate repeatable test data. I’m not going to say much about the problem, just leave you with a test script.


rem
rem	Script	random_upgrade.sql
rem	Author:	Jonathan Lewis
rem	Dated:	Oct 2018
rem
rem	Last tested
rem		18.3.0.0
rem		12.2.0.1
rem	Notes
rem	In the upgrade from 12.2.0.1 something
rem	changed that meant
rem		create as select dbms_random
rem	gets different data from
rem		select dbms_random
rem

drop table t4 purge;
drop table t3 purge;
drop table t2 purge;
drop table t1 purge;
drop table t0 purge;

set feedback off

create table t0 as
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
;


execute dbms_random.seed(0);

create table t1
as
select dbms_random.normal
from
	t0
;

execute dbms_random.seed(0);

create table t2
as
with g1 as (
	select rownum id
	from dual
	connect by
		level <= 1e4 -- > comment to avoid WordPress format issue
)
select
	dbms_random.normal
from
	g1
;

prompt	=================
prompt	Diff the two CTAS
prompt	=================

select count(*)
from (
select * from t1
minus
select * from t2
union all
select * from t2
minus
select * from t1
)
;


create table t3 
as 
select * from t2 
where rownum < 1 -- > comment to avoid WordPress format issue
;

create table t4 
as 
select * from t2 
where rownum < 1 -- > comment to avoid WordPress format issue
;

execute dbms_random.seed(0)

insert into t3
select dbms_random.normal
from
	t0
;

execute dbms_random.seed(0)

insert into t4
with g1 as (
	select rownum id
	from dual
	connect by
		level <= 1e4 -- > comment to avoid WordPress format issue
)
select
	dbms_random.normal
from
	g1
;


prompt	===================
prompt	Diff the two Insert
prompt	===================

select count(*)
from (
select * from t3
minus
select * from t4
union all
select * from t4
minus
select * from t3
)
;


prompt	===========
prompt	Sum of CTAS
prompt	===========

select sum(normal) from t1;

prompt	=============
prompt	Sum of Insert
prompt	=============

select sum(normal) from t3;


execute dbms_random.seed(0)

prompt	=============
prompt	Sum of select
prompt	=============

with g1 as (
	select rownum id
	from dual
	connect by
		level <= 1e4 -- > comment to avoid WordPress format issue
)
select sum(n) from (
select
	dbms_random.normal n
from
	g1
)
;


I’m repeatedly using dbms_random.seed(0) to reset the random number generator and trying to generate 10,000 normally distributed numbers. (I’ve chosen the normal distribution because that happened to be the function in a script I sent someone with the comment that “this will recreate the data for the demonstration” – and they wrote back to say that it didn’t.)

I’ve got two “create as select”, and two “insert as select”. One of each pair selects from a real existing table to get 10,000 rows, the other uses the “select dual connect by” trick to generate rows. I’ve written SQL that shows whether or not the two pairs of tables end up with the same data (they do, pairwise), then I’ve summed one table from each pair to see if the different mechanisms produce the same data – and that depends on the version of Oracle you’re using. Finally I’ve reset the random number generator and summed across a pure select to see what that produces.

If you run this code on 12.2.0.1 or earlier you’ll see that the “diffs” report zeros and the “sums” report -160.39249. If you upgrade to 18.3 the diffs will still report zeros and some of the sums will still report -160.39249 but the sum of the CTAS will report -91.352172.

Bottom Line

If you’ve got code that you wrote to create reproducible test cases and the code uses: “create table … as select … dbms_random …” then it won’t produce the same data when you upgrade to 18.3. You’ll have to modify the code to do “create table (); insert as select …”.

As of this afternoon I have 1,209 test scripts on my laptop that use the dbms_random package to model data distribution patterns. It is almost certain that I will end up modifying every single one of them eventually.

There are words to express how I feel about this – but not ones that I would consider publishing.

Powered by WordPress.com.