Oracle Scratchpad

October 8, 2018

Random Upgrade

Filed under: 18c,Oracle,Upgrades — Jonathan Lewis @ 1:36 pm BST Oct 8,2018

Here’s a problem that (probably) won’t affect the day to day running of most systems – but it could be a pain in the backside for people who write programs to generate repeatable test data. I’m not going to say much about the problem, just leave you with a test script.


rem
rem     Script  random_upgrade.sql
rem     Author: Jonathan Lewis
rem     Dated:  Oct 2018
rem
rem     Last tested
rem             19.3.0.0
rem             18.3.0.0
rem             12.2.0.1
rem

drop table t4 purge;
drop table t3 purge;
drop table t2 purge;
drop table t1 purge;
drop table t0 purge;

set feedback off

create table t0 as
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
;


execute dbms_random.seed(0);

create table t1
as
select dbms_random.normal
from
        t0
;

execute dbms_random.seed(0);

create table t2
as
with g1 as (
        select rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        dbms_random.normal
from
        g1
;

prompt  =================
prompt  Diff the two CTAS
prompt  =================

select count(*)
from (
select * from t1
minus
select * from t2
union all
select * from t2
minus
select * from t1
)
;


create table t3 
as 
select * from t2 
where rownum < 1 -- > comment to avoid WordPress format issue
;

create table t4 
as 
select * from t2 
where rownum < 1 -- > comment to avoid WordPress format issue
;

execute dbms_random.seed(0)

insert into t3
select dbms_random.normal
from
        t0
;

execute dbms_random.seed(0)

insert into t4
with g1 as (
        select rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        dbms_random.normal
from
        g1
;


prompt  ===================
prompt  Diff the two Insert
prompt  ===================

select count(*)
from (
select * from t3
minus
select * from t4
union all
select * from t4
minus
select * from t3
)
;


prompt  ===========
prompt  Sum of CTAS
prompt  ===========

select sum(normal) from t1;

prompt  =============
prompt  Sum of Insert
prompt  =============

select sum(normal) from t3;


execute dbms_random.seed(0)

prompt  =============
prompt  Sum of select
prompt  =============

with g1 as (
        select rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select sum(n) from (
select
        dbms_random.normal n
from
        g1
)
;


I’m repeatedly using dbms_random.seed(0) to reset the random number generator and trying to generate 10,000 normally distributed numbers. (I’ve chosen the normal distribution because that happened to be the function in a script I sent someone with the comment that “this will recreate the data for the demonstration” – and they wrote back to say that it didn’t.)

I’ve got two “create as select”, and two “insert as select”. One of each pair selects from a real existing table to get 10,000 rows, the other uses the “select dual connect by” trick to generate rows. I’ve written SQL that shows whether or not the two pairs of tables end up with the same data (they do, pairwise), then I’ve summed one table from each pair to see if the different mechanisms produce the same data – and that depends on the version of Oracle you’re using. Finally I’ve reset the random number generator and summed across a pure select to see what that produces.

If you run this code on 12.2.0.1 or earlier you’ll see that the “diffs” report zeros and the “sums” report -160.39249. If you upgrade to 18.3 the diffs will still report zeros and some of the sums will still report -160.39249 but the sum of the CTAS will report -91.352172.

Bottom Line

If you’ve got code that you wrote to create reproducible test cases and the code uses: “create table … as select … dbms_random …” then it won’t produce the same data when you upgrade to 18.3. You’ll have to modify the code to do “create table (); insert as select …”.

As of this afternoon I have 1,209 test scripts on my laptop that use the dbms_random package to model data distribution patterns. It is almost certain that I will end up modifying every single one of them eventually.

There are words to express how I feel about this – but not ones that I would consider publishing.

Update (March 2020)

I’ve just run this test on 19.3 and things have changed again. The simplest summary I can give is a little chart of results I got for the three (four) summations and three versions of Oracle. There are three sum() statement in the code above, but four results below because I ran the insert test with and without the /*+ append */ hint, so I’ve got numbers for:

  • select …
  • create as select …
  • insert as select …
  • insert /*+ append */ as select ….

I’ve also tested the code with parameter deferred_segment_creation set to true and then set to false, but that had no effect on the results.

Select          CTAS            Insert           Append          Version
 -160.39249     -160.39249      -160.39249       -160.39249      12.2.0.1
 -160.39249     -91.352172      -160.39249       -91.352172      18.3.0.0
 -160.39249     -91.352172      (not repeatable) -91.352172      19.3.0.0

So, if you’ve gone to the trouble of rewriting all your test code to change “create as select …” into “create table; insert as select” – you’ll be feeling just a little irritated because while it kept 18.3 producing the same results as 12.2 (and every earlier version) it will produce different results every time you run it in 19.3.

I hesitate to suggest creating a pl/sql pipelined function using a simple select to return a set of random numbers. At present it might get you back to the 12.2 results irrespective of how you use that stream – but who know how things will change in 20c.  (I might get around to testing it, one day.)

 

1 Comment »

  1. Well, its called DBMS_RANDOM for a reason :-) Too soon ?

    Comment by Connor McDonald — March 6, 2020 @ 3:43 am GMT Mar 6,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.