Oracle Scratchpad

January 27, 2016

Add primary key.

Filed under: Indexing,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 9:07 am GMT Jan 27,2016

I thought I had written this note a few years ago, on OTN or Oracle-L if not on my blog, but I can’t find any sign of it so I’ve decided it’s time to write it (again) – starting as a question about the following code:


rem
rem     Script: pk_overhead.sql
rem     Author: J.P.Lewis
rem     Dated:  Feb 2012
rem

create table t1
as
with generator as (
        select  rownum  id
        from    dual
        connect by
                rownum <= 1000 --> comment to avoid WordPress format issue
)
select
        rownum                      id,
        trunc((rownum-1)/50)        clustered,
        mod(rownum,20000)           scattered,
        lpad(rownum,10)             vc_small,
        rpad('x',100,'x')           vc_padding
from
        generator       g1,
        generator       g2
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1')

alter system flush buffer_cache;

alter table t1 add constraint t1_pk primary key(id, scattered);


I’ve generated a table with 1,000,000 rows, including a column that’s going to hold unique values; then I’ve added a (two-column) primary key constraint to that table.

Because of the guaranteed unique column the call to add constraint will succeed. Because Oracle will automatically create a unique index to support that constraint it will have to do a tablescan of the table. So here’s the question:

HOW MANY TIMES will Oracle tablescan that table (and how many rows will it scan) ?

Space for thought …

The answer is three tablescans, 3 million rows.

Oracle will scan the table to check the validity of adding a NOT NULL definition and constraint for the id column, repeat the scan to do the same for the scattered column, then one final scan to accumulate the key data and rowids to sort and create the index.

Knowing this, you may be able to find ways to modify bulk data loading operations to minimise overheads.

Update – May 2016

The extra tablescans occur even if you have pre-existing check constraints (not declarations) on the columns to ensure that they are not null (i.e. things like: “alter table t1 add constraint t1_nn_id check (id is not null)”).

Update – Jan 2021

Test cases are wonderful: I’ve just re-run this one, and the effect is still the same in 19.3.0.0.  Adding the primary key required three tablescans of the table.  Then I repeated the test that tried to short-circuit some of the work a little by modifying the two columns to “not null” before adding the primary key; it still took two tablescans to modify the two columns.

Update (April 2024)

With a small upgrade (up to 19.11) the session statistics on the tablescans show an interesting change in how they are reported. There are three tablescans visible (in the row counts) during creation of the two-column primary key, but two of them are treated differently from (presumably) the first one:

table scan rows gotten                3,000,543
table scan disk non-IMC rows gotten   1,000,543

Although “IMC” hints at the use of the InMemory cache (rather than the boring old buffer cache) for 2M rows, the numbers reported for “session logical reads” and “consistent gets” still looked as if the table had been read 3 times from blocks in the buffer cache.

Nothing further has changed in 23.3

 

 

8 Comments »

  1. […] the key column (or column set) is not null [update Jan 2016: couldn’t find it, so finally got around to writing it (again)]– which means that adding a primary key may also result in a tablescan for every […]

    Pingback by Add Constraint | Oracle Scratchpad — January 27, 2016 @ 9:15 am GMT Jan 27,2016 | Reply

  2. I think I already hit the same kind of issue, and I raised an ER to Oracle for this: ER – add constraint executing one full table scan per PK column. At that time we had the Unique index in place on a table, but all columns were declared NULL in the table definition. The create constraint USING INDEX was scanning 7 times a 47 Gb table, which took some time.

    Response from Oracle Support was Bug 16781054 opened, and finally closed as “Not a Bug” but finally an ER was created on their side:
    Explain why this is not a bug:
    This is expected behavior and not a Bug as it reproduces on all version through 12.2 Beta(MAIN). A new enhancement request has been filed for this in ER Bug 20505359.

    I’ve quickly tested your code and traced the ADD CONSTRAINT but can only see one single full table scan in the tracefile. I’ve tested this on a 11.2.0.4.

    Comment by sfromholtz — January 27, 2016 @ 9:43 am GMT Jan 27,2016 | Reply

    • sfromholz,

      Thanks for the note.

      The trace file can be a little deceptive and, at first sight, appear to be inconsistent with the session stats; how were you checking ?

      When I’ve cranked the sizes up to exceed the size of the buffer cache, and played around with the index definition (varying number of columns) the tkprof file shows the “add constraint” statement doing one tablescan less than expected (and that’s derived from the disk reads and buffer gets figures), but then there’s one more tablescan that gets recorded under a recursive “CREATE UNIQUE INDEX” statement.

      (I used the session stats to tell me about the number of tablescans and rows and blocks scanned as well as buffer gets and disk reads for my little test, along with the v$buffer_pool_statistics as a source to cross-check the buffer gets and disk reads).

      Comment by Jonathan Lewis — January 27, 2016 @ 10:26 am GMT Jan 27,2016 | Reply

      • True, I’ve done the test with a 4,000,000 rows T2 table (by setting 2000 instead of 1000 in the generator), and I have a 74203 blocks T2 table. Session stats show:

        consistent gets 222,638
        session logical reads 237,336
        table scan rows gotten 12,016,695

        Which is 3 times the T2 table. Tracefile only shows a single full table scan and 74415 blocks read, inconsistent with session stats.

        Comment by sfromholtz — January 27, 2016 @ 1:02 pm GMT Jan 27,2016 | Reply

    • So it must not be bug, as it has always worked that way.
      Their logic is unassailable.

      Comment by jkstill — January 27, 2016 @ 2:52 pm GMT Jan 27,2016 | Reply

      • As someone from Oracle support once (ca. 7.2) said to me: “The specification doesn’t say that it shouldn’t do it that way, so it’s not a bug” – I can understand that. Oracle never promises to expose mechanisms, nor does it guarantee the immutability of existing mechanisms.

        Comment by Jonathan Lewis — January 27, 2016 @ 3:48 pm GMT Jan 27,2016 | Reply

  3. […] I had on adding constraints with minimum service interruption – a topic I last wrote about a couple of years ago {ed. now nearly 5 years ago]. This time around I did something a little different, and here’s […]

    Pingback by Check Constraints | Oracle Scratchpad — January 13, 2021 @ 3:17 pm GMT Jan 13,2021 | Reply

  4. […] to the OP, one is about the effect of dropping a primary key constraint the other about the cost of adding a primary key to a […]

    Pingback by Quiz Night | Oracle Scratchpad — July 12, 2021 @ 10:24 am BST Jul 12,2021 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.