Oracle Scratchpad

August 28, 2020

FK on delete

Filed under: constraints,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 8:51 am BST Aug 28,2020

This is part 1 of a short reference note about the work load created by referential integrity constraints when you delete from a parent table. It was prompted by a question on the Oracle Developer Community forum about how to handle a very large delete from a table which (a) included a column of type CLOB and (b) had 9 child tables.

The 9 referential integrity constraints were declared with “on delete cascade”, but the delete was taking too long even though all the related child data had been deleted before the parent delete. In outline the process was designed to operate in batches as follows:

  • populate global temporary table with a small set of IDs
  • delete from 9 child tables where parent_id in (select id from gtt)
  • delete from parent where id in (select id from gtt);

The process was running very slowly. At first sight – and without examining any trace files or workload statistics – the “obvious” guess would be that this was something to do with the CLOBs – but a CLOB delete is usually a “logical” delete, i.e. it sets a flag amd shouldn’t really be doing a lot of work actually deleting and freeing space, and it doesn’t generate undo for the CLOB itself. Examination of the session statistics showed that the problem was with the work that Oracle was doing to handle the referential integrity – even though all related rows had been deleted in advance of the parent delete.

A highly suggestive statistic from the session stats (v$sesstat for the session) for a controlled test that deleted 1170 parent rows was the “execute count” which was 10,892.  Why would you execute that many statements when all you’re doing is a simple “delete from parent where id in (select from gtt)” – check the arithmetic: 1,170 * 9 = 10,530 which is a fairly good indicator that every row deleted from the parent results in 9 statements being executed to delete from the 9 child tables.

Foreign Key options

Oracle is a little restrictive in how it allows you to define foreign key constraints – MySQL, for example, allows you to specify the action the database should take on update or delete of a parent row,, and it allows 5 possible actions, so you have:


on update [restrict | cascade | set null | no action | set default]
on delete [restrict | cascade | set null | no action | set default]

In comparison Oracle only implements “on delete”, and the only options it allows are “cascade”, “set null”, and “no action” (which is the default, though you can’t actually specify it).

Side note: “No Action” and “Restrict” are very similar – if you try to delete a parent for which a child row exists then the attempt will raise an error. The difference between the two operations is that “no action” will delete the parent then rollback when it finds the child while “restrict” will check to see if the child exists before attempting to delete the parent. (See footnote 1 for detailed examination of how Oracle handles “no action”).

As an initial investigation of the work that Oracle does to handle referential integrity and deletes I want to examine one simple but extreme example. I’m going to create a parent/child pair of tables with “on delete no action”, and delete one parent row. The thing that makes the demonstration “extreme”  is that I will first create, and then delete, a large number of rows for that parent row before deleting the parent. Here’s the initial data setup:

rem
rem     Script:         fk_lock_stress.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2012
rem
rem     Last tested 
rem             19.3.0.0
rem

create table child
as
with generator as (
        select
                rownum id 
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid wordpress format issue
)
select
        trunc((rownum-1)/1200)  n1,
        lpad('x',80)            idx_padding,
        lpad(rownum,8,'0')      small_vc
from
        generator       v1,
        generator       v2
where
        rownum <= 6e4 -- > comment to avoid wordpress format issue
;

create index child_ix on child(n1, idx_padding) pctfree 95;

create table parent as
select
        id,
        lpad(rownum,8,'0')      small_vc,
        lpad('x',80)            padding
from    (
        select 
                distinct n1             id
        from
                child
        )
;

alter table parent add constraint par_pk primary key(id);
alter table child add constraint chi_fk_par foreign key(n1) references parent;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'PARENT',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'CHILD',
                method_opt       => 'for all columns size 1'
        );
end;
/


select  index_name, num_rows, distinct_keys, leaf_blocks, avg_leaf_blocks_per_key 
from    user_indexes
where   index_name in ('PARENT','CHILD')
;

select  object_id, object_name
from    user_objects
order by
        object_id
;

delete from child where n1 = 10;
commit;
delete from child where n1 = 20;
commit;
delete from child where n1 = 30;
commit;
delete from child where n1 = 40;
commit;

execute dbms_stats.gather_table_stats(user, 'child', cascade=>true)

alter system flush buffer_cache;

For each parent row there are 1,200 child rows, and I’ve given the “foreign key” index on child a pctfree of 95% which has resulted in the index needing roughly 600 leaf blocks per key. (When I first created this test my value for x$kcbbf (buffer pins) was 500 and I wanted to see what would happen if I needed to pin more buffers that could fit the array).

After creating the index and gathering stats I’ve deleted all the rows for 4 of the possible parent values from the child table because I want to see how much work it takes to delete a parent for which there are no longer any child rows, as follows:

execute snap_events.start_snap
execute snap_my_stats.start_snap

alter session set events '10046 trace name context forever, level 8';
alter session set tracefile_identifier = 'del';

delete from parent where id = 10;
commit;

delete from parent where id = 40;
commit;

alter session set tracefile_identifier = '';
alter session set events '10046 trace name context off';

execute snap_my_stats.end_snap
execute snap_events.end_snap

The “snap_my_stats / snap_events” packages are just my usual capture of workload information over the interval. I’ve enabled extended tracing at level 8 (waits) so that I can see if Oracle runs any interesting recursive SQL and, since I;ve flushed the buffer cache, this will also let me see what datablocks Oracle has to read from the database.

Here are the key results from a test run on 19.3.0.0:


Statistic                                                Value
---------                                                -----
physical reads                                           1,225

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
db file sequential read                           1,225           0           0.86        .001           1
events in waitclass Other                            11           0           0.01        .001          73

Is it a coincidence that I’ve done 1,225 single block reads when the number of leaf blocks per key in the child index is a little over 600 and I’ve just deleted two parent key values? (Spoiler: NO).

When we apply tkprof to the trace file we find the following summary of the first delete:


SQL ID: 0u6t174agkq27 Plan Hash: 3366423708

delete from parent
where
 id = 10


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.01          0          0          1           0
Execute      1      0.00       0.02        608          1        609           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.03        608          1        610           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 107
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  DELETE  PARENT (cr=1 pr=608 pw=0 time=23591 us starts=1)
         1          1          1   INDEX UNIQUE SCAN PAR_PK (cr=1 pr=1 pw=0 time=60 us starts=1 cost=0 size=3 card=1)(object id 94905)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                       608        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.00          0.00
********************************************************************************

The “Row Source Operation” statistics tell use that it has taken 608 single block reads to access a single row from the parent table by unique index and delete it. So let’s look at the content of the trace file – looking for any WAIT, that is a “db file sequential read” wait  relating to this cursor. (I started by looking for the statement in the trace file so that I could extract its cursor number before using the following grep command):


grep "#139773970345352" or19_ora_17541_del.trc | grep "WAIT.*db file sequential read" | sed "s/^.*obj#=//" | sed "s/ .*$//" | sort | uniq -c | sort -n

      2 0
    604 94903
      1 94904
      1 94905

If you’re wondering which objects these numbers correspond to (though you may have made a reasonalble guess by now), here are the results from my earlier query against user_objects:


 OBJECT_ID OBJECT_NAME
---------- --------------------
     94902 CHILD
     94903 CHILD_IX
     94904 PARENT
     94905 PAR_PK

I visit one index block in the parent index, then one block in the parent table, then 604 blocks in the child index to check that there are no current, or potential, index entries in each leaf block. I’ve done that with single block reads, walking the child index in order, and I do it without showing any SQL that might help me spot that it’s happening.

Exactly the same mechanism then comes into play for deleting parent_id = 40. We effectively I do a “silent” index range scan of the child index where n1 = 40 to make sure that there are no current child rows.

When I first created this test it was because I wanted to see if Oracle would pin all the buffers holding the current image of the relevant child index leaf blocks in exclusive mode as it walked through the index – this seemed to be the obvious way to ensure that no other session could insert a child row into a leaf block that had already been checked – but when I enabled pin tracing I found 600+ pairs of “Aquire/Release”. This may help to explain Oracle’s choice of “No Action” (rather than “Restrict”): if the parent row is locked and marked for deletion then there is no need to keep the child leaf blocks pinned exclusively as any attempt by another session to introduce a new child would first check the parent and discover that it was marked for deletion and wait for the deleting transaction to commit or rollback.

There is a gap in this hypothesis, though, which shows up when there is no appropriate index on the table. (Which includes the case where the “foreign key” index has been created as a bitmap instead of a B-tree, or where there’s a B-tree index that includes the foreign key column(s) but not at the start of the index definition, or if the foreign key columns are at the start of the index but declared as descending).

If there is no appropriate “foreign key” index there are two changes in the activity. First, the child table will temporarily be locked in mode 4 (share mode) or mode 5 (share row exclusive mode) for the duration of the referential integrity check. Secondly you will see the text of the recursive SQL statement that does the check; here, for example, is the SQL extracted from a 19.3 trace file in a few variants of the original test:


select /*+ all_rows */ count(1)
from
 "TEST_USER"."CHILD" where "N1" = :1

Oracle’s treatment of this statement is cost-based, and in my case, depending on the choice of index, level of compression, and settings for system statistics I found that I could get any one of the following plans:


Bitmap index
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=3 pr=2 pw=0 time=139 us starts=1)
         0          0          0   BITMAP CONVERSION COUNT (cr=3 pr=2 pw=0 time=132 us starts=1 cost=2 size=3600 card=1200)
         0          0          0    BITMAP INDEX RANGE SCAN CHILD_IX (cr=3 pr=2 pw=0 time=128 us starts=1)(object id 95098)

No index at all
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=826 pr=412 pw=0 time=4880 us starts=1)
         0          0          0   TABLE ACCESS FULL CHILD (cr=826 pr=412 pw=0 time=4870 us starts=1 cost=114 size=3600 card=1200)

Index on (idx_padding, n1)
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=826 pr=412 pw=0 time=5601 us starts=1)
         0          0          0   TABLE ACCESS FULL CHILD (cr=826 pr=412 pw=0 time=5591 us starts=1 cost=114 size=3600 card=1200)

Index on (idx_padding, n1) compress 1 (and tweaking MBRC)
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=128 pr=126 pw=0 time=2467 us starts=1)
         0          0          0   INDEX SKIP SCAN CHILD_IX (cr=128 pr=126 pw=0 time=2460 us starts=1 cost=122 size=3600 card=1200)(object id 95111)

You’ll note that I have two cases where the strategy chosen for the query seems to be a close match to the operation performed with the “proper foreign key index” in place – moreover the number of blocks read for the tablescan strategy is less than the number of blocks read when we had the “foreign key” index in place. So what’s the difference that makes it possible to for Oracle to avoid locking the table in mode 4/5 when we have that index?

These results, by the way, explain why Mikhail Velikikh (in comment #1) can say that large numbers of buffer gets and disk reads for the parent delete are NOT an indication of a missing foreign key index … when the index is missing the large numbers are correctly reported against the SQL that checks the child, not against the parent delete.

Summary

Even if you’ve deleted all the child data before attempting to delete a parent row, and YOU know that that there’s no child data when ORACLE is told to delete the parent row it has to check that there’s no child data, and in a large scale delete (particularly where there may be many child rows per parent) you may end up seeing a a surprising amount of I/O on single block reads of the “foreign key” indexes on any child tables.

This means for a very large “house-keeping” or archiving delete with a limited time-window you may want to take special steps, such as rebuilding “foreign key” indexes after deleting large number of child rows and before deleting the parent. You may even want to disable/drop the foreign key constraint before the parent delete and re-enable it afterwards if you think you can avoid getting any bad data into the job is running.

In the next article:  on delete cascade, on delete set null..

Footnote 1

To demonstrate Oracle’s behaviour for the default “no action  foreign key constraint when you attempt to delete a parent row for which child data exists (which will result in raising an Oracle error or the form: “ORA-02292: integrity constraint (TEST_USER.CHI_FK_PAR) violated – child record found”) you need only set up a
small amount of data, attempt the delete and then dump the redo for the transaction – if nothing else is going on you’ll find the following set or redo change vectors (taken from a trace file generated in 19.3, using the command “grep OP {filename}”):

CHANGE #1 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00084 OBJ:94828 SCN:0x00000b860f303fa4 SEQ:2 OP:11.3 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:35 AFN:17 DBA:0x04400110 OBJ:4294967295 SCN:0x00000b860f303f98 SEQ:2 OP:5.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #3 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c0009c OBJ:94829 SCN:0x00000b860f303fa4 SEQ:2 OP:10.4 ENC:0 RBL:0 FLG:0x0000
CHANGE #4 CON_ID:3 TYP:0 CLS:36 AFN:17 DBA:0x04404564 OBJ:4294967295 SCN:0x00000b860f303f98 SEQ:3 OP:5.1 ENC:0 RBL:0 FLG:0x0000
CHANGE #5 CON_ID:3 TYP:0 CLS:36 AFN:17 DBA:0x04404564 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:1 OP:5.1 ENC:0 RBL:0 FLG:0x0000

CHANGE #1 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c0009c OBJ:94829 SCN:0x00000b860f304062 SEQ:1 OP:10.5 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:36 AFN:17 DBA:0x04404564 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:2 OP:5.6 ENC:0 RBL:0 FLG:0x0000

CHANGE #1 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00084 OBJ:94828 SCN:0x00000b860f304062 SEQ:1 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:35 AFN:17 DBA:0x04400110 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:1 OP:5.11 ENC:0 RBL:0 FLG:0x0000

CHANGE #1 CON_ID:3 TYP:0 CLS:35 AFN:17 DBA:0x04400110 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:2 OP:5.4 ENC:0 RBL:0 FLG:0x0000

11.3  delete row piece
5.2   start transaction
10.4  delete leaf row    (ed: this is for the primary key index)
5.1   create undo record for table block
5.1   create undo record for index leaf block

10.5  restore leaf row during rollback
5.6   mark index undo as applied

11.2  insert rowpiece
5.11  mark table undo as applied

5.4   commit;

So Oracle deletes the parent, (discovers the child) then rolls back the parent delete.

 

 

5 Comments »

  1. Hi Jonathan,

    To identify what objects contribute to those current gets, I was making snapshots of V$SEGMENT_STATISTICS.
    Thus, it can be seen that those are lookups against CHILD_IX:

    select object_name, statistic_name, value
      from v$segment_statistics
     where object_name in ('CHILD', 'CHILD_IX', 'PARENT', 'PAR_PK')
       and statistic_name in ('db block changes', 'logical reads')
     order by object_name, statistic_name;
    
    -- before delete from parent
    OBJECT_NAME STATISTIC_NAME         VALUE
    ----------- ----------------- ----------
    CHILD       db block changes        5280
    CHILD       logical reads          13872
    CHILD_IX    db block changes        8704
    CHILD_IX    logical reads         116240
    PARENT      db block changes          16
    PARENT      logical reads             48
    PAR_PK      db block changes           0
    PAR_PK      logical reads             48
    
    -- after delete from parent
    OBJECT_NAME STATISTIC_NAME         VALUE
    ----------- ----------------- ----------
    CHILD       db block changes        5280
    CHILD       logical reads          13872
    CHILD_IX    db block changes        9904 -- 1200 more than before
    CHILD_IX    logical reads         117440 -- 1200 more than before
    PARENT      db block changes          32
    PARENT      logical reads             48
    PAR_PK      db block changes           0
    PAR_PK      logical reads             48
    

    Although tkprof usually does a great job in identifying recursive statements and reporting them separately: https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/performing-application-tracing.html#GUID-681AB1DC-082B-460C-8656-DC3286627D0C , it is not the case here. Which might be due to the fact that those child lookups are bypassing the SQL layer, so they are not statements per se.

    Your post itself reminds me another old one about the same topic: https://jonathanlewis.wordpress.com/2013/05/27/cu-gets/
    The reason why I remember it is that there was a recent question on AskTom:
    https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9529107800346040562

    Essentially, there is something that looks like a tkprof output (bulk delete, 6 x 1000 rows):

    delete /*+ parallel(1) */  from DOCUMENTS s 
    where
    s.id IN (:c_id)
    
    
    call     count       cpu    elapsed       disk      query    current        rows
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    Parse        1      0.00       0.00          0          0          0           0
    Execute      6    264.88     460.49       2591     140034  245669840        6000
    Fetch        0      0.00       0.00          0          0          0           0
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    total        7    264.88     460.49       2591     140034  245669840        6000
    

    Please notice that there are 245,669,840/6,000=40,944 current gets per parent row. I would not consider your example “extreme” after that.

    The topic starter was asked the following question:
    [quote]Do you have a FK index in place ? [/quote]

    When we have something like this in tkprof, it is definitely not due to missing FK indexes – those would be accounted separately from the parent statement that initiated them.
    Could it be something worth including in your upcoming series on this FK topic?

    Do you happen to know off the top of your head when we might have such high current gets while deleting rows from a parent table? Fragmented indexes on child tables is the only cause that I am aware of right now.

    Regards,
    Mikhail.

    Comment by Mikhail Velikikh — August 28, 2020 @ 2:07 pm BST Aug 28,2020 | Reply

    • Mikhail,

      Thanks for the comments.

      It’s a smart move to look at v$segstat / v$segment_statistics – it’s one of those views that can add a lot of useful information if you happen to know which segments are likely to be the (small number of) interesting ones. If you’re lucky (or maybe that should be “very unlucky”) the impact the delete could be so great that it would be sufficient to trigger an awr/statspack report (the latter at level 7) before and after the delete and check the “Top Segments” section of the report.

      I’ve added a couple of comments about what happens when there is not “proper” foreign key index just to mention the locking issue and to point out that the check is implemented as a recursive SQL statement rather than being hidden activity.

      There are two other possibilities, I think, that might get figures like these – though it would be surprising to see such large values. Some of the code for handling matierialized views also bypasses the SQL layer (by using pre-compiled statements) it’s possible that this could have some strange effects on the numbers – but I’d have to test that. The other possibility is the problem of long-running uncommitted transactions and the need to do a huge amount of work for read-consistency purposes – but you’d get some clues about that from segment statistics (shortfall in numbers) and from the session stats (“xxx – undo records applied”). Otherwise I would tend to assume that it was something to do with the nature of the data or processing that had left a large number of index leaf blocks for specific key values with no data in them.

      Regards
      Jonathan Lewis

      P.S. I’d forgotten all about that previous post.

      Comment by Jonathan Lewis — August 30, 2020 @ 11:41 am BST Aug 30,2020 | Reply

    • Mikhail,

      I’ve just taken a look at the question on AskTOM and the wait reported for that statement include 555 waits for “library cache: mutex X”, which suggests that the delete is causing some other SQL to be parsed and executed. There’s also a possible hint in the name of the table in the delete statement: “documents”

      The mutex X waits might be an indication of something to do with materialized views, but the table name suggests the possibility that the workload might be something to do with whatever Oracle calls its Text option (context ?). Maybe there are some context indexes on documents with synchronous update enabled.

      It’s also interesting to note that the difference between the CPU time and Elapsed time in the summary is nearly 200 seconds, but the wait time reported in the wait events summary is a little less than 3 seconds – which suggest that there’s something a little unusual going one that may not have proper instrumentation.

      It’s a pity that there was no follow-up; it would be nice to know a little more of the background.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — August 31, 2020 @ 1:20 pm BST Aug 31,2020 | Reply

  2. Hi Jonathan,
    I suppose that an index coalesce should be sufficient here rather than rebuilding it.
    It would be nice that oracle allows us to fallback to the non-indexed foreign key behaviour (locking the child table in share mode for the short duration of the delete). It tried to lock it explicitly before the delete but this doesn’t bypass the lookup on index leaves.
    Regards,
    Franck.

    Comment by @FranckPachot — August 30, 2020 @ 9:19 am BST Aug 30,2020 | Reply

    • Franck,

      Thanks for the comment.

      Yes, a coalesce (or a shrink space on the relevant index) should be sufficient – the critical thing is to ensure that the number of child leaf blocks that have to be visited for any one parent is kept to a minimum.

      It has also ooccured to me that in some cases a delete that followed the order of the parent key value MIGHT help in some special (lucky) cases because it could improve the chances of child leaf blocks still being cached from a previous parent delete.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — August 30, 2020 @ 11:48 am BST Aug 30,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.