Oracle Scratchpad

July 12, 2020

Massive Deletes

Filed under: 12c,Infrastructure,Oracle,Upgrades — Jonathan Lewis @ 7:36 pm BST Jul 12,2020

One of the recurrent questions on the Oracle Developer Commuity forum is:

What’s the best way to delete millions of rows from a table?

There are an enormous number of relevant details that you need to know before you can give the “right” answer to this question, e.g.

  • Which version of Oracle
  • Standard or Enterprise Edition
  • Is “millions” a tiny percentage of the table or a large percentage
  • Are there any referential integrity constraints in place
  • Does the system have to keep running while the deletion takes place
  • Is the table compressed
  • How many indexes are there – and can you drop some of them
  • How much space do you have to do this job
  • How much time do you have to do this job

One of the most important ones, of course, is “Which version of Oracle?” because it can make an enormous difference to the range of possible strategies. I’m writing this particular note because the question came up a little while ago where the user wanted to delete all the data from 2008 through to the end of 2018, keeping only the last 18 months of data.

That sounds like the volume of data to be deleted (11 years) is very much larger than the volume of data to be kept (1.5 years) – but we can’t be sure of that since businesses tend to grow over time so that last 18 months of data might actually be just as big as the previous 11 years.

As usually happens in response to this question there were suggestions to “create a new table selecting the data you want to keep”, “use dbms_parallel_execute to delete by rowid ranges in parallel”, and a relatively new one “convert to a partitioned table so that the data you want to keep is in its own partition and drop the other partition(s)”. 

I wrote a note a few years ago giving an example of converting a simple heap table to a partitioned table – while eliminating the data you don’t want to keep so there’s no need to waste resources copying redundant data, maintaining indexes (choosing between local and global) and doing the whole job online.  So, after learning that the OP was running 12.2 Enterprise Edition with the Partitioning option, I suggested that (s)he convert the table into a hash partitioned table with a single partition as this should (for purposes of optimisation) behave just like a simple heap table using the “including rows” clause to copy only the last 18 months of data.

I pointed out that their version of Oracle(EE + PO) gave them the 2nd best option – because I knew that in 19c you could simply do something like:

rem
rem     Script:         122_move.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:        
rem
rem     Last tested 
rem             12.2.0.1
rem

create table t1
as
select  *
from    all_objects
where   rownum <= 50000
;

alter table t1 add constraint t1_pk primary key(object_id);

alter table t1 move
        including rows where owner != 'SYS'
        online
;

It wasn’t until a little later that a tiny nagging doubt crept into my mind that maybe this enhancement to the basic move command may have appeared at the same time as the modify partition enhancement – in other words in 12.2.0.1; so I ran the test above and found that it did actually seem to work. (And I haven’t yet found any bugs on MOS suggesting that it shouldn’t be used.)

Having discovered that the command was available I thought that I’d better check whether it was also documented, and found that it was in the 12.2 SQL Reference Manual (though not the 12.1 reference manual – for the obvious reason) under Alter Table. Page down to the “tram-tracks” for the Alter Table command and follow the link for the “move_table_clause”, and from there follow the link for “filter_condition”.

Note:

This option is not available on 12.1, and if you run the test Oracle will raise error “ORA-25184: column name expected” at the point where the word “rows” appears. This may look somewhat counter-intuitive, but in that version of Oracle a command like “alter table TabX move including ColY online”; is how you would rebuild an index organized table (IOT) with all columns up to ColY in the “IOT_TOP” segment.

Update [The following morning]

Once you’ve got the framework of a test in place it really doesn’t take very long to start running through “what if” cases or potential boundary conditions.  So this morning I added one very obvious test – what happens if you have referential integrity declared between two tables and try to move both of them including a subset of rows from each that ensures that the referential integrity is still in place:


rem
rem     Script:         122_move_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0        Parent can't move
rem             12.2.0.1        Parent can't move
rem

create table parent
as
select  *
from    all_objects
where   rownum <= 50000
;

alter table parent add constraint par_pk primary key(object_id);


create table child
as
select  *
from    parent
order by
        dbms_random.value
;


alter table child add constraint chi_pk primary key(object_id);
alter table child add constraint chi_fk_par foreign key(object_id) references parent;

I’ve created the child table from the parent data, with random ordering. Now I’m going to delete all the child rows where owner = ‘PUBLIC’ using an online move, then I’ll try and do the same for the parent.


alter table child move
        including rows where owner != 'PUBLIC'
        online
;

-- Child move succeeds (of course)

alter table parent move
        including rows where owner != 'PUBLIC'
        online
;

--
-- Trying to do the matching move on the parent results in:
-- ORA-02266: unique/primary keys in table referenced by enabled foreign keys
--

So there’s a gap in the functionality that makes it less desirable than the simplest case suggests. The referential integrity constraint has to be disabled before the parent table can be restructured.

But something that merits a little further investigation is the option to set the foreign key to “disable validate” (which is sufficient to allow the parent move to take place) and then to set the constraint back to “enable”. When I tried this I had expected Oracle to do a lot of work to revalidate the constraint before enabling it, but I couldn’t find any indication that any such work had taken place.

2 Comments »

  1. (full disclosure Oracle employee) but when all of the online changes in 12.2 came out (heap move, convert to partitioned etc) I was maliciously looking forward to coming up with all sorts of nasty permutations to break it. So I’m ending up with things like this – intervals + compression + including + global pars + local par etc etc…..I won’t claim it to be infallible, but its pretty damn robust in my experience

    SQL> alter table T modify
      2  partition by range (object_id) interval (10000) compress
      3  (
      4    partition p1 values less than (20000) 
      5  ) online
      6  including rows where created &gt; date '2010-01-01'
      7  update indexes
      8  ( ix  local tablespace new_idx_ts,
      9    ix2 global partition by range (created)
     10    (
     11      partition ix2_p1 values less than (date '2016-08-01'),
     12      partition ix2_p2 values less than (maxvalue)
     13    )
     14  );
    
    Table altered.
    

    Comment by Connor McDonald — July 13, 2020 @ 2:13 am BST Jul 13,2020 | Reply

    • Connor,

      Thanks for the comment. The surprising thing about the general availability of online restructuring, though, is that my only memories of its introduction were all about the fanfare of “turn simple tables into partitioned tables” – and the option for getting rid of rows while simply moving a table online never got mentioned. Unless I’ve got the history wrong 12.1 gave us online move of partitions but not of simple heap tables, then the fanfare was 12,2 allows all sorts of amazing online partitioning (and, in passing, simple heap move is now online).

      It’s been several years since this “online move and forget about some of the data” has been available, and I don’t think I’ve yet seen anyone suggest using it to “delete old data”.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — July 13, 2020 @ 8:08 am BST Jul 13,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.