Oracle Scratchpad

July 12, 2020

Massive Deletes

Filed under: 12c,Infrastructure,Oracle,Upgrades — Jonathan Lewis @ 7:36 pm BST Jul 12,2020

One of the recurrent questions on the Oracle Developer Commuity forum is:

What’s the best way to delete millions of rows from a table?

There are an enormous number of relevant details that you need to know before you can give the “right” answer to this question, e.g.

  • Which version of Oracle
  • Are you running Standard Edition or Enterprise Edition
  • Is “millions” a tiny percentage of the table or a large percentage
  • Are there any referential integrity constraints in place
  • Does the system have to keep running while the deletion takes place
  • Is the table compressed
  • How many indexes are there – and can you drop, or mark unusable, some of them
  • How much space do you have to do this job
  • How much time do you have to do this job

One of the most important ones, of course, is “Which version of Oracle?” because it can make an enormous difference to the range of possible strategies. I’m writing this particular note because the question came up a little while ago where the user wanted to delete all the data from 2008 through to the end of 2018, keeping only the last 18 months of data.

That sounds like the volume of data to be deleted (11 years) is very much larger than the volume of data to be kept (1.5 years) – but we can’t be sure of that since businesses tend to grow over time so that last 18 months of data might actually be just as big as the previous 11 years.

As usually happens in response to this question there were suggestions to “create a new table selecting the data you want to keep”, “use dbms_parallel_execute to delete by rowid ranges in parallel”, and a relatively new one “convert to a partitioned table so that the data you want to keep is in its own partition and drop the other partition(s)”. 

I wrote a note a few years ago giving an example of converting a simple heap table to a partitioned table online while maintaining indexes (choosing between local and global) and eliminating the data you don’t want to keep so there’s no need to waste resources copying redundant data.  So, after learning that the OP was running 12.2 Enterprise Edition with the Partitioning option, I suggested that (s)he convert the table into a hash partitioned table with a single partition as this should (for purposes of optimisation) behave just like a simple heap table using the “including rows” clause to copy only the last 18 months of data.

I pointed out that their version of Oracle(EE + PO) gave them the 2nd best option – because I knew that in 19c you could simply do something like:

rem
rem     Script:         122_move.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem

create table t1
as
select  *
from    all_objects
where   rownum <= 50000 -- > comment to avoid wordpress format issue
;

alter table t1 add constraint t1_pk primary key(object_id);

alter table t1 move
        including rows where owner != 'SYS'
        online
;

It wasn’t until a little later that a tiny nagging doubt crept into my mind that maybe this enhancement to the basic move command may have appeared at the same time as the modify partition enhancement – in other words in 12.2.0.1; so I ran the test above and found that it did actually seem to work. (And I haven’t yet found any bugs on MOS suggesting that it shouldn’t be used.)

Having discovered that the command was available I thought that I’d better check whether it was also documented, and found that it was in the 12.2 SQL Reference Manual (though not the 12.1 reference manual – for the obvious reason) under Alter Table. Page down to the “tram-tracks” for the Alter Table command and follow the link for the “move_table_clause”, and from there follow the link for “filter_condition”.

Note:

This option is not available on 12.1 and, if you run the test using that version, Oracle will raise error “ORA-25184: column name expected” at the point where the word “rows” appears. This may look somewhat counter-intuitive but, for a very long time, a command like “alter table TabX move including ColY online” is how you would rebuild an index organized table (IOT) with all columns up to ColY in the “IOT_TOP” segment.

Update [The following morning]

Once you’ve got the framework of a test in place it really doesn’t take very long to start running through “what if” cases or potential boundary conditions.  So this morning I added one very obvious test – what happens if you have referential integrity declared between two tables and try to move both of them including a subset of rows from each that ensures that the referential integrity is still in place:


rem
rem     Script:         122_move_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:
rem
rem     Last tested
rem             19.3.0.0        Parent can't move
rem             12.2.0.1        Parent can't move
rem

create table parent
as
select  *
from    all_objects
where   rownum <= 50000 -- > comment to avoid wordpress format issue
;

alter table parent add constraint par_pk primary key(object_id);

create table child
as
select  *
from    parent
order by
        dbms_random.value
;

alter table child add constraint chi_pk primary key(object_id);
alter table child add constraint chi_fk_par foreign key(object_id) references parent;

I’ve created the child table from the parent data, with random ordering. Now I’m going to delete all the child rows where owner = ‘PUBLIC’ using an online move, then I’ll try and do the same for the parent.


alter table child move
        including rows where owner != 'PUBLIC'
        online
;

-- Child move succeeds (of course)

alter table parent move
        including rows where owner != 'PUBLIC'
        online
;

--
-- Trying to do the matching move on the parent results in:
-- ORA-02266: unique/primary keys in table referenced by enabled foreign keys
--

So there’s a gap in the functionality that makes it less desirable than the simplest case suggests. The referential integrity constraint has to be disabled before the parent table can be restructured.

But something that merits a little further investigation is the option to set the foreign key to “disable validate” (which is sufficient to allow the parent move to take place) and then to set the constraint back to “enable”. When I tried this I had expected Oracle to do a lot of work to revalidate the constraint before enabling it, but I couldn’t find any indication that any such work had taken place.

Update (Nov 2020 – following comment #3 below about IOTs)

It’s important to check the manuals before getting stuck into too many experiments. The “problem” with foreign key constaints is one of the specified restrictions, as given in the 19c Language Reference manual under Alter Table:

Restrictions on Filter Conditions

The following restrictions apply to the filter_condition clause:

    • Filter conditions are supported only for heap-organized tables.
    • Filter conditions can refer only to columns in the table being altered. Filter conditions cannot contain operations, such as joins or subqueries, that reference other database objects.
    • Filter conditions are unsupported for tables with primary or unique keys that are referenced by enabled foreign keys.

Update (Feb 2021)

Another example of experimentation and reading the manuals – there are restrictions on the online move command.  Among other things the online option is not available if the table has domain indexes (e.g. context indexes). The annoying thing about such restrictions, of course, is that it might not even occur to you that it’s something you would need to check the manuals for before you build the model. From the 19c SQL Language Reference manual for Alter Table again:

Restrictions on the ONLINE Clause

The ONLINE clause is subject to the following restrictions when moving table partitions:

    • You cannot specify the ONLINE clause for tables owned by SYS
    • You cannot specify the ONLINE clause for index-organized tables.
    • You cannot specify the ONLINE clause for heap-organized tables that contain object types or on which bitmap join indexes or domain indexes are defined.
    • Parallel DML and direct path INSERT operations require an exclusive lock on the table. Therefore, these operations are not supported concurrently with an ongoing online partition MOVE, due to conflicting locks.

There are a number of little features scattered through this section of the manual (e.g. drop column) that have their own restrictions on the use of the ONLINE option, so you may have to search for “online” if you want to get complete coverage.

 

7 Comments »

  1. (full disclosure Oracle employee) but when all of the online changes in 12.2 came out (heap move, convert to partitioned etc) I was maliciously looking forward to coming up with all sorts of nasty permutations to break it. So I’m ending up with things like this – intervals + compression + including + global pars + local par etc etc…..I won’t claim it to be infallible, but its pretty damn robust in my experience

    SQL> alter table T modify
      2  partition by range (object_id) interval (10000) compress
      3  (
      4    partition p1 values less than (20000) 
      5  ) online
      6  including rows where created &gt; date '2010-01-01'
      7  update indexes
      8  ( ix  local tablespace new_idx_ts,
      9    ix2 global partition by range (created)
     10    (
     11      partition ix2_p1 values less than (date '2016-08-01'),
     12      partition ix2_p2 values less than (maxvalue)
     13    )
     14  );
    
    Table altered.
    

    Comment by Connor McDonald — July 13, 2020 @ 2:13 am BST Jul 13,2020 | Reply

    • Connor,

      Thanks for the comment. The surprising thing about the general availability of online restructuring, though, is that my only memories of its introduction were all about the fanfare of “turn simple tables into partitioned tables” – and the option for getting rid of rows while simply moving a table online never got mentioned. Unless I’ve got the history wrong 12.1 gave us online move of partitions but not of simple heap tables, then the fanfare was 12,2 allows all sorts of amazing online partitioning (and, in passing, simple heap move is now online).

      It’s been several years since this “online move and forget about some of the data” has been available, and I don’t think I’ve yet seen anyone suggest using it to “delete old data”.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — July 13, 2020 @ 8:08 am BST Jul 13,2020 | Reply

  2. RE: It’s been several years since this “online move and forget about some of the data” has been available, and I don’t think I’ve yet seen anyone suggest using it to “delete old data”.
    RE: When I tried this I had expected Oracle to do a lot of work to revalidate the constraint before enabling it, but I couldn’t find any indication that any such work had taken place.

    We’re pondering this as a possible alternative to an existing homegrown bulk-friendly but DML based “delete old data” process we have that would avoid the redo logging overhead for some high-volume log tables. Those tables don’t have PK/FK issues, but if it works well for those tables and there’s no indication that Oracle has to do a lot of work to revalidate the constraint before enabling it, we might try it on some transactional business data tables if there’s not a strong reason not to. It would be easy enough to detect a and alter an enabled FK to DISABLE VALIDATE before and ENABLE after the ALTER.

    Comment by Rick Wiggins (@tarpaw) — October 15, 2020 @ 5:22 pm BST Oct 15,2020 | Reply

    • Rick,

      Thanks for the comment.
      If you get any interesting results from your tests (whether positive or negative) it would be really nice to see them.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — October 19, 2020 @ 10:48 am BST Oct 19,2020 | Reply

  3. Hey Jonathan,

    are you aware of a syntax to use this ‘including rows where …’ on an Index-Organized-Table in Oracle 19?
    I still get the Error ‘ORA-25184: column name expected’ there.

    My testcase is as follows:

    ```
    create table t1
    (OWNER,
     OBJECT_NAME,
     OBJECT_ID,
     LAST_DDL_TIME,
     constraint t1_pk primary key(object_id)
     )
     ORGANIZATION INDEX
    as select
     OWNER,
     OBJECT_NAME,
     OBJECT_ID,
     LAST_DDL_TIME
    from    all_objects
    where   rownum <= 50000
    ;
    
    Table created.
    
    alter table t1 move including rows where owner != 'SYS';
    alter table t1 move including rows where owner != 'SYS'
                *
    ERROR at line 1:
    ORA-25184: column name expected
    
    
    SELECT name, value FROM v$parameter WHERE name = 'compatible';
    
    NAME          VALUE
    ------------ ------------
    compatible   19.0.0
    ```
    

    Kind regards,
    Steffen

    Comment by Steffen Busch — November 11, 2020 @ 2:51 pm GMT Nov 11,2020 | Reply

    • Steffen,

      Thanks for the email.

      It’s an interesting point that I managed to describe why 12.1 would report an ORA-25184 if you tried to use the “including rows” syntax with a heap table but didn’t then consider the possibility of what would happen to a heap table in 12.2 onwards. There are always a few more questions to ask and tests to run.

      In fact, after searching the SQL Language Reference Manual under “alter table” I found the explicit list of restrictions – including IOTs; it’s applicable only to Heap Tables. Start here and search for “filter_condition”.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — November 11, 2020 @ 4:28 pm GMT Nov 11,2020 | Reply

  4. […] fairly recent generic article on massive deletes and new […]

    Pingback by Massive Delete | Oracle Scratchpad — January 31, 2021 @ 5:15 pm GMT Jan 31,2021 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.