I was on a customer site recently where I needed to add a NOT NULL constraint to a table of 200 million rows – without taking any downtime. It’s not difficult (provided you are happy with a check constraint rather than a column definition.)
alter table t1 add constraint t1_ck_colX_nn check (colX is not null) enable novalidate ;
The first step creates the constraint and enables it – but doesn’t validate it. This means that future data (and changes) will obey the constraint, but there may be illegal data already in the table that will not be checked. You will have an interruption to service doing this, as your session will wait to lock the table in share mode (mode 4) to add the constraint – so will be blocked by current update transactions, and will block new update transactions. In a typical OLTP system this should result in just a brief pause.
The second step validates the constraint, which needs a slightly more complex piece of code – perhaps something like the following:
declare resource_busy exception; pragma EXCEPTION_INIT(resource_busy, -54); begin loop begin execute immediate 'alter table t1 modify constraint t1_ck_colX_nn validate'; dbms_output.put_line('Succeeded'); exit; exception when resource_busy then dbms_output.put_line('Failed'); end; dbms_lock.sleep(0.01); end loop; end; /
This code tries to validate the constraint and goes into a loop, sleeping for 1/100 second, if it hits the “resource busy” error. Unlike the call to add the constraint, the call to validate it doesn’t wait for a lock – it requests an immediate lock, so it’s going to fail immediately if there are any active transactions on the table. Once it has the lock it drops it, so you’re not going to shut your users out while the validation takes place.
Finally, we get to the quiz.
I ran the code above on a table with 200 million rows. It took Oracle just 0.15 seconds to validate the constraint. How could that be possible ?
Update 28th Aug:
What a lot of excellent responses. The answer in my case was that the table was list partitioned and I didn’t have a partition defined as capable of accepting NULL. But there are other options (as one of the comments said the partitioning could have been range partitioning then in the absence of a “maxvalue” partition there’s nowhere for NULL to go. [See also comment 10 below])
When validating the constraint, Oracle simply has to execute the following query as quickly as possible (sample taken from an 18.104.22.168 trace file):
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "TEST_USER"."T1" A where not ( subobject_name is not null)
Here are three plans which I caused to appear by changing the infrastructure of the table:
STAT #6 id=1 cnt=0 pid=0 pos=1 obj=0 op='PARTITION LIST EMPTY (cr=0 pr=0 pw=0 time=0 us cost=309 size=37 card=1)' STAT #6 id=2 cnt=0 pid=1 pos=1 obj=75588 op='TABLE ACCESS FULL TP (cr=0 pr=0 pw=0 time=0 us cost=309 size=37 card=1)' STAT #4 id=1 cnt=0 pid=0 pos=1 obj=0 op='BITMAP CONVERSION TO ROWIDS (cr=2 pr=2 pw=2 time=0 us cost=1 size=37 card=1)' STAT #4 id=2 cnt=0 pid=1 pos=1 obj=75586 op='BITMAP INDEX SINGLE VALUE T1_B1 (cr=2 pr=2 pw=2 time=0 us)' STAT #6 id=1 cnt=0 pid=0 pos=1 obj=75587 op='INDEX RANGE SCAN T1_I1 (cr=3 pr=3 pw=3 time=0 us cost=2 size=37 card=1)'
The first came from my list-partitioned table, so the optimizer went for a plan that would search just the partition that was allowed to hold NULLs, and found at compile time that there wasn’t one. (But it still had to produce a plan that would describe the action it would have taken if it had found one.)
The second plan came from having a bitmap index on the column. NULLs are included in bitmap indexes, so an index access to find the first NULL would be the fastest thing that Oracle could do to test whether there were any rows failing the constraint.
The last one came from having a multi-column B-tree index with the leading column as my constrained column and one of the other columns defined as NOT NULL. Because at least one of the columns in the index was declared NOT NULL (and I should point out that this particular constraint was both enabled and validated) the optimizer knew that every single row in the table had to be in the index. Because the index started with the correct column Oracle was able to do an index range scan on it to check for NULLs very quickly.
If I’d switched the column order in the index the optimizer could still have chosen to use the index for the check, but it would have to have chosen an index skip scan, index full scan, or index fast full scan to do so and this probably wouldn’t have matched my 0.15 second response time. (Unless I were using Exadata, perhaps.)