When I was a child I could get lost for hours in an encyclopedia because I’d be looking for one topic, and something in it would make me want to read another, and another, and …
The same thing happens with MOS (My Oracle Support) – I search for something and the search result throws up a completely irrelvant item that looks much more interesting so I follow a hyperlink, which mentions a couple of other notes, and a couple of hours later I can’t remember what I had started looking for.
Today’s note is a side effect of that process. A comment made yesterday about count(*)/count(1) referenced Oracle bug “19450314: UNNECESSARY INVALIDATIONS IN 12C”, and when I searched MOS for more information on this bug I discovered bug 30404639 : TRIGGER DOES NOT WORK CORRECTLY AFTER ALTER TABLE DROP UNUSED COLUMN. The impact of this bug is easy to demonstrate, and the ramifications are as follows:
Exercise extreme care with the “alter table drop column” command in 18c and above.
The problem is easy to work around, but the impact of not knowing about it could be catastrophic if your pre-production testing wasn’t quite good enough. Here’s a little demonstration script – the bug note says the problem appeared in 18.3 but I ran this test against 19.3. The script is a modified version of the SQL in the bug note:
rem rem Script: drop_col_bug_18c.sql rem Author: Jonathan Lewis rem Dated: Jan 2020 rem create table t1 (c0 varchar2(30), c1 varchar2(30), c2 varchar2(30), c3 varchar2(30), c4 varchar2(30)); create table t2 (c_log varchar2(30)); create or replace trigger t1_ariu after insert or update on t1 for each row begin IF :new.c3 is not null then insert into t2 values (:new.c3); end if; end; / spool drop_col_bug_18c.lst insert into t1(c3) values ('Inserting c3 - should log'); select * from t2; insert into t1(c4) values ('Inserting c4 - should not log'); select * from t2; prompt =================================== prompt Drop some columns in two steps then prompt truncate t2 and repeat the test prompt =================================== alter table t1 set unused (c1, c2); alter table t1 drop unused columns; truncate table t2; insert into t1(c3) values ('Inserting c3 - should log'); select * from t2; insert into t1(c4) values ('Inserting c4 - should not log'); select * from t2;
The code is very simple. It creates a couple of tables an “after row” trigger on one of them to copy one column value across to the other table on an insert or update provided the new column value is not null.
To check that the trigger is (at least in part) behaving the code does two inserts – one which should copy a value and one which should not – and we see that the copy takes place as expected.
Now comes the critical part. We mark two of the columns in the table as unused, then drop all unused columns, truncate the second table and repeat the inserts.
If you run the test on 12.2.0.1 then you should find that the second run behaves just like the first run. If you’re running 18c or 19c be prepared for the following:
insert into t1(c3) values ('Inserting c3 - should log') * ERROR at line 1: ORA-00600: internal error code, arguments: [insChkBuffering_1], [4], [4], [], [], [], [], [], [], [], [], [] no rows selected insert into t1(c4) values ('Inserting c4 - should not log') * ERROR at line 1: ORA-00600: internal error code, arguments: [insChkBuffering_1], [4], [4], [], [], [], [], [], [], [], [], []
This is not good – but it gets worse. If your application starts consistently breaking with an ORA-00600 error that’s going to annoy a lot of users for (we hope) a brief interval, but if your application keeps running and corrupting your data that’s a much bigger problem. Re-run the whole script (dropping the two tables first) but change it to mark just one of the two columns as unused, and you’ll get results for the second pass that look like this:
Table truncated. 1 row created. no rows selected 1 row created. C_LOG ------------------------------ Inserting c4 - should not log 1 row selected.
The trigger seems to “lose count” of the columns in the table (presumably it’s compiled to refer to something like “column_position = 3” and doesn’t adjust on the “drop column” – the linked bug notes on MOS refer to the problem being associated with the project to increase fine-grained dependencies) so it manages to survive with one column dropped because there’s still a “column 3” which happens now to be the column that used to be “column 4”.
Workaround
There is a simple workaround if you run into this problem after modifying your production system (and before you’ve corrupted a huge amount of data – recompile the trigger manually immediately after the drop completes: “alter trigger t1_ariu compile”.
Refinement
The problem seems to appear only if the following two conditions are true:
- you use a two-step approach to dropping a column, viz: set unused then drop. If you simply issue “alter table t1 drop column c1” (with or without a “checkpoint NNN”) then the problem does not appear. It’s a great shame that in the past I’ve given advice that setting columns unused and dropping them later is a better option than doing an immediate drop.
- you drop columns that appear earlier in the table than the highest position column mentioned in the trigger. But this isn’t something you should gamble on, particularly since the workaround is so easy to implement, because the order the columns appear in the table declaration isn’t necessarily the internal column ordering so you might get it wrong (not that I’ve tried to test for that threat) – and what if there are some selective materialized view logs where you don’t explicitly create triggers and forget to cater for.
I don’t expect anyone to be dropping columns in production systems with any great frequency, and you would expect such a significant operation to be tested quite carefully, but it’s easy to envisage a scenario where the testing might be split into two pieces viz:
- test the application on a pre-prod version of the database where a table has been created as a subset of the production data without the column that’s due to be dropped
- test how long it takes to execute the actual drop on a (minimal) recovered backup of production, but don’t test the new production code on the resulting table.
Sometimes it’s easy to overlook things that “obviously” won’t need testing, especially when it’s something that has always worked in the past with no special treatment required.
Footnote
If you try running this model on LiveSQL you’ll find that the code stops and the web page reports “Error: Internal Server Error” so you can’t tell that the problem is exactly the same there – but it seems quite likely that it is.
Given how easy it is to bypass the problem I haven’t bothered to do any further research on the issue – is it only related to insert and update trigger, and do they have to be after row for the update, and what about before row delete triggers (with materialized view logs in mind).
Update Aug 2020
I’ve just rediscovered this note and done a quick check on the bug. It’s now reported as fixed in 21.1 and there are a few patches for earlier versions of Oracle including 18.10 and 19.0. [Jan 2021: see comment 5 below: now included in the 19.10 release].
Hi Jonathan,
As it is said that this issue is related to fine-grained dependencies, I have just quickly tested it and found that setting “_ignore_fg_deps”=tables can be used as a preventive workaround (the tests were run on a local 19.5 database and on LiveSQL):
Comment by Mikhail Velikikh — January 14, 2020 @ 4:07 pm GMT Jan 14,2020 |
Mikhail,
Thanks for the comment.
Excellent observation.
Checking MOS I see there have been just a few bugs reported that resort to using _ignore_fg_deps – though the ORA-00600 error they report had a different first parameter. I think the appearance of [kkmmctbf:bad intcoln] as the first parameter is a bit of a clue to which of the “column id” columns in col$ Oracle is having a problem with.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — January 14, 2020 @ 4:29 pm GMT Jan 14,2020 |
Hi Jonathan,
Running Oracle 19.8:
I have a table with 4 triggers,
Dropping 4 extended stats (exec DBMS_STATS.drop_extended_stats, because of indexes created on that table with same cols as extended stats, also hitting max extended stats for that table)
And creating a new extended stats gives ORA-00600: internal error code, arguments: [kkmmctbf:bad intcoln], [75] and ORA 600 [kktget: bind pos > num bind vars] on DML statements.
Recompiling the triggers on that table solves the problem ! So be careful when dropping/creating extended stats in a Oracle 19.8 (+some one off patches) (Never seen that before) .
Created a SR,
Kind regards
Erik
Comment by Erik — January 27, 2021 @ 5:57 pm GMT Jan 27,2021 |
Erik,
Thanks for the warning.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — January 27, 2021 @ 6:07 pm GMT Jan 27,2021 |
Hi Jonathan,
Forgotten to say in previous update: the triggers are valid and we have the ORA-600.
Kind regards
Comment by Erik — January 27, 2021 @ 6:10 pm GMT Jan 27,2021 |
Hello,
I have done some tests
This patch solves in my cases also the issues with extended stats!
Ggoog news!
Kind regards
Comment by Erik — January 29, 2021 @ 4:56 pm GMT Jan 29,2021 |
Erik,
Good news. Thanks for letting us know
Regards
Jonathan Lewis
Comment by Jonathan Lewis — January 29, 2021 @ 5:37 pm GMT Jan 29,2021 |
Hi,
30404639 is listed as fixed in 19.10
Best regards,
Phil
Comment by Phil Florent — January 30, 2021 @ 9:27 am GMT Jan 30,2021 |
[…] Drop column bug (Jan 2020) – introduced in 18c, fixed in 19.10, can do nasty things if you “drop unused columns” and have triggers on the table. […]
Pingback by Bug Catalogue | Oracle Scratchpad — February 15, 2022 @ 12:31 pm GMT Feb 15,2022 |
[…] Drop unused columns bug (Jan 2020) – fixed in 21.1, patches needed prior to 19.10. This bug may affect you if you have materialized view logs on tables where you’ve marked columns unused then drop them. […]
Pingback by Materialized Views catalogue | Oracle Scratchpad — February 16, 2022 @ 6:36 pm GMT Feb 16,2022 |