In a recent article (which has prompted several excellent comments) I highlighted the suggestion raised by Don Burleson that:
We would have expected (at least, I did, anyway) that Oracle would check to see if a SQL update statement failed to result in any row changes. A simple test could protect against “false updates”, and perhaps bypass the update and send an alert log message that a DML statement is doing superfluous updates.
Mr. Burleson’s comment was made with reference to an earlier page of mine explaining that I had clawed back some time from an update statement by changing it from
update HISTORY SET FLAG = 0 WHERE CLASS = 'X';
update HISTORY SET FLAG = 0 WHERE CLASS = 'X' and FLAG != 0;
thus avoiding the undo and redo overheads of roughly 5,000,000 redundant row updates per night.
At first sight, Mr. Burleson’s comment might seem quite sensible – after all it shouldn’t take much CPU to work out which rows should and should not be updated, and it ought to be quite simple to skip the rows that aren’t going to change. (I have had to make a few guesses about what Mr. Burleson is trying to say as he fails to distinguish between updates to columns, updates to rows, and the entire update statement. My initial assumption is that he is trying to say that Oracle should not update a row when it sees that none of the columns specified in the update statement have changed their values).
This, however, is a very naive piece of analysis that demonstrates a common lack of attention to the concept of generalisation. The one specific statement that you are looking at right now is not typical of everything you do. The “obvious” enhancement for this statement may have a horrid impact on everything else you do. In fact, the “obvious” enhancement may cause logical errors because the statement is no longer behaving according to its specification.
It’s the classic “silver bullet” issue – it’s often easy to come up with a “quick fix” for a local problem, but you then have to demonstrate that the fix caters for all possible variations that might occur, and that the global consequences of applying the fix aren’t going to be worse than the impact of the original problem.
Let’s start with the critical reason why Oracle must not try using clever tricks to avoid the “false updates”. If I update a row in a way that changes none of the columns I have still managed to lock the row and no-one else can modify it. If Oracle bypasses my update simply because it appears to be redundant then it may be subverting my intentions by leaving rows modifiable by other sessions when I want them locked.
Of course, you could argue that Oracle should therefore simply lock the row but not update it. However, in simple cases like the example above, most of the undo and redo overhead relates to the tracking and locking information – the volume of data relating to the before and after images of the data is really a small percentage of the total. So for the increased complexity, which becomes apparent in the more general case, we gain very little benefit. (In passing , when I advised the client to add the extra predicate, I first made sure that the “excess” 5 million rows were being updated by accident rather than by design – I also made sure that the column was not allowed to hold nulls.)
The General Case
The original example had just one column in the update statement – but think what Oracle would have to deal with to handle an update like:
update t1 set col1 = :b1, col2 = :b2, col3 = :b3, ... col50 = :b50 where rowid = :rowid ;
This is the type of code you get from many GUI screen-painters – the “one statement fits all changes” feature.
On the plus side you tend to get only one SQL statement in the shared pool to cover every possible update (though there may be multiple child cursors if you have columns declared as varchar2() or nvarchar2() with a length of more than 32 bytes), on the minus side you generate undo (and redo) for every column used in the screen. So an approach similar to Mr. Burleson’s suggestion definitely has some appeal, but would it be the “simple test” that he thinks it is.
Interestingly, there already seems to be some kernel code in place to detect “no-change” updates. If you update an indexed column so that the value does not change Oracle doesn’t seem to go through the full cycle of work that would normally go into updating the indexes – the amount of work done varies with version of Oracle, and in some scenarios seems to have gone up significantly in 10g. (I haven’t worked out all the details, and I can’t be certain that some of the optimisation isn’t simply an accidental side-effect rather than deliberate code – and it is possibly that the increase in 10g is a bug).
So why doesn’t Oracle perform a “simple test” to avoid updating columns that haven’t changed. I don’t know, I haven’t seen the kernel specifications and I didn’t write the kernel code, but I’ll take a couple of wild guesses.
Maybe it’s just not so simple to unravel the existing code that creates the undo and redo records – and remember in the case of my 50 column update above we have to test every single incoming field and (presumably) set a flag that says “we finally changed one column so we need undo and redo“ so that the redo and undo will be generated.
And when it comes to flagging the “false update” into the alert log, should we do this for the row when we find a single column that doesn’t change, for every column that doesn’t change, only if the entire set of columns in a row doesn’t change, or just once for the entire statement irrespective of how little, or how much, isn’t changed ? Whichever option you pick, it’s arguably the wrong option for some (possibly most) people.
And then you have to consider just what the overhead of this “simple test” might be. It has to operate for every single update you ever do. Most developers (we hope) will write code that does exactly what it’s supposed to do, efficiently, and with no redundant activity. So if we install this “simple test” we may have to pay for it millions of time a day in order to get a notification occasionally about a piece of silly code. Is this a sensible trade ? Possibly not, especially if there is some complexity involved that introduces a risk of error in a critical piece of the Oracle kernel.
Still, given all the other work that goes on around an update, the overhead of checking whether the old and new values match and not generating redundant undo and redo does sound like a cost-effective run-time trade-off – especially in the light of the GUI-generated screen. But I’m not going to make any guesses about the complexity of the task, or the impact it might have in the library cache if some component of the cursor heap has to be restructured as a consequence.
When thinking about “simple tests” you always need to be aware that some things can go on that fall outside the scope of your initial “simple example”. For example:
What should you do about triggers – consider the code fragment:
create or replace trigger t1_bru before update of n01, n02 on t1 for each row ...
If you do a “no-change” update on the columns referenced in a row-level trigger, should the trigger still fire ?
If you say no, you’ll have to explain how to cope with the case where the trigger changes the incoming value; if you say yes, you’ll have to explain what to do if the trigger is being used to summarize the rows that have been updated – and according to the “simple test” approach this row may have fallen into the “false update” category and not have been updated at all – and then compare the logic of your argument with the home-grown (and slightly unsafe) logic I’ve seen occasionally which does the equivalent of the following pseudo-code:
SQL> select rowid from history where flag = 'X' for update; 5000000 rows selected SQL> update history set status = 0 where flag = 'X'; 120 rows updated if select_count != update_count rollback and try again.
Consider all the flavours of using supplemental logging (for streams, change data capture, etc.). Supplemental logging works by adding to the undo records (yes, that does say undo, not redo, that is the way it works) a set of columns that can act as a unique identifier for a row – even if those columns have not changed. If someone does decide to add that “simple test”, I hope they don’t make it happen after the supplemental logging bit has been generated – or the supplemental log will disappear and streams et. al. will break.
No doubt there will be people reading this note who will think come up with other example of the less well-known features that also need special consideration before the “simple test” could work safely.
Most of the code in the Oracle kernel behaves as if application developers understand exactly what they are doing, and know how to use relational databases – this is, of course, a serious design flaw. But from time to time bits of the kernel code do get modified to cope with the worst excesses of bad applications.
Perhaps the time will come when someone modifies the kernel to eliminate the redundant undo and redo generated by the default behaviour of most GUI screen generators. When they do, they had better not change the locking behaviour, and they had better be very careful that they allow for all the variations of what else could be going on in all the peripheral activities involved with row updates.
And there had better be a parameter to turn the feature off; because when you start to think carefully, you realise that there could be all sorts of unexpected side-effects which make the “simple test” not so simple after all.