Consider the following update statement:
rem rem Script: index_update_01.sql rem Author: Jonathan Lewis rem Dated: Oct 2006 rem update /*+ index(t1 t1_n1) */ t1 set n1 = 0, n2 = n2 + 1, status = 'X' where n1 > 100 and status = 'A' ;
Did you realise that the access method that Oracle uses for an update can make a difference to the type of work done by your system ? And if the type of work is different, this could make a big difference to the amount of work done by your system ?
In recent tests on 8i, 9i, and 10g, the work done by an update depends on whether the update operates through a tablescan or an index. [Edit: the same is also true of a delete – see note 15 below]
If Oracle uses a (B-tree) index in ascending order (index_desc() doesn’t co-operate) to find the data to be updated, it postpones any (B-tree) index updates needed until the end of the update, then sorts the index keys (with their rowids) for the before and after values before applying bulk updates to the indexes. This bulk update strategy may reduce undo and redo generation – along with logical I/O and latch activity – but will expend extra resources on sorting. For example, if your update affects three indexes, and you update 100 rows, you will see sorts = 3, rows sorted = 600 (3 * 2 * 100) in your session statistics (v$mystat).
If Oracle uses a tablescan to acquire the data to be updated, then it updates each index entry as it goes. This eliminates the need for sorting but doesn’t get any benefits of the undo and redo optimisations that pack multiple changes into a single record. I’ve also noticed from 10g onwards (and possibly it’s a bug since it doesn’t seem to happen in 8i or 9i) that a large tablescan update can generate a surprising number of redundant block cleanout records for the table blocks.
In general it seems that an indexed mechanism is likely to be more efficient than the tablescan – even when a tablescan would be more efficient for the equivalent query. This may be particularly effective if you can pick an index with a low clustering_factor (or, to expressing that in terms of the underlying data, if as you walk the index in order you find that the data is well clustered in the same order).
As always, though, you ought to test the impact a change in access method might have on your more interesting updates. Depending on your bottleneck, either option might be the more appropriate for your particular circumstances, notwithstanding any abstract argument about which is theoretically “more efficient”.
Footnote:
I’ve included specific references to B-tree indexes in the above. I haven’t yet checked what might happen with bitmap indexes – but normally you wouldn’t expect to be doing this type of update with bitmap indexes enabled anyway.
Footnote 2:
See comments 12 – 15, the same index-driven mechanism applies to delete statements.
Footnote 3:
If you have any B-tree indexes supporting foreign key constraints, or non-unique indexes supporting unique/pk constraint, the delayed maintenance optimisation will not apply to them (though it can still be used on any other B-tree indexes on the table).
I am curious, what triggered your research on this ?
Comment by Christo Kutrovsky — November 22, 2006 @ 9:30 pm GMT Nov 22,2006 |
Christo, the usual – a client complaining about a performance problem.
Comment by Jonathan Lewis — November 22, 2006 @ 9:41 pm GMT Nov 22,2006 |
it postpones any (B-tree) index updates needed until the end of the update
Clarification : are all index updates postponed, or perhaps the index used to find the data (t1_n1 in your hinted example) is updated immediately, then the table rows, and then the remaining indices in bulk ?
May you also kindly sketch briefly how you investigate the internal workings in this scenarios – do you use logminer, or some events ?
Thanks in advance of course – as always, even when I don’t follow up with “Thanks” (I don’t only to avoid cluttering your blog with my thank-you messages).
Comment by Alberto Dell'Era — November 22, 2006 @ 9:41 pm GMT Nov 22,2006 |
Alberto, all indexes. In this case I started by creating a couple of test cases like the above, with index() hints, full() hints, and varying numbers of row, checking session stats for the different cases.
The stats suggested that it was every index, and that the number of “rows” sorted was twice the number of rows updated. I think I then dumped a couple of undo blocks to see what was in them; and looked at the 10032 trace to check whether the sort stats reported by the session stats matched the sorting reported by the 10032.
What I did not get around to in my test cases, was to force the sorting to spill to the temp tablespace so I have assumed – but not yet proved – that the items sorted are of the form:
(delete/insert flag, old/new key value, rowid)
Comment by Jonathan Lewis — November 22, 2006 @ 9:51 pm GMT Nov 22,2006 |
Curious by the way that the index updates are optimized for an index-driven access, since if the CBO chooses an index, it means that only “a few” rows are estimated to be inspected, and so “less than a few” updated, whereas for a FTS, the converse is true. Why optimize for the least resource intensive operation ?
Unless this is done, mainly, to avoid mixing read (consistent gets) and write (current gets) operations on t1_n1, which may be cumbersome if block splits are needed, and could mean (just thinking out loud) a lot of visits to the rollback segment to (re) build the CR version of the index blocks. Just an idea.
Comment by Alberto Dell'Era — November 22, 2006 @ 10:51 pm GMT Nov 22,2006 |
Jonathan – great!
About a month ago we came over similar issue. Including “dummy” update of indexed column “automagically” resulted in less latches. Unfortunately we didn’t have much time for tracking done this issue. You did that for us – thank you.
Comment by Alexander Fatkulin — November 23, 2006 @ 12:07 am GMT Nov 23,2006 |
Jonathan,
With reference to your observation that sorts=3 when the optimizer chooses the index update access path – I also noticed that sorts=2 for the full tablescan path when I tried to repeat your expirement. At first I thought this might be due to resorting of the index (which needs to happen irrespective of the access path used) but when I removed the index altogher and just ran a table update using your query I still get sorts=2 by diffing the v$mystat before and after. So my question is: would you be able to enlighten me on what the 3 sorts are doing and why they appear for a full tablescan when no index is present ? Thanks and Regards.
Comment by Jeroen — November 23, 2006 @ 3:17 am GMT Nov 23,2006 |
Very nice!
I’m wondering if this may apply to merge.
This is the situation:
A table with no index join with an external table, in the explain plan for a merge there are 2 sort operation (they got the highest number in the plan).
One for the insert and one for the update? Is it possible?
Comment by Antonio — November 23, 2006 @ 9:22 am GMT Nov 23,2006 |
Antonio – from your description of seeing two sorts in the exection plan for a merge command, then I suspect you are seeing a merge join (outer). The merge command always requires an outer join between the new data and the old data – and that join can be a hash, merge, or nested loop join.
Jeroen, I can’t tell you why you are seeing those two sorts; but I would guess it is either some (version-dependent) detail of recursive SQL; but it may be related to the SQL you are using to “diff” the results from v$mystat. Check the number of rows sorted – does it seem to vary with the number of rows updated, or is it basically constant.
Comment by Jonathan Lewis — November 23, 2006 @ 7:01 pm GMT Nov 23,2006 |
Hmm that’s very interesting. What about something like that:
UPDATE table SET col1=val1
WHERE colX in (SELECT rowid FROM tmptable)
I think I will try that out. No FTS, no Index access, just access by rowid.
I’ll post my results here.
Wolfgang
Comment by Wolfgang — November 24, 2006 @ 10:26 pm GMT Nov 24,2006 |
Just in case somebody is curious about the results Wolfgang got. I investigated this case in version 11.2.0.4 and came to the conclusion that Oracle doesn’t use the delayed maintenance optimisation.
Comment by Vasiliy — August 18, 2017 @ 9:32 am BST Aug 18,2017 |
[…] to find the data, the bitmap example uses the indexed access path – and I’ve written a little note about the different update costs you get using different access mechanisms. This explains the […]
Pingback by Mything in action « Oracle Scratchpad — December 19, 2006 @ 11:05 pm GMT Dec 19,2006 |
Hello Mr.Lewis
Is this true when we do a delete. I have a batch program which does an average of two million rows every day but it does thru index scan, so I am not worried. But I am curious.
Thanks
CT
Comment by CT — January 24, 2007 @ 2:27 pm GMT Jan 24,2007 |
CT – if you are curious, build a test case and try it. Write a script to create a table with a lot of data and delete with a full() hint, then repeat with an index() hint. Check the execution plan to make sure you’ve got the hint working. Disconnect and reconnect before the delete, and check v$mystat after the delete.
Comment by Jonathan Lewis — January 25, 2007 @ 12:13 am GMT Jan 25,2007 |
A recent question on OTN prompted me to check what would happen if the indexes that needed to be updated were bitmap indexes (and the driving index was still a b-tree).
The answer was that in both cases delayed index maintenance took place. The index entries were sorted and updated after the table update was complete – whether the driving access path was a tablescan or index.
Comment by Jonathan Lewis — October 17, 2008 @ 9:42 pm BST Oct 17,2008 |
CT,
You never told us if you had followed my suggestions for a test – but the answer to your original question is yes.
I did a quick test just before supplying the suggestions in this thread of the Oracle Forum:
The results seemed to have been quite beneficial
Update: This thread has since been removed from OTN thanks to the comments made by one of the posters – unfortunately I don’t have a backup, but the point was that the strategy works for deletes, not just updates.
Comment by Jonathan Lewis — January 3, 2009 @ 9:18 am GMT Jan 3,2009 |
Can you suggest something for an update statement without a where clause. Since I need to update all rows in the table, how can I get better performance?
UPDATE test_table
SET col1 = newvalue1,
col2 = newvalue2;
Any help is appreciated.
Chris
Comment by Chris C. — March 23, 2009 @ 3:22 pm GMT Mar 23,2009 |
Chris,
I think there are several questions we could ask before coming up with the best solution. For example – since you’re updating every row would you be allowed to do something like:
create new_table nologging as
select newvalue1, newvalue2, {rest of columns} from old table;
create new_indexes
rename old_table ...
rename new_table ...
rename new_indexes
drop old table
If you can’t play games with dropping and recreating, have you tried something like:
update /*+ index(pk index) */ table set ... where pk is not null;
Could you do an update in parallel ?
Can you disable and recreate (nologging) any indexes that would otherwise be affected by the update ?
Comment by Jonathan Lewis — March 23, 2009 @ 5:18 pm GMT Mar 23,2009 |
Hi Jonathan,
I have tested update on similer test tables in 9i and 10g. Buffer gets was alarmingly high on 11g compared to 10g.
Here is my test.
Oracle 10g. (8K Block size, buffer cached flushed before test executed, no other operation running while this testing).
Here I can see 10021 MB Buffer gets.
Oracle 11g: (Block size 16K, Buffer cache flushed before running this query, no other job/sql running while this test)
I can see very high 39049 MB buffer gets. Execution time also 5 times more compared to 10g database.
Both 10g and 11g servers are similer CPU and memory (also SGA) confuguration.
Thx
Dilip Patel.
Comment by Dilip Patel — April 7, 2010 @ 12:32 pm BST Apr 7,2010 |
Dilip,
There’s not enough information there to comment – and it may simply be that the execution plans have changed. But I note you’ve changed from 8KB blocks to 16KB blocks and you’re doing an update. If that update is causing a lot of row migration and you’re using ASSM you might be seeing an example of at this bug.
In passing – if you’re trying to track down a problem, then it’s best to start tailoring your queries to give yourself the information you need to see. For example, it makes things harder to grasp if you report “AVG_BUFFER_MB” when (a) you’re looking at one query, and (b) you’re interested in the number of buffer visits.
Comment by Jonathan Lewis — April 12, 2010 @ 1:53 pm BST Apr 12,2010 |
[…] pointed out before now that choosing a different execution plan for an update can have a big impact on performance – this is just another example demonstrating the […]
Pingback by More CR « Oracle Scratchpad — April 19, 2011 @ 6:35 pm BST Apr 19,2011 |
[…] may also recall, by the way, that when you have B-tree indexes on a table you can choose the optimal update or delete strategy by selecting a tablescan or index range scan as the execution path. If you update or delete […]
Pingback by Bitmap loading | Oracle Scratchpad — April 22, 2014 @ 2:14 pm BST Apr 22,2014 |
[…] That 20M rows sorted is 5M rows x 4 indexes. With my memory sizing the sorting spilled to disc (as 4 separate sort operations, though I think I used to see the entire operation handled as a single combined sort in earlier versions of Oracle). So you have a choice of mechanism for doing large deletes, and a choice of HOW you use resources that may impact significantly on where the work is done and how long it takes. (Note: The same variation in strategy is available on an update.) […]
Pingback by Massive Deletes – Part 3 – All Things Oracle — April 5, 2016 @ 8:58 am BST Apr 5,2016 |
[…] Every now and again I need to point out that driving an update (or delete) through an index may result in a reduction of undo and redo – but it takes me ages to find the blog note I wrote about it because I never remember the title and always search for “big update” and get to this post. So here’s the link to “Tuning Updates”. […]
Pingback by Big Update | Oracle Scratchpad — June 29, 2016 @ 8:52 am BST Jun 29,2016 |
[…] the index optimization also takes place on updates where the values of indexed columns change (see this blog post) so it’s left as an exercise to the interested reader to see which (if any) of the trigger […]
Pingback by Massive Deletes – Part 4 – All Things Oracle — September 5, 2016 @ 12:16 pm BST Sep 5,2016 |
[…] moreover the first strategy gives Oracle the option to use the index-driven optimisation for maintaining indexes and this method doesn’t. You might note, by the way, that you could include an “order […]
Pingback by Delete/Insert | Oracle Scratchpad — November 22, 2016 @ 12:59 pm GMT Nov 22,2016 |
we are facing a similar issue where update to a table using a predicate that happens to be part of PK. we are updating 3 columns in the table which are part of a unique key. when monitored in production environment (where update is happening thru home grown Java App), buffer gets is running into 700 to 2K range. There are only 2 indexes on the table – PK and UK indexes. Since update is based on PK we expect only one row to be updated and even if we considered changes to index, we shouldn’t be expecting such high buffer_gets for this sql_id. We opened one of our standby in snapshot standby mode and tried to mimic the same update thru sqlplus but we aren’t noticing such high buffer gets in this environment. What else can we consider in this situation?
Comment by Dina — March 17, 2017 @ 2:24 am GMT Mar 17,2017 |
we have traced the sql_id and didnt find any useful information. Considered using snapper from Tanel, but he has clearly called out that snapper would be useful only in a test ground and not in real production env as there would be other sessions that might not help us find the reason for such high buffer_gets on the sql_id.
Comment by Dina — March 17, 2017 @ 2:30 am GMT Mar 17,2017 |
if it is of any use we are on 11.2.0.3 and 3 node RAC. The table in question is interval partitioned on a date column.
Comment by Dina — March 17, 2017 @ 2:34 am GMT Mar 17,2017 |
Dina,
This isn’t really connect to the posting, so you could try asking the question on the OTN database forum.
However, a couple of starting thoughts: you say the predicate is PART OF the primary key – but that means you may be updating many rows. Also, since the table is partitioned if the PK is locally partitioned and the PART OF doesn’t include the partition key then Oracle may be probing every index partition to do the update.
After checking the easy options first, there are fairly typical problems that may be a combination of concurrency, read-consistency, and “FIFO” activity that leave you doing a lot of buffer visits to find the correct point in and index and make sure that you are looking at the correct version of the data. As a quick initial check you may be able to take a snaphost of v$sesstat for a session that you know is executing this query and compare the number of consistent gets with the number of “undo records applied” – if the number of undo records applied (there are three possible statistics) is a large fraction of the total you have a concurrency problem. (This is a snapper-like approach, of course, but limited to one session).
An alternative is to pick a session that you know is going to execute this statement a few times, and trace it’s logical I/O activity – events 10200, 10201, 10202, 10203: you can be selective in 11g as follows (logged on as a suitably privileged user):
Find the PID (v$process.pid) for a suspect session, and the SQL_ID for the important query and change the values below; you might as well enable all 4 events at the same time:
— wait for a brief interval
If you have 700 to 2,000 consistent gets per execution then the trace will be quite big.
When you look at the trace remember that the excess buffer gets may be UNDO blocks, but or blocks from EITHER of the two indexes.
Comment by Jonathan Lewis — March 17, 2017 @ 11:54 am GMT Mar 17,2017 |
It turns out that when batch updates from App tier uses prepared statement and executeBatch calls, then oracle reports it as 1 execution though we might impact multiple rows based on the set of bind variables we have setup. We tested this using code defined at mkyong dot com and we found this to be true. Rows processed per execution and trace file showing multiple binds and only one exec was the clue we had to follow down this path.
Comment by Dina — March 18, 2017 @ 12:38 am GMT Mar 18,2017 |
Dina,
It’s very easy to be misled by little details or (as in the case) the absence of some little detail. Possibly you were looking at an AWR report and the “SQL ordered by Gets” section which shows you the execution count and buffer gets, but doesn’t tell you how many rows were processed – that’s one of the ‘favourite’ deficiencies in the AWR. The effects (and small inconsistencies of statistics) of array processing are easy to overlook.
Comment by Jonathan Lewis — March 21, 2017 @ 2:26 pm GMT Mar 21,2017 |
[…] You might have noticed the (almost) hint for a full tablescan of the table. This was because I was in the middle of looking into something else when I noticed this behaviour, the method of accessing the table being updated doesn’t seem to make a difference here (unlike in https://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/). […]
Pingback by DML and SQL Plan Projections | CTAndrewSayer — June 10, 2018 @ 9:37 am BST Jun 10,2018 |