Here’s a little follow-on from Friday’s posting. I’ll start it off as a quiz, and follow-up tomorrow with an explanation of the results (though someone will probably have given the correct solution by then anyway).
I have a simple heap table t1(id number(6,0), n1 number, v1 varchar2(10), padding varchar2(100)). The primary key is the id column, and the table holds 3,000 rows where id takes the values from 1 to 3,000. There are no other indexes. (I’d show you the code, but I don’t want to make it too easy to run the code, I want you to try to work it out in your heads).
I run the following pl/sql block.
rem rem Script: transaction_demo.sql rem Dated: Feb 2013 rem Author: Jonathan Lewis rem begin -- -- Going to fail on primary key constraint -- for i in 1..200 loop begin insert into t1 values(50,i,'x','x'); commit; exception when others then null; end; end loop; -- -- Explicit rollback -- for i in 1..40 loop begin insert into t1 values(4000 + i,i,'x','x'); rollback; end; end loop; -- -- Explicit commit -- for i in 1..50 loop begin insert into t1 values(4500 + i,i,'x','x'); commit; end; end loop; -- -- Rolling back to savepoint -- for i in 1..70 loop begin savepoint A; insert into t1 values(4600 + i,i,'x','x'); rollback to savepoint A; end; end loop; end; /
If (from another session) you take an AWR (or Statspack) snapshot before and after running the block, what will you see for the statistics
- user commits
- user rollbacks
- transaction rollbacks
- rollback changes – undo records applied
and, for a bonus point, what will the value of “Rollback per transaction %” be ?
Note – since the snapshots are global, your results will be affected by other work on the system; in particular I noticed that simply taking two AWR snapshots very quickly one after the other on the instance I used resulted in 8 user commits.
Update
And the answers – cut from an AWR report – are:
Statistic Total per Second per Trans -------------------------------- ------------------ -------------- ------------- rollback changes - undo records 420 25.2 4.3 transaction rollbacks 240 4.7 2.5 user commits 58 1.1 0.6 user rollbacks 40 0.8 0.4 Rollback per transaction %: 40.82 Rows per Sort: 122.70
The 40 “user rollbacks” come from the loop with the explicit rollback. At the same time the 40 rollbacks introduced 80 “rollback changes – undo records applied” – one for the table’s index and one for the table for each call to rollback. Our “user rollbacks” have been “real” rollbacks, so they’ve introduced 40 “transaction rollbacks” at the same time.
The 58 “user commits” come from the loop with the explicit commit – except that the picture is slightly blurred by the fact that simply running the AWR snapshot introduced a few extra commits.
The 40.82% is 40 / (40 + 58) expressed as a percentage.
The “rollback to savepoint” calls haven’t contributed to the “user rollbacks” – even though, in this case, they have rolled back 40 transaction starts, and a check of various other statistics would show TX enqueues being requested and released and transactions rotating through the undo segments. Despite this clearing of the TX enqueues these 70 calls to “rollback to savepoint” don’t count towards “transaction rollbacks”; however, they have introduced a further 140 “rollback changes – undo records applied” (again, one for the table and one for the table’s index for each call).
Finally the failed attempts to insert duplicate keys have, in these circumstances, introduced 200 “transaction rollbacks” (though not “user rollbacks” – we didn’t ask for them explicitly). They are also responsible for the final 200 “rollback changes – undo records applied” that we need to account for. The changes, very specifically, are the changes to the table – Oracle does actually have to insert the row into the table before trying to insert the index entry (after all, the index entry needs to know the rowid), and so it has to reverse that insert when it discovers that the relevant entry already exists in the primary key index. The requirement to insert and then rollback is one of the reasons why it is often better when doing “batch maintenance by pl/sql loop” to try an update then insert if sql%rowcount is zero rather than trying to insert, capture the exception, and then update; it’s also a good reason for creating the unique indexes on a table before creating the non-unique indexes – Oracle maintains indexes in the order they were created so you want it to find the duplicate key error as soon as possible to minimise the amount of change and rollback.
If you want to experiment further, there are a couple of slightly different variations of the code that exhibit interesting variations in results. There are also other events that can cause transaction rollbacks to appear without user rollbacks being invoked.
Here are my guesses
the first loop will be responsible for 200 ORA-1 errors which should turn into:
0 transaction rollbacks ( because the transaction does not start at all)
0 user rollbacks ( since these are not explicit rollbacks)
200 rollback changed – undo records applied (to “repair” the ORA-1 errors on the index)
the second loop will be responsible for 40 insert+rollback which should turn into:
40 transaction rollbacks
40 user rollbacks ( since these are explicit rollbacks)
80 rollback changed – undo records applied (40 changes on the index and 40 changes on the table)
the third loop will be responsible for 50 insert+commit which should turn into:
0 transaction rollbacks
50 user commits
0 rollback changed – undo records applied (since the ORA-1 errors will not allow us to change anything)
the fourth loop will be responsible for 70 insert+rollback to savepoint which should turn into:
0 transaction rollback (the “rollback to savepoint” does not close the transaction even if we rollback to the “starting” savepoint)
0 user rollbacks (same as before)
140 rollback changed – undo records applied (70 changes on the index and 70 changes on the table)
Comment by Andrea Monti — February 18, 2013 @ 8:50 am GMT Feb 18,2013 |
obviously the “(since the ORA-1 errors will not allow us to change anything)” sentence is due to a wrong cut&paste.
I forgot to do the arithmetic: we should get
420 rollback changes – undo records applied
240 transaction rollbacks
40 user rollbacks
50 user commits
regards
Andrea
Comment by Andrea Monti — February 18, 2013 @ 9:01 am GMT Feb 18,2013 |
How do you get “240 transaction rollbacks” if you post before that “0 transaction rollbacks ( because the transaction does not start at all)” and others give you just 40 ?
Comment by Valentin Nikotin — February 18, 2013 @ 11:41 am GMT Feb 18,2013 |
I was in a hurry and my first post was (partially) wrong. The right picture is :
The first loop will be responsible for 200 ORA-1 errors which should turn into:
200 transaction rollbacks
0 user rollbacks
200 rollback changed – undo records applied (to “repair” the ORA-1 errors on the index only: we did no change on the base table, thus we have no undo record applied on the base table)
The second loop will be responsible for 40 insert+rollback which should turn into:
40 transaction rollbacks
40 user rollbacks ( since these are explicit rollbacks)
80 rollback changed – undo records applied (40 changes on the index and 40 changes on the table)
The third loop will be responsible for 50 insert+commit which should turn into:
0 transaction rollbacks
50 user commits
0 rollback changed – undo records applied
The fourth loop will be responsible for 70 insert+rollback to savepoint which should turn into:
0 transaction rollback (the “rollback to savepoint” does not close the transaction even if we rollback to the “starting” savepoint)
0 user rollbacks (same as before)
140 rollback changed – undo records applied (70 changes on the index and 70 changes on the table)
We should thus get
420 rollback changes – undo records applied
240 transaction rollbacks
40 user rollbacks
50 user commits
(200+40+50) = 290 closed transactions from the first three loops,
1 open transaction from the last loop
(40) / (40+50) = 0.44 “Rollback per transaction %” since we should only account for “user commits” and “user rollback” to evaluate this metric
Next time I’ll double-check my guesses. Regards, Andrea
Comment by Andrea Monti — February 18, 2013 @ 1:39 pm GMT Feb 18,2013 |
50 user commits
40 user rollbacks
240 transaction rollbacks = 200 + 40
420 rollback changes – undo record = 200 + (40 + 70) * 2 (for index and table)
0.137 Rollback per transaction % = 40 / (200 + 40 + 50 + 1)
Comment by Valentin Nikotin — February 18, 2013 @ 9:14 am GMT Feb 18,2013 |
Comment by Mohamed Houri — February 18, 2013 @ 9:39 am GMT Feb 18,2013 |
a small correction for user commits
Comment by Mohamed Houri — February 18, 2013 @ 9:54 am GMT Feb 18,2013 |
I’ve now updated the post with my results and a few comments.
Comment by Jonathan Lewis — February 18, 2013 @ 9:52 pm GMT Feb 18,2013 |
Jonathan,
Unless I missed something I found a different behavior for transaction rollbacks statistic when using DML error logging. This statistic seems not be incremented in this particular case. The column id is the primary key of table t1 and id = 1 has already been inserted into t1.The test has been conducted under Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bi
Comment by Houri — February 19, 2013 @ 8:31 am GMT Feb 19,2013 |
The same for forall save exceptions:
Comment by Valentin Nikotin — February 19, 2013 @ 9:18 am GMT Feb 19,2013 |
Valentin,
Thanks for the demo; if you hadn’t come up with this example, I would have suggested it as another area for Mohamed to look at.
One of the big problems I find with Oracle – as soon as I start looking at something a little bit too closely I keep finding more and more cases that ought to be examined at the same time. Are there any differences regarding recursive call counts between the bulk exceptions strategy and the error logging strategy, for example ? And does this change with version ?
I thought I had a blog with some quick demos of the interesting side effects and costs of error logging – but it turns out that the thing I had in mind was an article that Mark Rittman wrote about it nearly 8 years ago, with a little 11g follow-up reference a couple of years later.
Comment by Jonathan Lewis — February 19, 2013 @ 10:59 am GMT Feb 19,2013 |
Mohamed,
Thanks for the example. I think the results are as I would have guessed. Having covered this case, it would be interesting to extend the checks to get to (a) multiple indexes, not all unique, (b) multiple rows rejected, including first, middle, and last rows of insert as select. It would also be interesting to take a note of how the logging action affected the number of recursive calls.
Comment by Jonathan Lewis — February 19, 2013 @ 10:51 am GMT Feb 19,2013 |
Jonathan,
I have tried what you’ve suggested me to do and here are the results:
The model is :
Run without DML error logging
Run with DML error logging
When I changed a little bit my DML error logging script in order to use direct path load I got the following results:
As far as the DML Error logging is able to catch the error and let the insert continue there seem to be no transaction rollbacks applied. Unless you hit a restriction of the DML error logging (as unique constraint during direct path) where transaction rollbacks statistics starts to be incremented.
Comment by Houri — February 20, 2013 @ 8:27 am GMT Feb 20,2013 |
Hello Jonathan,
Let me share one more test case about almost the same:
That gives on my database:
One more interesting fact is a SQL statistic in this case:
It seems Oracle treats all recursive activity(internal rollbacks, related UNDO lookups, etc.) as CPU activity which is probably not quite correct in this case.
Comment by dmitryremizov — January 24, 2017 @ 5:59 pm GMT Jan 24,2017 |
dmitry,
I’m not sure what other time you might expect to find in v$sql – the insert is going to do some buffer visits, generate redo, rollback the table change; it’s almost all going to be CPU time unless you keep flushing the buffer cache or try running concurrent copies of the same code.
The interesting number is the “recursive calls” – I can account for 18,000 of them, but there are 9,000 that I can’t account for. (The 18,000 is the exec/close x 3,000 for the insert, and parse/exec/fetch/close x 3,000 for a query to acquire the constraint name for the index that caused the duplicate key error (which shouldn’t have been needed, of course).
The difference seems to be 3 per insert – so maybe a hidden parse/exec/close, or exec/fetch/close, but nothing obvious shows up in the library cache.
Comment by Jonathan Lewis — January 26, 2017 @ 9:33 am GMT Jan 26,2017 |
Hello Jonathan,
I would expect to see at least some IO activity reflected in v$sql stats. I saw several (many) visits to UNDO(rollback segments) in v$sesstat, v$session_event (db file sequential read) during the execution. However I see almost nothing in v$sql.user_io_wait_time.
PS
The test has been performed on a quite busy system.
Comment by dmitryremizov — March 3, 2017 @ 6:15 am GMT Mar 3,2017 |
The number of undo blocks used should have been fairly small – maybe 8; and if you ran your test immediately after the setup then the table and its index would have been cached. Some of your db file sequential reads might have been related to parse calls, some to reading the constraint definition (possibly multiple times if the system was busy) – and neither of those would be attributed to the execution of the INSERT.
The trunc() may have made the most important difference – especially if you also had tracing enabled: I ran your test a few times and finally got a little over 1% of the time as user_io_wait time, but I’m running off SSF so all file I/O time was very fast.
Comment by Jonathan Lewis — March 3, 2017 @ 3:08 pm GMT Mar 3,2017
Thank you, Jonathan,
Understood.
PS
I have tried to create a test case for a real production situation I observed sometimes ago, but seems this particular test case is not relevant.
Comment by dmitryremizov — March 5, 2017 @ 6:48 am GMT Mar 5,2017 |
Hi Jonathan, very interesting post.
Here is an example of statistics on a production database during 60 minutes.
I was very surprise by high volume of “undo records applied” by comparison to low “transaction rollbacks” .
What could be the explanation for such a surprising statistical profile.
Reading your post, the explication could it be a large number of SQL failing on oracle error like you example (insert with dupkey error) ?
Regards
Comment by Jean-Yves — May 10, 2017 @ 10:49 am BST May 10,2017 |
Jean-Yves,
There could be all sorts of reasons dependent on what the application does.
The simplest reason would be that a single large insert or update affecting several indexes failed and rolled back.
In the right circumstances, with or without savepoints, it’s possible for pl/sql loops that do single row processing do generate lots “undo records applied” without showing any user or transaction rollbacks, so the fact that you’ve also got 1.16M user commits might be a clue that two large-scale processes that do lots of single row processing have collided, leaving one of them applying lots of undo because the other has beaten it to creating identical (unique) rows.
Comment by Jonathan Lewis — May 10, 2017 @ 12:38 pm BST May 10,2017 |
Thank you Jonathan, I will look in this direction.
Comment by Jean-Yves — May 11, 2017 @ 8:33 am BST May 11,2017 |
[…] Footnote 2: Is anyone getting worried by my “first approximation” comment, and has anyone started wondering if there may be more to transaction rollbacks than user rollbacks, and if so where they fit into the arithmetic ? To be continued. [Late update – here’s part 2] […]
Pingback by Transactions | Oracle Scratchpad — July 10, 2019 @ 6:16 pm BST Jul 10,2019 |