Here’s a deadlock graph that might cause a little confusion:
Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-001a0015-00014787 34 90 X 32 3 S TX-00190008-0000601b 32 3 X 34 90 S session 90: DID 0001-0022-00000327 session 3: DID 0001-0020-000009E9 session 3: DID 0001-0020-000009E9 session 90: DID 0001-0022-00000327 Rows waited on: Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA (dictionary objn - 199909, file - 6, block - 276654, slot - 0) Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA (dictionary objn - 199909, file - 6, block - 433952, slot - 0)
Both sessions are holding X and waiting S on a TX lock. There are several well-known reasons why you might see a share (mode 4) lock on a transaction table slot: collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples, and problems with interested transaction lists (ITLs) or freelists are the common “internal” ones – so the presence of the waits in share mode shouldn’t, of themselves, be a source of confusion.
The confusion is in the reported rowids. If you try to interpret them as real rowids you may be unlucky and discover that they seem to be related to the SQL reported for the deadlocked sessions when really the information they hold is garbage. (Just to avoid confusion, let me make it clear that there are cases where the rowids reported definitely WILL be garbage; on the other hand there will be some cases where the rowids are relevant – slot 0 is, after all, a legal slot for a row in a block.)
When I see a deadlock graph on transaction locks and the waits are for S mode I tend to assume that the information about the rows waited on is probably misleading; when the slot number for the rowid is zero this increases my confidence that the rowid is rubbish. (Zero is a legal value for a rowid slot so a zero doesn’t prove that the rowid is rubbish, it’s just a coincidence that allows me to continue following a hypothesis.)
The problem is that Oracle doesn’t waste resources tidying up after itself, and in the case of deadlock graphs this laziness shows up in the rowids reported. The trace is simply reporting whatever happens to be in the row_wait_obj#, row_wait_file#, row_wait_block# and row_wait_row# columns of v$session; and if the waiting process hasn’t updated these columns with current row information you could be looking at the details of the last row (or block) that the session waited for. Here’s the description of a test to demonstrate the behaviour:
create table t1 (n1 number, n2 number); insert into t1 values(1,1); create unique index t1_i1 on t1(n1); create unique index t1_i2 on t1(n2); session 1: insert into t1 values(2,11); session 2: insert into t1 values(3,21); session 1: insert into t1 values(4,21); session 2: insert into t1 values(5,11);
With this table, and sequence of events following it, session 1 raised a deadlock error, and dumped the following trace:
Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-000a0021-0000c4aa 16 93 X 13 90 S TX-00020020-0000fdb9 13 90 X 16 93 S session 93: DID 0001-0010-00000057 session 90: DID 0001-000D-000000F2 session 90: DID 0001-000D-000000F2 session 93: DID 0001-0010-00000057 Rows waited on: Session 90: obj - rowid = 00000009 - AAAAAJAABAAAQJcAAA (dictionary objn - 9, file - 1, block - 66140, slot - 0) Session 93: obj - rowid = 0002E7DC - AAAufaAAFAAAAAJAAA (dictionary objn - 190428, file - 5, block - 9, slot - 0)
Object 9 is the I_FILE#_BLOCK# index in the data dictionary – and session 90 is definitely not doing anything with that object in this transaction.
Object 190428 is another table in the test schema, but session 93 didn’t access it in this transaction, and the block referenced is the segment header block, not a block that could hold a row.
In fact, just before I started the sequence of inserts I ran this query from a third session (connected as sys) with the following results:
select sid, row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row# from v$session where sid in (90,93) ; SID ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# ---------- ------------- -------------- --------------- ------------- 90 9 1 66140 0 93 190428 5 9 0
The “rows” reported for the deadlock simply echoed the values that were already in the row_wait columns before the test started. (The sessions were waiting on “SQL*Net message from client” at the time.)
Footnote:
Session 90 created the tables and indexes, that’s probably why it happened to have an outstanding reference to the i_file#_block# index.
Session 93 had just run a script to drop all the objects in the schema, which may explain why it happened to have an outstanding reference to a segment header block.
If the row_wait_obj# had been set to -1 for either session then the deadlock graph would have reported “No row” for that session.
[…] Despite what Jonathan Lewis wrote in Deadlocks: […]
Pingback by Technical Note on Blocking Using INSERT – Yet Another OCM — April 14, 2013 @ 12:50 pm BST Apr 14,2013 |
Thank you!
Comment by joelpatt — April 15, 2013 @ 8:02 pm BST Apr 15,2013 |
[…] https://jonathanlewis.wordpress.com/2013/04/13/deadlocks/ […]
Pingback by Can block contention and NOT row contention produce deadlock in Oracle ? The answer is … YES ! | Massimo Tinelli — October 28, 2014 @ 11:30 am GMT Oct 28,2014 |
Hello,
Thank you very much for your note.
I had the same deadlock graph as yours: TX X S; TX X S with 2 rows waited on having slot – 0.
Further more, the two sessions were both trying to execute the same query template: Update Table1 set last_update=:1 by request_id:2
Do you have any idea what could be the reason for that kind of deadlock and what can we do to eliminate it please?
Thank you !
Bang
Comment by docongbang — March 5, 2020 @ 2:59 pm GMT Mar 5,2020 |
The first step is to make sure you have identified exactly the right SQL statements.
The trace file should have a section labelled something like “Current SQL Statement for this session” – the exact text may vary with version (which you didn’t supply), and there should also be a section labelled something like: “Information for the OTHER waiting sessions” and in that section something saying: “current SQL:” followed by its SQL statement.
Having the right statements may help. I’ve supplied a short list of common reasons for a wait on TX mode 4 in the paragraph after the first deadlock graph: “collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples” so your first check should be based on seeing if anything about your DML could be related to those 4 options.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — March 5, 2020 @ 3:31 pm GMT Mar 5,2020 |
[…] Rows waited on (April 2013): be a little cautious about “rows waited on” information following a deadlock graph; the figures may be held over from a previous operation. […]
Pingback by Lock Catalogue | Oracle Scratchpad — February 19, 2022 @ 10:34 am GMT Feb 19,2022 |
I encountered the same deadlock graph whilst debugging an issue in an Oracle package we have – holds X waits S with the “slot – 0” stuff.Wwe have parallel tasks deleting from multiple tables (no collision in what each task was deleting).
Out of the multitude of tables we delete from, we hit a deadlock with only one in particular. The only thing that was different about this table was that the primary key constraint was backed up by a non-unique index (not by intention). It was this index which the deadlock graph pointed to when I looked up the data object id for the sessions. So far re-creating the PK with a corresponding unique index hasn’t seen the deadlocking issue re-occur.
I’m having trouble understanding why the PK backed by a non-unique index could have been the source of the issue though. If you have any thoughts on this I’d love to hear them.
Thanks, grateful for this website and it’s articles.
Comment by Randeep Singh — August 9, 2022 @ 4:58 pm BST Aug 9,2022 |
Randeep,
Thanks for the comment.
The first thing is to be reasonably certain that the disappearing deadlocks were due to the index being non-unique. It’s very easy to assume that the change you “chose” to make was the trigger to an effect when the effect was actually due to a side effect of the change. (e.g. maybe if you’d simply rebuilt the index without changing to a unique index it might have had the same success, and maybe the problem will come back in a few weeks time.)
Was the problem happening fairly regularly before the rebuild, or only intermittently – it might be a problem relating to non-unique indexes for PKs, but it might require such a special coincidence (e.g. two processes competing for adjacent values that are actually in (logically) adjacent leaf blocks and the session locking the higher value gets in first).
Off the top of my head I can’t think of a reason why you might see deadlocks in this instance, but there are some very odd things that happen with deferrable constraints (especially in the PK/FK area) so if you want to email me the dump file generated by the ORA-00060, along with details of tables, indexes and constraints that treat this table as the parent, I’ll take a look at it.
You didn’t say which version of Oracle – and that might be significant.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — August 10, 2022 @ 1:04 pm BST Aug 10,2022 |
[…] are plenty of notes on the web these days to tell you that TX mode 4 relates in some way to a unique index (or some associated referential integrity) or an ITL wait. (Inevitably there are a couple of other […]
Pingback by Deadlocks | Oracle Scratchpad — October 31, 2022 @ 9:15 am GMT Oct 31,2022 |