Oracle Scratchpad

April 13, 2013

Deadlocks

Filed under: deadlocks,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 11:36 am BST Apr 13,2013

Here’s a deadlock graph that might cause a little confusion:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
 Resource Name          process session holds waits  process session holds waits
 TX-001a0015-00014787        34      90     X             32       3           S
 TX-00190008-0000601b        32       3     X             34      90           S

 session 90: DID 0001-0022-00000327      session 3: DID 0001-0020-000009E9
 session 3: DID 0001-0020-000009E9       session 90: DID 0001-0022-00000327

 Rows waited on:
  Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
  (dictionary objn - 199909, file - 6, block - 276654, slot - 0)
  Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
  (dictionary objn - 199909, file - 6, block - 433952, slot - 0)


Both sessions are holding X and waiting S on a TX lock. There are several well-known reasons why you might see a share (mode 4) lock on a transaction table slot: collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples, and problems with interested transaction lists (ITLs) or freelists are the common “internal” ones – so the presence of the waits in share mode shouldn’t, of themselves, be a source of confusion.

The confusion is in the reported rowids. If you try to interpret them as real rowids you may be unlucky and discover that they seem to be related to the SQL reported for the deadlocked sessions when really the information they hold is garbage. (Just to avoid confusion, let me make it clear that there are cases where the rowids reported definitely WILL be garbage; on the other hand there will be some cases where the rowids are relevant – slot 0 is, after all, a legal slot for a row in a block.)

When I see a deadlock graph on transaction locks and the waits are for S mode I tend to assume that the information about the rows waited on is probably misleading; when the slot number for the rowid is zero this increases my confidence that the rowid is rubbish. (Zero is a legal value for a rowid slot so a zero doesn’t prove that the rowid is rubbish, it’s just a coincidence that allows me to continue following a hypothesis.)

The problem is that Oracle doesn’t waste resources tidying up after itself, and in the case of deadlock graphs this laziness shows up in the rowids reported. The trace is simply reporting whatever happens to be in the row_wait_obj#, row_wait_file#, row_wait_block# and row_wait_row# columns of v$session; and if the waiting process hasn’t updated these columns with current row information you could be looking at the details of the last row (or block) that the session waited for. Here’s the description of a test to demonstrate the behaviour:

create table t1 (n1 number, n2 number);
insert into t1 values(1,1);
create unique index t1_i1 on t1(n1);
create unique index t1_i2 on t1(n2);

session 1:	insert into t1 values(2,11);
session 2:	insert into t1 values(3,21);
session 1:	insert into t1 values(4,21);
session 2:	insert into t1 values(5,11);

With this table, and sequence of events following it, session 1 raised a deadlock error, and dumped the following trace:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-000a0021-0000c4aa        16      93     X             13      90           S
TX-00020020-0000fdb9        13      90     X             16      93           S

session 93: DID 0001-0010-00000057	session 90: DID 0001-000D-000000F2
session 90: DID 0001-000D-000000F2	session 93: DID 0001-0010-00000057

Rows waited on:
Session 90: obj - rowid = 00000009 - AAAAAJAABAAAQJcAAA
  (dictionary objn - 9, file - 1, block - 66140, slot - 0)
Session 93: obj - rowid = 0002E7DC - AAAufaAAFAAAAAJAAA
  (dictionary objn - 190428, file - 5, block - 9, slot - 0)

Object 9 is the I_FILE#_BLOCK# index in the data dictionary – and session 90 is definitely not doing anything with that object in this transaction.

Object 190428 is another table in the test schema, but session 93 didn’t access it in this transaction, and the block referenced is the segment header block, not a block that could hold a row.

In fact, just before I started the sequence of inserts I ran this query from a third session (connected as sys) with the following results:

select
        sid,
        row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#
from
        v$session
where
        sid in (90,93)
;

       SID ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
---------- ------------- -------------- --------------- -------------
        90             9              1           66140             0
        93        190428              5               9             0

The “rows” reported for the deadlock simply echoed the values that were already in the row_wait columns before the test started. (The sessions were waiting on “SQL*Net message from client” at the time.)

Footnote:

Session 90 created the tables and indexes, that’s probably why it happened to have an outstanding reference to the i_file#_block# index.

Session 93 had just run a script to drop all the objects in the schema, which may explain why it happened to have an outstanding reference to a segment header block.

If the row_wait_obj# had been set to -1 for either session then the deadlock graph would have reported “No row” for that session.

9 Comments »

  1. […] Despite what Jonathan Lewis wrote in Deadlocks: […]

    Pingback by Technical Note on Blocking Using INSERT – Yet Another OCM — April 14, 2013 @ 12:50 pm BST Apr 14,2013 | Reply

  2. Thank you!

    Comment by joelpatt — April 15, 2013 @ 8:02 pm BST Apr 15,2013 | Reply

  3. Hello,

    Thank you very much for your note.

    I had the same deadlock graph as yours: TX X S; TX X S with 2 rows waited on having slot – 0.

    Further more, the two sessions were both trying to execute the same query template: Update Table1 set last_update=:1 by request_id:2

    Do you have any idea what could be the reason for that kind of deadlock and what can we do to eliminate it please?

    Thank you !

    Bang

    Comment by docongbang — March 5, 2020 @ 2:59 pm GMT Mar 5,2020 | Reply

    • The first step is to make sure you have identified exactly the right SQL statements.

      The trace file should have a section labelled something like “Current SQL Statement for this session” – the exact text may vary with version (which you didn’t supply), and there should also be a section labelled something like: “Information for the OTHER waiting sessions” and in that section something saying: “current SQL:” followed by its SQL statement.

      Having the right statements may help. I’ve supplied a short list of common reasons for a wait on TX mode 4 in the paragraph after the first deadlock graph: “collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples” so your first check should be based on seeing if anything about your DML could be related to those 4 options.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — March 5, 2020 @ 3:31 pm GMT Mar 5,2020 | Reply

  4. […] Rows waited on (April 2013): be a little cautious about “rows waited on” information following a deadlock graph; the figures may be held over from a previous operation. […]

    Pingback by Lock Catalogue | Oracle Scratchpad — February 19, 2022 @ 10:34 am GMT Feb 19,2022 | Reply

  5. I encountered the same deadlock graph whilst debugging an issue in an Oracle package we have – holds X waits S with the “slot – 0” stuff.Wwe have parallel tasks deleting from multiple tables (no collision in what each task was deleting).

    Out of the multitude of tables we delete from, we hit a deadlock with only one in particular. The only thing that was different about this table was that the primary key constraint was backed up by a non-unique index (not by intention). It was this index which the deadlock graph pointed to when I looked up the data object id for the sessions. So far re-creating the PK with a corresponding unique index hasn’t seen the deadlocking issue re-occur.

    I’m having trouble understanding why the PK backed by a non-unique index could have been the source of the issue though. If you have any thoughts on this I’d love to hear them.

    Thanks, grateful for this website and it’s articles.

    Comment by Randeep Singh — August 9, 2022 @ 4:58 pm BST Aug 9,2022 | Reply

    • Randeep,

      Thanks for the comment.

      The first thing is to be reasonably certain that the disappearing deadlocks were due to the index being non-unique. It’s very easy to assume that the change you “chose” to make was the trigger to an effect when the effect was actually due to a side effect of the change. (e.g. maybe if you’d simply rebuilt the index without changing to a unique index it might have had the same success, and maybe the problem will come back in a few weeks time.)

      Was the problem happening fairly regularly before the rebuild, or only intermittently – it might be a problem relating to non-unique indexes for PKs, but it might require such a special coincidence (e.g. two processes competing for adjacent values that are actually in (logically) adjacent leaf blocks and the session locking the higher value gets in first).

      Off the top of my head I can’t think of a reason why you might see deadlocks in this instance, but there are some very odd things that happen with deferrable constraints (especially in the PK/FK area) so if you want to email me the dump file generated by the ORA-00060, along with details of tables, indexes and constraints that treat this table as the parent, I’ll take a look at it.

      You didn’t say which version of Oracle – and that might be significant.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — August 10, 2022 @ 1:04 pm BST Aug 10,2022 | Reply

  6. […] are plenty of notes on the web these days to tell you that TX mode 4 relates in some way to a unique index (or some associated referential integrity) or an ITL wait. (Inevitably there are a couple of other […]

    Pingback by Deadlocks | Oracle Scratchpad — October 31, 2022 @ 9:15 am GMT Oct 31,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.