Here’s a deadlock graph the appeared on Oracle-L and OTN a couple of days ago.
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-001a0002-0002a0fe 196 197 X 166 1835 S
TM-0000c800-00000000 166 1835 SX 196 197 SX SSX
It’s a little unusual because instead of the common TX mode 6 (eXclusive) crossover we have one TX and one TM lock, the TX wait is for mode 4 (S) and the TM wait is for a conversion from 3 (SX) to 5 (SSX).
The modes and types give us some clues about what’s going on: TX/4 is typically about indexes involved in referential integrity (though there are a couple of more exotic reasons such as wait for ITLs, Freelists or tablespace status change); conversion of a TM lock from mode 3 to mode 5 is only possible (as far as I know) in the context of missing foreign key indexes when you delete a parent row.
Here’s a simple data set to help demonstrate the type of thing that could have caused this deadlock:
drop table child; drop table parent; create table parent ( id number(4), name varchar2(10), constraint par_pk primary key (id) ) ; create table child( id_p number(4), id number(4), name varchar2(10), constraint chi_pk primary key (id, id_p), constraint chi_fk_par foreign key(id_p) references parent on delete cascade ) ; insert into parent values (1,'Smith'); insert into parent values (2,'Jones'); insert into child values(1, 1, 'Simon'); insert into child values(2, 1, 'Janet'); commit;
Note that I have define the primary key on the child the “wrong way round”, so that the foreign key doesn’t have a supporting index. Note also that the foreign key constraint is defined as ‘on delete cascade’ – this isn’t a necessity, but it means I won’t have to delete child rows explicitly in my demo.
Now we take the following steps:
Session 1: delete from parent where id = 1;
This will delete the child row – temporarily taking a mode 4 (S) lock on the child table – then delete the parent row. Both tables will end up locked in mode 3.
Session 2: insert into child values (1,2,'Sally');
This will lock the parent table in mode 2, lock the child table in mode 3, then wait with a TX mode 4 for session 1 to commit or rollback. If session 1 commits it will raise Oracle error: “ORA-02291: integrity constraint (TEST_USER.CHI_FK_PAR) violated – parent key not found”; if session 1 rolls back the insert will succeed.
Session 1: delete from parent where id = 2;
This will attempt to lock the child table in mode 4, find that there it already has the child locked in mode three (which is incompatible with mode 4) and therefore attempt to convert to mode 5 (SSX). This will make it queue, waiting for session 2 to commit.
Three seconds later session 2 (the first to start waiting) will timeout and report a deadlock with the follow deadlock graph:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-00015818-00000000 14 371 SX 17 368 SX SSX
TX-0009000e-000054ae 17 368 X 14 371 S
session 371: DID 0001-000E-00000018 session 368: DID 0001-0011-00000005
session 368: DID 0001-0011-00000005 session 371: DID 0001-000E-00000018
Rows waited on:
Session 368: no row
Session 371: no row
Session 368:
pid=17 serial=66 audsid=2251285 user: 52/TEST_USER
O/S info: user: HP-LAPTOPV1\jonathan, term: HP-LAPTOPV1, ospid: 2300:3528, machine: WORKGROUP_JL\HP-LAPTOPV1
program: sqlplus.exe
application name: SQL*Plus, hash value=3669949024
Current SQL Statement:
delete from parent where id = 2
End of information on OTHER waiting sessions.
Current SQL statement for this session:
insert into child values(1,2,'new')
You’ll notice that there are no rows waited for – session 1 isn’t waiting for a row it’s waiting for a table and session 2 isn’t waiting for a table row it’s waiting for an index entry.
Footnote: There are several variations on the theme of one session inserting child rows when the other session has deleted (or inserted) the parent. The uncommitted parent change is an easy cause of the TX/4; the delete with unindexed foreign key is a necessary cause of the SX -> SSX.

> … session 1 (the first to start waiting) …
Isn’t session 2 the first to start waiting?
And is this deterministic – the first session to start waiting will be the one detecting the deadlock? As far as I know (from somewhere), every session timing out on an enqueue wait will check for a deadlock before going to sleep again. Now, in almost all cases the first session to start waiting will be the first to time-out, but there’s a difference between being runnable and actually running, so it seems possible that the second session to start waiting might end up being the one detecting the deadlock.
Not that this matters for the topic of the post, I’m just curious.
Comment by Flado — August 30, 2011 @ 10:31 am UTC Aug 30,2011 |
Flado, correct – session 2 is the first to start waiting, and it is the one that will raise the deadlock. (Now corrected.)
I would like to think that the “1″ was just a case of missing the key while typing, and not noticing the error.
Technically I suppose it is possible for the two waiters to start running in the wrong order – but since they go off the run queue in order they will presumable go back on the run queue in order: but on a multi-CPU machine is there one run queue that services all CPUs or one runqueue per CPU ? In the latter case if process one goes onto the run queue of CPU A and process two goes onto the run queue of CPU B then I guess process two could get to the top of its run queue before process one.
Comment by Jonathan Lewis — September 1, 2011 @ 4:16 pm UTC Sep 1,2011 |
It is the session(session 2) doing the insert that will time out and will report the deadlock error
mhouri.world> insert into child values (1,2,'Sally'); insert into child values (1,2,'Sally') * ERROR at line 1: ORA-00060: deadlock detected while waiting for resourceBest regards
Comment by Houri Mohamed — August 31, 2011 @ 8:20 am UTC Aug 31,2011 |
Mohamed,
Now corrected. Thanks
Comment by Jonathan Lewis — September 1, 2011 @ 4:17 pm UTC Sep 1,2011 |
[...] From Jonathan Lewis’ Oracle Scratchpad blog. [...]
Pingback by Deadlock – All Things Oracle — September 8, 2011 @ 5:10 pm UTC Sep 8,2011 |
What would be a reason for ITL waits on an index segment?. Especially if the index is on a child table on a FK constraint column.
This ITL waits showingup while concurrent inserts into the child table.
Also can FK indexes can be reverse key? Would that have any negative impact (TM locks )
Thanks
Balaji.
Comment by Balaji — October 16, 2011 @ 5:00 pm UTC Oct 16,2011 |
Balaji,
If all you were doing was inserting rows into a child table with an enabled foreign key that was protected by a suitable index, I wouldn’t have expected to see ITL waits. As far as I know if there are no free ITL entries in a leaf block the process that wants one will simply intitiate a leaf block split. I could probably come up with a couple of scenarios to check if necessary – e.g. two sessions wanting to insert for the same parent key simultaneously when there are no free ITLs (though I might expect a buffer busy wait or “enq: TX – index contention”); or two leaf blocks under the same branch block having to split simuiltanously and causing contention for the branch ITL entry (although again I think I’d expect “Enq: TX – index contention”).
I haven’t tested FKs with reverse key indexes – I can think on no reason why they would cause any problems. It’s easy to test though, and you could apply the test case to different versions of Oracle because it’s often the case that a detail that ought to work simply doesn’t work on the first release. (e.g. you couldn’t do index fast full scans on bitmap indexes in the first release where index fast full scans appeared.)
Comment by Jonathan Lewis — November 13, 2011 @ 12:36 pm UTC Nov 13,2011 |
I’ve just had an exchange of email with Tom Kyte about a thread on AskTom that quoted the previous comment, specifically the bit:
When I wrote this I was writing in the context of the question, of course, which was all about inserting rows, not about updating or deleting rows; so when Tom emailed me about it my immediate reply was that it didn’t apply to deletes or updates (and rememmber that an update is implemented at the index level as a delete followed by an insert).
However, as I sat down to write a short note to add to this blog, I suddenly realised that I wasn’t sure that I had told Tom the truth – so I did a quick search of my working directory for any scripts with the words “index split delete”, and found a script called (appropriately enough) ind_split_del.sql that I had written a few years ago to demonstrate a couple of variations of how a block split could occur during a delete that wasn’t able to find or create an ITL entry in the leaf block. (I may write this up some day – it’s another of the edge cases where an index can become larger than it should because of concurrent activity.)
If you’re wondering how Oracle can split a leaf block when it can’t find an ITL to change the block, by the way, you need to know that Oracle reserves the first ITL entry in a leaf block for “service” transactions – viz: block splits. (See this blog note for a couple of comments about service transactions and service ITLs as they relate to branch blocks)
This does leave unanswered the question of how you can get ITL waits on an index – but I haven’t tested many variations of deletions, or mixing inserts and deletes; and it’s always possible that the waits appear if two deletes start on the same leaf block at the same time and both need to split the block. There may be cases were the waits relate to branch block contention if two sessions try to split two different leaf blocks under the same branch at the same time. Given that we know what it means to wait for an ITL, it’s relatively easy to come up with scenarios where they may occur – unfortunately it takes time (and sometimes a lot of patience) to create and run tests for those scenarios.
Comment by Jonathan Lewis — November 28, 2011 @ 5:26 pm UTC Nov 28,2011 |