Oracle Scratchpad

March 20, 2013

Lock Bug

Filed under: Bugs,deadlocks,Locks,Oracle — Jonathan Lewis @ 12:08 pm BST Mar 20,2013

Here’s an oddity that I ran into a little while ago while trying to prepare a sample trace file showing a particular locking pattern; it was something that I’d done before, but trace files can change with different versions of Oracle so I decided to use a copy of 11.2.0.2 that happened to be handy at the time to check if anything had changed since the previous (11gR1) release. I never managed to finish the test; here are the steps I got through:


-- created a table t1 to work with, no data needed, column definitions don't matter.

Session 1:
lock table t1 in row exclusive mode;        -- mode 3, lock acquired

Session 2:
lock table t1 in row exclusive mode;        -- mode 3, lock acquired

Session 3:
lock table t1 in exclusive mode;            -- mode 6, goes into wait for session 1

Session 1 (again):
lock table t1 in share row exclusive mode;  -- mode 5, should move to converter queue, wait for session 2

The mode 3 to mode 5 conversion is what happens if you have a foreign key referential integrity constraint without a covering index and run code that does something like “delete child rows for parent X, delete parent X” (which also happens under the covers if your foreign key is declared as “on delete cascade”).

Notice the “should” on the lock conversion line – this is exactly what happens in 10.2.0.3 and 11.2.0.3; but on the system I was using session 1 got an immediate deadlock (ORA-00060) error – in the absence of a deadlock scenario !

My problem is this – the instance that gives the deadlock error is 64-bit 11.2.0.2 running RAC on Linux (OEL 5); the instances that don’t give the error are 32-bit, non-RAC, running on Windows XP Pro. SO is the anomaly due to some difference in:

  • Windows vs. Linux
  • 32 bit vs. 64 bit
  • RAC vs. non-RAC
  • Specific version of Oracle

I’m inclined to think it’s a version dependent bug, but it’s possible that it’s a necessary side effect of RAC. So if you’ve got 11.2.0.3 RAC, or 11.2.0.2 non-RAC, or the terminal releases of 10g and 11.1 whether RAC or non-RAC, I’d like to hear from you which versions – if any – produce the same deadlock and which don’t. (And the scientific method being what it is, anyone with 11.2.0.2 on RAC on Linux might like to confirm – or contradict – my result.)

Update 8:30 p.m. GMT

The results in so far tend to support the idea that this may be a RAC-related issue. Justin Mitchell’s result is particularly interesting, and I confirmed on my 11.2.0.2 RAC system that if the two starting sessions are on different nodes then I don’t get the deadlock behaviour.

Users of RAC will probably be aware of the fact that v$lock.block commonly sets itself to 2 (potential blocker) on RAC the moment you acquire the lock; and this is what happens to both the “row exclusive” locks. When we request the exclusive lock, both TM locks change from block=2 to block=1 if they are on the same node; if they are on different nodes then only one of them changes.

Unfortunately¬†Oleksandr didn’t see a deadlock when he tried RAC 11.2.0.3 – and since he didn’t make any comment about using multiple nodes we will have to wait to see if he can repeat the test and tell us whether his results are consistent with Justin’s, or whether they suggest that there are further considerations in effect.

13 Comments »

  1. I was unable to reproduce this behaviour neither on 11.2.0.2 non-RAC, nor 11.2.0.3 RAC (both 64-bit Linux).

    Comment by Oleksandr alesinskyy — March 20, 2013 @ 1:06 pm BST Mar 20,2013 | Reply

    • Oleksandr,

      Thanks for that note. When you ran the 11.2.0.3 RAC test were the first two sessions running on the same node or different nodes ? The results from Justin Mitchell below indicate that it might make a difference.

      Comment by Jonathan Lewis — March 20, 2013 @ 8:29 pm BST Mar 20,2013 | Reply

  2. Hi Jonathan,
    two node rac 11.2.0.3.5 (latest PSU) OEL 6.2 64bit and session 1 gets

    SQL> lock table t1 in share row exclusive mode;
    lock table t1 in share row exclusive mode
    *
    ERROR at line 1:
    ORA-00060: deadlock detected while waiting for resource
    

    Not sure how to format code so only conclusion showed .
    Regards
    GregG

    Comment by goryszewskig — March 20, 2013 @ 1:36 pm BST Mar 20,2013 | Reply

  3. I was able to get the same deadlock scenario reproduced in a 11.2.0.3 on a 3 node OEL RAC as well. However, the placement of the sessions seems to make a difference as well:

    session1@node1
    session2@node1
    session3@node1
    = Deadlock scenario as above

    session1@node1
    session2@node1
    session3@node2
    = Deadlock scenario as above

    session1@node1
    session2@node2
    session3@node2
    = No deadlock, session 1 waits on session 2

    I’ll try to get some additional testing in on some other systems a bit later.

    Comment by Justin Mitchell — March 20, 2013 @ 1:51 pm BST Mar 20,2013 | Reply

    • Testing on a 11.2.0.2 2-node RAC system yielded the same results as my earlier 11.2.0.3 test, with deadlocks when session 1 and session 2 are on the same node, and waits when session 1 and 2 are on different nodes.

      Comment by Justin Mitchell — March 21, 2013 @ 1:25 am BST Mar 21,2013 | Reply

  4. I was unable to reproduce this on 11.2.0.2.2 non-RAC on HP-UX Itanium.

    Comment by David Leinbach — March 20, 2013 @ 1:57 pm BST Mar 20,2013 | Reply

  5. I have trialled your experiment on Oracle 11.2.0.1 non RAC, 64bit OEL 5.5 and can confirm that no deadlock was observed, session 1 is waiting on session 2.

    Comment by Tony Sleight — March 20, 2013 @ 3:43 pm BST Mar 20,2013 | Reply

  6. Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 – 64bit Production
    on OEL 5 ( Linux 2.6.18-164.el5 #1 SMP )
    non-RAC

    no deadlock:

     sid TY     ID1 id 1       ID2 id 2   LMODE  REQUEST  CTIME  BLOCK
    ---- -- ------- --------- ---- ----- ------ -------- ------ ------
      44 TM   86501 SCOTT.T1     0            0        6     47      0
      31 TM   86501 SCOTT.T1     0            3        0     67      1
      17 TM   86501 SCOTT.T1     0            3        5    443      1
    

    Comment by Wolfgang — March 20, 2013 @ 4:17 pm BST Mar 20,2013 | Reply

  7. On 11.2.0.2 non-RAC 64-bit Linux I get a deadlock.
    Same outcome with sessions on the same nodes on 11.2.0.3. RAC, also on 64-bit Linux

    Comment by Dom Brooks — March 20, 2013 @ 8:35 pm BST Mar 20,2013 | Reply

  8. No deadlock observed on a NON RAC 10.2.0.4.0 database

    Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
    PL/SQL Release 10.2.0.4.0 - Production
    CORE    10.2.0.4.0      Production
    TNS for Solaris: Version 10.2.0.4.0 - Production
    
    SQL> start c:\getlocks2
    
           SID       WSID LOCK_TYPE            MODE_HELD       MODE_REQUESTED
    ---------- ---------- -------------------- --------------- ---------------
           305        305 DML                  ROW-X (SX)      S/ROW-X (SSX)
           305        282 DML                  ROW-X (SX)      EXCLUSIVE
           340        305 DML                  ROW-X (SX)      S/ROW-X (SSX)
           340        282 DML                  ROW-X (SX)      EXCLUSIVE
    

    Comment by Houri — March 21, 2013 @ 2:00 pm BST Mar 21,2013 | Reply

  9. I’m not Oleksandr Alesinskyy, but we are also running RAC 11.2.0.3 on OEL5.6 (2 node)

    Deadlock if all is done on same RAC node

    No deadlock if session 2 is started on the other RAC node.

    Comment by Wessel van Norel — March 21, 2013 @ 5:00 pm BST Mar 21,2013 | Reply

  10. ORA-00060: deadlock detected while waiting for resource

    Two node RAC running on SunOS 5.10 Generic_144488-17 sun4v sparc SUNW,T5440.

    Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
    PL/SQL Release 11.2.0.3.0 – Production
    CORE 11.2.0.3.0 Production
    TNS for Solaris: Version 11.2.0.3.0 – Production
    NLSRTL Version 11.2.0.3.0 – Production

    Comment by nakinov — April 5, 2013 @ 10:03 am BST Apr 5,2013 | Reply

  11. I am sorry Jonathan, I forgot to write that the test I did was all on same instance. However I did two other test by mixing the instances and the results are bellow.

    Test#1 – All on same instance
    — ORA-00060: deadlock detected while waiting for resource
    Instance#1
    lock table t1 in row exclusive mode;

    Instance#1
    lock table t1 in row exclusive mode;

    Instance#1
    lock table t1 in exclusive mode;

    Instance#1
    lock table t1 in share row exclusive mode;

    Test#2 – Mixing instances
    — No ORA-600
    Instance#1
    lock table t1 in row exclusive mode;

    Instance#2
    lock table t1 in row exclusive mode;

    Instance#1
    lock table t1 in exclusive mode;

    Instance#1
    lock table t1 in share row exclusive mode;

    Test#3 – Mixing instances
    — ORA-00060: deadlock detected while waiting for resource
    Instance#1
    lock table t1 in row exclusive mode;

    Instance#2
    lock table t1 in row exclusive mode;

    Instance#2
    lock table t1 in exclusive mode;

    Instance#1
    lock table t1 in share row exclusive mode;

    Comment by nakinov — April 5, 2013 @ 10:21 am BST Apr 5,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,266 other followers