Oracle Scratchpad

July 8, 2022

ORA-00054 pt.2

Filed under: Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 11:03 am BST Jul 8,2022

This is the follow-up to an initial post that covered some details of using the errorstack and ksq traces as and aid to finding the cause of an intermittent ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired. We were (hypothetically) looking at a scenario where a batch-like process would occasionally fail raising this error and leaving us to deal with an error that we could reproduce on demand.

Recapping the previous article, we saw that we could set a system-wide call to dump an errorstack at level 1 whenever the error occurred, producing a trace file containing the statement that had raised the error along with its SQL ID and a call stack that would allow us to find the object_id of the specific object that had caused the lock conflict. Once we had the SQL ID we then had the option to set a system-wide call to dump the ksq (Kernel Service Enqueues) trace whenever that statement was executed [but see footnote 1]. Whenever the statement succeeded this would give us a complete listing of the locks (enqueues) needed by the statement, and when the statement failed (due to ORA-00054) we would be able to see very clearly where the breakdown had occurred.

alter system set events '54 trace name errorstack level 1'; 
alter system set events 'trace[ksq][SQL:3afvh3rtqqwyg] disk=highest';

Neither option, however, would tell us anything about the competing session or about the SQL that caused the competing lock to come into existence; all we could hope for was some hint about why the ORA-00054 and little clue about the part of the application that was causing the conflict.

Using the SystemState

The basic systemstate dump tends to be rather large – for starters, it’s going to include a lot of information about currently open cursors for every (user) session – so it’s not something you really want to make frequent use of, and you don’t want it to be triggered frequently. But if you have an occasional (and critical) batch failure due to an intermittent locking problem then you can issue a call like:

alter system set events '00054 trace name systemstate level 2, lifetime 1';

The “lifetime 1” ensures that a session will only dump a systemstate once in its lifetime – which may be necessary to ensure the system isn’t overloaded with by larger numbers of system state dumps begin generated in a very short time interval. You may need to allow for more than just 1 dump per session, though.

In fact, since I want to dump both the errorstack and the systemstate when the ORA-00054 occurrs the critical three lines in my model were as follows

alter system set events '54 trace name errorstack level 1; name systemstate level 2, lifetime 1';

alter table child enable novalidate constraint chi_fk_par;

alter system set events '54 trace name systemstate off; name errorstack off';

So what do you get if you make this call and then try to re-enable a foreign key constraint when the parent table is locked. In my very small system, with just a couple of live sessions, and shortly after instance startup my tracefile was about 1.5MB and 16,000 lines in size, so not something to read through without a little filtering.

From the Call Stack Trace produced by the errorstack dump I could see that the first argument to ktaiam (and the associated function calls) was 00001EA61.This told me that I would find at least one session holding a lock identified as TM-0001EA61 so that’s the text I searched for next. Note the little trap: the value reported in the call stack has an extra leading zero. I found the TM enqueue 11,000 lines further down the file in a “State Object (SO:)” of type “DML Lock”:

        SO: 0x9cf0e708, type: DML lock (83), map: 0x9bac7b98
            state: LIVE (0x4532), flags: 0x0
            owner: 0x9cf648d8, proc: 0xa0eed9f0
            link: 0x9cf0e728[0x9cf64948, 0x9cf64948]
            conid: 3, conuid: 3792595, SGA version=(1,0), pg: 0
        SOC: 0x9bac7b98, type: DML lock (83), map: 0x9cf0e708
             state: LIVE (0x99fc), flags: INIT (0x1)
        DML LOCK: tab=125537 flg=11 chi=0
                  his[0]: mod=6 spn=348
2022-07-06 23:34:49.285*:ksq.c@10787:ksqdmc(): Enqueue Dump        (enqueue) TM-0001EA61-00000000-0039DED3-00000000  DID: ksqlkdid: 0001-0029-0000001F

        lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  res_flag: 0x16
        mode: X, lock_flag: 0x20, lock: 0x9bac7bc0, res: 0x9e7ec5e8
        own: 0xa086dc70, sess: 0xa086dc70, proc: 0xa05617d0, prv: 0x9e7ec5f8
        SGA version=(1,0)

In my case there was only one holder for this lock, but in a live system the same object could be locked by many users.

Having found a state object for the lock I had to identify the process holding this lock – which means searching backwards to find the parent of this state object, then its parent, and so on. Here, in the order I found them (which is the reverse order they appear in the file) are the lines I found:

        SO: 0x9cf0e708, type: DML lock (83), map: 0x9bac7b98
      SO: 0x9cf648d8, type: transaction (85), map: 0x9bbebdc8
      SO: 0x8df2ac88, type: LIBRARY OBJECT LOCK (118), map: 0x63693c60
      SO: 0x8df4fa78, type: LIBRARY OBJECT LOCK (118), map: 0x81af83e8
    SO: 0xa0efc020, type: session (4), map: 0xa086dc70
  SO: 0xa0eed9f0, type: process (2), map: 0xa05617d0

So the DML lock is owned by a transaction, which is owned by a session (with a couple of “library object lock” state objects “in the way”) which is owned by a process. You may find other “extraneous” lines on the way but the key detail to note is the hierarchical pattern of state objects – keep going until you’re reached the session and process state objects.

Once we’ve got this information we need to search the systemstate dump for any cursors that this session/process has open to see if we can find something that looks like a statement that could have created the lock, so we need to search for state objects of type “LIBRARY OBJECT LOCK” with the correct owner information.

Technically they would have to appear in the trace file between the process state object we’ve found and the next process state object listed in the file, but it would be a little tedious to do this seacrh with a text editor so I switched from using vi to using grep – and here’s a search condition that will identify and print part of each state object owned by this session and process:

grep -B+2 -A+13 "owner: 0xa0efc020, proc: 0xa0eed9f0" or19_ora_19369.trc >temp.txt

The hexadecimal value following “owner: “ is from the session state object, the value following “proc: “ is from the process state object. or19_ora_19369.trc is my trace file, and each time I’ve found a matching line I’ll write the 2 lines before it, the line itself, and 13 lines after it to the file temp.txt.

In my example I found 29 state objects owned by the process, of which 25 were of type “LIBRARY OBJECT LOCK” – and I’ve reported two of them below:

      SO: 0x8df36d38, type: LIBRARY OBJECT LOCK (118), map: 0x7b54afa0
          state: LIVE (0x4532), flags: 0x1
          owner: 0xa0efc020, proc: 0xa0eed9f0
          link: 0x8df36d58[0x8df496c8, 0x8df31608]
          child list count: 0, link: 0x8df36da8[0x8df36da8, 0x8df36da8]
          conid: 3, conuid: 3792595, SGA version=(1,0), pg: 0
      SOC: 0x7b54afa0, type: LIBRARY OBJECT LOCK (118), map: 0x8df36d38
           state: LIVE (0x99fc), flags: INIT (0x1)

      LibraryObjectLock:  Address=0x7b54afa0 Handle=0x82967ac0 Mode=N
        CanBeBrokenCount=1 Incarnation=1 ExecutionCount=1

        User=0xa086dc70 Session=0xa086dc70 ReferenceCount=1
        Flags=CNB/[0001] SavepointNum=155 Time=07/06/2022 23:29:53
      LibraryHandle:  Address=0x82967ac0 Hash=5ce74124 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
        ObjectName:  Name=lock table parent in exclusive mode


      SO: 0x8defb228, type: LIBRARY OBJECT LOCK (118), map: 0x944f8880
          state: LIVE (0x4532), flags: 0x1
          owner: 0xa0efc020, proc: 0xa0eed9f0
          link: 0x8defb248[0x96fbe5e8, 0x8df496c8]
          child list count: 0, link: 0x8defb298[0x8defb298, 0x8defb298]
          conid: 3, conuid: 3792595, SGA version=(1,0), pg: 0
      SOC: 0x944f8880, type: LIBRARY OBJECT LOCK (118), map: 0x8defb228
           state: LIVE (0x99fc), flags: INIT (0x1)

      LibraryObjectLock:  Address=0x944f8880 Handle=0x896af550 Mode=N
        CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0
        Context=0x7f125ecf34b8
        User=0xa086dc70 Session=0xa08664b8 ReferenceCount=1
        Flags=[0000] SavepointNum=0 Time=07/06/2022 23:29:53
      LibraryHandle:  Address=0x896af550 Hash=0 LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
        Name:  Namespace=SQL AREA(00) Type=CURSOR(00) ContainerId=3

Details to note:

  • The last line I’ve selected from the first state object looks like a good candidate SQL statement for creating the blocking lock. The line above it, showing “Hash=5ce74124”, gives us the hexadecimal equivalent of the v$sql.hash_value for this statement.
  • I believe the last line of the second state object is telling us that the associated statement has been flushed from the library cache but I’m not sure that I’m interpreting that correctly. You’ll notice though that the line does gives us a suitable namespace and type for something to do with a SQL or PL/SQL cursor (and a hash value of zero – so if it is/was a (Pl/)SQL statement that’s the clue that it’s not in memory any more).

Comparing these two state objects, the things I want to find with minimal hassle are (I hope) the lines that start with the text “ObjectName” that appear one line after a line holding the text “Hash=” followed by anything but a zero, from state objects labelled “LIBRARY OBJECT LOCK”. Here’s a one-line (wrapped) grep command to do that, followed by the (slightly re-formatted) results I got from my trace file:

grep -B+2 -A+13 "owner: 0xa0efc020, proc: 0xa0eed9f0" or19_ora_19369.trc |
                grep -A+15 "SO:.*LIBRARY OBJECT LOCK" |
                grep -A+1  "Hash=[^0]" |
                grep "ObjectName"


        ObjectName:  Name=lock table parent in exclusive mode 
        ObjectName:  Name=select pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg, maxext_stg,
                     maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg, maxtra_stg, optimal_stg,
                     maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg, cmpflag_stg, cmplvl_stg,imcflag_stg, 
                     ccflag_stg, flags2_stg from deferred_stg$  where obj# =:1 

It looks like I’ve got the information I need (or, at least) a good clue about why my batch session raised an ORA-00054; and, in a real system the other open cursors reported for this session might give me enough information to work out where the problem is coming from.

Warnings

The first warning is just a reminder that there may have been multiple sessions/processes holding locks on table, so don’t stop after finding the first occurrence of the TM-xxxxxxxx lock, check to see if there are any more and repeat the search for its owning process and owned Library Object Locks.

The second warning is that all this work may not give you an answer. A session may have locked a table ages ago and still have an active transaction open; if you’re unlucky the statement that produced the lock may have been flushed from the library cache. A comment I made in 2009 about finding the locking SQL is just as relevant here for the systemstate dump immediately after the ORA-00054 as it was when I first wrote about querying v$sql all those years ago. You may get lucky, and this prompt dumping of the systemstate may make you luckier, but there’s no guarantee you’ll find the guilty statement.

Furthermore, the state objects that I’ve been looking at are “LIBRARY OBJECT LOCK” state objects – these are the things that linke to a cursor that’s held open by the session (i.e. things you’d see in v$open_cursor) so if session introduced a table lock then closed the cursor (and hasn’t commited) the table will still be locked but the systemstate won’t have a state object for the statement that locked the table. For example when I created and executed the following procedure to lock the table using an “execute immediate” I found a state object for the procedure call, but I didn’t find a state object for the “lock table” statement:

create or replace procedure lock_p as
begin
        execute immediate 'lock table PARENT in exclusive mode';
end;
/

On the other hand when I created the procedure with embedded SQL I found state objects for both the procedure call and the SQL statement.

create or replace procedure lock_p as
begin
        LOCK TABLE test_user.PARENT in exclusive mode;
mode';
end;
/

In passing, the text of the “ObjectName:” you find for the procedure call varies depending on whether you “execute lock_p” or “call lock_p()” from SQL*Plus. The former shows up as “BEGIN lock_p; END;” and the latter as “call lock_p()”

The second warning is to remember that there may have been multiple sessions/processes holding locks on table, so don’t stop after finding the first occurrence of the TM-xxxxxxxx lock, check to see if there are any more and repeat the search for its owning process and owned Library Object Locks.

Conclusion

If you need to track down the cause of an intermittent locking problem that results in an Oracle error ORA-00054 then enabling a system-wide dump of the systemstate (level 2 is sufficient) on error 54 may allows you to find out what everyone else was doing around the time of the problem.

If you don’t already know which object is the locked object that’s the direct cause of the ORA-00054 then enabling the errorstack trace at the same time will allow you to find the object_id of the object, so that you can then find the the processes/sessions that are holding a DML (TM-) lock with the correct id.

For each State Object for the relevant DML lock you can track backwards up the trace file, and then use the address of each pair of session and process state objects to find all of their “open cursor”/”library open lock” state objects, and check the “ObjectName” of each to see the SQL or PL/SQL text. This may give you the information you need to identify where/how the application is going wrong.

Cursors close, and cursors that are still open (but not pinned) can be flushed from the library cache, and a lock may have been placed by a cursor whose text is no longer available, or not part of the systemstate dump, so this method is not perfect – however, since the systemstate dump takes place the instance the error occurs it doess improve your chances that the problem statement is still available and reported.

Footnote 1

Trouble-shooting when the problem is not reproducible on demand often puts you in a position where you have to make a trade-off between information gained and overheads required. Dumping an errorstack for every occurrence of an ORA-00054 is probably a small overhead since (you hope that) you don’t generate thousands of locking problems per hour . In the unlikely case that the errors occur very frequently to every session that connects you might be able to limit the overhead by adding the “lifetime” clause to the call, e.g:

alter system set events '00054 trace name errorstack level 1 , lifetime 5';

This would result in every session being able to dump the trace only on the first 5 occasions it triggered the error.

On the other hand, I can think of no effective way of limiting the ksq trace (beyond restricting its action to a specific SQL_ID). If the problem statement executes a couple of times in each batch run, and there are only a few batch runs per day, then the overhead will be small when you’re trying to find the details of a problem that happens one a week. But if the problem statement runs thousands of times in each batch run then it would probably be very expensive to enable the ksq trace to catch an intermittent error.

July 6, 2022

ORA-00054 pt.1

Filed under: Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 8:20 am BST Jul 6,2022

If you don’t recognise the number the relevant extract from the oraus.msg file is:

00054, 00000, "resource busy and acquire with NOWAIT specified or timeout expired"
// *Cause:  Interested resource is busy.
// *Action: Retry if necessary or increase timeout.

The error is typically the result of application code trying to do some DDL on an object that is locked in an incompatible mode by some other session, and when it shows up in the log from some batch process it can be difficult to find out what was going on at the same time to cause the problem because the error message tells you nothing about the blocker.

Nenad Noveljic has just published a note discussing what you can do to trouble-shoot this type of problem, but I thought I’d write up a note on one of the ways I’d address the problem, in this case starting from a specific question on the Oracle-L list-server.

Statement of problem

A piece of application code disables the foreign key constraints on a table, inserts (using “insert as select” with the /*+ append */ hint) a very large volume of data (tens to hundreds of millions of rows), then executes a pl/sql loop to re-enable, in novalidate mode, all the foreign key constraints on that table.

****Sample code *****
INSERT /*+Append*/ INTO TAB1 (c1,c2.....)
        SELECT (ROWNUM + sqq_key) ,col1, col2, col3..... from....tab2,
tab3, tab4.....;
COMMIT;

FOR I IN (
        SELECT TABLE_NAME, CONSTRAINT_NAME 
        FROM ALL_CONSTRAINTS 
        WHERE TABLE_NAME = v_table_nm 
        AND CONSTRAINT_TYPE = 'R' 
        AND STATUS = 'DISABLED'
) LOOP
   EXECUTE IMMEDIATE ('ALTER TABLE ' || v_table_nm || ' ENABLE NOVALIDATE CONSTRAINT '|| I.CONSTRAINT_NAME);
 END LOOP I;

From time to time one of the calls to re-enable a constraint fails raising ORA-00054, so the OP had set event 54 to do a systemstate dump to see if that would help identify the cause of the error:

alter system set events '54 trace name systemstate level 266, lifetime 1';

Picking through the resulting trace file, though, the OP got the impression that the session was blocking itself, leading to a worry that somehow the “commit;” wasn’t releasing locks properly so that the lock due to the insert was blocking the lock needed for the “enable constraint”.

Trouble-shooting

Reading the key question “is the commit not working properly?” my first thought was “it’s almost guaranteeable that the commit is doing what it’s supposed to do”; and I had no intention of reading a systemstate dump (or reading the bits that had been extracted by someone who had (almost guaranteeably) misinterpreted it).

Where you start trouble-shooting does depend to a degree on how much you already know about what’s going on. The OP, for example, already knew that the error appeared in response to one of the “alter table” commands and was also able to identify which constraint had caused the error to appear – but if you don’t even have that information how do you begin?

Since Oracle is raising an error (and one that probably doesn’t occur very frequently) you could just set the system to dump an errorstack every time the error occurred. (For a repeatable test you might use “alter session”, for a randomly occurring event you might have to “alter system” unless you were able to modify the batch code itself to issue its own “alter session” at the right point.) To minimise the size of the trace file level 1 should suffice, at least to begin with:

alter system set events '54 trace name errorstack level 1';

Here’s the start of the trace information that I produced by setting this event and trying to enable a constraint when I knew the call would be blocked by a competing lock:

dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
<error barrier> at 0x7ffcdfb0dd20 placed dbkda.c@296
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
----- Current SQL Statement for this session (sql_id=3afvh3rtqqwyg) -----
alter table child enable novalidate constraint chi_fk_par

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst1()+95         call     kgdsdst()            7FFCDFB0D180 000000002
                                                   7FFCDFB074B0 ? 7FFCDFB075C8 ?
                                                   000000000 000000082 ?

Key points: the report starts with the SQL that triggered the error and gives us its SQL_ID. Before doing anything else, then, let’s consider what locks might be necessary for the constraint to be enabled (noting, particularly, that in this case the constraint is being enabled with the “novalidate” option. The OP suspects that the problem appears because the session can’t acquire a lock on the child table – but maybe there are other locks involved.

Let’s model the scenario with a parent/child table and referential integrity constraint and see what locks appear as we try to enable the constraint with “novalidate” . But we want to find out what locking goes on when there are no problems with competing sessions and no errors raised. You’ll note that I used the pause command in mid-script so that I could connect through another session if I wanted to introduce some competing locks:

rem
rem     Script:         errorstack.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2022
rem

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
)
;

create table child(
        id_p    number(4)       constraint chi_fk_par references parent,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id) 
)
;

alter table child disable constraint chi_fk_par;

pause Press return

-- alter system set events '54 trace name errorstack level 1'; 
alter system set events 'trace[ksq][SQL:3afvh3rtqqwyg] disk=highest';

alter table child enable novalidate constraint chi_fk_par;

alter system set events 'trace[ksq][SQL:3afvh3rtqqwyg] off';
-- alter system set events '54 trace name errorstack off'; 

This is using the “new” trace mechanism, tracing the “ksq” (Kernel Service Enqueues) component of the RDBMS library (“oradebug doc component rdbms”), restricted to tracing only when the current SQL statement has a specific SQL_ID.

If you examine the trace file you’ll find lots of lines referencing the source file ksq.c, with call names like ksqgtlctx (get lock context ?) and ksqcli (clear lock information?). I’m just going to grep out the lines that contain the text “mode=”:

2022-07-05 10:44:13.642*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8C8-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
2022-07-05 10:44:13.642*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8C6-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
2022-07-05 10:44:13.644*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000005-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.644*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000006-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.646*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000007-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.646*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000008-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.646*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-00000009-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.647*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-0000000A-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.647*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-0000000B-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.648*:ksq.c@9175:ksqgtlctx(): *** ZH-0001E8C8-0000000C-0039DED3-00000000 mode=6 flags=0x10021 why=225 timeout=0 ***
2022-07-05 10:44:13.649*:ksq.c@9175:ksqgtlctx(): *** TX-00090014-000022C2-0039DED3-00000000 mode=6 flags=0x401 why=176 timeout=0 ***
2022-07-05 10:44:13.649*:ksq.c@9175:ksqgtlctx(): *** TM-00000061-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.651*:ksq.c@9175:ksqgtlctx(): *** TM-00000049-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.651*:ksq.c@9175:ksqgtlctx(): *** TM-00000004-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.652*:ksq.c@9175:ksqgtlctx(): *** TM-0000001F-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.653*:ksq.c@9175:ksqgtlctx(): *** TM-00000012-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.654*:ksq.c@9175:ksqgtlctx(): *** RC-00000002-00000012-0039DED3-00000000 mode=4 flags=0x10401 why=294 timeout=5 ***
2022-07-05 10:44:13.654*:ksq.c@9175:ksqgtlctx(): *** RC-00000002-0000001F-0039DED3-00000000 mode=4 flags=0x10401 why=294 timeout=5 ***
2022-07-05 10:44:13.657*:ksq.c@9175:ksqgtlctx(): *** TM-00004887-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.657*:ksq.c@9175:ksqgtlctx(): *** TM-0001E643-00000000-0039DED3-00000000 mode=3 flags=0x401 why=173 timeout=21474836 ***
2022-07-05 10:44:13.657*:ksq.c@9175:ksqgtlctx(): *** TX-0003001C-0000223A-0039DED3-00000000 mode=6 flags=0x401 why=176 timeout=0 ***

  • The first two locks are table locks in mode 4 (share) on the child and parent tables respectively (The values 0x0001E8C8 and 0x0001E8C6 are the hexadecimal equivalents of the object_ids).
  • The next 8 locks in mode 6 (exclusive) are something to do with the child table (same object_id appearing) and a check in v$lock_type tells us they’re something to do with compression.
  • Then we see a TX (transaction) lock in mode 6, 5 TM locks on very low (data dictionary) object_ids in mode 3, two RC (result cache) locks in mode 4.
  • Finally there’s two TM locks and a TX lock – and the table locks are for the aud$unified table and one of its partitions.

It’s probably safe to ignore the locking related to the recursive transactions (especially in since the locks show non-zero timeouts); I don’t know what the ZH locks are about but the increasing nature of the second component of the lock id suggests that they’re not likely to be a problem (even though they have to be acquired without a timeout).

The thing that catches my eye is that we have to lock both the child and the parent – and until I did this test I wasn’t certain that for a “novalidate” constraint there would be any need for a data lock on the parent – though a rowcache lock to check for the legality of the constraint definition would make sense.

So maybe the problem isn’t about the child table, possibly it’s about the parent. I’m going to rerun the whole test again, enabling the ksq trace and the errorstack, and in the pause that’s in the script I’m going to lock the parent table from another session before enabling the constraint. From the new trace file I’m going to show you more lines about some of the TM locks (which will be for new object_ids since the code drops and recreates tables) and then a few more lines from the error stack.

First the locking information:

2022-07-05 11:20:54.686*:ksq.c@9100:ksqgtlctx(): ksqtgtlctx: PDB mode 
2022-07-05 11:20:54.688*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8D4-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
...
2022-07-05 11:20:54.688*:ksq.c@9851:ksqgtlctx(): ksqgtlctx: updated lock mode, mode:4 req:0
2022-07-05 11:20:54.688*:ksq.c@9960:ksqgtlctx(): SUCCESS

2022-07-05 11:20:54.689*:ksq.c@9100:ksqgtlctx(): ksqtgtlctx: PDB mode 
2022-07-05 11:20:54.689*:ksq.c@9175:ksqgtlctx(): *** TM-0001E8D2-00000000-0039DED3-00000000 mode=4 flags=0x401 why=173 timeout=0 ***
...
2022-07-05 11:20:54.689*:ksq.c@9001:ksqcmi(): returns 51
2022-07-05 11:20:54.689*:ksq.c@9948:ksqgtlctx(): FAILURE: returns 51

Note how we can see here that it’s the attempt to lock the parent (lower object id:0x0001E8D2) that makes Oracle raise the error. Notice, by the way, that internally it’s raising error 51 (which is “timeout occurred while waiting for a resource”) not error 54.

And here’s a section of the error stack – quite a long way down:

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
...
ktagetg0()+929       call     ktaiam()             00001E8D2 000000004 000000000
                                                   7FFF26A8BC38 ? 000000000 ?
                                                   000000000 ?
ktagetp_internal()+  call     ktagetg0()           00001E8D2 ? 000000004 ?
141                                                000000004 ? 7FFF26A8BC38 ?
                                                   000000000 ? 000000000 ?
ktagetg_ddlX()+323   call     ktagetp_internal()   00001E8D2 000000000 000000004
                                                   000000000 ? 000000000
                                                   F7A4F57D00000000
ktagetg_ddl()+30     call     ktagetg_ddlX()       00001E8D2 ? 000000000 ?
                                                   000000004 ? 000000000 ?
                                                   000000000 ? 000000000
kkdllk0()+1551       call     ktagetg_ddl()        00001E8D2 ? 000000000 ?
                                                   000000004 ? 000000000 ?
                                                   000000000 ? 000000000 ?

You’ll notice in the list of arguments for these calls (which relate to getting locks for DDL) that the value 00001E8D2 (the object_id) and 000000004 (requested lock mode) keep appearing. So simply setting errorstack to level 1 will give you the SQL statement that caused the ORA-00054, and you will be able to find the object_id that Oracle was unable to lock and the attempted lock mode.

Conclusion

It can be a little tricky to track down the source of Oracle error ORA-00054 when it appears unpredictably and cannot be reproduced on demand, but there are two options that help you to get started.

The error is about attempting to lock something – so if you set the system to dump an error stack (even at only level 1) when the error occurs you should be able to find the object_id of the object that is the source of the problem and the lock mode being requested. It’s probably the case that this will be a small overhead when run at the system level since you probably don’t generate lock timeouts very often.

If you are lucky enough to know the SQL ID of statement that runs into the problem you can enable ksq (lock) tracing for that specific SQL statement and that will make it a lot easier to see exactly which lock attempt failed. If you have no idea of the SQL_ID, then ksq tracing for the whole system will probably be too much of an overhead to leave in place. The benefit of the ksq trace is that if you don’t know what locking your application code needs you will be able to see all the locks involved, and simply knowing what locks are involved may be enough to point you in the right direction.

Note: (which I haven’t tested) if the guilty SQL is called from inside a package, then using the SQL_ID of the package call may result in ksq tracing for every statement call inside the package call, and that might be a bearable overhead.

Footnote:

Although this allows us to discover where the locking conflict appeared, it doesn’t tell us what the blocking session did to get in our way. In the next installment I’ll describe how we can drill through through the systemstate dump to find out (if the information is still there) what the other session was doing to cause the problem.

Footnote 2:

It’s worth mentioning that in some cases of locking it can be a good idea to use the “wait N” (for a small value of N) option in your code as a wait of a few seconds may allow you to find some clues about blockers in the ASH (v$active_session_history / dba_hist_active_sess_history) information when a timeout occurs. In this specific case, though, I don’t think there’s a variant of the syntax that would allow you to do something like “alter table modify constraint …. wait 5”.

Update (a few minutes after publishing)

If you check comment #1 below you’ll see that Alexander Chervinskiy has supplied a method for getting the recursive locks to wait for a limited period by setting the parameter ddl_lock_timeout to a small value (in seconds). This can be done at the session or at the system level.

February 18, 2022

Index ITL Limit

Filed under: deadlocks,Indexing,Infrastructure,Locks,Oracle,trace files — Jonathan Lewis @ 5:02 pm GMT Feb 18,2022

Here’s a little script that I wrote more than 10 years ago when investigating some undesirable behaviour with indexes. I’ve just rediscovered it after seeing it mentioned in a comment to an old article that I had been prompted to revisit. This isn’t going to help you solve any specific problem, but it might give you some feel for how much work Oracle has to do to cater for efficient index maintenance.

The script is just a dirty little hack to crash a session by calling a procedure recursively until something breaks – in this case when all the ITL slots of an index leaf block are full and the block doesn’t split for the next insert (which was a little surprising in its own right).

rem
rem     Script:         itl_limit.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2010
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem             11.1.0.6
rem  

create table itl_limit(n1 number) pctfree 0;
create unique index il_01 on itl_limit(n1) pctfree 0;

create or replace procedure recursive_itl(i_tx_count number)
as
        pragma autonomous_transaction;
begin
        if i_tx_count != 0 then
                insert into itl_limit values(200 - i_tx_count);
                recursive_itl(i_tx_count - 1);
                commit;
        end if;
end;
/

alter session set events '10046 trace name context forever, level 8';

execute recursive_itl(200);

alter system checkpoint;

alter session set events '10046 trace name context off';

prompt  ==========================================
prompt  If there is no index on the table then you
prompt  should see 169 rows in one block and 31 in
prompt  the other. But if there is an index there
prompt  should be no rows thanks to the rollback
prompt  caused by the error.
prompt  ==========================================

select 
        dbms_rowid.rowid_block_number(rowid), count(*) 
from 
        itl_limit
group by 
        dbms_rowid.rowid_block_number(rowid)
;

prompt  =================================
prompt  Try for a tree dump on the index
prompt  after which you can dump the root
prompt  block to see the ITL entries
prompt  =================================

column object_id new_value m_object_id

select  object_id, object_type, object_name
from    user_objects
where   object_name = 'IL_01'
/

alter session set events 'immediate trace name treedump level &m_object_id ';

If you comment out the creation of the index il_01 then the script completes very quickly (complaining, of course, about the attempt to do a treedump with a null level (ORA-49100: Failed to process event statement). Assuming you’re using an 8KB block size the rowid count query will show that you’ve got 169 rows in one block and 31 rows in the other – and if you dump the block with 169 rows you will find that (a) there’s loads of empty space in the block, and (b) the number of ITL entries has reached 169, and that’s the limiting factor that restricted the number of rows we could insert.

If you create the index then you’ll have to wait roughly 142 seconds for the procedure call to fail (with an ORA-00060: deadlock detected error) and the script to complete. And when you generate the tkprof output from the trace file you’ll find that most of the time is spent in the following statement:

INSERT INTO ITL_LIMIT
VALUES
(200 - :B1 )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    169      0.96     143.69         11      56324       4939         168
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      170      0.96     143.69         11      56324       4939         168

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 138     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  ITL_LIMIT (cr=5 pr=0 pw=0 time=192 us starts=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                        11        0.00          0.00
  enq: TX - allocate ITL entry                   48        5.01        142.49
********************************************************************************

We’ve execute 169 times, but reported only 168 rows, which suggests something went wrong on one insert. The 48 waits for “enq:TX – allocate ITL entry” invite further investigation, of course – so let’s get back to the raw trace file and find them.

grep -T -n  "WAIT.*alloc" or19_ora_15879.trc

 571   :WAIT #139861052053576: nam='enq: TX - allocate ITL entry' ela= 1000682 name|mode=1415053316 usn<<16 | slot=10420232 sequence=2 obj#=94939 tim=5285776687
 574   :WAIT #139861052053576: nam='enq: TX - allocate ITL entry' ela= 1016171 name|mode=1415053316 usn<<16 | slot=1310726 sequence=6209 obj#=94939 tim=5286793280
 577   :WAIT #139861052053576: nam='enq: TX - allocate ITL entry' ela= 1001580 name|mode=1415053316 usn<<16 | slot=10223624 sequence=2 obj#=94939 tim=5287795235
...
 709   :WAIT #139861052053576: nam='enq: TX - allocate ITL entry' ela= 4999483 name|mode=1415053316 usn<<16 | slot=11468804 sequence=2 obj#=94939 tim=5423905081
42381  :WAIT #139861052053576: nam='enq: TX - allocate ITL entry' ela= 2999710 name|mode=1415053316 usn<<16 | slot=9633800 sequence=2 obj#=94939 tim=5426905028

All 48 waits occur after the 169th attempt to insert a row. Oracle rotates through 12 ITL slots waiting one second on each, then goes round the loop again waiting 2 seconds on each, then 4 seconds, then 5 seconds – except it doesn’t wait on the 12th ITL on the final loop, instead it reports “DEADLOCK DETECTED (ORA-00060)” in the trace file and dumps a Deadlock graph of the form:

Deadlock graph:
                                          ------------Blocker(s)-----------  ------------Waiter(s)------------
Resource Name                             process session holds waits serial  process session holds waits serial
TX-00930008-00000002-0039DED3-00000000         44      49     X        18979      44      49           S  18979

Note the self-deadlock – the holder and waiter are the same session. After the deadlock graph we get the usual stack dump and after 48,000 lines of trace we see the message “Attempting to break deadlock by signaling ORA-00060” after which the session waits on the 12th ITL for 3 seconds and then the whole stack of autonomous transactions rolls back:

WAIT #139861052053576: nam='enq: TX - allocate ITL entry' ela= 2999710 name|mode=1415053316 usn<<16 | slot=9633800 sequence=2 obj#=94939 tim=5426905028
EXEC #139861052053576:c=429678,e=142567211,p=0,cr=504,cu=11,mis=0,r=0,dep=1,og=1,plh=0,tim=5427342894
ERROR #139861052053576:err=60 tim=5427342914
CLOSE #139861052053576:c=1,e=1,dep=1,type=3,tim=5427343141
XCTEND rlbk=1, rd_only=0, tim=5427343240
XCTEND rlbk=1, rd_only=0, tim=5427343336
...

My test script reports the object_id (not the data_object_id) of the index and does a treedump of it (which should show just a root block (which is also a leaf block) with no entries. The root block address lets you do a treedump, which will show something like:

----- begin tree dump
leaf: 0x400008b 67109003 (0: row:0.0 avs:3988)
----- end tree dump

Notice how the available space (avs) in this root/leaf block is only 3988 bytes rather than roughly 8,000 for an 8KB block size. That’s because the ITL area has taken advantage of its ability to grow to half the size of the block, and once it has grown it doesn’t shrink (Note: there is a hard limit of 255 which will only become apparent with larger block sizes – and if you want to test that you’ll have to edit my script to change both occurrences of the constant 200 to (at least) 256).

The data block address (DBA) of the root/leaf block is given in both hexadecimal and decimal; but if you can’t work out which file and block number this represents (I happen to know it’s file 16, and I can convert 0x8b to 139 decimal in my head) then you can check for the segment header block and add one to the block number; or use calls to the dbms_utility package to translate the DBA before doing a block dump:

SQL> select
  2          dbms_utility.data_block_address_file(67109003) file#,
  3          dbms_utility.data_block_address_block(67109003) block#
  4  from
  5          dual
  6  /

     FILE#     BLOCK#
---------- ----------
        16        139

1 row selected.

SQL> alter system flush buffer_cache;

System altered.

SQL> alter system dump datafile 16 block 139;

System altered.

Here’s the section of the resulting trace that shows you how bad things had got before the

 seg/obj: 0x172dd  csc:  0x0000000001311be1  itc: 169  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x4000088 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn  0x0000000001311be1
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x04   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
...
0xa8   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0xa9   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
Leaf block dump

A final comment: if you’re wondering why the table (without an index) can get to 169 rows while the index achieves 168 rows and fails on the 169th – there’s one ITL in the index ITL area that is reserved for the “service ITL” (see answer 2), the one that Oracle uses to register and hold the block on an “index node split”.

Footnote

One of the entertaining things, about poking around with Oracle is the way that you discover unexpected details – some of which are side-effects that you realise you should have predicted, some of which are just a little bizarre. In this case the easter egg in the trace file was the following statement (reproduced from the tkprof output, and made highly visible because I used the sort=execnt option.)

update /*+ rule */ undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,
  undosqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13
where
 us#=:1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      336      0.00       0.00          0          0          0           0
Execute    336      0.02       0.02          3        336        531         336
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      672      0.03       0.03          3        336        531         336

The predictable side effect was that Oracle was going to create a number of new undo segments: as a standard detail of minimising contention Oracle tries to give every concurrent transaction (which means every one of my autonomous transactions) its own undo segment.

The surprising detail was the /*+ rule */ hint – still in 19.11.0.0. I guess that that was to ensure that Oracle executed the update through an index access – but possibly a suitable /*+ index() */ hint would be more appropriate to almost every version of Oracle.

There were a number of other segment/undo related statements that operated 140+ times in the course of this test – which did make me wonder if a very busy OLTP system (doing lots of tiny, concurrent, transacation) with a lot of undo segments could spend a significant amount of its time managing undo segments – but that’s a thought for another day. Another “look at that some time” thing that appeared was the large number of selects and inserts to a table call undohist$.

January 18, 2019

DML Tablescans

Filed under: deadlocks,Infrastructure,Locks,Oracle,Parallel Execution,Performance — Jonathan Lewis @ 1:25 pm GMT Jan 18,2019

This note is a follow-up to a recent comment on a blog note about Row Migration:

So I wonder what is the difference between the two, parallel dml and serial dml with parallel scan, which makes them behave differently while working with migrated rows. Why might the strategy of serial dml with parallel scan case not work in parallel dml case? I am going to make a service request to get some clarifications but maybe I miss something obvious?

The comment also referenced a couple of MOS notes:

  • Bug 17264297 “Serial DML with Parallel scan performs single block reads during full table scan when table has chained rows in 11.2”
  • Doc ID 1514011.1 “Performance decrease for parallel DML on compressed tables or regular tables after 11.2 Upgrade

The latter document included a comment to the effect that 11.2 uses a “Head Piece Scan” while 11.1 uses a “First Piece scan”, which is a rather helpful comment. Conveniently my blog note referenced an earlier note on the potential for differentiating between migrated and chained rows through a “flag” byte associated with each row piece. The flag byte has an H bit for the row head piece, an F bit for the row first piece, and an L bit for the row last piece and {no bits set} for a row piece in the middle of a chained row.

Side note:

      • A “typical” simple row will be a single row-piece with the H, F and L bits all set.
      • A simple migrated row will start with an “empty” row-piece in one block with the H bit set and a pointer (nrid – next rowid) to a row in another block that will have the F and L bits set and a pointer (hrid – head rowid) back to the head piece.
      • A chained row could start with a row piece holding a few columns and the H and F bits set and a pointer to the next row piece which might lead to a long chain of row pieces with no bits set each pointing to the next row piece until you get to a row piece with the L bit set. 
      • Alternatively you might have row which had migrated and chained – which means it could start with an empty row piece with just the H bit and a pointer to the next row piece, then a row piece with the F bit set, a back pointer to the header, and a next pointer to the next row piece, which could lead to a long chain of row pieces with no bits set until you reach a row piece with the L bit set.

Combining the comments about “head piece” and “first piece” scans with the general principles of DML and locking it’s now possible to start making some guesses about why the Oracle developers might want updates through tablescans to behave differently for serial and parallel tablescans. There are two performance targets to consider:

  • How to minimise random (single block) I/O requests
  • How to minimise the risk of deadlock between PX server processes.

Assume you’re doing a serial tablescan to find rows to update – assume for simplicity that there are no chained rows in the table. When you hit a migrated row (H bit only) you could follow the next rowid pointer (nrid) to find and examine the row. If you find that it’s a row that doesn’t need to be updated you’ve just done a completely redundant single block read; so it makes sense to ignore row pieces which are “H”-only row pieces and do a table scan based on “F” pieces (which will be FL “whole row” pieces thanks to our assumption of no chained rows). If you find a row which is an F row and it needs to be updated then you can do a single block read using the head rowid pointer (hrid) to lock the head row piece then lock the current row piece and update it; you only do the extra single block read for rows that need updates, not for all migrated rows. So this is (I guess) the “First Piece Scan” referenced in Doc ID 1514011.1. (And, conversely, if you scan the table looking only for row pieces with the H flag set this is probably the “Head Piece Scan”).

But there’s a potential problem with this strategy if the update is a parallel update. Imagine parallel server process p000 is scanning the first megabyte of a table and process p001 is scanning the second megabyte using the “first piece” algorithm.  What happens if p001 finds a migrated row (flags = FL) that needs to be updated and follows its head pointer back into a block in the megabyte being scanned by p000?  What if p000 has been busy updating rows in that block and there are no free ITLs for p001 to acquire to lock the head row piece? You have the potential for an indefinite deadlock.

On the other hand, if the scan is using the “head piece” algorithm p000 would have found the migrated row’s head piece and followed the next rowid pointer into a block in the megabyte being scanned by p001. If the row needs to be updated p000 can lock the head piece and the migrated piece.

At this point you might think that the two situations are symmetrical – aren’t you just as likely to get a deadlock because p000 now wants an ITL entry in a block that p001 might have been updating? Statistically the answer is “probably not”. When you do lots of updates it is possible for many rows to migrate OUT of a block; it is much less likely that you will see many rows migrate INTO a specific block. This means that in a parallel environment you’re more likely to see several PX servers all trying to acquire ITL entries in the same originating block than you are  to see several PX servers trying to acquire ITL entries in the same destination block. There’s also the feature that when a row (piece) migrates into a block Oracle adds an entry to the ITL list if the number of inwards migrated pieces is more than the current number of ITL entries.

Conclusion

It’s all guesswork of course, but I’d say that for a serial update by tablescan Oracle uses the “first piece scan” to minimise random I/O requests while for a parallel update by tablescan Oracle uses the “head piece scan” to minimise the risk of deadlocks – even though this is likely to increase the number of random (single block) reads.

Finally (to avoid ambiguity) if you’ve done an update which does a parallel tablescan but a serial update (by passing rowids to the query co-ordinator) then I’d hope that Oracle would use the “first piece scan” for the parallel tablescan because there’s no risk of deadlock when only the query co-ordinator is the only process doing the locking and updating, which makes it safe to use the minimum I/O strategy. (And a paralle query with serial update happens quite frequently because people forget to enable parallel dml.)

Footnote

While messing around to see what happened with updates and rows that were both migrated and chained I ran the following script to create one nasty row. so that I could dump a few table blocks to check for ITLs, pointers, and locks. The aim was to get a row with a head-only piece (“H” bit), an F-only piece, a piece with no bits set, then an L-only piece. With an 8KB block size and 4,000 byte maximum for varchar2() this is what I did:


rem
rem     Script:         migrated_lock.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2019
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0
rem

create table t1 (
        n1 number,
        l1 varchar2(4000),
        s1 varchar2(200),
        l2 varchar2(4000),
        s2 varchar2(200),
        l3 varchar2(4000),
        s3 varchar2(200)
);

insert into t1 (n1,l1,s1) values(0,rpad('X',4000,'X'),rpad('X',200,'X'));
commit;

insert into t1 (n1,l1) values(1,null);
commit;

update t1 set
        l1 = rpad('A',4000),
        s1 = rpad('A',200),
        l2 = rpad('B',4000),
        s2 = rpad('B',200),
        l3 = rpad('C',4000),
        s3 = rpad('C',200)
where
        n1 = 1
;

commit;

execute dbms_stats.gather_table_stats(user,'t1');

update t1 set
        s1 = lower(s1),
        s2 = lower(s2),
        s3 = lower(s3)
where
        n1 = 1
;

alter system flush buffer_cache;

select
        dbms_rowid.rowid_relative_fno(rowid)    rel_file_no,
        dbms_rowid.rowid_block_number(rowid)    block_no,
        count(*)                                rows_starting_in_block
from
        t1
group by
        dbms_rowid.rowid_relative_fno(rowid),
        dbms_rowid.rowid_block_number(rowid)
order by
        dbms_rowid.rowid_relative_fno(rowid),
        dbms_rowid.rowid_block_number(rowid)
;

The query with all the calls to dbms_rowid gave me the file and block number of the row I was interested in, so I dumped the block, then read the trace file to find the next block in the chain, and so on. The first block held just the head piece, the second block held the n1 and l1 columns (which didn’t get modified by the update), the third block held the s1 and l2 columns, the last block held the s2, l3 and s3 columns. I had been expecting to see the split as (head-piece(, (n1, l1, s1), (l2, s2), (l3, s3) – but as it turned out the unexpected split was a bonus.

Here are extracts from each of the blocks (in the order they appeared in the chain), showing the ITL information and the “row overhead” information. If you scan through the list you’ll see that three of the 4 blocks have an ITL entry for transaction id (xid) of 8.1e.df3, using three consecutive undo records in undo block 0x0100043d. My update has locked 3 of the 4 rowpieces – the header and the two that have changed. It didn’t need to “lock” the piece that didn’t change. (This little detail was the bonus of the unexpected split.)


Block 184
---------
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.00b.00000ee1  0x01000bc0.036a.36  C---    0  scn  0x00000000005beb39
0x02   0x0008.01e.00000df3  0x0100043d.0356.2e  ----    1  fsc 0x0000.00000000

...

tab 0, row 1, @0xf18
tl: 9 fb: --H----- lb: 0x2  cc: 0
nrid:  0x00800089.0



Block 137       (columns n1, l1 - DID NOT CHANGE so no ITL entry acquired)
---------       (the lock byte relates to the previous, not cleaned, update) 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.00b.00000ee1  0x01000bc0.036a.35  --U-    1  fsc 0x0000.005beb39
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn  0x0000000000000000

...

tab 0, row 0, @0xfcb
tl: 4021 fb: ----F--- lb: 0x1  cc: 2
hrid: 0x008000b8.1
nrid:  0x00800085.0



Block 133 (columns s1, l2)
--------------------------
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.00b.00000ee1  0x01000bc0.036a.34  C---    0  scn  0x00000000005beb39
0x02   0x0008.01e.00000df3  0x0100043d.0356.2f  ----    1  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn  0x0000000000000000

...

tab 0, row 0, @0xf0b
tl: 4213 fb: -------- lb: 0x2  cc: 2
nrid:  0x008000bc.0



Block 188 (columns s2, l3, s3)
------------------------------
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.00b.00000ee1  0x01000bc0.036a.33  C---    0  scn  0x00000000005beb39
0x02   0x0008.01e.00000df3  0x0100043d.0356.30  ----    1  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  C---    0  scn  0x0000000000000000

...

tab 0, row 0, @0xe48
tl: 4408 fb: -----L-- lb: 0x2  cc: 3

Note, by the way, how there are nrid (next rowid) entries pointing forward in every row piece (except the last), but it’s only the “F” (First) row-piece has the hrid (head rowid) pointer pointing backwards.

 

March 28, 2018

Lock Types

Filed under: Locks,Oracle — Jonathan Lewis @ 12:24 pm BST Mar 28,2018

Every now and again I have to check what a particular lock (or enqueue) type is for and what the associated parameter values represent. This often means I have to think about the names of a couple of views and a collection of columns – then create a few column formats to make the output readable (though sometimes I can take advantage of the “print_table()” procedure that Tom Kyte published a long time ago.  It only takes a little time to get the code right but it’s a nuisance when I’m in a hurry so I’ve just scribbled out a few lines of a script that takes a lock type as an input parameter and reports all the information I want.

rem
rem     Script:         lock_types.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2018
rem     Usage:          start lock_types {lock type}
rem

define m_lock_type='&1'

column  display new_value m_display

select
        case when substr(version,1,2) in ('12', '18', '19')
        then
                'display_name'
        else
                'name'
        end                     display
from
        v$instance
;

set linesize 160
set pagesize 60
set trimspool on

column type             format a4
column name             format a32
column description      format a80
column id1_tag          format a32
column id2_tag          format a32
column is_user          format a4 heading "User"
column is_recycle       format a4 heading "Rcyc"

set feedback off
set verify off
break on report skip 1

spool lock_types

select  *
from    V$lock_type
where   type = upper('&m_lock_type')
order by
        type
;

column  name                    format a42
column  parameter1              format a9
column  parameter2              format a24
column  parameter3              format a22
column  wait_class              format a14
column  display_name            format a42

select
        eve.name,
        eve.parameter1,
        eve.parameter2,
        eve.parameter3,
        eve.wait_class,
        nullif(eve.&m_display, eve.name) display_name
from
        v$event_name    eve
where
        eve.name like 'enq: ' || upper('&m_lock_type') || '%'
order by
        nullif(eve.wait_class,'Other'),
        eve.name
;

set feedback on

I’ve included a check (and hack) on the value of the major version because 12c introduced a “display_name” as well as a “name” for events, and the latter is sometimes a little more descriptive than the former, so it’s nice to have a single script that could print two different values for the versions that have them.

Here’s a sample of the output when I pass ‘IV’ as an input parameter:

TYPE NAME                             ID1_TAG                          ID2_TAG                          User Rcyc
---- -------------------------------- -------------------------------- -------------------------------- ---- ----
DESCRIPTION                                                                                              CON_ID
---------------------------------------------------------------------------------------------------- ----------
IV   Library Cache Invalidation       object #                         time stamp                       NO   NO
Synchronizes library cache object invalidations across instances                                              0


NAME                                       PARAMETER PARAMETER2               PARAMETER3             WAIT_CLASS     DISPLAY_NAME
------------------------------------------ --------- ------------------------ ---------------------- -------------- -----------------------
enq: IV -  contention                      type|mode id1                      id2                    Other

As you can see from the presence of the con_id column in v$lock_type this output came from a 12c instance. I picked the IV lock because that’s the one that prompted me to check the meanings of the id[12] and parameter[123] columns when a question about waits for the IV lock appeared recently on Oracle-L. I’ve got two reasons for carrying on with this particular example – first that it demonstrates that the descriptions can be wrong, second that it allows me to demonstrate a quick tip on translation.

The question on Oracle-L related to a 4-node RAC system and reported one instance suffering long waits on the IV enqueue on a fairly regular basis when running a particular batch task. The OP reported the following values as the p1, p2, p3 values from v$session while the wait was going on:


P1    type|mode 1230372869
P2    id1       1398361667
P3    id2                3

According to the details in v$lock_type the enqueue is about library cache invalidation across instances – and that fits the OPs complaint because the system is a RAC system. The id1 value is supposed to be an obj# (object_id), but the OP said it wasn’t; and the id2 value is supposed to be a timestamp, but 3 is an odd value for a timestamp (though it might represent – for example – the 3 second wait that is a common time-out interval for enqueues). So, clearly, the descriptions can be wrong.

Translation

Take another look at p1 and p2, and turn them into Hexadecimal:


1230372869 (dec) = 0x49560005 (hex)
1398361667 (dec) = 0x53594E43 (hex)

If you happen to be good with Hex and ASCII code you’ll know that 2-byte values in the range 41-5F are mostly the capital letters of the Roman alphabet (while 61 – 7f are mostly the lower case letters), so a second translation step gives us:


1230372869 (dec) = 0x49560005 (hex) = 'IV'  5
1398361667 (dec) = 0x53594E43 (hex) = 'SYNC'

The p1 parameter is described (correctly) as “type/mode” – this is an IV enqueue held or requested in mode 5; the p2 parameter is not an object number, it looks more like a text description of why the enqueue is being requested (the enqueue is, after all, described as being used to “synchronize library cache object invalidation”).

I still don’t know what the final parameter represents – I doubt if it’s really about a three second wait (but that could be checked by examining v$session_wait over a period of several seconds or v$session_wait_history), it might be an indication of the instance that the session is trying to synchronize with (though, again, that seems a long shot), or it might just be a “reason-code” describing why the synchronisation is necessary.

Whenever in doubt about the meaning for the various parameters, it’s always worth a quick dec -> hex -> ASCII check, just in case it offers some clues about the function of the enqueue.

Update (Feb 2021)

I’ve just received the following (paraphrased) email:

We are getting this wait event enq IV – contention very frequently while doing a MERGE statement in our RAC oracle database due to which the merge statement is taking a long time after upgrading to  oracle 12c

I happened to see this article but it does not give any info as to how to resolve this issue. 

Any comments from you would be appreciated.

The little script was originally written to run on 12c, but I’ve just tweaked it to include 18c and 19c and run it on 19c and got an additional line in the results from v$event_name; here’s the latest output:


TYPE NAME                             ID1_TAG                          ID2_TAG                          User Rcyc
---- -------------------------------- -------------------------------- -------------------------------- ---- ----
DESCRIPTION                                                                                              CON_ID
---------------------------------------------------------------------------------------------------- ----------
IV   Library Cache Invalidation       object #                         time stamp                       NO   NO
Synchronizes library cache object invalidations across instances                                              0


NAME                                       PARAMETER PARAMETER2               PARAMETER3             WAIT_CLASS     DISPLAY_NAME
------------------------------------------ --------- ------------------------ ---------------------- -------------- ----------------------
enq: IV - cross instance invalidation      lock idx                                                  Concurrency
enq: IV -  contention                      type|mode id1                      id2                    Other

This tells us something very important about the IV wait events, which we can probably carry back from 19c to 12c: “IV Library Cache Invalidation” isn’t necessarily the same as “enq: IV – contention”, so we shouldn’t assume that we can interpret the P2 and P3 values according to the id1 and id2 tags in v$lock_type.

With a new reason for researching the enqueue it’s easy to start with a simple search, resulting in two hits on MOS documents:

  • Doc ID 2028503.1 12c RAC DDL Performance Issue: High “enq: IV – contention” etc if CPU Count is Different
  • Doc ID 2301046.1 “Enq: IV – Contention”, summary invalidations, and summary management compile

The first of these has already been mentioned in the comments below and relates to a configuration problem where cross-instance library cache invalidation can be slow if the number of LMD processes differs between instances; it’s interesting to note that the document specifically mentions a bug (closed as a duplicate) that reported performance degradation of a grant statement after upgrade to 12c  (and presumably the same would have been true for revoke and I can’t help wondering if this is also true of the set role command – it doesn’t in single instance – since this acts indirectly to grant privileges to users)

The second document describes, amongst other bits of information, the effect of DML on the base tables of materialized views and the subsequence need to invalidate “summary objects” across all instance – and it includes a simple query against [g]v$db_object_cache that might help to identify objects that are frequently invalidated. Following a slightly circuitous trail this also leads to bug 21418655 : “PATCH 19450314 INCOMPLETE: MORE UNNECESSARY INVALIDATIONS IN 12C” which is reported as fixed in 18.1, but has fixes included in patch sets for earlier versions.

So – bottom line response to the email that triggered this update:

  • if you’ve managed to collect some p1, p2, p3 values (e.g. from tracing or ASH) from the IV enqueue waits then you may have clues from object numbers;
  • if you check gv$db_object_cache for objects with high values for invalidations that may give you clues to interesting objects;
  • if you have code that frequently grants or revokes privileges, that might have something to do with the problem
  • if you have code that freqently modifies the base tables of materalized views especially if you have many, possibly related, materialized view

 

May 9, 2016

RI Locks

Filed under: deadlocks,Indexing,Locks,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 12:24 pm BST May 9,2016

RI = Referential Integrity: also known informally as parent/child integrity, and primary (or unique) key/foreign key checking.

I’m on a bit of a roll with things that I must have explained dozens or even hundreds of times in different environments without writing them up on my blog, so here’s a blog note I could have used to answer a question that came up over the weeked on the Oracle database forum.

What happens in the following scenario:


create table parent (
        id        number(8,0),
        constraint par_pk primary key(id)
);

create table child  (
        id_p      number(8,0) not null references parent,
        id_c      number(8,0) not null,
        constraint child_pk primary key(id_p, id_c)
)
;

--
--      session 1
--

insert into parent values(1);

--
--      session 2
--

insert into child values(1,1);

Since the parent row corresponding to the child row doesn’t seem to exist as far as session 2 is concerned you might expect session 2 to respond immediately with an error message like:

ERROR at line 1:
ORA-02291: integrity constraint (TEST_USER.SYS_C0017926) violated - parent key not found

In fact, although the end-user session is not allowed to see the uncommitted parent row, the user’s shadow process can see the uncommitted row and will wait until session 1 commits or rolls back – so if you examine v$lock for the current locks for the two sessions you’d see something like this:

  1  select  sid, type, id1, id2, lmode, request, ctime, block
  2  from    V$lock
  3  where   sid in (select sid from V$session where username = 'TEST_USER')
  4  and     type != 'AE'
  5  order by
  6*         sid, type desc
  7  /

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- -- ---------- ---------- ---------- ---------- ---------- ----------
         3 TX     327709      12584          6          0        283          1
           TM     143734          0          2          0        283          0
           TM     143732          0          3          0        283          0

       250 TX     589829      12877          6          0        240          0
           TX     327709      12584          0          4        240          0
           TM     143734          0          3          0        240          0
           TM     143732          0          3          0        240          0


7 rows selected.

In the above, SID 250 is session 2: it’s holding a transaction lock (TX) in mode 6 (exclusive) because it has acquired an undo segment and has generated some undo, it’s also waiting for a transaction lock in mode 4 (share) and – checking id1 and id2 – we can see that the transaction table entry it’s waiting for is held by SID 3 (session 1)  in mode 6 (and we also note that the lock held by SID 3 is marked as a blocker).

If SID 3 commits (thus releasing the transaction lock) SID 250 will continue processing the insert; if SID 3 rolls back SID 250 will raise error ORA-02291 and roll back its insert statement. (Note: if this were a multi-statement transaction it would only be the insert into child that would be rolled back; that’s another one of those details that is important but often isn’t stated explicitly, leaving people believing that the entire transaction would be rolled back.)

Updates and deletes can produce the same effects. Imagine that we have just created the two tables, and then run the following:



--
--      session 1
--

insert into parent values(1);
commit;
delete from parent where id = 1;

--
--      session 2
--

insert into child values(1,1);

Again session 2 will wait for session 1 to commit or roll back. In this case if session 1 commits session 2 will raise Oracle error ORA-02291, if session 1 rolls back session 2 will continue with the insert.

Deadlocks

Whenever you can demonstrate a way of producing a wait chain you can also manage to produce a deadlock. Consider the following (starting, again, from empty tables);


-- (1) session 1
insert into parent values(1);

-- (2) session 2
insert into parent values(2);

-- (3) session 1
insert into child values(2,2);

-- (4)session 2
insert into child values(1,1);

Session 1 will start waiting for session 2 to commit (or rollback) at step 3, then session 2 will start to wait for session 1 at step 4 – with the result that session 1 will recognise the deadlock after about three seconds and rollback its last statement, raising exception ORA-00060 and dumping a trace file. (Note: session 1 will not, as many people think, roll back the entire transaction, it will roll back only the single statement that allowed the deadlock to develop). Session 2 will still be waiting for session 1 to commit or rollback its insert into parent.

Contrary to the popular claim, Oracle will not “resolve” the deadlock, it will simply break the deadlock leaving one session waiting for the other session to respond appropriately to the deadlock error.

For reference, here’s the deadlock graph (from a 12c trace file) produced by session 1 (SID = 3) for this demo:


Deadlock graph:
                                          ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name                             process session holds waits  process session holds waits
TX-00010017-000026C7-00000000-00000000          6       3     X             33     250           S
TX-000A000D-000026F8-00000000-00000000         33     250     X              6       3           S

session 3: DID 0001-0006-00000004       session 250: DID 0001-0021-00000041
session 250: DID 0001-0021-00000041     session 3: DID 0001-0006-00000004

Rows waited on:
  Session 3: no row
  Session 250: no row

When you see a deadlock graph with TX waits of type S (share, mode 4) it’s a very good bet that the wait has something to do with indexes – which may mean referential integrity as discussed here, but may mean collisions on primary keys, and may mean something to do with simple collisions on index-organized tables. You’ll notice that the “Rows waited on:” section shows no row – unfortunately in earlier versions of Oracle you may find a spurious row entry here because the wait information from some other (block) wait has been left in the relevant columns in v$session.

Footnote

The “lock at the opposite end” strategy has changed several times over the many versions of Oracle. At one point the lock taken at the opposite end of the constraint was a mode 2; then it changed to a mode 3. In the latest versions of Oracle an insert into a parent produces a mode 2 on the child but an insert into a child produces a mode 3 on the parent.

 

 

March 11, 2015

Flashback Logging

Filed under: Flashback,Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 3:21 pm GMT Mar 11,2015

One of the waits that is specific to ASSM (automatic segment space management) is the “enq: FB – contention” wait. Despite the title, the “FB” in this case doesn’t mean “FlashBack”; instead when you query v$lock_type, and v$event_name you will find that the “FB” enqueue has the following description and wait information :


SQL> execute print_table('select * from v$lock_type where type = ''FB''')
TYPE                          : FB
NAME                          : Format Block
ID1_TAG                       : tablespace #
ID2_TAG                       : dba
IS_USER                       : NO
DESCRIPTION                   : Ensures that only one process can format data blocks in auto segment space managed tablespaces

SQL> execute print_table('select * from v$event_name where name like ''enq: FB%''')
EVENT#                        : 806
EVENT_ID                      : 1238611814
NAME                          : enq: FB - contention
PARAMETER1                    : name|mode
PARAMETER2                    : tablespace #
PARAMETER3                    : dba
WAIT_CLASS_ID                 : 1893977003
WAIT_CLASS#                   : 0
WAIT_CLASS                    : Other

This tells us that a process will acquire the lock when it wants to format a batch of blocks in a segment in a tablespace using ASSM – and prior experience tells us that this is a batch of 16 consecutive blocks in the current extent of the segment. When we see a wait for an FB enqueue we can assume that two sessions have simultaneously tried to format the same new batch of blocks and one of them is waiting for the other to complete the format. In some ways this wait can be viewed (like the “read by other session” wait) in a positive light – if the second session weren’t waiting for the first session to complete the block format it would have to do the formatting itself, which means the second end-user is actually going to get an improved response time thanks to the work done by the first session. On the other hand the set of 16 blocks picked by a session is dependent on its process id so the second session might have picked a different set of 16 blocks to format, which means that in the elapsed time of one format call the segment could have had 32 blocks formatted – this wouldn’t have improved the end-user’s response time, but it would mean that more time would pass before another session had to spend time formatting blocks. Basically, in a highly concurrent system, there’s not a lot you can do about FB waits (unless, of course, you do some clever partitioning of the hot objects).

There is actually one set of circumstances where you can have some control of how much time is spent on the wait but before I mention it I’d like to point out a couple more details about the event itself. First, parameter3 / id2_tag is a little misleading: you can use it to work out which blocks are being formatted (if you really need to) but the “dba” is NOT a data block address (which you might think if you look at the name and a few values). There is a special case when the FB enqueue is being held while you format blocks in a 64KB extent if you’re using system allocated extents, and there’s probably a special case (which I haven’t bothered to examine) if you create a tablespace with uniform extents that aren’t a multiple of 16 blocks, but in the general case the “dba” seems to consist of two parts – a base “data block address” and a single (hex) digit offset identifying which batch of 16 blocks will be formatted.

For example: a value of 0x01800242 means start at data block address 0x01800240, count forward 2 * 16 blocks then format 16 blocks from that point onwards. Since the last digit can only range from 0x0 to 0xf this means the first 7 (hex) digits of a “dba” can only reference 16 batches of 16 blocks, i.e. 256 blocks. It’s not coincidence (I assume) that a single bitmap space management block can cover a maximum of 256 consecutive blocks in a segment – the FB enqueue is tied very closely to the bitmap block.

Connecting the FB enqueue to Flashback logging.

So now it’s time to ask why this discussion of the FB enqueue appears in an article titled “Flashback Logging”. Enable the 10704 trace at level 10, along with the 10046 trace at level 8 and you’ll see. Remember that Oracle may have to log the old version of a block before modifying it and if it’s a block that’s being reused it may contribute to the statistic “physical reads for flashback new” – here’s a trace of a “format block” event:


*** 2015-03-10 12:50:35.496
ksucti: init session DID from txn DID:
ksqgtl:
        ksqlkdid: 0001-0023-00000014

*** 2015-03-10 12:50:35.496
*** ksudidTrace: ksqgtl
        ktcmydid(): 0001-0023-00000014
        ksusesdi:   0000-0000-00000000
        ksusetxn:   0001-0023-00000014
ksqgtl: RETURNS 0
WAIT #140627501114184: nam='db file sequential read' ela= 4217 file#=6 block#=736 blocks=1 obj#=192544 tim=1425991835501051
WAIT #140627501114184: nam='db file sequential read' ela= 674 file#=6 block#=737 blocks=1 obj#=192544 tim=1425991835501761
WAIT #140627501114184: nam='db file sequential read' ela= 486 file#=6 block#=738 blocks=1 obj#=192544 tim=1425991835502278
WAIT #140627501114184: nam='db file sequential read' ela= 522 file#=6 block#=739 blocks=1 obj#=192544 tim=1425991835502831
WAIT #140627501114184: nam='db file sequential read' ela= 460 file#=6 block#=740 blocks=1 obj#=192544 tim=1425991835503326
WAIT #140627501114184: nam='db file sequential read' ela= 1148 file#=6 block#=741 blocks=1 obj#=192544 tim=1425991835504506
WAIT #140627501114184: nam='db file sequential read' ela= 443 file#=6 block#=742 blocks=1 obj#=192544 tim=1425991835504990
WAIT #140627501114184: nam='db file sequential read' ela= 455 file#=6 block#=743 blocks=1 obj#=192544 tim=1425991835505477
WAIT #140627501114184: nam='db file sequential read' ela= 449 file#=6 block#=744 blocks=1 obj#=192544 tim=1425991835505985
WAIT #140627501114184: nam='db file sequential read' ela= 591 file#=6 block#=745 blocks=1 obj#=192544 tim=1425991835506615
WAIT #140627501114184: nam='db file sequential read' ela= 449 file#=6 block#=746 blocks=1 obj#=192544 tim=1425991835507157
WAIT #140627501114184: nam='db file sequential read' ela= 489 file#=6 block#=747 blocks=1 obj#=192544 tim=1425991835507684
WAIT #140627501114184: nam='db file sequential read' ela= 375 file#=6 block#=748 blocks=1 obj#=192544 tim=1425991835508101
WAIT #140627501114184: nam='db file sequential read' ela= 463 file#=6 block#=749 blocks=1 obj#=192544 tim=1425991835508619
WAIT #140627501114184: nam='db file sequential read' ela= 685 file#=6 block#=750 blocks=1 obj#=192544 tim=1425991835509400
WAIT #140627501114184: nam='db file sequential read' ela= 407 file#=6 block#=751 blocks=1 obj#=192544 tim=1425991835509841

*** 2015-03-10 12:50:35.509
ksqrcl: FB,16,18002c2
ksqrcl: returns 0

Note: we acquire the lock (ksqgtl), read 16 blocks by “db file sequential read”, write them to the flashback log (buffer), format them in memory, release the lock (ksqrcl). That lock can be held for quite a long time – in this case 13 milliseconds. Fortunately, as the output above shows us, the single block reads after the first one may all be accelerated by O/S prefetching:  your timings may vary.

The higher the level of concurrent activity the more likely it is that processes will collide trying to format the same 16 blocks (the lock is exclusive, so the second will request and wait, then find that the blocks are already formatted when it finally gets the lock). This brings me to the special case where waits for the FB enqueue waits might have a noticeable impact … if you’re running parallel DML and Oracle decides to use “High Water Mark Brokering”, which means the parallel slaves are inserting data into a single segment instead of each using its own private segment and leaving the query co-ordinator to clean up round the edges afterwards. I think this is most likely to happen if you have a tablespace using fairly large extents and Oracle thinks you’re going to process a relatively small amount of data (e.g. small indexes on large tables) – the trade-off is between collisions between processes and wasted space from the private segments.

October 22, 2014

Deadlocks

Filed under: deadlocks,Locks,Oracle — Jonathan Lewis @ 6:34 pm BST Oct 22,2014

A recent question on the OTN forum asked about narrowing down the cause of deadlocks, and this prompted me to set up a little example. Here’s a deadlock graph of a not-quite-standard type:


Deadlock graph:
                                          ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name                             process session holds waits  process session holds waits
TX-00040001-000008EC-00000000-00000000         50     249     X             48       9           X
TX-000A001F-000008BC-00000000-00000000         48       9     X             50     249           S


My session (the one that dumped the trace file) is 249, and I was blocked by session 9. The slight anomaly, of course, is that I was waiting on a TX lock in mode 4 (Share) rather than the more common mode 6 (eXclusive).

There are plenty of notes on the web these days to tell you that TX mode 4 relates in some way to a unique index (or some associated referential integrity) or an ITL wait. (Inevitably there are a couple of other less frequently occurring and less well documented reasons, such as waits for tablespaces to change state but I’m going to ignore those for now). The question is, how do I tell whether this example is related to uniqueness (indexing) or ITLs ? For recent versions of Oracle the answer is in the rest of the trace file which now holds the “Current Wait Stack” and “Session Wait History” for the session that dumped the trace file.

Reading down my trace file, past the line which says “Information for THIS session”, I eventually get to this:


    Current Wait Stack:
     0: waiting for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc
        wait_id=80 seq_num=81 snap_id=1
 

So it didn’t take me long to find out I had an ITL problem (which should be a pretty rare occurrence in newer versions of Oracle); but there’s more:

...

    There is at least one session blocking this session.
      Dumping 1 direct blocker(s):
        inst: 1, sid: 9, ser: 40192
      Dumping final blocker:
        inst: 1, sid: 9, ser: 40192
    There are 2 sessions blocked by this session.
    Dumping one waiter:
      inst: 1, sid: 357, ser: 7531
      wait event: 'enq: TX - allocate ITL entry'

...

    Session Wait History:
        elapsed time of 0.000035 sec since current wait
     0: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn<<16 | slot=0x5000c, sequence=0xa39
        wait_id=79 seq_num=80 snap_id=1
        wait times: snap=5.002987 sec, exc=5.002987 sec, total=5.002987 sec
        wait times: max=5.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000047 sec of elapsed time
     1: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc
        wait_id=78 seq_num=79 snap_id=1
        wait times: snap=1 min 4 sec, exc=1 min 4 sec, total=1 min 4 sec
        wait times: max=1 min 4 sec
        wait counts: calls=22 os=22
        occurred after 0.000032 sec of elapsed time

...
     8: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn<<16 | slot=0x5000c, sequence=0xa39
        wait_id=71 seq_num=72 snap_id=1
        wait times: snap=5.001902 sec, exc=5.001902 sec, total=5.001902 sec
        wait times: max=5.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000042 sec of elapsed time
     9: waited for 'enq: TX - allocate ITL entry'
        name|mode=0x54580004, usn<<16 | slot=0xa001f, sequence=0x8bc
        wait_id=70 seq_num=71 snap_id=1
        wait times: snap=4.005342 sec, exc=4.005342 sec, total=4.005342 sec
        wait times: max=4.000000 sec
        wait counts: calls=2 os=2
        occurred after 0.000031 sec of elapsed time

...

    Sampled Session History of session 249 serial 3931
    ---------------------------------------------------

    The history is displayed in reverse chronological order.

    sample interval: 1 sec, max history 120 sec
    ---------------------------------------------------
      [9 samples,                                          11:14:50 - 11:14:58]
        waited for 'enq: TX - allocate ITL entry', seq_num: 81
          p1: 'name|mode'=0x54580004
          p2: 'usn<= 8 sec (still in wait)
      [5 samples,                                          11:14:45 - 11:14:49]
        waited for 'enq: TX - allocate ITL entry', seq_num: 80
          p1: 'name|mode'=0x54580004
          p2: 'usn<<16 | slot'=0x5000c
          p3: 'sequence'=0xa39
          time_waited: 5.002987 sec (sample interval: 4 sec)
...

The little report that follows the initial wait state shows that the situation was a little messy – session 9 was my first and last blocker, but there was another session tangled up in the chain of waits, session 357.

Following this there’s a set of entries from my v$session_wait_history – and if you look carefully at the slot and sequence that appears on the second line of each wait you’ll notice that my waits have been alternating between TWO other sessions/transactions before I finally crashed.

Finally there’s a set of entries for my session extracted from v$active_session_history. (Question: I’m only allowed to query v$active_session_history if I’ve licensed the Diagnostic Pack – so should I shut my eyes when I get to this part of the trace file?) This breakdown also shows my session alternating between waits on the two different blockers, giving me a pretty good post-event breakdown of what was going on around the time of the deadlock.

September 24, 2014

Lock Time

Filed under: AWR,Locks,Oracle,Statspack — Jonathan Lewis @ 12:42 pm BST Sep 24,2014

Here’s a little detail I was forced to re-learn yesterday. It’s one of those things where it’s easy to say “yes, obviously” after you’ve had it explained, so I’m going to start by posing it as a question. Here are two samples of PL/SQL that use locking to handle a simple synchronisation mechanism; one uses a table as an object that can be locked, the other uses Oracle’s dbms_lock package. I’ve posted the code for each fragment and a sample of what you see in v$lock if two sessions execute the code one after the other so that the second session has to wait for the first session to commit (or otherwise release the lock):

Table locking.


begin
        lock table t1 in exclusive mode;
end;
/

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
00007FF409E57BF8 00007FF409E57C58         15 TM     157778          0          0          6         65          0
00007FF409E57BF8 00007FF409E57C58        125 TM     157778          0          6          0         91          1

Using dbms_lock.

variable m_handle       varchar2(255);

declare
        n1              number;
begin

        dbms_lock.allocate_unique(
                lockname        => 'Synchronize',
                lockhandle      => :m_handle
        );

        dbms_output.put_line(:m_handle);

        n1 := dbms_lock.request(
                lockhandle              => :m_handle,
                lockmode                => dbms_lock.x_mode,
                timeout                 => dbms_lock.maxwait,
                release_on_commit       => true         -- the default is false !!
        );

        dbms_output.put_line(n1);

end;
/

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
000000008ED8F738 000000008ED8F790         15 UL 1073742427          0          0          6         42          0
000000008ED902B0 000000008ED90308        125 UL 1073742427          0          6          0        103          1

The big question is this: although the two code fragments produce the same effects in terms of lock waits and the reports from v$lock what’s the big difference in the way that they are reported in the AWR report.

The high-level difference appears in the Time Model stats. Here are two extracts showing the difference:

Using dbms_lock.

Time Model Statistics

Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time                                 65.4         99.9
PL/SQL execution elapsed time                            63.8         97.4


SQL ordered by Elapsed Time

-> Captured SQL account for    2.8% of Total DB Time (s):              65
-> Captured PL/SQL account for   99.4% of Total DB Time (s):              65

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
            63.7              1         63.72   97.3     .0     .0 10u1qbw4a27sp
Module: SQL*Plus
declare n1 number; begin dbms_lock.allocate_unique
( lockname => 'Synchronize', lockhandle
 => :m_handle ); dbms_output.put_line(:m_handle); n
1 := dbms_lock.request( lockhandle => :m_handle,

Table locking method:

Time Model Statistics

Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time                                 95.5         99.9
DB CPU                                                    0.9           .9
parse time elapsed                                        0.1           .1
hard parse elapsed time                                   0.1           .1
PL/SQL execution elapsed time                             0.1           .1


SQL ordered by Elapsed Time

-> Captured SQL account for   99.6% of Total DB Time (s):              96
-> Captured PL/SQL account for   98.7% of Total DB Time (s):              96

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
            93.9              1         93.88   98.3     .0     .0 8apkdghttmndx
Module: SQL*Plus
begin lock table t1 in exclusive mode; end;

            93.9              1         93.88   98.3     .0     .0 29fwr53agvbc0
Module: SQL*Plus
LOCK TABLE T1 IN EXCLUSIVE MODE

The time spent waiting for the table lock is reported purely as SQL execution time in the Time Model stats but the time spent waiting for the user-defined lock is reported as SQL execution time and as PL/SQL execution time. I had forgotten this yesterday as I was skipping through the various headline figures of an hourly snapshot, so I was amazed to see the Time Model stats reporting 33,000 seconds of PL/SQL and 66,000 seconds of SQL – how on earth do you manage to do that much PL/SQL on any Oracle system. (To add to the embarrassment, it had only been a few moments earlier that I’d gone through the Top 5 Timed Events and said something like: “you can ignore all time spent on ‘enq: UL – contention’, it’s probably just synchronisation code”.

In this case the SQL ordered by Elapsed Time gives you a strong visual clue about what’s going on with the two lines that are clearly related – but it won’t always be that obvious.

Bottom Line:

PL/SQL execution time includes the time spent waitng for UL enqueues (user-defined locks), don’t forget that you may need to subtract wait time for ‘enq: UL – contention’ from the PL/SQL time before you start to worry about how much actual work you’re doing in PL/SQL.

Update (Sept 2020)

A recent test on 19.3 produced exactly the same behaviour.

July 13, 2014

Deferrable RI – 2

Filed under: Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 7:46 pm BST Jul 13,2014

A question came up on Oracle-L recently about possible locking anomalies with deferrable referential integrity constraints.

An update by primary key is taking a long time; the update sets several columns, one of which is the child end of a referential integrity constraint. A check on v$active_session_history shows lots of waits for “enq: TX – row lock contention” in mode 4 (share), and many of these waits also identify the current object as the index that has been created to avoid the “foreign key locking” problem on this constraint (though many of the waits show the current_obj# as -1). A possible key feature of the issue is that foreign key constraint is defined as “deferrable initially deferred”. The question is, could such a constraint result in TX/4 waits.

My initial thought was that if the constraint was deferrable it was unlikely, there would have to be other features coming into play.

When the foreign key is NOT deferrable it’s easy to set up cases where a TX/4 appears; for example you insert a new parent value without issuing a commit then I insert a new matching child – at that point my session will wait for your session to commit or rollback. If you commit my insert succeeds if you rollback my session raises an error (ORA-02291: integrity constraint (schema_name.constraint_name) violated – parent key not found). But if the foreign key is deferred my first guess would be that the non-existence (or potential existence, or not) of the parent shouldn’t matter and you could get away with things like this so long as you fixed up the data in time for the commit.

I was wrong. Here’s a little example:

rem
rem     Script:         c_pk_ck_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Feb 2014
rem
rem     Last tested
rem             11.2.0.4
rem

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
)
;

create table child(
        id_p    number(4)
                constraint chi_fk_par
                references parent
                deferrable initially deferred,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id)
)
;

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');

commit;

begin
        dbms_stats.gather_table_stats(user,'parent');
        dbms_stats.gather_table_stats(user,'child');
end;
/

pause Press return

update child set id_p = 3 where id_p = 2 and id = 2;

If you don’t do anything after the pause and before the insert then the update will succeed – but fail on a subsequent commit unless you insert parent 3 before committing. But if you take advantage of the pause to use another session to insert parent 3 first, the update will then hang waiting for the parent insert to commit or rollback – and what happens next may surprise you. Basically the deferrability doesn’t protect you from the side effects of conflicting transactions.

The variations on what can happen next (insert the parent elsewhere, commit or rollback) are interesting and left as an exercise.

I was slightly surprised to find that I had had a conversation about this sort of thing some time ago, triggered by a comment to an earlier post. If you want to read a more thorough investigation of the things that can happen and how deferrable RI works then there’s a good article at this URL. [Update: the URL is, alas, no longer live].

March 13, 2014

Shrink Space

Filed under: Bugs,deadlocks,fragmentation,Index Rebuilds,Indexing,Locks,Oracle — Jonathan Lewis @ 7:08 am GMT Mar 13,2014

Here’s an example of a nasty accident that can be seen in a slightly unusual output from v$lock (on 11.2.0.4).

Rather conveniently it demonstrates a comment I’ve often made about deadlocks not being “resolved” automatically, with the result that incorrectly written applications can end up with two sessions taking it in turns (see-sawing) to block each other.
(more…)

May 2, 2013

v$lock

Filed under: Locks,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 7:03 pm BST May 2,2013

The problem of slow queries on v$lock just came up again on the OTN database forum, so I thought I’d better push out a post that’s been hanging around on my blog for the last few months. This is actually mentioned in MOS note 1328789.1: “Query Against v$lock Run from OEM Performs Slowly” which points out that it is basically a problem of bad statistics and all you have to do is collect the stats.

(more…)

April 13, 2013

Deadlocks

Filed under: deadlocks,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 11:36 am BST Apr 13,2013

Here’s a deadlock graph that might cause a little confusion:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
 Resource Name          process session holds waits  process session holds waits
 TX-001a0015-00014787        34      90     X             32       3           S
 TX-00190008-0000601b        32       3     X             34      90           S

 session 90: DID 0001-0022-00000327      session 3: DID 0001-0020-000009E9
 session 3: DID 0001-0020-000009E9       session 90: DID 0001-0022-00000327

 Rows waited on:
  Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
  (dictionary objn - 199909, file - 6, block - 276654, slot - 0)
  Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
  (dictionary objn - 199909, file - 6, block - 433952, slot - 0)

(more…)

March 20, 2013

Lock Bug

Filed under: Bugs,deadlocks,Locks,Oracle — Jonathan Lewis @ 12:08 pm GMT Mar 20,2013

Here’s an oddity that I ran into a little while ago while trying to prepare a sample trace file showing a particular locking pattern; it was something that I’d done before, but trace files can change with different versions of Oracle so I decided to use a copy of 11.2.0.2 that happened to be handy at the time to check if anything had changed since the previous (11gR1) release. I never managed to finish the test; here are the steps I got through:
(more…)

February 22, 2013

Deadlock Detection

Filed under: deadlocks,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 6:19 pm GMT Feb 22,2013

By some strange coincidence, the “London Bus” effect perhaps, there have been three posts on the OTN database forum in the last couple of days relating to deadlocks; and they have prompted me to indulge in a little rant about the myth of Oracle and deadlock detection; it’s the one that goes:

“Oracle detects and resolves deadlocks automatically.”

Oracle may detect deadlocks automatically, but it doesn’t resolve them, it simply reports them (by raising error ORA-00060 and rolling back one statement) then leaves the deadlocked sessions stuck until the session that received the report resolves the problem or an external agent resolves the problem.

Consider the following example (which, I have to admit, I wrote without access to a live instance):

(more…)

Next Page »

Website Powered by WordPress.com.