Oracle Scratchpad

May 31, 2010

Skip Locked

Filed under: Troubleshooting — Jonathan Lewis @ 7:41 pm BST May 31,2010

You may already be familiar with the syntax:

select ... for update skip locked;

It appeared (internally) some time back in Oracle 8i to implement some features of advanced queueing, but was only legalised and documented in one of the more recent versions of Oracle. If, like me, you checked the manuals to understand what this did you probably came to the conclusion that it allowed you to issue a query that selected, locked, and returned the rows in a table that matched your query and were not already locked. This isn’t quite correct – the rows your query selects, locks and returns are the ones that can be locked … and that may not give you the results you expect.

I’m in Detroit at the moment, about to start a consulting assignment, and Mark Bobak mentioned this one to me on the drive into town from the airport. As he said: “It’s one of those things that’s obvious after you’ve worked out what’s going on.” If you want to read more, here’s a link to the blog item he wrote about it.

16 Comments »

  1. I believe Mark got it wrong. Where rownum<100 is applied before the “skip locked” and therefore he should get the unlocked rows from within the first 100. It is expected that the number will be less than 100. I could easily reproduce his issue with just two sessions:

    Session 1:

    SQL> create table t1 as select object_id from dba_objects where rownum<102;
    
    Table created.
    
    SQL> select count(*),max(object_id) from t1;
    
      COUNT(*) MAX(OBJECT_ID)
    ---------- --------------
           101            102
    
    SQL> update t1 set object_id=500 where object_id=90;
    
    1 row updated.
    
    SQL> update t1 set object_id=501 where object_id=91;
    
    1 row updated.
    

    Session 2:

    SQL> select * from t1 where rownum<100 for update skip locked;
    
    .....
    
    97 rows selected.
    
    

    101 rows, two locked. Wouldn’t you expect 99 rows to return? But the two locked are below 100. If I lock two above 100, this would not be an issue:

    session 1:

    SQL> rollback;
    
    Rollback complete.
    
    SQL>  update t1 set object_id=500 where object_id=101;
    
    1 row updated.
    
    SQL> update t1 set object_id=501 where object_id=102;
    
    1 row updated.
    
    

    session 2:

    SQL>  select * from t1 where rownum<100 for update skip locked;
    
    ....
    
    99 rows selected.
    

    Comment by Gwen Shapira — June 1, 2010 @ 1:19 am BST Jun 1,2010 | Reply

  2. Hi Chen,

    I take your point, and it’s correct.

    But, your demo is substantially different than my scenario.

    Further, I don’t believe I “got it wrong”.

    Allow me to explain why:

    First, the sample query in my write-up was:

    select ...
    from    [table_name]
    where   [some predicates]
    and     rownum &lt; 101
    for update skip locked;
    

    Note, in particular, the "[some predicates]".

    This is important, in that, in my case the table has 10s of thousands or 100s of thousands of rows, and the "[some predicates]" clause is going to allow for hundreds or thousands of rows that will satisfy the condition. Further, multiple, concurrent processes are running with different values of binds in the predicates. So, in this case, it's quite possible that, given high levels of concurrency, and with each process running with unique bind variable values, you'll run into ITL contention. Also, consider that when we edited our code to remove the "skip locked" clause, we did, in fact, immediately see ITL waits on TX enqueues. If something else were actually going on, the ITL waits would not have been visible.

    The point of my blog post was simply that, when using "skip locked", consider that the skipping could actually happen at the block level. This is something that I found counterintuitive, and, to my knowledge, not documented at all.

    Comment by mbobak — June 1, 2010 @ 5:39 am BST Jun 1,2010 | Reply

  3. Adding an ORDER BY may be able to prove the ITL hypothesis. Take a table with two blocks with 100 records each and ITLs that allowed only 25 of them to be locked. Lock 25 of the records on the first block in one session, then add another session grabbing 10 records with skip locked with an ORDER BY such that the records on the first block are preferred. If it grabs 10 records off the second block, then it is confirmed that it has skipped those rows on the first block which aren’t individually locked.

    Personally it sounds like a well implemented feature, at least from a processing perspective. You want lots of consumers processing available/unencumbered rows with minimal waiting. If it just moved waits from row locks to ITL, you’d still be waiting.

    In regards the ROWNUM filter being applied before the SKIP LOCKED exclusions, this one might be better to do with a PL/SQL BULK FETCH with a LIMIT clause.

    Comment by Gary — June 1, 2010 @ 6:55 am BST Jun 1,2010 | Reply

  4. Chen’s point is an important one – the rownum applies before the skip locked so if you see a case where Oracle is not returning data that “obviously” should be returned you should consider the possibility she offers before assuming you are seeing a case of Mark’s problem.

    In the discussion we had in the car, though, it was very obvious that we were talking about a case where rows which could not have been (individually) locked were being skipped – and that was the point of Mark’s posting.

    For a quick demo case – which may be very sensitive to configuration: I used 10.2.0.3, freelist management, 8KB block sizes, and uniform extents of 1MB:

    create table t1 (
    	id	number(6),
    	modded	number(6)
    )
    pctused 99
    pctfree 0
    ;
    
    insert into t1
    select
    	rownum		id,
    	mod(rownum,3)	modded
    from
    	all_objects
    where
    	rownum <= 5000
    ;
    
    commit;
    
    -- gather stats at this point.
    
    

    This gave me 693 rows in the first block, 2 entries in the ITL, and 10 bytes of free space so that I could not add an ITL entry to the block.

    Run the following from three different sessions – supply 0, 1, and 2 as the input parameter in turn.

    select  id
    from    t1
    where   modded = &1
    and     rownum <=100
    for update
    skip locked
    /
    
    

    The first two sessions will return 100 rows, the third session will (should) return no rows – even though there are no locked rows for the given value of modded – but the first hundred rows that will be scanned cannot be locked because an ITL entry cannot be acquired.

    Remove the skip locked from the query and repeat the test – the third session will get stuck on “enq: TX – allocate ITL entry”.

    Comment by Jonathan Lewis — June 1, 2010 @ 7:43 am BST Jun 1,2010 | Reply

    • Hello Sir,

      “The first two sessions will return 100 rows, the third session will (should) return no rows – even though there are no locked rows for the given value of modded – but the first hundred rows that will be scanned cannot be locked because an ITL entry cannot be acquired….”

      As per your example theree are 693 rows in the first block and I assume the same for the rest of the blocks so there will be 231 rows per modded value

      so why not oracle goes to another block where it could find rows where modded=2 since the intent is to return 100 rows matching the condition also we donot have any order by clause?

      could please clarify

      Many Thanks

      Comment by Henish — June 4, 2010 @ 4:31 pm BST Jun 4,2010 | Reply

      • Henish,

        The clue is in Gwen’s comment (repeated in the opening line of my comment). The “rownum <= 100" predicate takes precedence over the locking. Oracle stops after attempting to lock 100 rows, not after locking 100 rows. It's just the way it behaves – maybe at some point release it will just change.

        Comment by Jonathan Lewis — June 8, 2010 @ 10:03 pm BST Jun 8,2010 | Reply

  5. Gary,

    The bulk collect (or any variant on removing the rownum and using array fetches) is an interesting possibility and worth testing.

    A possible drawback (which I am just speculating on, and have not tested) would be that Oracle may need to lock ALL the relevant data before the first array was returned.

    Jeff Kemp in a comment on Mark’s blog comes up with a similar idea, but then suggests closing the cursor after fetching – but my first thought on that is that the rows would no longer be locked if you closed the cursor.

    Comment by Jonathan Lewis — June 1, 2010 @ 7:47 am BST Jun 1,2010 | Reply

  6. Correction: Mark “got it right” and was simply discussing a different (and more tricky) gotcha than the one I ran into :)

    Comment by Gwen Shapira — June 1, 2010 @ 10:56 pm BST Jun 1,2010 | Reply

  7. Hi,

    I am facing similar problem. SKIP LOCKED query is not returning specified number of records, but returns random number of records each time when multiple threads are run.
    In my scenario,
    Number of concurrent threads = 6
    DB Block size = 8K
    INITRANS = 8
    PCTFREE = 10
    Average row length (according to dba_tab_statistics.avg_row_len) = 260

    Do you think that INITRANS value should be increased ?

    Thanks.

    Comment by Amy — December 27, 2010 @ 9:20 pm BST Dec 27,2010 | Reply

    • Amy,

      Insufficient information to be certain – but since you have concurrency 6 and initrans 8 it seems unlikely. It’s possible, of course, that someone changed the initrans after the table was created and put into service, leaving some older blocks with too few ITL slots, and no space to grow them.

      The first guess, though, would simply be that your six sessions have overlapping requests as described by Chen Shapira in her comment.

      Comment by Jonathan Lewis — December 28, 2010 @ 3:31 pm BST Dec 28,2010 | Reply

  8. Hi Jonathan,

    We are using the same skip locked with rownum <= 30 rows. This is a single table select with two indexes. The table size is around 25M and indexes less than 10M.

    There are 20 sessions running the same sql. I understand from the blog that we can get less than 30 rows on each execution. I see different issue, that the buffer gets per execution is around 1-3 Million. All this cause 100% cpu on the box.

    Our DB version is 10.2.0.4.0.

    I am not able to understand how come it is using 1-3M buffer gets per exec. It happens only when there is some data in the table otherwise buffer gets arr less than 900/exec.

    Comment by DS — August 2, 2011 @ 6:09 pm BST Aug 2,2011 | Reply

  9. We seem to be running into a different problem that possibly some of the others above are also running into. We fetch the rows using skip locked from within a PL/SQL procedure using a bulk collect with a limit. Our limit needs to be on the small size, i.e., 20 rows. However, PL/SQL seems to be “helping” us by pre-fetching 100 rows, causing them to be locked and unavailable to other threads calling this same procedure. But, since our limit is 20, we don’t see the other 80 rows, either. They are eventually unlocked at the commit, but they are still unprocessed, lowering our overall throughput.

    Is there any way to lower the amount PL/SQL prefetches for us or turn it off altogether?

    Thanks,
    -Roger

    Comment by rogflies — September 1, 2011 @ 9:23 pm BST Sep 1,2011 | Reply

    • rogflies,

      Interesting side effect. That was an optimisation introduced in 10g for implicit cursor loops in pl/sql. You can disable the feature by setting plsql_optimize_level to 1 (or possibly zero) from the default of 2, then you will really fetch 1 row at a time.

      The level is stored with the object when you recompile it, so future recompile (but not “create or replace” should continue to use the lower level.

      Comment by Jonathan Lewis — September 1, 2011 @ 10:04 pm BST Sep 1,2011 | Reply

  10. [...] two interesting articles here and [...]

    Pingback by Force Oracle to return TOP N rows with SKIP LOCKED — October 23, 2011 @ 6:32 am BST Oct 23,2011 | Reply

  11. [...] specified by the ORDER BY clause.  Why might this be an important question?  Possibly if the SKIP LOCKED clause is implemented in the SELECT FOR UPDATE statement?  Possibly if a procedure is hanging, and [...]

    Pingback by Select For Update – In What Order are the Rows Locked? « Charles Hooper's Oracle Notes — November 21, 2011 @ 4:58 pm BST Nov 21,2011 | Reply

  12. Hi,

    if we move the locking in the example at a record level in pl/sql then we are able to process 99 records in the second session.

    try this code in the second session.

    create or replace procedure pr_prcess_first_row
    as
    
    cursor c2 
    is select 
    rowid
    from t1
    where 
    rownum < 100;
    
    l_rowid rowid;
    
    type typ_2 is table of c2%rowtype index by binary_integer;
    
    tab_2 typ_2;
    
    begin
    
        open c2;
        fetch c2 bulk collect into tab_2 ;
        for i in tab_2.first..tab_2.last
        loop
    	begin 
    		select rowid 
            	into l_rowid
    		from
    		t1 
    	        where rowid = tab_2(i).rowid 
    		for update skip locked; 
    
    		dbms_output.put_line(' i is ' || i || ' row id ' || l_rowid);	
    	exception
    		when no_data_found
    		then 
    			null;
    	end;
    	
    
        end loop;
    end;
    

    Comment by Sriram K — May 30, 2012 @ 5:53 am BST May 30,2012 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,161 other followers