Skip Locked

May 31, 2010

Skip Locked

Filed under: Troubleshooting — Jonathan Lewis @ 7:41 pm BST May 31,2010

You may already be familiar with the syntax:

select ... for update skip locked;

It appeared (internally) some time back in Oracle 8i to implement some features of advanced queueing, but was only legalised and documented in one of the more recent versions of Oracle. If, like me, you checked the manuals to understand what this did you probably came to the conclusion that it allowed you to issue a query that selected, locked, and returned the rows in a table that matched your query and were not already locked. This isn’t quite correct – the rows your query selects, locks and returns are the ones that can be locked … and that may not give you the results you expect.

I’m in Detroit at the moment, about to start a consulting assignment, and Mark Bobak mentioned this one to me on the drive into town from the airport. As he said: “It’s one of those things that’s obvious after you’ve worked out what’s going on.” If you want to read more, here’s a link to the blog item he wrote about it.

Comments (16)

16 Comments »

I believe Mark got it wrong. Where rownum<100 is applied before the “skip locked” and therefore he should get the unlocked rows from within the first 100. It is expected that the number will be less than 100. I could easily reproduce his issue with just two sessions:

Session 1:

SQL> create table t1 as select object_id from dba_objects where rownum<102;

Table created.

SQL> select count(*),max(object_id) from t1;

  COUNT(*) MAX(OBJECT_ID)
---------- --------------
       101            102

SQL> update t1 set object_id=500 where object_id=90;

1 row updated.

SQL> update t1 set object_id=501 where object_id=91;

1 row updated.

Session 2:

SQL> select * from t1 where rownum<100 for update skip locked;

.....

97 rows selected.

101 rows, two locked. Wouldn’t you expect 99 rows to return? But the two locked are below 100. If I lock two above 100, this would not be an issue:

session 1:

SQL> rollback;

Rollback complete.

SQL>  update t1 set object_id=500 where object_id=101;

1 row updated.

SQL> update t1 set object_id=501 where object_id=102;

1 row updated.

session 2:

SQL>  select * from t1 where rownum<100 for update skip locked;

....

99 rows selected.

Comment by Gwen Shapira — June 1, 2010 @ 1:19 am BST Jun 1,2010 | Reply

Hi Chen,

I take your point, and it’s correct.

But, your demo is substantially different than my scenario.

Further, I don’t believe I “got it wrong”.

Allow me to explain why:

First, the sample query in my write-up was:
```
select ...
from    [table_name]
where   [some predicates]
and     rownum < 101
for update skip locked;
```
Note, in particular, the "[some predicates]".

This is important, in that, in my case the table has 10s of thousands or 100s of thousands of rows, and the "[some predicates]" clause is going to allow for hundreds or thousands of rows that will satisfy the condition. Further, multiple, concurrent processes are running with different values of binds in the predicates. So, in this case, it's quite possible that, given high levels of concurrency, and with each process running with unique bind variable values, you'll run into ITL contention. Also, consider that when we edited our code to remove the "skip locked" clause, we did, in fact, immediately see ITL waits on TX enqueues. If something else were actually going on, the ITL waits would not have been visible.

The point of my blog post was simply that, when using "skip locked", consider that the skipping could actually happen at the block level. This is something that I found counterintuitive, and, to my knowledge, not documented at all.

Comment by mbobak — June 1, 2010 @ 5:39 am BST Jun 1,2010 | Reply
Adding an ORDER BY may be able to prove the ITL hypothesis. Take a table with two blocks with 100 records each and ITLs that allowed only 25 of them to be locked. Lock 25 of the records on the first block in one session, then add another session grabbing 10 records with skip locked with an ORDER BY such that the records on the first block are preferred. If it grabs 10 records off the second block, then it is confirmed that it has skipped those rows on the first block which aren’t individually locked.

Personally it sounds like a well implemented feature, at least from a processing perspective. You want lots of consumers processing available/unencumbered rows with minimal waiting. If it just moved waits from row locks to ITL, you’d still be waiting.

In regards the ROWNUM filter being applied before the SKIP LOCKED exclusions, this one might be better to do with a PL/SQL BULK FETCH with a LIMIT clause.

Comment by Gary — June 1, 2010 @ 6:55 am BST Jun 1,2010 | Reply
Chen’s point is an important one – the rownum applies before the skip locked so if you see a case where Oracle is not returning data that “obviously” should be returned you should consider the possibility she offers before assuming you are seeing a case of Mark’s problem.

In the discussion we had in the car, though, it was very obvious that we were talking about a case where rows which could not have been (individually) locked were being skipped – and that was the point of Mark’s posting.

For a quick demo case – which may be very sensitive to configuration: I used 10.2.0.3, freelist management, 8KB block sizes, and uniform extents of 1MB:
```
create table t1 (
	id	number(6),
	modded	number(6)
)
pctused 99
pctfree 0
;

insert into t1
select
	rownum		id,
	mod(rownum,3)	modded
from
	all_objects
where
	rownum <= 5000
;

commit;

-- gather stats at this point.
```
This gave me 693 rows in the first block, 2 entries in the ITL, and 10 bytes of free space so that I could not add an ITL entry to the block.

Run the following from three different sessions – supply 0, 1, and 2 as the input parameter in turn.
```
select  id
from    t1
where   modded = &1
and     rownum <=100
for update
skip locked
/
```
The first two sessions will return 100 rows, the third session will (should) return no rows – even though there are no locked rows for the given value of modded – but the first hundred rows that will be scanned cannot be locked because an ITL entry cannot be acquired.

Remove the skip locked from the query and repeat the test – the third session will get stuck on “enq: TX – allocate ITL entry”.

Comment by Jonathan Lewis — June 1, 2010 @ 7:43 am BST Jun 1,2010 | Reply
- Hello Sir,
  
  “The first two sessions will return 100 rows, the third session will (should) return no rows – even though there are no locked rows for the given value of modded – but the first hundred rows that will be scanned cannot be locked because an ITL entry cannot be acquired….”
  
  As per your example theree are 693 rows in the first block and I assume the same for the rest of the blocks so there will be 231 rows per modded value
  
  so why not oracle goes to another block where it could find rows where modded=2 since the intent is to return 100 rows matching the condition also we donot have any order by clause?
  
  could please clarify
  
  Many Thanks
  
  Comment by Henish — June 4, 2010 @ 4:31 pm BST Jun 4,2010 | Reply
  - Henish,
    
    The clue is in Gwen’s comment (repeated in the opening line of my comment). The “rownum <= 100" predicate takes precedence over the locking. Oracle stops after attempting to lock 100 rows, not after locking 100 rows. It's just the way it behaves – maybe at some point release it will just change.
    
    Comment by Jonathan Lewis — June 8, 2010 @ 10:03 pm BST Jun 8,2010 | Reply
Gary,

The bulk collect (or any variant on removing the rownum and using array fetches) is an interesting possibility and worth testing.

A possible drawback (which I am just speculating on, and have not tested) would be that Oracle may need to lock ALL the relevant data before the first array was returned.

Jeff Kemp in a comment on Mark’s blog comes up with a similar idea, but then suggests closing the cursor after fetching – but my first thought on that is that the rows would no longer be locked if you closed the cursor.

Comment by Jonathan Lewis — June 1, 2010 @ 7:47 am BST Jun 1,2010 | Reply
Correction: Mark “got it right” and was simply discussing a different (and more tricky) gotcha than the one I ran into :)

Comment by Gwen Shapira — June 1, 2010 @ 10:56 pm BST Jun 1,2010 | Reply
Hi,

I am facing similar problem. SKIP LOCKED query is not returning specified number of records, but returns random number of records each time when multiple threads are run.
In my scenario,
Number of concurrent threads = 6
DB Block size = 8K
INITRANS = 8
PCTFREE = 10
Average row length (according to dba_tab_statistics.avg_row_len) = 260

Do you think that INITRANS value should be increased ?

Thanks.

Comment by Amy — December 27, 2010 @ 9:20 pm GMT Dec 27,2010 | Reply
- Amy,
  
  Insufficient information to be certain – but since you have concurrency 6 and initrans 8 it seems unlikely. It’s possible, of course, that someone changed the initrans after the table was created and put into service, leaving some older blocks with too few ITL slots, and no space to grow them.
  
  The first guess, though, would simply be that your six sessions have overlapping requests as described by Chen Shapira in her comment.
  
  Comment by Jonathan Lewis — December 28, 2010 @ 3:31 pm GMT Dec 28,2010 | Reply
Hi Jonathan,

We are using the same skip locked with rownum <= 30 rows. This is a single table select with two indexes. The table size is around 25M and indexes less than 10M.

There are 20 sessions running the same sql. I understand from the blog that we can get less than 30 rows on each execution. I see different issue, that the buffer gets per execution is around 1-3 Million. All this cause 100% cpu on the box.

Our DB version is 10.2.0.4.0.

I am not able to understand how come it is using 1-3M buffer gets per exec. It happens only when there is some data in the table otherwise buffer gets arr less than 900/exec.

Comment by DS — August 2, 2011 @ 6:09 pm BST Aug 2,2011 | Reply
We seem to be running into a different problem that possibly some of the others above are also running into. We fetch the rows using skip locked from within a PL/SQL procedure using a bulk collect with a limit. Our limit needs to be on the small size, i.e., 20 rows. However, PL/SQL seems to be “helping” us by pre-fetching 100 rows, causing them to be locked and unavailable to other threads calling this same procedure. But, since our limit is 20, we don’t see the other 80 rows, either. They are eventually unlocked at the commit, but they are still unprocessed, lowering our overall throughput.

Is there any way to lower the amount PL/SQL prefetches for us or turn it off altogether?

Thanks,
-Roger

Comment by rogflies — September 1, 2011 @ 9:23 pm BST Sep 1,2011 | Reply
- rogflies,
  
  Interesting side effect. That was an optimisation introduced in 10g for implicit cursor loops in pl/sql. You can disable the feature by setting plsql_optimize_level to 1 (or possibly zero) from the default of 2, then you will really fetch 1 row at a time.
  
  The level is stored with the object when you recompile it, so future recompile (but not “create or replace” should continue to use the lower level.
  
  Comment by Jonathan Lewis — September 1, 2011 @ 10:04 pm BST Sep 1,2011 | Reply
[…] specified by the ORDER BY clause. Why might this be an important question? Possibly if the SKIP LOCKED clause is implemented in the SELECT FOR UPDATE statement? Possibly if a procedure is hanging, and […]

Pingback by Select For Update – In What Order are the Rows Locked? « Charles Hooper's Oracle Notes — November 21, 2011 @ 4:58 pm GMT Nov 21,2011 | Reply

Hi,

if we move the locking in the example at a record level in pl/sql then we are able to process 99 records in the second session.

try this code in the second session.

create or replace procedure pr_prcess_first_row
as

cursor c2 
is select 
rowid
from t1
where 
rownum < 100;

l_rowid rowid;

type typ_2 is table of c2%rowtype index by binary_integer;

tab_2 typ_2;

begin

    open c2;
    fetch c2 bulk collect into tab_2 ;
    for i in tab_2.first..tab_2.last
    loop
	begin 
		select rowid 
        	into l_rowid
		from
		t1 
	        where rowid = tab_2(i).rowid 
		for update skip locked; 

		dbms_output.put_line(' i is ' || i || ' row id ' || l_rowid);	
	exception
		when no_data_found
		then 
			null;
	end;
	

    end loop;
end;

Comment by Sriram K — May 30, 2012 @ 5:53 am BST May 30,2012 | Reply

[…] Here is a very interesting post by Jonathan Lewis about it. […]

Pingback by 11g Feature: Skip Locked Syntax in SELECT FOR UPDATE – |GREP ORA — November 15, 2017 @ 11:15 am GMT Nov 15,2017 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome. Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Scratchpad

May 31, 2010

Skip Locked

16 Comments »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS

Oracle Scratchpad

May 31, 2010

Skip Locked

Share this:

Related

16 Comments »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS