Oracle Scratchpad

September 30, 2010

Rownum effects

Filed under: CBO,Performance,Troubleshooting — Jonathan Lewis @ 6:42 pm GMT Sep 30,2010

Here’s a hidden threat in the optimizer strategy that may cause performance problems if you’re trying to operate a series of batch updates (or batch deletes).

In the past I’ve pointed out that a predicate like “rownum <= N" generally makes the optimizer use “first_rows(N)” optimisation methods – known in the code as first_k_rows optimisation.

This isn’t true for updates and deletes, as the following simple example indicates:


create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		rownum <= 10000
)
select
	rownum			id,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 10000
;

create index t1_i1 on t1(id);

-- gather_table_stats, no histograms, compute, cascade

explain plan for
update t1 set 
	small_vc = upper(small_vc)
where 
	id > 100
and	rownum <= 200
;

select * from table(dbms_xplan.display);

explain plan for
select
	small_vc
from
	t1
where 
	id > 100
and	rownum <= 200
;

select * from table(dbms_xplan.display);


As usual I ran this with system statistics (CPU costing) disabled, using a locally managed tablespace with uniform 1MB extents and freelist management – simply because this leads to a repeatable test. Since I was running 11.1.0.6 I didn’t set the db_file_multiblock_read_count parameter (thus allowing the _db_file_optimizer_read_count to default to 8). These are the plans I got for the update and select respectively:

------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost  |
------------------------------------------------------------
|   0 | UPDATE STATEMENT    |      |   200 |  3000 |    27 |
|   1 |  UPDATE             | T1   |       |       |       |
|*  2 |   COUNT STOPKEY     |      |       |       |       |
|*  3 |    TABLE ACCESS FULL| T1   |  9901 |   145K|    27 |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(ROWNUM<=200)
   3 - filter("ID">100)


----------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost  |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |   200 |  3000 |     6 |
|*  1 |  COUNT STOPKEY               |       |       |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |   200 |  3000 |     6 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |       |       |     2 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=200)
   3 - access("ID">100)

Note how the select statement uses an index range scan with stop key as the best strategy for finding 200 rows and then stopping – and the total cost of 6 is the cost of visiting the (well-clustered) table data for two hundred rows. The update statement uses a full tablescan to find the first 200 rows with a total cost of 27 – which happens to be the cost of a completed tablescan, not the cost of “enough of the tablescan to find 200 rows”. The update statement has NOT been optimized with using the first_k_rows strategy – it has used the all_rows strategy.

The demonstration is just a starting-point of course – you need to do several more checks and tests to convince yourself that first_k_rows optimisation isn’t going to appear for updates (and deletes) and to discover why it can be a problem that needs to be addressed. One of the simplest checks is to look at the 10053 (CBO) trace files to see the critical difference, especially to notice what’s in the trace for the select but missing from the trace for the update. The critical lines show the following type of information – but only in the trace file for the select:


First K Rows: K = 200.00, N = 9901.00
First K Rows: Setup end

First K Rows: K = 200.00, N = 9901.00
First K Rows: old pf = -1.0000000, new pf = 0.0202000

SINGLE TABLE ACCESS PATH (First K Rows)

First K Rows: unchanged join prefix len = 1

Final cost for query block SEL$1 (#0) - First K Rows Plan:

But why might it matter anyway ? Here’s the shape of a piece of SQL, embedded in pl/sql, that I found recently at a client site:


update	tabX set
	col1 = {constant}
where
	col2 in (
		complex subquery
	)
and	{list of other predicates}
and	rownum <= 200
returning
	id
into
	:bind_array
;

For most of the calls to this SQL there would be a small number of rows ready for update, and the pl/sql calling this update statement would populate an array (note the “returning” clause) with the ids for the rows updated and then do something with those ids. Unfortunately there were occasions when the data (and the statistics about the data) covered tens of thousands of rows that needed the update. When this happened the optimizer chose to unnest the complex subquery – instead of using a very precise and efficient filter subquery approach – and do a massive hash semi-join that took a couple of CPU minutes per 200 rows and hammered the system to death for a couple of hours.

If Oracle had followed the first_k_rows optimizer strategy it would have used the “small data” access path and taken much less time to complete the task. As it was we ended up using hints to force the desired access path – in this case it was sufficient to add a /*+ no_unnest */ hint to the subquery.

10 Comments »

  1. Couldn’t a first_rows(200) hint do the job in a more strightforward and self-documenting fashion? Since you’ve chosen no_unnest, I’m guessing no, which raises the question “why not?”

    Cheers,
    Flado

    Comment by Flado — September 30, 2010 @ 8:46 pm GMT Sep 30,2010 | Reply

  2. What was the significance of adding “rownum <=200" to the list of predicates? Was this just an attempt to make the update faster in the knowldege that there would be never more than 200 rows? Or did they just want to do 200 rows at a time because …?

    Comment by John Seaman — October 1, 2010 @ 4:59 am GMT Oct 1,2010 | Reply

    • John,

      I think the intent in the original code was to avoid having to handle a very large array from the “returning” clause – with the assumption that the limit wouldn’t impose a significant overhead if they had to handle a small number of batches

      Comment by Jonathan Lewis — October 1, 2010 @ 6:01 am GMT Oct 1,2010 | Reply

  3. Please, could you explain why CBO does not decide to use index in this situation?

    update /*+gather_plan_statistics*/ t1 set     small_vc =
    upper(small_vc) where     id &gt; 100
    
    Plan hash value: 2927627013
    
    ----------------------------------------------------------------------------------------------
    | Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
    ----------------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT   |      |      1 |        |      0 |00:00:00.11 |     469 |    172 |
    |   1 |  UPDATE            | T1   |      1 |        |      0 |00:00:00.11 |     469 |    172 |
    |*  2 |   TABLE ACCESS FULL| T1   |      1 |   9901 |   9900 |00:00:01.11 |     174 |    171 |
    ----------------------------------------------------------------------------------------------
    

    but if CBO is forced to use the index then the LIO statistic is better

    update /*+gather_plan_statistics index(t1,t1_i1)*/t1 set     small_vc =
    upper(small_vc) where     id &gt; 100
    
    Plan hash value: 3664600488
    
    ----------------------------------------------------------------------------------------------
    | Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
    ----------------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT  |       |      1 |        |      0 |00:00:00.07 |     319 |     22 |
    |   1 |  UPDATE           | T1    |      1 |        |      0 |00:00:00.07 |     319 |     22 |
    |*  2 |   INDEX RANGE SCAN| T1_I1 |      1 |   9901 |   9900 |00:00:00.01 |      22 |     22 |
    ----------------------------------------------------------------------------------------------
    

    Comment by Yuri — October 6, 2010 @ 9:53 am GMT Oct 6,2010 | Reply

  4. Jonathon,

    My humble thinking is we are compare different execution paths, in update, to apply rownum <= 200 result from table has to be materialized. more like in

    select *
    from (select rownum ….)

    My thinking here is, with oracle 11g since full table scans are done in PGA the cost of materialization has gone done and its a valid plan.

    Thanks
    Amir Riaz

    Comment by Amir Riaz — October 18, 2010 @ 7:52 am GMT Oct 18,2010 | Reply

    • Amir Riaz,

      It’s always worth coming up with a small collections of hypotheses to explain an anomaly. The trick is then to think up the simplest test that would eliminate some of those hypotheses.

      In your case a couple of quick checks would be:
      a) Does the (estimated) cost of a tablescan change from 10g to 11g
      b) Is there any difference in the more complex case when tested against 10g rather than 11g.

      If you’re thinking about the case in the previous comment, there’s also a related question about direct path reads for tablescans – if you’re going to do an update you (probably) have to do it in the buffer cache, would this automatically ensure that Oracle won’t read to the PGA for the tablescan.

      Comment by Jonathan Lewis — October 19, 2010 @ 10:47 am GMT Oct 19,2010 | Reply

  5. [...] I have been seeing a lot on stopkey optimization lately. First, at the Michigan OakTable Symposium, I attended Randolf Geist’s presentation on FIRST_ROWS_N. It was a very good presentation with great examples that got me wanting to test some other cases to see the results. Second, Jonathan Lewis posted about a stopkey limitation with updates. [...]

    Pingback by My Possible Pasts » Blog Archive » Stopkey Optimization – Part I — October 28, 2010 @ 3:28 am GMT Oct 28,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,308 other followers