Oracle Scratchpad

May 9, 2008

Manual Optimisation – 2

Filed under: Oracle,Performance,Tuning — Jonathan Lewis @ 1:13 pm BST May 9,2008

[Back to Manual Optimisation part 1][Forward to Manual Optimisation part 3]

A few days ago I posted an example of SQL that could be used to reduce the impact of sorting a large volume of data by sorting the smallest possible subset of the data with its rowids, and then joining back to the original table by rowid.

This produced a few comments, backed by Tom Kyte, about the dangers of depending on (a) SQL returning data in order without a final “order by” clause, (b) the exact and unchangeable use of hints, and (c) an assumption that internal mechanisms would not change.

It’s worth saying a little more about these issues, but I thought I’d start with the background to the SQL that appeared in the previous post as it’s actually derived from a generic strategy that I’ve used a couple of times as a temporary performance fix for Web-based applications.

The basic requirement for many Web-based reporting systems is to be able to run “page-based” reports, which means the ability to respond efficiently to queries like: “return rows 21 to 40 of an ordered set” – searches of Google or Amazon give you the general principle of the need for this type of pagination.

The mechanism of using a couple of “rownum” predicates against an inline view is quite well known as a way of optimising this type of page-based access; but it usually requires you to build a suitable index to support the underlying query. For example:


select
	v2.id,
	v2.small_vc
from
	(
	select
		v1.id,
		v1.small_vc,
		rownum	rn
	from
		(
		select
			t1.id,
			t1.small_vc
		from
			t1
		where
			t1.rep = 100
		order by
			t1.id
		)	v1
	where
		rownum <= 20	-- First N rows, typically a bind variable
	)	v2
where
	v2.rn >= 11		-- Last M rows, typically a bind variable
order by
	v2.rn
;

And this is the execution Plan (10.2.0.3 – dbms_xplan.display_cursor() edited for clariry to remove a few columns):


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Id| Operation                       | Name  | E-Rows | A-Rows | Buffers | Used-Mem |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|  1|  SORT ORDER BY                  |       |     20 |     10 |       4 | 2048  (0)|
|* 2|   VIEW                          |       |     20 |     10 |       4 |          |
|* 3|    COUNT STOPKEY                |       |        |     20 |       4 |          |
|  4|     VIEW                        |       |     21 |     20 |       4 |          |
|  5|      TABLE ACCESS BY INDEX ROWID| T1    |     21 |     20 |       4 |          |
|* 6|       INDEX RANGE SCAN          | T1_PK |        |     20 |       3 |          |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Predicate Information (identified by operation id):
- - - - - - - - - - - - - - - - - - - - - - - - - -
   2 - filter("V2"."RN">=11)
   3 - filter(ROWNUM<=20)
   6 - access("T1"."REP"=100)

For the best performance both the “where” clause and “order by” clause have to be “captured” in a single index, where the “where” clause is an equality condition against the leading column(s) of the index, and the “order by” clause follows the next columns of the index definition.

If you can’t match these requirements, your query will have to collect all the data identified by the “where” clause and sort it according to the “order by” clause before restricting the result set to the single “page” that you wanted. If you do meet the requirements, the optimizer is able to produce a plan that can avoid some (most) of the work by “pushing” the row limit inside the index range scan, as indicated above.

So what do you do when you have to handle a query that does something a little more awkward. Here’s an example of how to minimise the overheads:


select
	/*+
		qb_name(main)
		leading(ptr@main t1@main)
		use_nl(@main t1@main)
	*/
	t1.*
from
	(
	select
		/*+ qb_name(last_M) */
		rid,	rn
	from	(
		select
			/*+ qb_name(first_N) */
			rid,
			rownum	rn
		from
			(
			select
				/*+ qb_name(rowids) */
				rowid	rid
			from
				t1
			where
				log_id = 1
			and	transaction_date >= trunc(sysdate - 1)
			order by
				transaction_id	desc,
				id 		desc
			)
		where
			rownum <= 500
 		)
	where rn >= 481
	)			ptr,
	t1			t1
where
	t1.rowid = ptr.rid
order by
	ptr.rn
;

The driving query doesn’t acquire all the data it needs, it acquires the minimum necessary data, which (in this case) is just the set of rowids for the target data, and sorts that minimum set according to the required ordering clause.

Once you have the sorted set of rowids, you can do the usual “get_N / discard_M” trick with the rownum to get the rowids for the page you want – and with these rowids you can do an efficient join back to the “real” data, accessing exactly the required rows in the most efficient way possible.

In these circumstances, ending the statement with an “order by” clause that repeats the ordering implied by the earlier “order by” clause will ensure that the final result is ordered correctly with only a small increase in the work load … even if you think the “order by” seems to be unnecessary.

But I’ve had to include some hints in the final join back to make sure that the optimizer does something that I know to be sensible.  The example shows the stage where we’re trying to get rows 481 to 500 of the underlying report – and in this case the optimizer can use the 500 that appears in the first “rownum” predicate as part of its cost calculation; but it can’t handle the 481 that appears in the second “concealed” rownum predicate. So the optimizer’s estimate of cardinality for this query is 500, despite the fact that we know that the size of the result is going to be 20.

The result of this is that (with my data set) a point came as I paged through the data where the optimizer switched from the nested loop with rowid access (low cardinality path) to a hash join that scanned the t1 table (high cardinality path) – completely defeating the point of the complex query.

My clever query depends on hints to do what I know to be the right thing all the time, every time.

But how much of a surprise is that ? Go back to the simpler example at the start of this page. Although we’ve written a query that should obviously walk through an index to pick up 20 rows at a time in the right order, it is perfectly feasible for the optimizer to ignore the index when optimising this query, especially if you are after (say) rows 201 to 220 … so even in this simple case you really need to include at least one hint (viz: to use the appropriate index) to make sure that the path doesn´t go wrong.

The point I want to make, of course, is this: any time you want “page-based”  SQL to operate with maximum efficiency you are trying to do something that the optimizer has not been programmed to do for reasons that only you can see. So you will have to supply some hints to block any execution paths that might be inefficient. The argument that my original SQL depends on hints to work efficiently is not a sufficiently powerful argument to stop you from using it – it’s just a reminder that (as with all code that’s hinted) you have to document and manage the code properly.

In the original article a more important criticism of the query I showed was that it depended on assumptions about the implementation of a particular join mechanism. That is a much more powerful criticism – and one that I shall address in my next note on this topic.

Update Sep 2009: There’s a question on the OTN forum of the way in which the Optimizer has treated this query – and a couple of variations – including a couple of points highlighting errors in the optimizer code. (The discussion includes a link back to this post).

[Back to Manual Optimisation part 1][Forward to Manual Optimisation part 3]

12 Comments »

  1. [...] by Manual Optimisation – 2 « Oracle Scratchpad — May 9, 2008 @ 1:13 pm UTC May [...]

    Pingback by Manual Optimisation « Oracle Scratchpad — May 9, 2008 @ 1:15 pm BST May 9,2008 | Reply

  2. Jonathan,

    Last year I posted a similar idea on asktom.oracle.com.

    http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:76812348057#411983300346159899

    The few times I have done this, a cardinality hint on the driving query seemed sufficient. If you have a moment, I would appreciate your comments, especially on what conditions might make more complex hints necessary.

    Tom’s attitude seems to be avoid such queries, don’t optimise them :how often do people really need to see more than the first 10 or so pages?

    Comment by Stew Ashton — May 10, 2008 @ 3:24 pm BST May 10,2008 | Reply

  3. Stew,

    If you check back to the previous posting in this series, you’ll see that my initial question about this query was: “Why does someone want to see 300,000 rows from the database?” So Tom and I agree on that.

    Having said that, if someone really thinks it’s necessary, then you need to be able to do it. There’s also the argument that you may need a technique like this when you only want to see the first few pages of a set that can’t be identified easily anyway.

    I didn’t think of the cardinality() hint – it’s quite a cute option in this case. But the point remains that it’s still a hint and you have to check all hinted code on every patch and upgrade. You never know, one day Oracle may deprecate the cardinality() hint in favour of the opt_estimate() equivalent – remember the selectivity() hint ?

    Comment by Jonathan Lewis — May 10, 2008 @ 4:33 pm BST May 10,2008 | Reply

  4. Every time I read of “pagination-query” I feel the need of the MySql “LIMIT” keyword! :(

    But, on the other hand, we must admit that Oracle and MySql are two very different DB.

    Only a question (but I think the answer is yes because of read consistency): the rowid is copied in the undo segment (or temp) along with the rest of the row data?

    In another way: There’s no possibility that I try to access a row that someone else deleted from the db.

    Comment by lascoltodelvenerdi — May 12, 2008 @ 10:39 am BST May 12,2008 | Reply

  5. lascoltodelvenerdi,

    It’s not quite accurate to say that the rowid is copied into the undo segment – but it’s close enough to explain why there is no read-consistency problem.

    Here’s a sample undo record from 9.2.0.8 after a simple update to a single column:

    *-----------------------------
    * Rec #0x1f  slt: 0x00  objn: 46117(0x0000b425)  objd: 46117  tblspc: 9(0x00000009)
    *       Layer:  11 (Row)   opc: 1   rci 0x00   
    Undo type:  Regular undo    Begin trans    Last buffer split:  No 
    Temp Object:  No 
    Tablespace Undo:  No 
    rdba: 0x00000000
    *-----------------------------
    uba: 0x0080005b.0aef.1e ctl max scn: 0x0000.07e73982 prv tx scn: 0x0000.07e7415c
    KDO undo record:
    KTB Redo 
    op: 0x03  ver: 0x01  
    op: Z
    KDO Op code: URP row dependencies Disabled
      xtype: XA  bdba: 0x0240038a  hdba: 0x02400389
    itli: 3  ispac: 0  maxfr: 4863
    tabn: 0 slot: 15(0xf) flag: 0x2c lock: 0 ckix: 0
    ncol: 4 nnew: 1 size: -2
    col  3: [ 1]  31

    Notice the reference to:
    The object number and data object number:objn: 46117, objd: 46117
    The block address of the block: bdba: 0x0240038a
    The table number within block (for clusters): tabn: 0
    The row directory entry within block (which is the ‘row_number’ component of the rowid slot: 15

    The rowid isn’t in the undo – but it’s part of the undo.

    Comment by Jonathan Lewis — May 12, 2008 @ 8:41 pm BST May 12,2008 | Reply

  6. Thank you Jonathan for the clarification.

    Comment by lascoltodelvenerdi — May 13, 2008 @ 6:28 am BST May 13,2008 | Reply

  7. Same feeling as L’ascolto about MySQL keyword LIMIT: I really wonder if there is a built in Oracle engine limitation to add this keyword. It does not only concern web page builders but also any Field picker routines when you provide a part of a string in order to display all candidates rows. Something like LIMIT(20[,COLUMN][,A|D)) would just means provide the first 20 rows using, if exists, any index on COLUMN in ascending or descending with the usual default features. if no index exists or it was dropped, let’s perform a FT and this query will appear on top IO statspack/AWR as usual bad SQL. The main advantage of the clause would be to stop any operation when the clauses is satisfied, like EXISTS.

    Comment by Polarski Bernard — May 13, 2008 @ 7:52 am BST May 13,2008 | Reply

  8. do you know about pagination-query the google style ?
    (I learnt this from Tom Kyte):
    try

    http://www.google.com/search?q=google&start=991

    you query for google and all you find is – google !

    Comment by Matthias Rogel — May 13, 2008 @ 7:44 pm BST May 13,2008 | Reply

  9. Bernard,
    That would be a nice touch for drop down fields – not that I’m all that keen on that type of thing, it’s too easy to abuse and ruin response times, but sometimes you just have to let people do it – but it would still need some refinement for the “page 2″ type of queries. Does MySQL have an option like “LIMIT 20 LAST 10″

    Matthias,
    What a cute feature: “Sorry, Google does not serve more than 1000 results for any query. (You asked for results starting from 991.)”

    Comment by Jonathan Lewis — May 13, 2008 @ 8:07 pm BST May 13,2008 | Reply

  10. [...] Manual Optimisation 3 Filed under: Execution plans, Hints, Performance, Tuning — Jonathan Lewis @ 6:38 pm UTC Oct 23,2008 [Back to Manual Optimisation part 2] [...]

    Pingback by Manual Optimisation 3 « Oracle Scratchpad — October 23, 2008 @ 6:38 pm BST Oct 23,2008 | Reply

  11. I’ve just added a footnote to this post to link to a question on the same topic that’s just appeared on the OTN database forum.

    Comment by Jonathan Lewis — September 20, 2009 @ 11:28 am BST Sep 20,2009 | Reply

  12. I used the method and it worked great. We had a big join – many rows, lots of columns selected – and the TEMP tablespace was blown away. But the nature of the system ensured that only fraction of the rows will come out as a result. So I did what you recommend here – join to select only the rowid, and for the few rows comming out, join again to select the rest of the infomation needed. It worked – thank you for iinventing and sharing it!

    Comment by Todor Botev — December 13, 2012 @ 10:07 pm BST Dec 13,2012 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,258 other followers