Oracle Scratchpad

July 1, 2008

Ancient History

Filed under: Infrastructure — Jonathan Lewis @ 9:23 pm UTC Jul 1,2008

I’ve just been browsing through a subdirectory on my laptop that has been copied from machine to machine over the years - even though some of the material needs programs that probably don’t exist any more.

This slide, from a presentation I gave at the UKOUG annual conference in 1996, raised a wry smile. The title of the presentation was “Fact, Folklore, or Fairy-table” - posisbly the first ever “Mythbuster” presentation on the Oracle scene. The title of each slide declared a popular belief, and the rest of the slide commented on the sense (or lack thereof) in the belief.

The notes on this particularly slide show that I was talking about Oracle 7 at the time, but I remember explaining the concept to the DBA at one client site shortly after 6.0.33 went into production.

Amazingly, you can still find questions on the internet about tablespace fragmentation - even though the topic should have been done to death years ago.

June 17, 2008

Blog Roll

Filed under: Uncategorized — Jonathan Lewis @ 7:28 pm UTC Jun 17,2008

I’ve added a couple of entries to my link list recently, so I thought I’d give them a quick mention in case you hadn’t noticed them.

Alberto Dell’Era - Lots of material about optimizer join costing, and now a growing collection of interesting pl/sql packages.

ASHMasters - Kyle Hailey, Graham Wood, John Beresniewicz: some of the big names behind ASH who want you to get more out of it.

Tanel Poder - possibly the new Steve Adams.

TED - A site with a number of interesting video presentations that have nothing to do with Oracle.

 

June 15, 2008

Event snapshots

Filed under: Performance, Troubleshooting — Jonathan Lewis @ 8:13 pm UTC Jun 15,2008

[Back to part 2]

Some time ago I wrote a note about using dbms_lock() to slow down your SQL, and followed up with an example of using this type of code to help with trouble-shooting.

One of the most useful views I’ve used this trick on is v$event_histogram - the view which gives you a more detailed breakdown of where the time is going on each of the events reported by v$system_event. As with my example of faking v$session_wait_history, the idea is simple - create a “union all” view with a time-wasting query in the middle.

First we create a simple function that waits for a number of seconds specified by the first input, then returns the second (numeric) input:


create or replace function jpl_numeric_wait (
	i_seconds in number,
	i_value in number default 0
) return number
deterministic
as
begin
	dbms_lock.sleep(i_seconds);
	return i_value;
end;
/

 Then we query v$event_histogram twice - with a delay in the middle:


define m_event = 'db file sequential read'
define m_pause = 60

column tstamp new_value m_tstamp

select to_char(sysdate, 'dd_hh24_mi') tstamp from dual;

spool snap_&m_tstamp

select
	event, wait_time_milli, sum(wait_count)
from (
	select
		event, wait_time_milli, -1 * wait_count  wait_count
	from	v$event_histogram
	where	event = '&m_event'
	union all
	select
		null, jpl_numeric_wait( &m_pause ,-1), null
	from	dual
	union all
	select
		event, wait_time_milli, wait_count
	from	v$event_histogram
	where	event = '&m_event'
	)
where
	wait_time_milli != -1
group by
	event, wait_time_milli
order by
	event, wait_time_milli
;

prompt Output file: snap_&m_tstamp..lst

spool off

You need only create the function once, of course, and you will have to have the privilege to execute the dbms_lock package to do so. The rest of the code can be used in a script which can be modified to pick you favourite events, set the snapshot time, and so on.

Here’s a sample of output from a slightly stressed client system - using the event and time period of the sample code:


EVENT                          WAIT_TIME_MILLI SUM(WAIT_COUNT)
------------------------------ --------------- ---------------
db file sequential read                      1           43430
db file sequential read                      2            6558
db file sequential read                      4            3738
db file sequential read                      8           12241
db file sequential read                     16            4388
db file sequential read                     32             214
db file sequential read                     64             125
db file sequential read                    128              18
db file sequential read                    256               5
db file sequential read                    512               1
db file sequential read                   1024               0
db file sequential read                   2048               0
db file sequential read                   4096               0
db file sequential read                   8192               0

14 rows selected.

Output file: snap_13_14_09.lst

 As you can see, this system has done about 70,000 single block read requests in 60 seconds - and luckily most of them (about 50,000 - the ones that took less than 2 milliseconds) came out of a SAN cache.

The spike of 12,200 reads in the 4 - 8 millisecond region is a good response time, with a trailer of 4,400 out into the 8 - 16 millsecond range that’s a little suspect but allowable for a reasonably loaded system.

But the tail is very worrying - there’s one read that took more than a quarter of a second, and could have been anything up to half a second; there’s 149 reads in total that took more than 32 milliseconds to complete. It’s only a small percentage, but it’s an indication that this is an I/O subsystem that’s on the fringes of catastrophe.  (If you know Cary Millsap’s work, you’ll recognise the knee of the response-time curve in these numbers).

It’s a useful little script: average wait times can often look reasonable when there’s really a catastrophe waiting in the wings.

[Back to part 2]

 

 

June 13, 2008

Recent SQL

Filed under: Troubleshooting — Jonathan Lewis @ 5:01 pm UTC Jun 13,2008

If you’re doing a little casual trouble-shooting, it’s always worth an occasional browse through v$sql just on the off-chance that you’ll find some sort of resource hog that could do with a little fixing.

There are many ways to slice and dice your way through v$sql, but here’s one that I often use when asked to do a quick health-check - especially if a Statspack / AWR report sugggests that the amount of “hard parsing” is unreasonable. The aim is simple - list any SQL that appeared in the last N minutes:


rem
rem	Script:		recent_sql.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Oct 2001
rem	Last update:	Oct 2005
rem
rem	Last tested
rem		11.1.0.6
rem		10.2.0.3
rem		10.1.0.4
rem		 9.2.0.6
rem		 8.1.7.4**
rem
rem	Quick and dirty to get the most recently
rem	arrived bits of SQL from V$sql.
rem
rem	This may be worth running if you see from
rem	Statspack (AWR) that you are reporting
rem	more hard parses than you think reasonable
rem
rem	You may want to adjust the number of
rem	minutes into the past.
rem
rem	The particular target of this query is to
rem	find 'literal string' SQL, so it sorts by
rem	sql_text as this tends to pull "identical"
rem	SQL statements together.
rem
rem	The scan on v$sql will hit the library cache
rem	latches quite hard, so don't run this query
rem	casually - especially if you have a large
rem	shared_pool setting.
rem
rem	** there are some columns in the query
rem	that do not exist in 8i
rem

define m_minutes = 1

set timing off

set linesize 160
set pagesize 100
set termout off

column	timestamp	new_value	m_timestamp
column	baseline	new_value	m_baseline

select
	to_char(sysdate,'ddhh24mi') 		timestamp,
	to_char(
		sysdate - &m_Minutes /(24 * 60),
		'yyyy-mm-dd/hh24:mi:ss'
	)					baseline
from
	dual;

spool recent_sql_&m_timestamp

select
	first_load_time,
	hash_value,	-- 9i and earlier
--	sql_id,		-- 10g
	child_latch,	-- 9i (9.2 ?)
	executions,
	parse_calls,
	sorts,
	fetches,	-- 9i (9.2 ?)
	disk_reads,
	buffer_gets,
	rows_processed,
	cpu_time,	-- 9i
	elapsed_time,	-- 9i
	sql_text
from
	v$sql
where
	first_load_time >= '&m_baseline'
order by
	sql_text
;

prompt Interval:         &m_minutes minute(s)
prompt Output filename:  recent_sql_&m_timestamp..lst

spool off

set termout on

ed recent_sql_&m_timestamp..lst

The code picks up sysdate, substracts one minute (which can be adjusted in the define at the top of the script), and formats the result so that it can be compared with the first_load_time column from v$sql.

As a convenience to the user, the script prints the snapshot interval, the output file name, and then starts an edit on its output.

There are plenty of other columns you might choose, and the ordering can be set to suit your purposes - an obvious alternative would be cpu_time or elapsed time. As the comments say, I like to list the data in order of sql_text as this makes it easy to pick up the SQL which is identical apart from the literal values. (Can I claim a worst-case record here of 16,000 statements in 5 minutes ?).

Just one warning: when you access v$sql, the only efficient entry points are by hash_value and sql_id. Any other access into v$sql (such as the ‘tablescan’ here, will hammer the library cache latches - possibly one or two gets for every object in the library cache - so don’t get too carried away running this script on a system with a large library cache.

June 10, 2008

Firefox

Filed under: Uncategorized — Jonathan Lewis @ 2:53 pm UTC Jun 10,2008

Here’s an interesting URL that I found by following an incoming link a little while ago.

If you run Firefox as your web browser, this “Customizegoogle” add-in allows you to “filter spammy websites from search results”. (It’s got a lot of other features, but this one seems likely to be the most useful to Oracle users).

I’ve got Firefox on my Linux RAC stack - but I may have to download it for my Windows boxes too, especially the laptop which is the machine I use for most of my writing.

Updated: And here’s a URL that should let you do the same thing with Internet Explorer, too.

June 8, 2008

Scientific Method

Filed under: Uncategorized — Jonathan Lewis @ 9:53 pm UTC Jun 8,2008

I’ve finally found out why I seem to disagree with Don Burleson more frequently that I do with other people on the internet.

From a recent OTN thread:

Me: “you’re supposed to design a theory to match the facts, not select the facts to match the theory.”

Burleson: “I think it’s the other way around, Jonathan, the scientific method requires that you start with a hypothesis.”

So that’s my problem - I let the facts stand in the way of a perfectly nice theory.

 

June 5, 2008

Advert - New Book

Filed under: Uncategorized — Jonathan Lewis @ 4:02 pm UTC Jun 5,2008

I posted an early warning about Christian Antognini’s new book some time back.  Well, it’s now on the shelves. For more details, see this URL to read the forewords that Cary Millsap and I supplied

If you’re waiting for volume 2 of my Cost Based Oracle series, then this is the book you should get to keep you going while you wait. If you’re not waiting for my volume 2, then this is a book you should get anyway.

 

May 30, 2008

Nothing Changed

Filed under: CBO, Performance, Statistics, Tuning — Jonathan Lewis @ 10:42 am UTC May 30,2008

The following request appeared on the Oracle Forum a few days ago:

I have a select query

select col1,col2,col3 from table1 order by col1,col3

This table contains 4.5 million records initially and the select was returning records in less than 2 minutes .

This query has been running for the last year with out any issues, but from last week onwards this query is taking more than 15 min to complete . No change in database and other components.

How can I find out the root cause of this issue ? Any specific area I need to check ?

 

 
There are two important points  in this request - first the statement that “nothing changed”, and second the need to discover “what was happening in the past”.

Obviously something did change. It may have been the statistics, it may have been both the data and the statistics.  (And it may have been the index definitions that changed, or an index may have been rebuilt, or some parameter may have been adjusted).

Since the query is now persistently slow, the most innocent explanation is that the data has changed; the second most innocent explanation is that the statistics have changed. The former could change the execution time, the latter could change the execution plan - which could be a poor plan and change the execution time.

The query (as supplied) is not selective. It’s got to acquire 4.5 million rows and supply them in order. There are only a few execution paths for the query, and these depend on the available indexes and constraints:

  1. Full tablescan with sort
  2. Index fast full scan with sort (if an index holding all three columns exists)
  3. Index full scan with no visit to table and no sort if the index in (2) starts (col1, col3)
  4. Index full scan with table access by rowid (and no sort) if an index starts (col1, col3) but doesn’t hold col2

Increase the amount of data and the same plan could take a lot longer: a sort might spill to disk instead of completing in memory; a full scan with or without table access might lose a caching benefit if the index (option 3) or table (option 4) cannot be kept in memory while the scan progresses.

So how you do find out what was happening when things went well ? Possibly the documentation includes a statement of the expected execution plan. Possibly you’ve been running statspack or AWR and this was a query that was captured with its execution plan. Possibly the only thing you can do is guess.

Work out all the possible execution plans for the query, then use hints to check the performance for each execution plan.  If one of the plans drops back to 2 minutes, then maybe that’s the plan you used to have and you work out the root cause by comparing the old plan, the new plan, and the costs.

If none of the other plans gets you close to the original execution time, then something has changed on your system. (Although, possibly, you aren’t doing the tests fairly).

 

May 16, 2008

Best Practices

Filed under: Uncategorized — Jonathan Lewis @ 7:26 pm UTC May 16,2008

This is a note that I wrote for the Northern California Oracle User Group a few months ago. It was published in the February issue of the magazine in the section “Ask the Oracles”, an interesting and innovative section that I first contributed to in August 2006 with a note on hints.

The topic under discussion in the February 2008 issue was actually”Best and Worst Practices”. These were my thoughts - limited to 600 words:

It’s quite hard to say anything sensible or inspiring about a topic like this. Many of the specifics we call best practices are extremely obvious “common sense” but sometimes circumstances conspire to make it impossible to follow them. Many of the worst practices are things that we should obviously avoid, but again circumstances (often pressures of time) may make them unavoidable.

The worst generic practice I know of, then, is applying a fix to a problem without understanding why that fix might work.

I have a simple approach to problem-solving; it involves three steps:

  • What is the problem?
  • Why will my solution fix the problem?
  • Where am I going to pay for implementing this solution?

The last question will sometimes tell us that we can’t afford to implement the “perfect” solution – perhaps we want to change a heap table to an index-organized table (IOT), but can’t because we have too much code that has done something a little exotic with traditional rowids and needs to be rewritten to deal with the “urowid” used for IOTs.

But before we worry about such side effects, overheads and implementation costs, we need to be confident that our solution really is addressing the root cause; because if it isn’t we may spend time and effort implementing a change which seems to fix our problem temporarily – until things go wrong again.

Take a simple example. A query takes an unreasonable amount of time. You check the execution plan and decide the problem might go away if Oracle used a particular index. So you rebuild the index and the query runs much more quickly. Is your job finished?

No doubt the first thing you do is check the execution plan to see that it changed to use the index the way you expected. But does that prove that the performance improvement came from the change in plan – is it possible that your index rebuild used a table-scan that resulted in the target table being cached somewhere (in the SAN cache, for example) so that accesses to the table during your test were much quicker than they would normally be?

Even if the improvement was due to the change in execution plan, do you know why the plan changed? Was it because the rebuild packed the index giving you a smaller leaf block count (or height even), or was it because index rebuilds automatically compute statistics in your version of Oracle and a small change in the statistics (distinct_keys or leaf_blocks being the most likely) made the difference. Will the new plan survive the next statistics collection – even if the index doesn’t start growing immediately after the rebuild?

If you don’t check, you may end up performing a regular, yet redundant and potentially dangerous, rebuild of this index; and when, exactly, does it need to be rebuilt anyway?

What if the change was due to a change in the distinct_keys, there may be “random” occasions when the rebuild strategy just doesn’t work; or a day may come when the rebuild will no longer work because even the freshly rebuilt index has grown past a critical number of distinct_keys or leaf_blocks.

Whenever you make a change that’s supposed to fix a problem, try to capture the before and after information (e.g. statistics, execution plans, work done, number and type of waits). If there’s any doubt in your mind about the root cause, never be afraid to document what you’ve done, and the reasons why you did it – it may save you a lot of time in the future when your “fix” turned out to have been just a temporary lucky coincidence.

 

May 14, 2008

Index Efficiency

Filed under: Infrastructure, Performance, Troubleshooting — Jonathan Lewis @ 9:30 pm UTC May 14,2008

I was looking for an old posting on the Oracle newsgroup recently, when I found a set of posts that talked about measuring the effectiveness of rebuilding some indexes.

I had contributed by pointing to my website where I have a script that demonstrates how to get some detailed information about how well the space in an index is used, and Joel Garry had responded with some results from a production instance.

The full thread starts at this URL

But for a quick summary of the conversation I had with Joel, you could go to each of these posts in turn:

 For other notes I’ve written about index rebuilds:

Happy Birthday!

Filed under: Uncategorized — Jonathan Lewis @ 7:45 pm UTC May 14,2008

This year marks the 25th anniversary of the UK Oracle User Group - and we have a target of 430 presentations covering the many different faces of Oracle’s presence in the IT and business world.

The call for papers has just gone out, with a closing date of May 9th - so if you want to join the fun, sign up soon.

Update:

(1) The closing date has been extended to 20th May - I know it seems early for a conference at the start of December, but there’s a lot of work in putting together an event for 2,500 people and early submissions help a lot.

(2) Since Oracle has just bought BEA, the UKOUG will be putting together a stream for users of their product as well - even if the list of topics in the ‘abstracts’ screen doesn’t include the right label  yet.
 

May 13, 2008

Interesting notes

Filed under: Uncategorized — Jonathan Lewis @ 7:35 pm UTC May 13,2008

This (Oracle-related) blog site caught my eye a few days ago. There isn’t much on it at present, but what’s there is worth reading, so I thought I’d post a pointer to it for a few days.

 

May 12, 2008

Synchronisation

Filed under: Infrastructure, Performance, Tuning — Jonathan Lewis @ 9:19 pm UTC May 12,2008

There are several performance problems that show up only when you start running concurrency tests, and sometimes you need to manage a very precise degree of synchronisation to demonstrate these problems repeatably in a test environment.

For a clean and simply strategy for making sure that processes try to do the same thing at (virtually) the same moment, you need look no further than the dbms_lock package. Here’s a section of code to introduce the technique:


 declare
	n1		number(38);
	m_handle	varchar2(60);
begin

	dbms_lock.allocate_unique(
		lockname	=> 'Synchronize',
		lockhandle	=> m_handle
	);

	n1 := dbms_lock.request(
		lockhandle		=> m_handle,
		lockmode		=> dbms_lock.x_mode,
		timeout 		=> dbms_lock.maxwait,
		release_on_commit 	=> true		-- the default is false !!
	);

	dbms_output.put_line(n1);

end;
/

The code uses the dbms_lock.allocate() call to create a ‘meaningless handle’ for a lock. This allows me to have many programs that refer to a lockable object by name, rather than number - this is a simple feature to enhance safety and readability in your code. Bear in mind, however, that a call to dbms_lock.allocate_unique() issues a commit, so you may want to wrap the call inside an autonomous transaction in your own code.

After generating a lock “handle”, my code fragment requests an exclusive lock (x_mode) on the handle. Other options would be a share lock (s_mode) and all the other possibilities that you might expect of Oracle’s normal locking methods. (See $ORACLE_HOME/rdbms/admin/dbmslock.sql for more details).

You can be very inventive with this code fragment, but the use I make of it is usually very simple:

Session 1: Run the code fragment to acquire an exclusive lock on the “Synchronize” lock.

Session 2 - N: Start the scripts that are supposed to run concurrently. Each script starts with the same code fragment as session 1, but with a request for a share lock (s_mode).

Sessions 2 - N start to wait for their share lock: no-one can hold a share lock on a resource if anyone is holding an exclusive lock.  (On the other hand, you can hold a share lock on a resource if the only other locks on that resource are also share locks).

When sessions 2 - N are all waiting issue a commit in session 1 - at this point the exclusive lock is released (release_on_commit => true), and sessions 2 - N are free to acquire their share locks and resume running.

There are all sorts of ways you can use the dbms_lock package for synchronisation. But one of the really nice reasons for using it is that if a session crashes the  locks disappear as the session dies - so there’s no mess for someone to clear up afterwards.

 

May 10, 2008

CPU usage

Filed under: Performance, Troubleshooting — Jonathan Lewis @ 10:32 am UTC May 10,2008

I made a few comments recently on a post in the Oracle Forum that raised the issue of CPU usage and how busy a CPU can get.

Shortly afterwards my blog dashboard showed a couple of incoming references from a blog entry that Chen Shapira had made about my comments. Her blog had received a couple of follow-up comments (from Nuno Souto, among others) that were worth collating, and also prompted me to demonstrate how nasty things can get if you start to work very hard in a very small area of the database.

If you follow the blog trail, you’ll see that I’ve mentioned a query that jumped from 15 seconds of CPU usage to 45 seconds - purely because of the effect of spin_gets on latches.

If you want to try the same test on different platforms and different versions of Oracle the code to build the test case is shown below. (It’s based on a simple script I wrote many years ago to demonstrate how pointless it was to depend on the buffer cache hit ratio as a source of meanigful information - subsequently hi-jacked by Connor McDonald for his ‘pick a hit ratio’ procedure).


alter session set "_old_connect_by_enabled" = true;

create table kill_cpu_&1(n primary key)
organization index
as
select
	rownum n
from
	all_objects
where
	rownum <= 25  	--  Adjust number (slightly) to suit
;

pause Press return

set timing on
spool latch_test_&1

select
	count(*) X
from
	kill_cpu_&1
connect by
	n > prior n
start with
	n = 1
;

set timing off
spool off

A couple of notes:

The code expect an input parameter, and uses that parameter as part of the table name it creates and queries, and also in the name of an output file.  This means you can run multiple copies of the job against the ’same’ table, although each copy will drop and recreate the table created by the previous copy.

To ensure that no query starts until all the copies of the script have dropped and recreated their tables, I have a ’synchronisation’ step in the code.  The code above shows a simple “pause” - I’ll be posting a note some time in the future about the method I usually use.

Something else I’ve omitted from the sample is the code I use to take snapshots of various dynamic performance views to see where the work goes. Looking at v$session_event, v$mystat (your session’s slice of v$sesstat), and v$latch is very informative.  (In 10g, watch out for  variations in results that depend on whether or not your table happens to collide with the “simulator” latches).

Warning - on my laptop with a Turion dual-core CPU running at 1.6Ghz, the query (with rownum <= 25) ran for 91 seconds under 10.2.0.3. The run time will double for every row you add to the index organized table (IOT) that I’ve used to hold the driving data.

As a flavour of what you might see, here are a few results from my (dual-core) laptop; the first set compares a few versions of Oracle, the second set compares a few scenarios in the same version:

  • 8.1.7.4: Single run 59.35 CPU seconds
  • 9.2.0.8: Single run 64.56 CPU seconds
  • 10.2.0.3: Single run 91.5 CPU seconds

Now trying to cause collisions (using 9.2.0.8, which was the worst case by a significant margin):

  • Single run: 64.56 CPU seconds
  • Two concurrent runs, using different tables: 70.5 CPU seconds each
  • Two concurrent runs against the same table: 177.8 CPU seconds each, plus 7.7 seconds each on “latch free” waits.

The extra CPU time for the last test was basically the consequence of 33 million misses and spin_gets (out of 67M gets) for the cache buffers chains latch. The latch sleep time, by the way, came from 1,500 sleeps (each).

Moral: Although the test is deliberately designed to exaggerate the issue, latch sleeps aren’t necessarily your biggest performance problem when your system is running with a high degree of concurrency - misses and spin_gets can have a significant impact on your CPU usage: and the performance impact can be exacerbated if your basic CPU load is high.

 
Footnote: Here’s a link to an interesting little note from Hermant Chitale about CPU disappearing “outside” Oracle.

 

May 9, 2008

Manual Optimisation - 2

Filed under: Performance, Tuning — Jonathan Lewis @ 1:13 pm UTC May 9,2008

A few days ago I posted an example of SQL that could be used to reduce the impact of sorting a large volume of data by sorting the smallest possible subset of the data with its rowids, and then joining back to the original table by rowid.

This produced a few comments, backed by Tom Kyte, about the dangers of depending on (a) SQL returning data in order without a final “order by” clause, (b) the exact and unchangeable use of hints, and (c) an assumption that internal mechanisms would not change.

It’s worth saying a little more about these issues, but I thought I’d start with the background to the SQL that appeared in the previous post as it’s actually derived from a generic strategy that I’ve used a couple of times as a temporary performance fix for Web-based applications.

The basic requirement for many Web-based reporting systems is to be able to run “page-based” reports, which means the ability to respond efficiently to queries like: “return rows 21 to 40 of an ordered set” - searches of Google or Amazon give you the general principle of the need for this type of pagination.

The mechanism of using a couple of “rownum” predicates against an inline view is quite well known as a way of optimising this type of page-based access; but it usually requires you to build a suitable index to support the underlying query. For example:


select
	v2.id,
	v2.small_vc
from
	(
	select
		v1.id,
		v1.small_vc,
		rownum	rn
	from
		(
		select
			t1.id,
			t1.small_vc
		from
			t1
		where
			t1.rep = 100
		order by
			t1.id
		)	v1
	where
		rownum <= 20	-- First N rows, typically a bind variable
	)	v2
where
	v2.rn >= 11		-- Last M rows, typically a bind variable
order by
	v2.rn
;

Execution Plan (10.2.0.3 - dbms_xplan.display_cursor() edited to remove columns)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Id| Operation                       | Name  | E-Rows | A-Rows | Buffers | Used-Mem |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|  1|  SORT ORDER BY                  |       |     20 |     10 |       4 | 2048  (0)|
|* 2|   VIEW                          |       |     20 |     10 |       4 |          |
|* 3|    COUNT STOPKEY                |       |        |     20 |       4 |          |
|  4|     VIEW                        |       |     21 |     20 |       4 |          |
|  5|      TABLE ACCESS BY INDEX ROWID| T1    |     21 |     20 |       4 |          |
|* 6|       INDEX RANGE SCAN          | T1_PK |        |     20 |       3 |          |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Predicate Information (identified by operation id):
- - - - - - - - - - - - - - - - - - - - - - - - - -
   2 - filter(”V2″.”RN”>=11)
   3 - filter(ROWNUM<=20)
   6 - access(”T1″.”REP”=100)

For the best performance both the “where” clause and “order by” clause have to be “captured” in a single index, where the “where” clause is an equality condition against the leading column(s) of the index, and the “order by” clause follows the next columns of the index definition.

If you can’t match these requirements, your query will have to collect all the data identified by the “where” clause and sort it according to the “order by” clause before restricting the result set to the single “page” that you wanted. If you do meet the requirements, the optimizer is able to produce a plan that can avoid some (most) of the work by “pushing” the row limit inside the index range scan, as indicated above.

So what do you do when you have to handle a query that does something a little more awkward. Here’s an example of how to minimise the overheads:


select
	/*+
		qb_name(main)
		leading(ptr@main t1@main)
		use_nl(@main t1@main)
	*/
	t1.*
from
	(
	select
		/*+ qb_name(last_M) */
		rid,	rn
	from	(
		select
			/*+ qb_name(first_N) */
			rid,
			rownum	rn
		from
			(
			select
				/*+ qb_name(rowids) */
				rowid	rid
			from
				t1
			where
				log_id = 1
			and	transaction_date >= trunc(sysdate - 1)
			order by
				transaction_id	desc,
				id 		desc
			)
		where
			rownum <= 500
 		)
	where rn >= 481
	)			ptr,
	t1			t1
where
	t1.rowid = ptr.rid
order by
	ptr.rn
;

The driving query doesn’t acquire all the data it needs, it acquires the minimum necessary data, which (in this case) is just the set of rowids for the target data, and sorts that minimum set according to the required ordering clause.

Once you have the sorted set of rowids, you can do the usual “get_N / discard_M” trick with the rownum to get the rowids for the page you want - and with these rowids you can do an efficient join back to the “real” data, accessing exactly the required rows in the most efficient way possible.

In these circumstances, ending the statement with an “order by” clause that repeats the ordering implied by the earlier “order by” clause will ensure that the final result is ordered correctly with only a small increase in the work load … even if you think the “order by” seems to be unnecessary.

But I’ve had to include some hints in the final join back to make sure that the optimizer does something that I know to be sensible.  The example shows the stage where we’tr trying to get rows 481 to 500 of the underlying report - and in this case the optimizer can use the 500 that appears in the first “rownum” predicate as part of its cost calculation; but it can’t handle the 481 that appears in the second “concealed” rownum predicate. So the optimizer’s estimate of cardinality for this query is 500, despite the fact that we know that the size of the result is going to be 20.

The result of this is that (with my data set) a point came as I paged through the data where the optimizer switched from the nested loop with rowid access (low cardinality path) to a hash join that scanned the t1 table (high cardinality path) - completely defeating the point of the complex query.

My clever query depends on hints to do what I know to be the right thing all the time, every time.

But how much of a surprise is that ? Go back to the simpler example at the start of this page. Although we’ve written a query that should obviously walk through an index to pick up 20 rows at a time in the right order, it is perfectly feasible for the optimizer to ignore the index when optimising this query, especially if you are after (say) rows 201 to 220 … so even in this simple case you really need to include at least one hint (viz: to use the apporpriate index) to make sure that the path doesn´t go wrong.

The point I want to make, of course, is this - any time you want “page-based”  SQL to operate with maximum efficiency, you are trying to do something that the optimizer has not been programmed to do, for reasons that only you can see. So you will have to supply some hints to block any execution paths that might be inefficient. The argument that my original SQL depends on hints to work efficiently is not a sufficiently powerful argument to stop you from using it - it’s just a reminder that (as with all code that’s hinted) you have to document and manage the code properly.

In the original article a more important criticism of the query I showed was that it depended on assumptions about the implementation of a particular join mechanism. That is a much more powerful criticism - and one that I shall address in my next note on this topic.

[Back to Manual Optimisation part 1]

 

Next Page »

Blog at WordPress.com.