Oracle Scratchpad

May 16, 2008

Best Practices

Filed under: Uncategorized — Jonathan Lewis @ 7:26 pm UTC May 16,2008

This is a note that I wrote for the Northern California Oracle User Group a few months ago. It was published in the February issue of the magazine in the section “Ask the Oracles”, an interesting and innovative section that I first contributed to in August 2006 with a note on hints.

The topic under discussion in the February 2008 issue was actually”Best and Worst Practices”. These were my thoughts - limited to 600 words:

It’s quite hard to say anything sensible or inspiring about a topic like this. Many of the specifics we call best practices are extremely obvious “common sense” but sometimes circumstances conspire to make it impossible to follow them. Many of the worst practices are things that we should obviously avoid, but again circumstances (often pressures of time) may make them unavoidable.

The worst generic practice I know of, then, is applying a fix to a problem without understanding why that fix might work.

I have a simple approach to problem-solving; it involves three steps:

  • What is the problem?
  • Why will my solution fix the problem?
  • Where am I going to pay for implementing this solution?

The last question will sometimes tell us that we can’t afford to implement the “perfect” solution – perhaps we want to change a heap table to an index-organized table (IOT), but can’t because we have too much code that has done something a little exotic with traditional rowids and needs to be rewritten to deal with the “urowid” used for IOTs.

But before we worry about such side effects, overheads and implementation costs, we need to be confident that our solution really is addressing the root cause; because if it isn’t we may spend time and effort implementing a change which seems to fix our problem temporarily – until things go wrong again.

Take a simple example. A query takes an unreasonable amount of time. You check the execution plan and decide the problem might go away if Oracle used a particular index. So you rebuild the index and the query runs much more quickly. Is your job finished?

No doubt the first thing you do is check the execution plan to see that it changed to use the index the way you expected. But does that prove that the performance improvement came from the change in plan – is it possible that your index rebuild used a table-scan that resulted in the target table being cached somewhere (in the SAN cache, for example) so that accesses to the table during your test were much quicker than they would normally be?

Even if the improvement was due to the change in execution plan, do you know why the plan changed? Was it because the rebuild packed the index giving you a smaller leaf block count (or height even), or was it because index rebuilds automatically compute statistics in your version of Oracle and a small change in the statistics (distinct_keys or leaf_blocks being the most likely) made the difference. Will the new plan survive the next statistics collection – even if the index doesn’t start growing immediately after the rebuild?

If you don’t check, you may end up performing a regular, yet redundant and potentially dangerous, rebuild of this index; and when, exactly, does it need to be rebuilt anyway?

What if the change was due to a change in the distinct_keys, there may be “random” occasions when the rebuild strategy just doesn’t work; or a day may come when the rebuild will no longer work because even the freshly rebuilt index has grown past a critical number of distinct_keys or leaf_blocks.

Whenever you make a change that’s supposed to fix a problem, try to capture the before and after information (e.g. statistics, execution plans, work done, number and type of waits). If there’s any doubt in your mind about the root cause, never be afraid to document what you’ve done, and the reasons why you did it – it may save you a lot of time in the future when your “fix” turned out to have been just a temporary lucky coincidence.

 

May 14, 2008

Index Efficiency

Filed under: Infrastructure, Performance, Troubleshooting — Jonathan Lewis @ 9:30 pm UTC May 14,2008

I was looking for an old posting on the Oracle newsgroup recently, when I found a set of posts about talked about measuring the effectiveness of rebuilding some indexes.

I had contributed by pointing to my website where I have a script that demonstrates how to get some detailed information about how well the space in an index is used, and Joel Garry had responded with some results from a production instance.

The full thread starts at this URL

But for a quick summary of the conversation I had with Joel, you could go to each of these posts in turn:

 For other notes I’ve written about index rebuilds:

Happy Birthday!

Filed under: Uncategorized — Jonathan Lewis @ 7:45 pm UTC May 14,2008

This year marks the 25th anniversary of the UK Oracle User Group - and we have a target of 430 presentations covering the many different faces of Oracle’s presence in the IT and business world.

The call for papers has just gone out, with a closing date of May 9th - so if you want to join the fun, sign up soon.

Update:

(1) The closing date has been extended to 20th May - I know it seems early for a conference at the start of December, but there’s a lot of work in putting together an event for 2,500 people and early submissions help a lot.

(2) Since Oracle has just bought BEA, the UKOUG will be putting together a stream for users of their product as well - even if the list of topics in the ‘abstracts’ screen doesn’t include the right label  yet.
 

May 13, 2008

Interesting notes

Filed under: Uncategorized — Jonathan Lewis @ 7:35 pm UTC May 13,2008

This (Oracle-related) blog site caught my eye a few days ago. There isn’t much on it at present, but what’s there is worth reading, so I thought I’d post a pointer to it for a few days.

 

May 12, 2008

Synchronisation

Filed under: Infrastructure, Performance, Tuning — Jonathan Lewis @ 9:19 pm UTC May 12,2008

There are several performance problems that show up only when you start running concurrency tests, and sometimes you need to manage a very precise degree of synchronisation to demonstrate these problems repeatably in a test environment.

For a clean and simply strategy for making sure that processes try to do the same thing at (virtually) the same moment, you need look no further than the dbms_lock package. Here’s a section of code to introduce the technique:


 declare
	n1		number(38);
	m_handle	varchar2(60);
begin

	dbms_lock.allocate_unique(
		lockname	=> 'Synchronize',
		lockhandle	=> m_handle
	);

	n1 := dbms_lock.request(
		lockhandle		=> m_handle,
		lockmode		=> dbms_lock.x_mode,
		timeout 		=> dbms_lock.maxwait,
		release_on_commit 	=> true		-- the default is false !!
	);

	dbms_output.put_line(n1);

end;
/

The code uses the dbms_lock.allocate() call to create a ‘meaningless handle’ for a lock. This allows me to have many programs that refer to a lockable object by name, rather than number - this is a simple feature to enhance safety and readability in your code. Bear in mind, however, that a call to dbms_lock.allocate_unique() issues a commit, so you may want to wrap the call inside an autonomous transaction in your own code.

After generating a lock “handle”, my code fragment requests an exclusive lock (x_mode) on the handle. Other options would be a share lock (s_mode) and all the other possibilities that you might expect of Oracle’s normal locking methods. (See $ORACLE_HOME/rdbms/admin/dbmslock.sql for more details).

You can be very inventive with this code fragment, but the use I make of it is usually very simple:

Session 1: Run the code fragment to acquire an exclusive lock on the “Synchronize” lock.

Session 2 - N: Start the scripts that are supposed to run concurrently. Each script starts with the same code fragment as session 1, but with a request for a share lock (s_mode).

Sessions 2 - N start to wait for their share lock: no-one can hold a share lock on a resource if anyone is holding an exclusive lock.  (On the other hand, you can hold a share lock on a resource if the only other locks on that resource are also share locks).

When sessions 2 - N are all waiting issue a commit in session 1 - at this point the exclusive lock is released (release_on_commit => true), and sessions 2 - N are free to acquire their share locks and resume running.

There are all sorts of ways you can use the dbms_lock package for synchronisation. But one of the really nice reasons for using it is that if a session crashes the  locks disappear as the session dies - so there’s no mess for someone to clear up afterwards.

 

May 10, 2008

CPU usage

Filed under: Performance, Troubleshooting — Jonathan Lewis @ 10:32 am UTC May 10,2008

I made a few comments recently on a post in the Oracle Forum that raised the issue of CPU usage and how busy a CPU can get.

Shortly afterwards my blog dashboard showed a couple of incoming references from a blog entry that Chen Shapira had made about my comments. Her blog had received a couple of follow-up comments (from Nuno Souto, among others) that were worth collating, and also prompted me to demonstrate how nasty things can get if you start to work very hard in a very small area of the database.

If you follow the blog trail, you’ll see that I’ve mentioned a query that jumped from 15 seconds of CPU usage to 45 seconds - purely because of the effect of spin_gets on latches.

If you want to try the same test on different platforms and different versions of Oracle the code to build the test case is shown below. (It’s based on a simple script I wrote many years ago to demonstrate how pointless it was to depend on the buffer cache hit ratio as a source of meanigful information - subsequently hi-jacked by Connor McDonald for his ‘pick a hit ratio’ procedure).


alter session set "_old_connect_by_enabled" = true;

create table kill_cpu_&1(n primary key)
organization index
as
select
	rownum n
from
	all_objects
where
	rownum <= 25  	--  Adjust number (slightly) to suit
;

pause Press return

set timing on
spool latch_test_&1

select
	count(*) X
from
	kill_cpu_&1
connect by
	n > prior n
start with
	n = 1
;

set timing off
spool off

A couple of notes:

The code expect an input parameter, and uses that parameter as part of the table name it creates and queries, and also in the name of an output file.  This means you can run multiple copies of the job against the ’same’ table, although each copy will drop and recreate the table created by the previous copy.

To ensure that no query starts until all the copies of the script have dropped and recreated their tables, I have a ’synchronisation’ step in the code.  The code above shows a simple “pause” - I’ll be posting a note some time in the future about the method I usually use.

Something else I’ve omitted from the sample is the code I use to take snapshots of various dynamic performance views to see where the work goes. Looking at v$session_event, v$mystat (your session’s slice of v$sesstat), and v$latch is very informative.  (In 10g, watch out for  variations in results that depend on whether or not your table happens to collide with the “simulator” latches).

Warning - on my laptop with a Turion dual-core CPU running at 1.6Ghz, the query (with rownum <= 25) ran for 91 seconds under 10.2.0.3. The run time will double for every row you add to the index organized table (IOT) that I’ve used to hold the driving data.

As a flavour of what you might see, here are a few results from my (dual-core) laptop; the first set compares a few versions of Oracle, the second set compares a few scenarios in the same version:

  • 8.1.7.4: Single run 59.35 CPU seconds
  • 9.2.0.8: Single run 64.56 CPU seconds
  • 10.2.0.3: Single run 91.5 CPU seconds

Now trying to cause collisions (using 9.2.0.8, which was the worst case by a significant margin):

  • Single run: 64.56 CPU seconds
  • Two concurrent runs, using different tables: 70.5 CPU seconds each
  • Two concurrent runs against the same table: 177.8 CPU seconds each, plus 7.7 seconds each on “latch free” waits.

The extra CPU time for the last test was basically the consequence of 33 million misses and spin_gets (out of 67M gets) for the cache buffers chains latch. The latch sleep time, by the way, came from 1,500 sleeps (each).

Moral: Although the test is deliberately designed to exaggerate the issue, latch sleeps aren’t necessarily your biggest performance problem when your system is running with a high degree of concurrency - misses and spin_gets can have a significant impact on your CPU usage: and the performance impact can be exacerbated if your basic CPU load is high.

 
Footnote: Here’s a link to an interesting little note from Hermant Chitale about CPU disappearing “outside” Oracle.

 

May 9, 2008

Manual Optimisation - 2

Filed under: Performance, Tuning — Jonathan Lewis @ 1:13 pm UTC May 9,2008

A few days ago I posted an example of SQL that could be used to reduce the impact of sorting a large volume of data by sorting the smallest possible subset of the data with its rowids, and then joining back to the original table by rowid.

This produced a few comments, backed by Tom Kyte, about the dangers of depending on (a) SQL returning data in order without a final “order by” clause, (b) the exact and unchangeable use of hints, and (c) an assumption that internal mechanisms would not change.

It’s worth saying a little more about these issues, but I thought I’d start with the background to the SQL that appeared in the previous post as it’s actually derived from a generic strategy that I’ve used a couple of times as a temporary performance fix for Web-based applications.

The basic requirement for many Web-based reporting systems is to be able to run “page-based” reports, which means the ability to respond efficiently to queries like: “return rows 21 to 40 of an ordered set” - searches of Google or Amazon give you the general principle of the need for this type of pagination.

The mechanism of using a couple of “rownum” predicates against an inline view is quite well known as a way of optimising this type of page-based access; but it usually requires you to build a suitable index to support the underlying query. For example:


select
	v2.id,
	v2.small_vc
from
	(
	select
		v1.id,
		v1.small_vc,
		rownum	rn
	from
		(
		select
			t1.id,
			t1.small_vc
		from
			t1
		where
			t1.rep = 100
		order by
			t1.id
		)	v1
	where
		rownum <= 20	-- First N rows, typically a bind variable
	)	v2
where
	v2.rn >= 11		-- Last M rows, typically a bind variable
order by
	v2.rn
;

Execution Plan (10.2.0.3 - dbms_xplan.display_cursor() edited to remove columns)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| Id| Operation                       | Name  | E-Rows | A-Rows | Buffers | Used-Mem |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|  1|  SORT ORDER BY                  |       |     20 |     10 |       4 | 2048  (0)|
|* 2|   VIEW                          |       |     20 |     10 |       4 |          |
|* 3|    COUNT STOPKEY                |       |        |     20 |       4 |          |
|  4|     VIEW                        |       |     21 |     20 |       4 |          |
|  5|      TABLE ACCESS BY INDEX ROWID| T1    |     21 |     20 |       4 |          |
|* 6|       INDEX RANGE SCAN          | T1_PK |        |     20 |       3 |          |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Predicate Information (identified by operation id):
- - - - - - - - - - - - - - - - - - - - - - - - - -
   2 - filter(”V2″.”RN”>=11)
   3 - filter(ROWNUM<=20)
   6 - access(”T1″.”REP”=100)

For the best performance both the “where” clause and “order by” clause have to be “captured” in a single index, where the “where” clause is an equality condition against the leading column(s) of the index, and the “order by” clause follows the next columns of the index definition.

If you can’t match these requirements, your query will have to collect all the data identified by the “where” clause and sort it according to the “order by” clause before restricting the result set to the single “page” that you wanted. If you do meet the requirements, the optimizer is able to produce a plan that can avoid some (most) of the work by “pushing” the row limit inside the index range scan, as indicated above.

So what do you do when you have to handle a query that does something a little more awkward. Here’s an example of how to minimise the overheads:


select
	/*+
		qb_name(main)
		leading(ptr@main t1@main)
		use_nl(@main t1@main)
	*/
	t1.*
from
	(
	select
		/*+ qb_name(last_M) */
		rid,	rn
	from	(
		select
			/*+ qb_name(first_N) */
			rid,
			rownum	rn
		from
			(
			select
				/*+ qb_name(rowids) */
				rowid	rid
			from
				t1
			where
				log_id = 1
			and	transaction_date >= trunc(sysdate - 1)
			order by
				transaction_id	desc,
				id 		desc
			)
		where
			rownum <= 500
 		)
	where rn >= 481
	)			ptr,
	t1			t1
where
	t1.rowid = ptr.rid
order by
	ptr.rn
;

The driving query doesn’t acquire all the data it needs, it acquires the minimum necessary data, which (in this case) is just the set of rowids for the target data, and sorts that minimum set according to the required ordering clause.

Once you have the sorted set of rowids, you can do the usual “get_N / discard_M” trick with the rownum to get the rowids for the page you want - and with these rowids you can do an efficient join back to the “real” data, accessing exactly the required rows in the most efficient way possible.

In these circumstances, ending the statement with an “order by” clause that repeats the ordering implied by the earlier “order by” clause will ensure that the final result is ordered correctly with only a small increase in the work load … even if you think the “order by” seems to be unnecessary.

But I’ve had to include some hints in the final join back to make sure that the optimizer does something that I know to be sensible.  The example shows the stage where we’tr trying to get rows 481 to 500 of the underlying report - and in this case the optimizer can use the 500 that appears in the first “rownum” predicate as part of its cost calculation; but it can’t handle the 481 that appears in the second “concealed” rownum predicate. So the optimizer’s estimate of cardinality for this query is 500, despite the fact that we know that the size of the result is going to be 20.

The result of this is that (with my data set) a point came as I paged through the data where the optimizer switched from the nested loop with rowid access (low cardinality path) to a hash join that scanned the t1 table (high cardinality path) - completely defeating the point of the complex query.

My clever query depends on hints to do what I know to be the right thing all the time, every time.

But how much of a surprise is that ? Go back to the simpler example at the start of this page. Although we’ve written a query that should obviously walk through an index to pick up 20 rows at a time in the right order, it is perfectly feasible for the optimizer to ignore the index when optimising this query, especially if you are after (say) rows 201 to 220 … so even in this simple case you really need to include at least one hint (viz: to use the apporpriate index) to make sure that the path doesn´t go wrong.

The point I want to make, of course, is this - any time you want “page-based”  SQL to operate with maximum efficiency, you are trying to do something that the optimizer has not been programmed to do, for reasons that only you can see. So you will have to supply some hints to block any execution paths that might be inefficient. The argument that my original SQL depends on hints to work efficiently is not a sufficiently powerful argument to stop you from using it - it’s just a reminder that (as with all code that’s hinted) you have to document and manage the code properly.

In the original article a more important criticism of the query I showed was that it depended on assumptions about the implementation of a particular join mechanism. That is a much more powerful criticism - and one that I shall address in my next note on this topic.

[Back to Manual Optimisation part 1]

 

May 5, 2008

Hash Clusters - 1

Filed under: Infrastructure, Performance, Tuning — Jonathan Lewis @ 7:36 pm UTC May 5,2008

Someone posted a question on the Oracle Forum yesterday about single table hash clusters, and I’ve given some initial commentary on the topic. Since this might be generally useful - and since there’s quite a lot more that could (and probably will) be said on the topic I thought I’d put a link to my respone here.

 

May 2, 2008

Rules for Hinting

Filed under: Hints — Jonathan Lewis @ 8:13 am UTC May 2,2008

I’ve written several notes about hinting, such as:

In fact, by using at the “Select Category” list to the right, I see that I have (so far) tagged 26 different articles (and this will be the 27th) with the hints tag. So I’ve decided it was time that I made clear my basic guidelines on safe hinting, as follows:

  1. Don’t
  2. If you must use hints, then assume you’ve used them incorrectly.
  3. On every patch or upgrade to Oracle, assume every piece of hinted SQL is going to do the wrong thing - because of (2) above; you’ve been lucky so far, but the patch/upgrade lets you discover your mistake.
  4. Every time you apply some DDL to an object that appears in a piece of hinted SQL, assume that the hinted SQL is going to do the wrong thing - because of (2) above; you’ve been lucky so far, but the structural change lets you discover your mistake.

You will appreciate from these guidelines that I don’t really approve of using hints. The only reason that I leave them in place on a producton system is when I’m sure that there is no alternative safe mechanism for making the optimizer do what I want.

What I use them for on test systems is to check whether a particular execution plan is actually possible, and to track down bugs in the optimizer.

Finally, for the purposes of education, I use them to demonstrate execution plans without having craft data sets and set database parameters so make a plan appear ’spontaneously’.

Always be cautious about adding hints to production systems.

 

April 27, 2008

Manual Optimisation

Filed under: Execution plans, Performance, Tuning — Jonathan Lewis @ 5:21 pm UTC Apr 27,2008

[Forward to Manual Optimisation part 2]

Warning: The following note supplies an example of a mechanism that I believe to be safe in Oracle 8i, 9i, and 10g (except when hidden parameter _optimizer_ignore_hints is set to true) and is probably safe in 11g.

However, the SQL shown is expected to return data in a given order without a final “order by” clause, and it is Tom Kyte’s opinion  that it would be dangerous to use this code on a production system.  

I agree with Tom’s argument - to the degree that any hinted code or any code that depends on side-effects of the current mechanics should be treated with great caution and its use should be documented, kept to a minimum, and checked regularly.

The following question appeared in a recent post on the Oracle DBA Forum.

We have a developer here who thinks that DBAs can make things happen. His query is against a single table and his result set is around 300,000 records and he wants to sort this by two columns (transaction_id desc, id desc) where id is the primary key in the table. The query does not take a long time but the sort part is the problem. Is this doable or not. His query is:


select  *
from    event
where   log_id = 1
and     date_posted > trunc(sysdate-1)
order by
        transaction_id desc,
        id desc
;

Before you get too busy trying to speed up such a query, one of the important questions you have to ask in response is: “Why does anyone want to retrieve 300,000 rows from the database?”

There are valid reasons for doing so, of course. You may, for example, be driving a statistical analysis  package that needs to acquire large volumes of raw data (though the order by then looks a little suspect); but no-one reads 300,000 rows of a report, and if the front-end code is acquiring the data to summarise it then it’s probably better to summarise it in the database.

Let us assume, however, that (a) the front-end really does need to the get this data, (b) it does need to appear in sorted order, and (c) it is a relatively small extract from a much larger dataset. Is there any way to make this more efficiently.

The main problem we have to face is that we have a range-based predicate that gets a lot of data combined with an order by clause that uses a different set of columns from the predicate list. This makes it difficult to optimize the query through a simple index. (If we could change the second predicate to date_posted = trunc(sysdate-1) then an index on (log_id, date_posted, transaction_id desc, id desc) would allow the optimizer to do an index range scan, access the table, and return the data in order without sorting).

Note (2nd May) - at this point I managed to change from ‘date_posted >’ to ‘transaction_date >=’ without spotting it. I have left this change in place, given a conversation about it that appears in the comments.

We might be lucky, of course. If there is a strong correlation between log_id and transaction_date such that most of the data with log_id=1 also had transaction_date >= trunc(sysdate - 1), then we might get a reasonable response time by creating the index (log_id, transaction_id desc, id desc) - allowing Oracle to use an index range scan to access the data in the right order picking up a little more data (we hope) than we need, then discarding the (few, we hope) rows which fail the test on transaction_date.

Since the stated problem is one of sorting - rather than accessing - the data, we won’t consider the options for creating a ‘fat index’ (see Book Reviews for an explanation of this term).

So what does that leave us with ? We have been told that it is the sorting that is the problem - so possibly the number of columns in that “select *” is large, making the total volume of data (although “only” 300,000 rows) very large. Is there any way we could minimise the sorting if we can’t build a “perfect” index ?

The answer is yes - although it may require a little lateral thinking.

In this case, we will need a slightly smaller and simpler index - no descending columns - than the four column index shown above: (log_id, date_posted, transaction_id, id) but we’re not going to use it in the standard way. Instead we’re going to take a two-pass approach to the problem. We start with the following SQL:


select
	rowid	rid
from
	t1
where
	log_id = 1
and	transaction_date >= trunc(sysdate - 1)
order by
	transaction_id	desc,
	id		desc
;
 

This gets us the rowids of all the rows that our main query wants, and because of the order by clause in this query, if we visited the rows in the order of our result set, we would be visiting them in exactly the order that would satisfy our original query.

Although we are still performing a sort operation here, we are sorting the smallest possible set of values that we could get away with, and once we have the rowids in order we can use them to visit the table with no further sorting. So all we have to do is stick this initial query into a non-mergeable  inline view:


select
	/*+
		qb_name(main)
		leading (ptr@main t1@main)
		use_nl(@main t1@main)
		rowid(@main t1@main)
	*/
	t1.*
from
	(
	select
		/*+
			no_merge
			qb_name(rowids)
			no_eliminate_oby(@rowids)
			index(@rowids t1(log_id, transaction_date, transaction_id, id))
		*/
		rowid	rid
	from
		t1
	where
		log_id = 1
	and	transaction_date >= trunc(sysdate - 1)
	order by
		transaction_id	desc,
		id 		desc
	)	ptr,
	t1
where
	t1.rowid = ptr.rid
;

This query gets us the data we want, in the order we want, with the smallest possible sort. The cost/benefit analysis we have to do is simple: does the random access to the table that this execution path gives us result in less work than any other mechanism that gets the data and sorts entire rows. The answer to that question depends on the data volume and distribution.

You’ll notice the great stack of 10g hints - I have used this type of code on 8i and 9i in the past, but I’ve always put in a lot of hints to minimise the risk of Oracle doing something “clever” to spoil my plan. 10g introduces cost-based query transformation - and has far more options for unwrapping any clever tricks you introduce to SQL; so if you’re going to try messing about with “manual” optimisation you do need a lot of hints to block every new feature that might cause problems. (Highlighted following note 18 and its link to AskTom - see also “Rules for Hinting”)

The most important hint in this example is probably the no_eliminate_oby hint, which I have described in the past. Without this hint you may find that the optimizer decides that the order by adds no value to the inline view, and eliminates it - while still honouring the no_merge hint.

The execution plan you get from this query, running under 10.2.0.3 in my case, is as follows:


-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |  3400 |  1470K|  3420   (1)| 00:00:42 |
|   1 |  NESTED LOOPS               |       |  3400 |  1470K|  3420   (1)| 00:00:42 |
|   2 |   VIEW                      |       |  3400 | 40800 |    17   (6)| 00:00:01 |
|   3 |    SORT ORDER BY            |       |  3400 |   106K|    17   (6)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN        | T1_I1 |  3400 |   106K|    16   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY USER ROWID| T1    |     1 |   431 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("LOG_ID"=1 AND "TRANSACTION_DATE">=TRUNC(SYSDATE@!-1) AND
              "TRANSACTION_DATE" IS NOT NULL)

As you can see, the view operator tells us that we’ve created an ordered list of the inline view, and then, for each rowid in the list, accessed the t1 table by “user” rowid - just as we wanted.

So - when you can’t do what you want with simple SQL, remember that you can always decompose a query into steps to reduce the size of the ‘hard work’, then use an intermediate result set (possibly joined back to the original table, as in my example) to collect the rest of the data. It’s a technique I have used successfuly to suppy short-term fixes to performance issues that some of my clients have had on their web-based applications.

[Forward to Manual Optimisation part 2]

April 25, 2008

Cardinality Change

Filed under: CBO, Execution plans, Statistics, Troubleshooting — Jonathan Lewis @ 8:13 pm UTC Apr 25,2008

Here’s an interesting point that’s just been raised on from the Oracle-L list-server:

Metalink bug: 5483301.8

If a column has a frequency histogram and a query uses an equality predicate over the column looking for a value that does not exists in the histogram then the CBO was estimating a cardinality of 1. This could favor Nested Loops too much.

The fix introduces a notable change in behavior: instead of estimating a cardinality of 1 then with this fix CBO estimates a cardinality of (0.5 * the lowest cardinality found in the histogram). This can result in plan changes.

Funnily enough, I had a client who ran into a problem assoiated with this cardinality effect just a couple of weeks ago - although there “real” problem was that the automatica stats collection routine in 10g had created a histogram that wasn’t needed. So I was nearly pleased to see the change - except that (a) it spoils the solution I’ve previously used for this problem and (b) any change is likely to be bad from some people.

The latter problem appears as bug 6082745.8 - which says that if the patch for 5483301 causes you problems, you either have to drop the frequency histogram, or you could “unfix” the patch by setting parameter “_fix_control” to ‘5483301:off’

 

March 20, 2008

Humour

Filed under: Uncategorized — Jonathan Lewis @ 5:01 pm UTC Mar 20,2008

Tom Kyte and Pete Finnigan are both well known for talking about the need to use bind variables and the dangers of SQL Injection. This cartoon must have been written just for them.

ANSI SQL

Filed under: CBO, Infrastructure, Troubleshooting — Jonathan Lewis @ 4:26 pm UTC Mar 20,2008

Someone sent me an email a little while ago about a problem they were having with two databases that were using different execution plans for the same query. 

But the two databases were believed to be identical, and the optimizer was running rule-based in both cases, so it shouldn’t have been possible to get different execution plans.

Here’s a little catch (cut and paste from a session running under 9.2.0.6):


SQL> alter session set optimizer_mode = rule; 

Session altered. 

SQL> set autotrace traceonly explain
SQL> select t1.n2, t2.n2
  2  from t1 left join t1 t2 on t1.n1 = t2.n1
  3  ; 

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE (Cost=192 Card=450000 Bytes=7200000)
   1    0   HASH JOIN (OUTER) (Cost=192 Card=450000 Bytes=7200000)
   2    1     TABLE ACCESS (FULL) OF 'T1' (Cost=75 Card=150000 Bytes=1200000)
   3    1     INDEX (FAST FULL SCAN) OF 'T1_I1' (NON-UNIQUE) (Cost=25 Card=150000 Bytes=1200000)

Notice how line 0 of the plan shows that we are running RULE based. But every line has a cost, and the join method is a hash join - which is an option only available to the cost based optimizer.

The use of ANSI syntax has caused the RBO to hand off to the CBO. Once you’re running cost-based with no statistics, there are all sorts of little things that might be enough to cause unexpected behaviour in execution plan.

March 11, 2008

Everything Changes

Filed under: CBO, Execution plans, Statistics — Jonathan Lewis @ 8:59 pm UTC Mar 11,2008

Sometimes you look at a change in the optimizer arithmetic and think “Why did that take so long to appear?” Here’s an example that’s probably going to cause a lot of heartache over the next couple of years as people go through the upgrade process. I have a query that looks like this:

select
	padding
from
	t1
where
	n1	= 2
and	n2	= 2
;

Here are the execution plans from 10.1.0.4, 10.2.0.3, and 11.1.0.6 respectively:


Execution plan (10.1.0.4)
---------------------------------------------------------------------------
| Id  | Operation                   |  Name       | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |    10 |  2470 |    12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1          |    10 |  2470 |    12 |
|*  2 |   INDEX RANGE SCAN          | T1_I1       |    10 |       |     4 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."N1"=2 AND "T1"."N2"=2)
.
.
Execution plan (10.2.0.3)
---------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    10 |  2470 |   168 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |    10 |  2470 |   168 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |   200 |       |    25 |
---------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=2 AND "N2"=2)
.
.
Execution plan (11.1.0.6_)
---------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |   200 | 49400 |   168 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |   200 | 49400 |   168 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |   200 |       |    25 |
---------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=2 AND "N2"=2)

To get the 10.2 and 11.1 plans, I actually had to hint the queries. (And I’ve cheated just a little bit - I had an extra constant, padding, column in the middle of the index to make it larger than the actual values suggest; but I’ve removed that column reference from the predicate list)

Notice how the cardinalities (rows) and costs change. I have 4,000 rows in the table, and the n1 and n2 columns have 20 distinct values each. However, the way I’ve defined the data means that n1 and n2 are always the same.

In 10.1, the optimizer does it’s arithmetic by considering the n1 and n2 values separately - so it decides there are 400 (20 * 20) possible distinct combinations of n1 and n2, so for both the index and table selectivity it decides that the cardinality will be 10 (4,000 / 400). This means the index scan will be small (cost = 4) and we will visit just a few blocks in the table (incremental cost = 8).

In 10.2, the optimizer uses the index distinct_keys to calculate the number of index entries scanned. There are 20 distinct combinations, so we visit 200 entries (4,000 / 20). This is a larger range scan (cost = 25) and we have to visit a lot more blocks in the table (incremental cost = 143). Unfortunately, when we get to the table the optimizer decides that the number of rows we find is derived by considering the number of distinct column values separately - so we are back to 10 (4,000 / (20 * 20)).

Finally, in 11.1, the optimizer calculates the index selectivity and the table selectivity from the number of distinct keys in the index; so not only are the index and table costs high, the estimate for the number of rows in the table is also 200.

In many cases the incremental enhancements to the optimizer will make no difference to most people - but occasionally the increased costs or the increased cardinalities will change an execution plan, possibly making it better, possibly making it worse.

It’s important to find the differences in simple examples like this, because it makes it so much easier to understand what’s changed in complex examples.

Footnote: I had forgotten that I previously published this example (without the 11g change) some time ago.

March 9, 2008

Cursor_sharing

Filed under: CBO, Execution plans, Hints, Troubleshooting — Jonathan Lewis @ 9:13 pm UTC Mar 9,2008

From time to time I’ve commented on the fact that setting cursor_sharing to force or similar may be a temporary workaround to bad coding practises, but that it can introduce problems, has a few associated bugs, and shouldn’t  be viewed as a guaranteed, or long-term, solution.

I don’t think I’ve ever published any specific examples, though, of the oddities that can appear when you enable the feature - so here’s a simple one to watch out for:


select
	{list of columns}
from	{table}
where	rownum = 1
;

What difference will setting cursor_sharing to force make for this query?

Here are the “before” and “after” execution plans in 10.2.0.3 (pulled out of memory using the dbms_xplan.display_cursor() procedure) for an example of this type of query. :


SQL_ID  16x8y4zkwqhtt, child number 0
-------------------------------------
select  /*+ not sharing */  small_vc from t1 where rownum = 1        

Plan hash value: 3836375644
-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     2 |
|*  1 |  COUNT STOPKEY     |      |       |       |       |
|   2 |   TABLE ACCESS FULL| T1   |     1 |    11 |     2 |
-----------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM=1)
.
.
.
SQL_ID  56rbgjy8h1yaa, child number 0
-------------------------------------
select  /*+ sharing */  small_vc from t1 where rownum = :"SYS_B_0"        

Plan hash value: 624922415
------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost  |
------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |    16 |
|   1 |  COUNT              |      |       |       |       |
|*  2 |   FILTER            |      |       |       |       |
|   3 |    TABLE ACCESS FULL| T1   |  3000 | 33000 |    16 |
------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(ROWNUM=:SYS_B_0)        

The question, of course, is how much difference that extra filter operation at line 2 makes to the performance. If you can’t guess from the plan, you can always re-run the queries with autotrace enabled to capture the execution statistics:


Statistics (cursor_sharing = exact)
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        419  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
.
.
.
Statistics (cursor_sharing = force)
———————————————————-
          1  recursive calls
          0  db block gets
         97  consistent gets
          0  physical reads
          0  redo size
        419  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed       

In the first run the optimizer could see that it had to stop after one row of the tablescan. But when the rownum target was forced into a bind variable, the optimizer produced a plan that made the entire tablescan happen with a check on every single row returned to see if its row number matched the input bind variable!

This anomaly is fixed in 11g - the filter operation disappears - but if you’ve got code like this in any earlier versions of Oracle, then it needs to be fixed. Funnily enough, changing the predicate from “rownum = 1″ to “rownum <= 1″ is sufficient to allow Oracle to get the right answer with the minimum of work.

There are a couple of other common anomalies that appear when you enable the feature, and I may find time to write about them in the future. In the interim you might want to investigate adding the hint /*+ cursor_sharing_exact */ (which appeared in 9i) to any queries that behave sufficiently badly when you fiddle with the cursor_sharing parameter - this gives you a localised way of turning sharing off.

Next Page »

Blog at WordPress.com.