Someone recently sent me a request about a piece of SQL they could not optimise. I don’t usually respond to private requests – it’s not an effective use of my time – but their example was something that pops up relatively frequently as a “bug” – so I thought I’d mention it here.
The SQL looked like this:
insert into tab3
select -- small result set
tab1@dblink t1 -- large data set
tab1.col1 in (
tab2 -- small data set
When it comes to setting the optimizer_mode parameter you often hear people say that first_rows_N (for one of the legal values of N) should be used for OLTP systems and all_rows should be used for decision support and data warehouse systems.
There is an element of truth in the statement – but it’s really a hangover from the early days of CBO, and remembrance of the old first_rows optimizer mode (** See footnote).
It’s quite surprising that I still see people arguing about the fastest way to “count the rows in a table”; usually with suggestions that one or other of the following queries will be faster than the rest:
- select count(*) from tab;
- select count(1) from tab;
- select count(primary_key_column) from tab;
A question came up on the Oracle database forum a few months ago asking:
What are the benefits and the downside of using IOTs on 16k blocks? Would you recommend it?
I think the best response to the generic question about block sizing came from Greg Rahn in another thread on the forum:
If someone has to ask what block size they need. The answer is always 8KB.***
[Back to Manual Optimisation part 2]
This little series started from a note I wrote about manual optimisation where I took advantage of a sort operation in a non-mergeable view to produce sorted data from a final nested loop join without including an “order by” that would have produced a large sort operation.
In fact, as I showed in a follow-up post, this was taking a convenient pagination mechanism to an extreme – and you might decide (with good reason, as Tom Kyte did) that it was an extreme that should not be used.
I received an email a litle while ago with an unusual problem. It said:
“One of the jobs which used to take more than one hour to complete is now completing in less than 10 minutes. Neither the application developer nor we (the DBA’s) made *any* changes in the environment/code/database. I can’t work out why it’s got better!”
It’s not often that “going faster” is a problem – but there’s a very good reason for being worried about jobs that go faster for no apparent reason – one day your luck is going to run out and the jobs are going to go slower again – and people really notice when things slow down.
Here’s a summary of a question that appeared on the Oracle Forum some time ago:
I had been seeing frequent log file switching (resulting in “checkpoint not complete” reports) at night when some export dumps have to occur simultaneously, so I increased my redo log file sizes from 5MB to 10MB.
I now have a user who complains that the system is slow, and I see that the buffer cache hit ratio (BCHR) has dropped to about 90% from what was usually 95% or higher.
Can anyone tell me whether increasing the log file size could cause a performance decrease (and thus buffer cache hit ratio decrease)?
I don’t really have anything to do with XMLDB (beyond the fact that it’s an application built in an Oracle database, of course, and subject to tuning and bugs just like any other application), so it was nice to get an email last night from Marco Gralike telling me that the next release of XMLDB was going to include a little enhancement I had suggested to solve a performance issue he was facing with a simple “count(*)” query.
My suggestion was to add a not null constraint to an index on the hidden sys_nc_oid$ column that is the object ID on object table. As I said in a posting on Oracle-L, I couldn’t think of any reason why this would be illegal – and now the constraint is (or will be) official.
Full details are on Marco’s blog.
Have you ever created an index on a column with a name like “last_update_date” – or maybe even a function-based index on “trunc(last_update_date)” ?
You can probably guess the purpose of the column from its name – but could you also guess what state that index is going to be in a few weeks after you’ve created it.
A question about reporting data one page at a time came up on the Oracle Database Forum a couple of days ago – this variation on the “Top N” class of questions is becoming more common as more applications get web-based front-ends, but this example was a little more subtle than usual – so I spent a few minutes seeing if I could work out a quick answer, which I then posted to the forum.
A recent post on the OTN forum asked:
I was wondering is there any fast method for updating 8 million records out of 10 million table? For eg :
I am having a customer table of 10m records and columns are cust_id, cust_num and cust_name.
i need to update 8m records out of 10m customer table as follows.
update customer set cust_id=46 where cust_id=75;
The above statement will update 8m records. And cust_id is indexed.
[Forward to Part 2]
Towards the end of April, I published a note about manual optimisation, and mentioned in one of the comments (#19) that as part of the discussion of the (slightly suspect) mechanism I had introduced I would eventually get around to talking about sorted hash clusters. So I’ve finally managed to make a start.
The following request appeared on the Oracle Forum a few days ago:
I have a select query
select col1,col2,col3 from table1 order by col1,col3
This table contains 4.5 million records initially and the select was returning records in less than 2 minutes .
This query has been running for the last year with out any issues, but from last week onwards this query is taking more than 15 min to complete . No change in database and other components.
How can I find out the root cause of this issue ? Any specific area I need to check ?
There are several performance problems that show up only when you start running concurrency tests, and sometimes you need to manage a very precise degree of synchronisation to demonstrate these problems repeatably in a test environment.
[Forward to Manual Optimisation part 3]
A few days ago I posted an example of SQL that could be used to reduce the impact of sorting a large volume of data by sorting the smallest possible subset of the data with its rowids, and then joining back to the original table by rowid.