In the previous note on local indexes I raised a couple of questions about the problems of different partitions holding different volumes of data, and supplied a script to build some sample data that produced the following values for blevel across the partitions of a list-partitioned table.
November 14, 2010
October 11, 2010
I recently came across a tidy solution to a common problem – how to minimise code maintenance in a procedure while maximising flexibility of the procedure. The task was fairly simple – create a ref cursor for a calling program to return data that (a) followed complex selection rules and (b) allowed the user to specify numerous types of input.
The principle was simple – the final ref cursor was driven by a list of (say) order ids – and the details to be returned about those orders required some fairly complex SQL to execute. To separate the complexity of constructing the list of columns from the complexity of identifying the required rows the developers had split the procedure into two stages. First, select the list of relevant order ids using one of several possible statements – the appropriate statement being derived from analysis of the inputs to the procedure; secondly open a ref cursor using that list of order ids. In this way if a new set of rules for selection appeared the only new code needed was a new query to select the ids – the main body of code didn’t need to be modified and re-optimised.
October 7, 2010
In an article that I wrote about the /*+ driving_site */ hint a few months ago I pointed out that the hint was not supposed to work with “create table as select” (CTAS) and “insert as select”. One of the people commenting on the note mentioned pipelined function as a workaround to this limitation – and I’ve finally got around to writing a note about the method.
The idea is simple. If you can write a distributed select statement that takes advantage of the /*+ driving_site */ hint to work efficiently, you can wrap the statement in a pl/sql cursor loop and stick that loop into a pipelined function to maximise the efficiency of create or insert as select. Here’s some sample code (tested on 220.127.116.11) to demonstrate the principle:
September 30, 2010
Here’s a hidden threat in the optimizer strategy that may cause performance problems if you’re trying to operate a series of batch updates (or batch deletes).
In the past I’ve pointed out that a predicate like “rownum <= N" generally makes the optimizer use “first_rows(N)” optimisation methods – known in the code as first_k_rows optimisation.
This isn’t true for updates and deletes, as the following simple example indicates:
September 19, 2010
There’s a thread on OTN that talks about a particular deletion job taking increasing amounts of time each time it is run.
It looks like an example where some thought needs to go into index maintenance and I’ve contributed a few comments to the thread – so this is a lazy link so that I don’t have to repeat myself on the blog.
August 29, 2010
From time to time someone will post a question about query performance on the OTN database forum asking why one form of a query returns data almost immediately while another form of the query takes minutes to return the data.
Obviously there are all sorts of reasons – the optimizer is not perfect, and different transformations may take place that really do result in a huge differences in work done by two queries which return the same result set – but a very simple reason that can easily be overlooked is the front-end tool being used to run the query.
August 27, 2010
I was on a customer site recently where I needed to add a NOT NULL constraint to a table of 200 million rows – without taking any downtime. It’s not difficult (provided you are happy with a check constraint rather than a column definition.)
alter table t1 add constraint t1_ck_colX_nn check (colX is not null) enable novalidate ;
The first step creates the constraint and enables it – but doesn’t validate it. This means that future data (and changes) will obey the constraint, but there may be illegal data already in the table that will not be checked. You will have an interruption to service doing this, as your session will wait to lock the table in share mode (mode 4) to add the constraint – so will be blocked by current update transactions, and will block new update transactions. In a typical OLTP system this should result in just a brief pause.
August 24, 2010
A couple of years ago I wrote about a poster on the OTN db forum who was puzzled by the fact that when he started rebuilding tables they got bigger. (Bad luck, sometimes that’s what happens !)
A few days ago a related question appeared: I rebuilt some indexes and my query got slower. (Bad luck, sometimes that’s what happens – again!)
If you rebuild an index it’s physically different and its statistics are different. Plans can change and go slower because the index stats look sufficiently different; plans can stay the same and go slower because the index is physically different.
I’ve added a couple of comments to the thread – there may still be some further mileage in it.
August 22, 2010
I think anyone who has read Wolfgang Breitling’s material about the optimizer will be familiar with the concept of Cardinality Feedback and one particular detail that when Oracle gets a cardinality estimate of one for a “driving” table then there’s a good chance that the execution plan will go wrong. (That’s not rule, by the way, just a fairly common observation after things have gone wrong.)
A recent note on OTN reminded me of a particular scenario where this specific problem can occur. It’s not particularly common, but it may hit people who are building data warehouses from multiple different sources. We start with an unlikely looking data set and very simple query:
August 15, 2010
The final join mechanism in my “all joins are nested loop joins” argument is the Merge Join – a join mechanism that depends on both its row sources being pre-sorted on the join columns. (For a quick reference list of URLs to all three articles in turn, see: Joins.)
August 10, 2010
In the second note on my thesis that “all joins are nested loop joins with different startup costs” I want to look at hash joins, and I’ll start by going back to the execution plan I posted on “Joins – NLJ”. (For a quick reference list of URLs to all three articles in turn, see: Joins.)
July 30, 2010
Here’s an example of how you have to think about conflicts of interest when dealing with problems of scalability. It starts with a request (that I won’t give in detail, and was a little different from the shape I describe below) from a client for advice on how to make a query go faster.
Basic problem: the query runs about 20 times per second, returning a very small number of rows; it’s basically a very simple “union all” of three query blocks that access the same table in slightly different ways.
June 29, 2010
From time to time I’ve warned people that subquery factoring should be used with a little care if all you’re trying to do is make a query more readable by extracting parts of the SQL into “factored subqueries” (or Common Table Expressions – CTEs – if you want to use the ANSI term for them). In principle, for example, the following two queries should produce the same execution plan:
June 27, 2010
The following question came up in an email conversation a little while ago:
Are you aware of any problems a large oltp site might have with running index coalesce during production hours, as opposed to doing index rebuilds in a maintenance window?
June 2, 2010
The old chestnut of “optimal block size” came up on OTN again a few weeks ago, with someone asking for advice on how to do some testing to decide on the optimal block size for a database. The correct answer to this question is you don’t: you assume you are going to use the default size for your platform and then think about whether there are any very specific jobs that your application does that might gain some sort of worthwhile benefit if you used a non-default size.
Nevertheless, the OP came back some time later with a few results which suggested that some of his tests showed that a 4KB block size gave significantly better performance than the same tests using 8KB and 16KB block sizes.
But there’s a problem with the conclusion. If you examine the results carefully, and think about what type of work must happen in the tests, you realise that this particular test was not about the blocksize – it was about the network and the client program. (I haven’t included a link to the posting where I explained this – it’s just a little later in the same thread. This is just to give you the option of working out why the test is wrong before you read my comments about it.)
Update 18th Aug 2010
The investigation continues – with the OP comparing the resultsof using a table with a single 2000 byte column to a table with many columns with an similar total size. Again, though, the anomaly in timing he is chasing seems to be about network traffic time, NOT about database block size.
(I’ve only sent one reply to this thread at the moment, but the OP has been good at supplying extra data in the past, so the discussion may evolve to produce further interesting information.)