Function-based indexes are wonderful things – but they don’t always work exactly as expected. Here’s an example of one such anomaly.
Imagine you have some type of “orders” table where most orders are in a “finished with” state, and you have a requirement to access the small number of orders in the “new” state. Here’s a sample data set to emulate this type of data requirement (created in 22.214.171.124, 1MB uniform extents, freelist management and 8KB blocks).
create table t1 (
insert into t1
rownum <= 5000
Over the last year I’ve written a short collection of articles describing how a defect in the code for handling index leaf (block) node splits can cause a significant waste of space in indexes that are subject to a high degree of concurrent inserts. Finally I can announce some good news – Oracle has a patch for the code.
The problem is described in MOS (the database formerly known as Metalink) under bug 8767925 which is reported as “fixed in version 12.1″.
Backports may become available – I’ve already asked for one for 126.96.36.199 on AIX for one of my clients (but it has been a little slow in arriving) and another client should have put in a bid for a backport to 10.2.0.4 (also on AIX) in the last couple of days.
I’ve recently been writing about the index join mechanism and ways of emulating it. Those notes were originally inspired by an example of an index join that appeared on OTN a little while ago.
It was a plan that combined “bitmap/btree conversion” with the basic index join strategy so, with hindsight, it was an “obvious” and brilliant execution plan for a certain type of query. The query in the original posting was a simple select (with no predicates) against a huge table in a data warehouse – presumably extracting a small number of columns from a much wider row.
In an earlier article introducing the index join I raised a question that came up at the first ES2N virtual conference:
“If you hint an index hash join, is there any way of telling Oracle the order in which it should use the indexes?”
Consider the following example:
One of the less well known access paths available to the optimizer is the “index join” also known as the “index hash join” path. It’s an access path that can be used when the optimizer decides that it doesn’t need to visit a table to supply the select list because there are indexes on the table that, between them, hold all the required columns. A simple example might look something like the following:
I’ve often been heard to warn people of the accidents that can happen when they forget about the traps that appear when you start allowing columns to be NULL – but sometimes NULLs are good, especially when it helps Oracle understand where the important (e.g. not null) data might be.
An interesting example of this came up on OTN a few months ago where someone was testing the effects of changing a YES/NO column into a YES/NULL column (which is a nice idea because it allows you to create a very small index on the YESes, and avoid creating a histogram to tell the optimizer that the number of YESes is small).
They were a little puzzled, though, about why their tests showed Oracle using an index to find data in the YES/NO case, but not using the index in the YES/NULL case. I supplied a short explanation on the thread, and was planning to post a description on the blog, but someone on the thread supplied a link to AskTom where Tom Kyte had already answered the question, so I’m just going to leave you with a link to his explanation.
Here’s an example of “creative SQL” that I wrote in response to a question on OTN about combining data from two indexes to optimise access to a table. It demonstrates the principle that you can treat an index as a special case of a table – allowing you to make a query go faster by referencing the same table more times.
Unfortunately you shouldn’t use this particular example in a production system because it relies on the data appearing in the right order without having an “order by” clause. This type of thing makes me really keen to have a hint that says something like: /*+ qb_name(my_driver) assume_ordered(@my_driver) */ so that you could tell the optimizer that it can assume that the rowset from a given query block will appear in the order of the final “order by” clause.
There’s a thread on OTN that talks about a particular deletion job taking increasing amounts of time each time it is run.
It looks like an example where some thought needs to go into index maintenance and I’ve contributed a few comments to the thread – so this is a lazy link so that I don’t have to repeat myself on the blog.
A couple of years ago I wrote about a poster on the OTN db forum who was puzzled by the fact that when he started rebuilding tables they got bigger. (Bad luck, sometimes that’s what happens !)
A few days ago a related question appeared: I rebuilt some indexes and my query got slower. (Bad luck, sometimes that’s what happens – again!)
If you rebuild an index it’s physically different and its statistics are different. Plans can change and go slower because the index stats look sufficiently different; plans can stay the same and go slower because the index is physically different.
I’ve added a couple of comments to the thread – there may still be some further mileage in it.
[Further reading on rebuilding indexes]
I’ve just been reminded of a thread on OTN which turned into a Q&A about index space usage and various related topics. On re-reading it, I decided it was too good to waste in the backwaters of OTN, so here’s a link to it.
Index Space Utilization.
Here’s an odd little bug (I think) in the optimizer that showed itself in a thread on the OTN database forum some time ago. The optimizer can choose an index which is NOT the cheapest index for satisfying a query against a single table. Here’s the demo – which I’ve run against 188.8.131.52 using an 8KB block size, ASSM and system allocated extent sizes:
Here’s an example of how you have to think about conflicts of interest when dealing with problems of scalability. It starts with a request (that I won’t give in detail, and was a little different from the shape I describe below) from a client for advice on how to make a query go faster.
Basic problem: the query runs about 20 times per second, returning a very small number of rows; it’s basically a very simple “union all” of three query blocks that access the same table in slightly different ways.
This note is part four of a four-part series, and covers Index fragmentation. The whole series is as follows
- Introduction – with links to parts 2 – 4
- Disk and Tablespace Fragmentation
- Table Fragmentation
- Index Fragmentation – this bit
4. Index “fragmentation”.
The multiple extent and ASSM “fragmentation” that I described in the previous article about table fragmentation applies equally well to indexes, of course, and matters in just the same way – i.e. hardly ever.
Some time ago I published a little script that showed you how to read an index treedump and summarise the leaf block usage in the logical order of the index leaf blocks – allowing you to see fairly easily if the index had any areas where the blocks were poorly filled.
Here’s another way of looking at the treedump – this time focusing on the branch blocks only. The posting is a little long, but mostly because I’ve reproduced the output from a reasonably large index so that you can see a full example from a production system.
The following question came up in an email conversation a little while ago:
Are you aware of any problems a large oltp site might have with running index coalesce during production hours, as opposed to doing index rebuilds in a maintenance window?