I’ve commented in the past about the strange stories you can find on the internet about how Oracle works and how sometimes, no matter how daft those stories seem, there might be something behind them. Here’s one such remark I came across a little while ago – published in two or three places this year:
“An index that enforces referential integrity cannot be rebuilt online.”
There are a couple of problems with this statement – first, of course, indexes don’t enforce referential integrity, though they may help to enforce uniqueness, and the so-called “foreign key” index may avoid a locking issue related to referential integrity: that’s splitting hairs a little bit, though, and we can probably guess what the author means by “indexes enforcing referential integrity”. (An example demonstrating the problem would have been useful, though – it would have saved me from writing this note, and it might save other people from jumping to the wrong conclusion and taking unsuitable action as a consequence.)
So here’s a simple test (run under 126.96.36.199):
For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.
It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while reading the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment on page 429 to the effect that:
In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.
Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget where I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.
Another problem with high insert/delete rates appears with very small indexes.
If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.
The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.
Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of “2 * number of times you visit the table” if your small table is the second table in a nested loop join – and suddenly a nested loop becomes much more expensive without a real change in the data size.
Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.
Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)
Yesterday I introduced a little framework I use to avoid the traps inherent in writing PL/SQL loops when modelling a session that does lots of simple calls to the database. I decided to publish the framework because I had recently come across an example where a series of SQL statements gives a very different result from a single PL/SQL block.
I wrote a short note last week that linked to a thread on the Russian Oracle forum about indexing, and if you’ve followed the thread you will have seen a demonstration that seemed to be proving the point that there were cases where an index rebuild would be beneficial.
Of course it’s not difficult to come up with cases where index rebuilds should make a difference – but it’s harder to come up with demonstrations that look realistic, so I thought I’d review the example to explain why it doesn’t really work as a good example of why you might need to think about rebuilding some production index.
A couple of days ago I found several referrals coming in from a question about indexing on the Russian Oracle Forum. Reading the thread I found a pointer to a comment I’d written for the Oracle-L list server a couple of years ago about Advanced Queueing and why you might find that it was necessary to rebuild the IOTs (index organized tables) that support AQ.
The queue tables are, of course, a perfect example of what I call the “FIFO” index so it’s not a surprise that they might need special consideration. Rather than rewrite the whole note I’ll just link to it from here. (One of the notes in the rest of the Oracle-L thread also points to MOS document 271855.1 which describes the whys and hows of rebuilding AQ tables.)
There are many suggestions floating around the internet about identifying which Oracle indexes to rebuild. One of these involves running the validate (or analyze index validate) command on an index and checking the resulting figures in view index_stats to determine whether or not del_lf_rows exceeds some percentage (usually 20%) of lf_rows and rebuilding the index if so. Since this suggestion came up, yet again, in a recent OTN thread about Oracle index rebuilding I thought I’d write up a quick note highlighting one of the obvious flaws in the argument.
I’m not going to bother pointing out the threat inherent in the table-locking when you use the “validate index” command (or the “analyze index validate” equivalent) but I think it’s worth making some comments about the misunderstanding built into this policy. So let’s start by building some data, creating an index on it, then deleting 90% of the data:
A couple of years ago I wrote about a poster on the OTN db forum who was puzzled by the fact that when he started rebuilding tables they got bigger. (Bad luck, sometimes that’s what happens !)
A few days ago a related question appeared: I rebuilt some indexes and my query got slower. (Bad luck, sometimes that’s what happens – again!)
If you rebuild an index it’s physically different and its statistics are different. Plans can change and go slower because the index stats look sufficiently different; plans can stay the same and go slower because the index is physically different.
I’ve added a couple of comments to the thread – there may still be some further mileage in it.
[Further reading on rebuilding indexes]
This note is part four of a four-part series, and covers Index fragmentation. The whole series is as follows
- Introduction – with links to parts 2 – 4
- Disk and Tablespace Fragmentation
- Table Fragmentation
- Index Fragmentation – this bit
4. Index “fragmentation”.
The multiple extent and ASSM “fragmentation” that I described in the previous article about table fragmentation applies equally well to indexes, of course, and matters in just the same way – i.e. hardly ever.
The following question came up in an email conversation a little while ago:
Are you aware of any problems a large oltp site might have with running index coalesce during production hours, as opposed to doing index rebuilds in a maintenance window?
A question came up on OTN a little while ago about an oddity during an online index rebuild. The original question describes how a relatively small index (500MB) takes several minutes to rebuild, with a few minutes where users who are trying to update the table seem to be stuck on enqueue wait events, waiting for locks on the table to be released. But Oracle tells us that “online” rebuilds of indexes don’t cause long-lasting locks to appear – is this a bug, or a new myth ?
My response to the question is speculative but, I hope, a reasonable suggestion that fits the original description. The amount of time that the locks are held at either end of the rebuild are dependent on the level (and timing) of activity on the table.
In Oracle 10g, I have managed to cause long waits for other users – even when the process doing the index rebuild took fractions of a second to get its final lock – by generating a very large volume of change as the initial stage of the index rebuild was taking place.
Note: this was not the same problem as the process doing the rebuild having to wait for the final lock because another user had uncommitted transactions on the table.
While searching on Metalink for clues about an oddity relating to an index rebuild, I came across Bug 6767655 – reported in 10.2.0.3 and fixed in 11.2. The problem is described as follows:
When having two sessions running concurrently, one doing inserts to a partitioned table, and the other doing partition maintenance operations, there is a chance of the DML session core dumping in kauxs_do_journal(), i.e.:
ORA-7445: exception encountered: core dump [kauxs_do_journal()+21] [SIGSEGV] [Address not mapped to object] [0x2]  
The problem seems to be a timing issue – or at least there is a timing component involved. When doing the partition split, an ORA-54 is normally raised, however by running the split in a loop, it will eventually succeed (and thus causing problems for the session running the DML).
Here’s a little oddity that may be waiting to catch you out – but only if you like to create indexes with very long keys.
create table t1(
create index t1_i1 on t1(v1, v2);
alter index t1_i1 rebuild;
alter index t1_i1 rebuild online;
My key value is at the limit for an 8KB block size in Oracle 9i and later – which is roughly 80% of (block size - 190 bytes). In earlier versions of Oracle the limit was roughly half that (i.e. 40% rather than 80%).
Someone posted a very pertinent question about blocks splits and index rebuilds on the OTN Database forum yesterday covering a detail of the cost/benefit equation that I don’t think I’ve mentioned before.
Since I answered the question on the forum I’ve posted a link here to make it available to a wider audience.
(This is a strategy I may adopt more frequently in the future – there’s a lot of useful material of mine all over the internet, and I really ought to make sure I don’t spend time repeating myself when simple pointer would do).
[Updated May 2009: The forum item has since gone missing - so maybe linking to notes I've written in public groups is not so smart after all]. Fortunately I happen to have been saving copies of the thread as it progressed, and the little point that I wanted to highlight was just the following:
“When you create or rebuild an index Oracle does not honour the pctfree setting in the branch blocks. Since the branch blocks are effectively 100% packed, the very first leaf block split in each branch block (except the last one) will almost certainly result in a branch block split. This means a single row insert into an 8KB block could result in 16KB+ of redo for the leaf block plus 16KB+ for the branch block split.”
[Further reading on rebuilding indexes]
From time to time I see people asking about Oracle error ORA-01410: invalid rowid. It’s a problem I’ve responded to a few times in newsgroups and presentations, but I realised recently that despite mentioning it casually for several years I’ve never written a specific note about it – so here it is.