After describing how to deal most effectively – but only after approval from Oracle Support – with the problem of indexes wasting space on unnecessary ITL entries, I left you with a short list of “supportable” options for addressing the problem. In this note I’m going to outline a few pros and cons of each of those options. The list was as follows:
- Use a coalesce when necessary – possibly on a very regular basis
- Create the index as a reverse-key index
- Create the index as global hash partitioned index
- Get rid of (or change) the index
The command “alter index XXX coalesce” will make Oracle walk through the index leaf blocks in index order attempting to pack the data from two or more adjacent leaf blocks into a smaller number of leaf blocks using the current settings of initrans and pctfree to work out the space to use in each block. It won’t coalesce leaf blocks under different branch blocks, though, so can’t reduce the height of an index. The blocks that are emptied out by the coalesce are removed from the index structure and put on the freelist.
The coalesce command operates as a series of small isolated transactions (so could, in theory, increase the risk of Oracle error ORA-01555: snapshot too old – although in practise this is a little unlikely). The coalesce command also skips blocks holding uncommitted transactions.
On the plus side the coalesce command may not have to do much work, and it is an online command. On the minus side it does have to read the entire index and can generate quite a lot of undo and redo if the concurrency issue has affected blocks scattered all the way through the index (in this case, though, you may decide not to do anything about the waste space – or perhaps only fix it once every few weeks or months).
An alternative to the coalesce command is to rebuild the index – and you may want to do this anyway the first time you realise that you have an index suffering from this concurrency problem. But the rebuild command will lock the table for the duration of the rebuild unless you use the online option – in which case the rebuild has to do a tablescan and sort (as well as maintain, then apply, a journal of changes, and lock the table briefly at the start and end of processing). There are ways to minimise the resources used in a rebuild – but in general you probably don’t want to schedule a regular rebuild for this problem.
Update 27th April 2010: Timur Akhmadeev has written a nice article discussing the limitations of using the coalesce to address the problem of an exploding ITL.
In a reverse key index each column in the index has its bytes reversed, so key values that were originally adjacent (tend to) become randomly scattered throughout the index. As a side effect, you are likely to spread the pattern of inserts more randomly through the index, which means each block probably sees less concurrent action, and the ITL issue “accidentally” disappears as a side effect.
There are several undesirable side effects to reverse key indexes, though.
You’ve effectively randomised the key insertion point, so an index which used to show a nice time-dependent clustering pattern suddenly loses that pattern and may appear undesirable to the optimizer unless you use calls to the dbms_stats package to adjust the index statistics by supplying a nice clustering_factor.
If you were previously inserting, and subsequently using, data at the “top” (high-values) end of the index you could have been getting a very good buffering effect on that part of the index. By recreating the index with reversed keys you are now inserting keys randomly throughout the index, so you may have to allow more space in your db cache to keep the recently used index blocks buffered. Reversing an index could cause an increase in disc I/O, even if the data usage and execution plans don’t change.
Finally, the optimizer cannot apply range-based predicates to reverse key indexes. (Note – this is not the same as using index range scans on the index: if you have a multi-column index that has been reversed the optimizer can still do a range scan for an equality predicate such as “COL1 = constant” on the leading columns). As a result of this, some of your execution plans might change dramatically.
Global Hash Partitioned
This often looks like an easy winner. Create the index as a globally hash partitioned index, partitioned on the first column of the index. The number of partitions should take account of the degree of concurrency you have to deal with, and could easily be somewhere between 16 and 128 in extreme cases. (The number should always be a power of two because of the strategy that Oracle uses for hash partitioning).
This works because you introduce far more insertion points in the index for data items that looks similar. Where a single block of the index was being hit by (say) 20 concurrent inserts, those inserts are likely to be spread over many different blocks scattered across the N partitions.
There are downsides, of course. For a start the partitioning option is only available for Enterprise Edition, and it’s a licensed extra option, and it’s not cheap. On a technical note you’ve also introduced a problem similar to one of the problems with reverse key indexes. The hash-partitioned index is fine for predicates like “COLX = constant” – where the optimizer can pick a single hash partition and the run-time task can be efficient; but if you start using range-based predicates on the hash column then your query will have to work through every single index partition as it runs the query, and if you’ve created the index with a large number of partitions the overhead of the extra index probes may be significant if the query is supposed to be a light-weight (small number of rows) query.
Get rid of the index
This may not be the first thing that crosses your mind when you see the problem – but it is a possibility to be considered. The indexes that are most likely to run into the ITL explosion are “meaningless key” indexes (which you may need to keep and reverse) and the “timestamp” index, and I have seen cases where the latter type of index is simply a waste of space. When examining problem indexes always remember Cary Milsapp’s comment: “the fastest way to do something is to not do it at all”. Sometimes the best way to fix a troublesome index is to drop it.
The other question I left unanswered in the previous note in this series was: “Why haven’t I noticed the space-wasting phenomenon more often despite seeing indexes where it could have been happening.” There are several answers to this question, for example:
- I’ve been called in to address more significant problems, and the side effects of an index that was two or three times larger than it should have been simply didn’t show up in comparison to the other performance issue.
- The DBAs may have been doing regular (and usually pointless) index rebuilds so the issue is never visible when I’m on site
- (This is one I really like). I’ve seen the problem but attributed the symptoms to something that is often the underlying cause of the ITL issue – the time-based index with a slowly deleted tail (also known as the FIFO index or, in Tom Kyte’s terminology, the “sweeper”). The solution for one problem also fixes the other – so there have probably been times when I’ve killed two birds with one stone without realising it. If you want an example of this, take another look at this article on index analysis and ask yourself how much of the space wastage the article describes was due to deleting from the tail of the index, and how much of it was due to the ITL concurrency problem as users were inserting the data. I knew that the index was going to be a disaster area before I looked at it, so I didn’t look at it closely enough or I would probably have seen the ITL problem as well as the FIFO problem.
Indexes are fundamentally different from (heap) tables, and there are several interesting things that can happen to them because of the difference. Sometimes you need to look at them very closely before you understand how to make best use of them and how to avoid the performance problems that they can introduce.
Footnote: Just in case you haven’t come to this article by way of the earlier articles in the series I should mention for completeness that in earlier versions of Oracle you could avoid the ITL problem by setting maxtrans. The need for workarounds to the problem only become necessary in 10g where Oracle started to ignore maxtrans.