Some time ago I wrote a note on index efficiency that referenced, in one of the comments, an article on my old website about how you could use existing statistics to estimate the “correct” size of an index.
A little while ago, while applying the concept on a client site and implementing a mechanism that would allow the DBAs to do the same check as regularly as needed, I discovered a couple of flaws in the original code – so I had to spend a little time correcting it,and I’ve finally published the latest version here.
The basic concept is unchanged:
The volume of user data stored in a B-tree index – the space needed for the basic column values – can be found from the column statistics in views user_tab_columns and user_tables (or their ‘dba_’ or ‘all_’ equivalents).
The number of entries in the index is given in the view user_indexes (or equivalent) – and for each index entry we need to allow for some well-known “overheads”:
Rowid – 6 bytes for a unique index, 7 for a non-unique, 10 (or 11) for a global index on a partitioned table.
lock byte – one byte
flag byte – one byte
row directory entry – two bytes
There are a couple of defects and rounding errors built into these assumptions, of course, but if the index statistics are reasonably up to date we can get a fairly accurate estimate of the expected size for most indexes. There are several other notes included with the script – do read them before you try using it so that you have an idea of how the results are derived and what they mean .
One important point: if you do find an index that seems to be much bigger than it should be, don’t rebuild it or coalesce it until you’ve given yourself a little time to think about why it got into that state, and what else you might infer from the information available. (Some scripts to help with this decision are in the Script Catalogue.)