A recent question on the OTN database forum asked:
I have a table with a,b,c,d,e,f,g,h,i,j,k. columns and I have an index on (a,b) columns. There is a sql statement now with “where a= ?” and we are wondering if it could also be good to add a single index on just (a).
Does it help at all? Does it help in some cases?
This is one of those questions where the answer for a perfectly designed and managed system could easily contradict the pragmatic answer for a live system in its current state. That may mean you have to do the wrong thing in the short term while working (possibly very slowly) towards the right thing. I gave the following (slightly edited) answer on the forum:
The basic answer is that you do NOT need the single column index if you have the two-column index.
The complex answer is that you may have to spend some time and effort ensuring that the two-column index is used in all cases where it would have been appropriate to use the single column index. This may simple mean ensuring the clustering_factor of the index is adjusted suitably so that the optimizer “likes” the index enough it may mean you (also) have to modify some code to include the cluster_by_rowid hint (when you’re at 12c) so that you don’t suffer a performance impact at run-time.
Key factors to consider: the two-column index will be physically larger than the single column index – this will increase the (optimizer’s estimated) cost of using it; the clustering_factor of the two-column index will almost certainly be larger than the clustering_factor of the single column index – this will also increase the (optimizer’s estimated) cost of using it.
These two points are echoed at run-time: the two column index will be bigger so you will have to do more work (though probably not very much more work) to read the relevant rowids and, if you walk the two-column index in order for a given value of the first column, you will visit the table blocks in a different order compared to the order of visits from the single column index – this may result in the query actually doing noticeably more work at run-time.
The change in the index leaf_block count is often insignificant (especially if, as per your example, the number of rows required – hence blocks visited in the table – is large); the impact of the clustering_factor can make a dramatic difference to the cost calculations; but you can often work around this. In 126.96.36.199, particularly, you can use the dbms_stats.set_table_prefs() call to set the ‘table_cached_blocks’ parameter for a table so that all its indexes look more desirable to the optimizer.
Bottom line: you don’t need the single column index but if you currently have it and want to drop it the human effort required to ensure that it can be dropped without side effects may make you decide to keep it anyway, especially if it doesn’t seem to be causing any concurrency or other performance overheads. If you don’t have it yet, then you shouldn’t need to create it – though you might have to do some work to make sure that the optimizer takes full advantage of the two-column index.
Since I’m on the topic, I’ll add that the same arguments apply to a pair of indexes like (a, b, c) and (a, b); if you’ve got the longer index you shouldn’t need the shorter one; however, because the shorter index is a multi-column index, you might find that it’s beneficial to create a column group on that column combination so that the optimizer doesn’t lose information about the number of distinct values for the combination when you drop the index.