This is a note that I wrote for the Northern California Oracle User Group a few months ago. It was published in the February issue of the magazine in the section “Ask the Oracles”, an interesting and innovative section that I first contributed to in August 2006 with a note on hints.
The topic under discussion in the February 2008 issue was actually”Best and Worst Practices”. These were my thoughts – limited to 600 words:
It’s quite hard to say anything sensible or inspiring about a topic like this. Many of the specifics we call best practices are extremely obvious “common sense” but sometimes circumstances conspire to make it impossible to follow them. Many of the worst practices are things that we should obviously avoid, but again circumstances (often pressures of time) may make them unavoidable.
The worst generic practice I know of, then, is applying a fix to a problem without understanding why that fix might work.
I have a simple approach to problem-solving; it involves three steps:
- What is the problem?
- Why will my solution fix the problem?
- Where am I going to pay for implementing this solution?
The last question will sometimes tell us that we can’t afford to implement the “perfect” solution – perhaps we want to change a heap table to an index-organized table (IOT), but can’t because we have too much code that has done something a little exotic with traditional rowids and needs to be rewritten to deal with the “urowid” used for IOTs.
But before we worry about such side effects, overheads and implementation costs, we need to be confident that our solution really is addressing the root cause; because if it isn’t we may spend time and effort implementing a change which seems to fix our problem temporarily – until things go wrong again.
Take a simple example. A query takes an unreasonable amount of time. You check the execution plan and decide the problem might go away if Oracle used a particular index. So you rebuild the index and the query runs much more quickly. Is your job finished?
No doubt the first thing you do is check the execution plan to see that it changed to use the index the way you expected. But does that prove that the performance improvement came from the change in plan – is it possible that your index rebuild used a table-scan that resulted in the target table being cached somewhere (in the SAN cache, for example) so that accesses to the table during your test were much quicker than they would normally be?
Even if the improvement was due to the change in execution plan, do you know why the plan changed? Was it because the rebuild packed the index giving you a smaller leaf block count (or height even), or was it because index rebuilds automatically compute statistics in your version of Oracle and a small change in the statistics (distinct_keys or leaf_blocks being the most likely) made the difference. Will the new plan survive the next statistics collection – even if the index doesn’t start growing immediately after the rebuild?
If you don’t check, you may end up performing a regular, yet redundant and potentially dangerous, rebuild of this index; and when, exactly, does it need to be rebuilt anyway?
What if the change was due to a change in the distinct_keys, there may be “random” occasions when the rebuild strategy just doesn’t work; or a day may come when the rebuild will no longer work because even the freshly rebuilt index has grown past a critical number of distinct_keys or leaf_blocks.
Whenever you make a change that’s supposed to fix a problem, try to capture the before and after information (e.g. statistics, execution plans, work done, number and type of waits). If there’s any doubt in your mind about the root cause, never be afraid to document what you’ve done, and the reasons why you did it – it may save you a lot of time in the future when your “fix” turned out to have been just a temporary lucky coincidence.