Here’s a deadlock graph the appeared on Oracle-L and OTN a couple of days ago.
Resource Name process session holds waits process session holds waits
TX-001a0002-0002a0fe 196 197 X 166 1835 S
TM-0000c800-00000000 166 1835 SX 196 197 SX SSX
It’s a little unusual because instead of the common TX mode 6 (eXclusive) crossover we have one TX and one TM lock, the TX wait is for mode 4 (S) and the TM wait is for a conversion from 3 (SX) to 5 (SSX).
A few days ago I got an email about a problem with a system that would use a bitmap index to satisfy a query but wouldn’t use the equivalent B-tree index – and the DBA wanted to make the switch because he wanted to downgrade to Standard Edition from Enterprise Edition.
In outline there was a table with a mapping column defined as varchar2(N) holding a string of 1′s and 0′s representing a bit mask. Typically each map value had about 800 rows associated with it and the users’ queries were all about matching bits using the utl_raw.bitand() and utl_raw.bit_or() functions.
My response was that the only surprise was that Oracle was using the bitmap index, not that it wasn’t using the B-tree index as it seemed that the only way the index could be used was with an index fast full scan. I was curious, so I said I’d take a look at the query, the object definitions the plan, and the 10053 trace file if the DBA cared to send them to me.
It turned out that I was correct – the index fast full scan was the plan used by the bitmap index because the queries were of the form:
from tableX t0
utl_raw.bit_and(t0.mapping, '0000.....001') = '0000.....001'
But, as the DBA pointed out, Oracle didn’t even consider this plan when he changed the bitmap to a B-tree. Why not ? For the same old reason that Oracle often surprises people by ignoring indexes – the column was not declared as NOT NULL, which means there could be rows in the table that are not in the B-tree index, so the index was not a valid target for comparison. (In this case the human eye can see that this is irrelevant, but the optimizer is blindly following a heurisitc – or rule – at this point.)
Key point: Oracle’s standard B-tree indexes do not hold index entries that are completely null. Bitmap indexes (and cluster indexes) do have entries for the nulls.
Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget where I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.
Another problem with high insert/delete rates appears with very small indexes.
If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.
The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.
Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of “2 * number of times you visit the table” if your small table is the second table in a nested loop join – and suddenly a nested loop becomes much more expensive without a real change in the data size.
Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.
Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)
A question about partitioning came up on OTN a few days ago – but the really interesting part of the thread was about the choice of indexes, and how the choice of partitioning, combined with the data patterns, could make a big difference to the choice of indexes. I think the comments I made about indexes are worth seeing, so I’ve linked to the thread.
The following question appeared in a comment to an earlier posting on multi-column bitmap indexes and the inability of Oracle to create a bitmap index join when (to the human eye) the strategy was an obvious choice.
I have a query which is using 2 indexes both are bitmap indexes (sizes are 37 and 24 Mbs) and table size is 17gb. While i ran the following query which can very well get the index itself, it takes around 6-8 minutes and using pga around 3 gb.
could you please explain me why ?
I’ve said in the past that one of the best new features, in my view, in 11g was the appearance of proper virtual columns; and I’ve also been very keen on the new “approximate NDV” that makes it viable to collect stats with the “auto_sample_size”.
Who’d have guessed that if you put them both together, then ran a parallel stats collection it would break :(
The bug number Karen quotes (10013177.8) doesn’t (appear to) mention extended stats – but since virtual columns, function-based indexes, and extended stats share a number of implementation details I’d guess that they might be affected as well.
It’s about time I wrote a sequel to Mything in Action – and funnily enough it’s also about bitmap indexes. It starts with a note on the OTN database forum that prompted me to run up a quick test to examine something that turned out to be a limitation in the optimizer. The problem was that the optimizer didn’t do a “bitmap and” between two indexes when it was obviously a reasonable – possibly even good – idea. Here’s some sample code:
Here’s an interesting question from the OTN database forum:
“If I delete 90% of the rows from a table which has a few indexes, without rebuildling or coalescing indexes afterwards, will this improve the performance of index range scans ?”
The thing that makes it interesting is the scope it gives you for imagining reasons why the performance won’t change, or might get better, or could get worse. So how about it – can you think of an argument for each of the three possibilities ?
Being a very reserved British type of character I’m not really one to make a big fuss about advertising myself, which is why it’s taken me five years to realise that I ought to make it easy for people to find the free download of Chapter 5 (Clustering Factor) of Cost Based Oracle Fundamentals.
Apress changes the relevant URL from time to time, and I’ve just discovered that they’ve now bundled the pdf file of the chapter into this zip file.
The thing that prompted me to post this special note was that some time ago Mohamed Houri translated the chapter into French as a gesture of appreciation for the fact that I had written the book and Apress has given me permission to post the translation, which is this pdf file.
Yesterday I introduced a little framework I use to avoid the traps inherent in writing PL/SQL loops when modelling a session that does lots of simple calls to the database. I decided to publish the framework because I had recently come across an example where a series of SQL statements gives a very different result from a single PL/SQL block.
I wrote a short note last week that linked to a thread on the Russian Oracle forum about indexing, and if you’ve followed the thread you will have seen a demonstration that seemed to be proving the point that there were cases where an index rebuild would be beneficial.
Of course it’s not difficult to come up with cases where index rebuilds should make a difference – but it’s harder to come up with demonstrations that look realistic, so I thought I’d review the example to explain why it doesn’t really work as a good example of why you might need to think about rebuilding some production index.
A couple of days ago I found several referrals coming in from a question about indexing on the Russian Oracle Forum. Reading the thread I found a pointer to a comment I’d written for the Oracle-L list server a couple of years ago about Advanced Queueing and why you might find that it was necessary to rebuild the IOTs (index organized tables) that support AQ.
The queue tables are, of course, a perfect example of what I call the “FIFO” index so it’s not a surprise that they might need special consideration. Rather than rewrite the whole note I’ll just link to it from here. (One of the notes in the rest of the Oracle-L thread also points to MOS document 271855.1 which describes the whys and hows of rebuilding AQ tables.)
From time to time the question about whether local indexes on partitioned tables should be prefixed or non-prefixed appears on the Oracle forums and mailing lists.
It’s possible that I’m wrong – although no-one has come up with a counter-example to the statement I keep repeating – but the whole prefixed/non-prefixed thing for local indexes dates back to a limitation in the optimizer somewhere in the 8.0 time line where Oracle couldn’t do partition elimination on indexes properly but the overhead of the error it made could be dramatically reduced (in most cases) by sticking the partition key at the start of the index.
The guideline for local indexes are the same as they would be for a non-partitioned index on a non-partitioned heap table – the partitioning column(s) are just columns in the table that might be worth including in the index, and the order of the index columns is dictated by the important queries that have to access the table.
For further comments, there’s a note I wrote (which I’ve just been reminded of) on the OTN database forum that adds a little detail to this argument.
There are many suggestions floating around the internet about identifying which Oracle indexes to rebuild. One of these involves running the validate (or analyze index validate) command on an index and checking the resulting figures in view index_stats to determine whether or not del_lf_rows exceeds some percentage (usually 20%) of lf_rows and rebuilding the index if so. Since this suggestion came up, yet again, in a recent OTN thread about Oracle index rebuilding I thought I’d write up a quick note highlighting one of the obvious flaws in the argument.
I’m not going to bother pointing out the threat inherent in the table-locking when you use the “validate index” command (or the “analyze index validate” equivalent) but I think it’s worth making some comments about the misunderstanding built into this policy. So let’s start by building some data, creating an index on it, then deleting 90% of the data:
In a recent note I wrote about index joins I made a passing comment about limitations in the optimizer’s available strategies that might make you choose to write your code to emulate an index join through explicit SQL references.
Here are two SQL similar SQL statements (with execution plans) that demonstrate the initial problem – the first is just a restatement of the basic example I supplied in the first article: