Here’s a model of a little problem I came across recently. It’s something I wrote about many years ago, and I thought I’d seen a note on Metalink (probably Note 73167.1 – see comments 2 and 4 below) explaining that the issue had been addressed; but the problem is still there, even in 184.108.40.206.
We start with a little data set (and it’s my standard setup of 8KB blocks, LMTs, 1MB uniform extents, and no ASSM):
You might think from the title that this little note is going to be about the index hash join – you would be half right, it’s also about how the optimizer seems to make a complete hash of dealing with index hash joins.
Let’s set up a simple data set and a couple of indexes so that we can take a closer look:
In my previous post, I made the comment:
In general, if you have a three-column index that starts with the same columns in the same order as the two-column index then the three-column index will be bigger and have a higher clustering_factor.
So what scenarios can you come up with that fall outside the general case ?
Alternatively, what argument could you put forward that justifies the general claim ?
I’ll try to respond to comments on this post a little more quickly than the last one, but I still have quite a lot of other comments to catch up on.
Browsing a little history recently I came across a note I’d written about the new-style index hint. In that note I claimed that:
… the index has to start with the columns (product_group, id) in that order – with preference given to an exact match, otherwise using the lowest cost index that starts the right way.
On reading this statement I suddenly realised that I hadn’t actually proved (to myself, even) that if I had the indexes (product_group, id) and (product_group, id, other_col) then a two-column hint forced Oracle to use the two column index in all (legal) circumstances.
So, tonight’s quiz – are there any edge cases, and what easy ways can you think of to prove (or disprove) the claim for the general case.
Footnote: you don’t have to demonstrate the method, just a brief outline of the idea will be sufficient.
Here’s a follow-up to a post I did some time ago about estimating the size of an index before you create it. The note describes dbms_stats.create_index_cost() procedure, and how it depends on the results of a call to explain plan. A recent question on the OTN database forum highlighted a bug in explain plan, however, which I can demonstrate very easily. I’ll start with a small amount of data to demonstrate the basic content that is used to calculate the index cost.
Here’s a funny little optimizer bug – though one that seems to have been fixed by at least 10.2.0.3. It showed up earlier on today in a thread on the OTN database forum. We’ll start (in 220.127.116.11) with a little table and two indexes – one normal, the other descending.
This one’s so odd I nearly posted it as a “Quiz Night” – but decided that it would be friendlier simply to demonstrate it. Here’s a simple script to create a couple of identical tables. It’s using my standard environment but, apart from fiddling with optimizer settings, I doubt if there’s any reason why you need to worry too much about getting the environment exactly right.
Here are a few thoughts on dbms_stats – in particular the procedure gather_index_stats.
The procedure counts the number of used leaf blocks and the number of distinct keys using a count distinct operation, which means you get an expensive aggregation operation when you gather stats on a large index. It would be nice efficiency feature if Oracle changed the code to use the new Approximate NDV mechanism for these counts.
Inspired by Martin Widlake’s series on IOTs, I thought I’d throw out this little item. In the following, run against 10.2.0.3, tables t3 and t4 are index organized tables, in the same tablespace, with a primary key of (id1, id2) in that order.
A new form of index hint appeared in 10g – and it’s becoming more common to see it in production code; instead of naming indexes in index hints, we describe them. Consider the following hint (expressed in two ways, first as it appeared in the outline section of an execution plan, then cosmetically adjusted to look more like the way you would write it in your SQL):
INDEX(@"SEL$1" "PRD"@"SEL$1" ("PRODUCTS"."PRODUCT_GROUP" "PRODUCTS"."ID"))
index(@sel$1 prd@sel$1(product_group id))
Here’s one I keep forgetting – and spending 15 minutes trying to think of the answer before getting to the “deja vu” point again. I’ve finally decided that I’ve got to write the answer down because that will save me about 14 minutes the next time I forget.
Q. In a Statspack or AWR report there is a section titles “Segments by Row Lock Waits”. Why could an index be subject to a Row Lock Wait ?
A. Try inserting into a table from two different sessions (without committing) two rows with the same primary key. The second insert will wait on event enq: TX – row lock contention, and show up in v$lock with a lock request for a TX lock in mode 4. When you issue a commit or rollback on the first session, and the second statement errors or completes (depending on whether you commit or rollback the first session) it will increase the value for row lock waits in v$segstat (and v$segment_statistics) for the index by 1.
There are variations on the theme, of course, but the key feature is uniqueness with one session waiting for another session to commit or rollback on a conflicting value. This includes cases of foreign key constraint checking such as inserting a child for a parent that has been deleted but not committed (and there’s an interesting anomaly with that scenario which – in 10g, at least – reports more row lock waits on the parent PK than you might expect.)
Here’s a deadlock graph the appeared on Oracle-L and OTN a couple of days ago.
Resource Name process session holds waits process session holds waits
TX-001a0002-0002a0fe 196 197 X 166 1835 S
TM-0000c800-00000000 166 1835 SX 196 197 SX SSX
It’s a little unusual because instead of the common TX mode 6 (eXclusive) crossover we have one TX and one TM lock, the TX wait is for mode 4 (S) and the TM wait is for a conversion from 3 (SX) to 5 (SSX).
A few days ago I got an email about a problem with a system that would use a bitmap index to satisfy a query but wouldn’t use the equivalent B-tree index – and the DBA wanted to make the switch because he wanted to downgrade to Standard Edition from Enterprise Edition.
In outline there was a table with a mapping column defined as varchar2(N) holding a string of 1′s and 0′s representing a bit mask. Typically each map value had about 800 rows associated with it and the users’ queries were all about matching bits using the utl_raw.bitand() and utl_raw.bit_or() functions.
My response was that the only surprise was that Oracle was using the bitmap index, not that it wasn’t using the B-tree index as it seemed that the only way the index could be used was with an index fast full scan. I was curious, so I said I’d take a look at the query, the object definitions the plan, and the 10053 trace file if the DBA cared to send them to me.
It turned out that I was correct – the index fast full scan was the plan used by the bitmap index because the queries were of the form:
from tableX t0
utl_raw.bit_and(t0.mapping, '0000.....001') = '0000.....001'
But, as the DBA pointed out, Oracle didn’t even consider this plan when he changed the bitmap to a B-tree. Why not ? For the same old reason that Oracle often surprises people by ignoring indexes – the column was not declared as NOT NULL, which means there could be rows in the table that are not in the B-tree index, so the index was not a valid target for comparison. (In this case the human eye can see that this is irrelevant, but the optimizer is blindly following a heurisitc – or rule – at this point.)
Key point: Oracle’s standard B-tree indexes do not hold index entries that are completely null. Bitmap indexes (and cluster indexes) do have entries for the nulls.
Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget where I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.
Another problem with high insert/delete rates appears with very small indexes.
If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.
The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.
Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of “2 * number of times you visit the table” if your small table is the second table in a nested loop join – and suddenly a nested loop becomes much more expensive without a real change in the data size.
Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.
Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)
A question about partitioning came up on OTN a few days ago – but the really interesting part of the thread was about the choice of indexes, and how the choice of partitioning, combined with the data patterns, could make a big difference to the choice of indexes. I think the comments I made about indexes are worth seeing, so I’ve linked to the thread.