The Enkitec Extreme Exadata Expo (E4) event is over, but I still have plenty to say about the technology. The event was a great success, with plenty of interesting speakers and presentations. I was particularly keen to hear Frits Hoogland’s comments on Exadata and OLTP, Richard Foote on Indexes, and Maria Colgan’s comments on how Oracle is making changes to the optimizer to understand Exadata a little better.
All three presentations were interesting – but Maria’s was possiby the most important (and entertaining). In particular she told us about two patches for 220.127.116.11, one current and one that is yet to be released (unfortunately I forgot to take note of the patch numbers – ed: but they’ve been supplied by readers’ comment below).
When I arrived in Edinburgh for the UKOUG Scotland conference a little while ago Thomas Presslie, one of the organisers and chairman of the committee, asked me if I’d sign up on the “unconference” timetable to give a ten-minute talk on something. So I decided to use Hybrid Columnar Compression to make a general point about choosing and testing features. For those of you who missed this excellent conference, here’s a brief note of what I said.
For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.
It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while reading the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment on page 429 to the effect that:
In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.
I was in a discussion recently about how to estimate the size of a bitmap index before you build it, and why it’s much harder to do this for bitmap indexes than it is for B-tree indexes. Here’s what I wrote in “Practical Oracle 8i”:
There are many little bits and pieces lurking in the Oracle code set that would be very useful if only you had had time to notice them. Here’s one that seems to be virtually unknown, yet does a wonderful job of eliminating calls to decode().
The nvl2() function takes three parameters, returning the third if the first is null and returning the second if the first is not null. This is convenient for all sorts of example where you might otherwise use an expression involving case or decode(), but most particularly it’s a nice little option if you want to create a function-based index that indexes only those rows where a column is null.
If you drop a unique or primary key constraint the index that supports it may be dropped at the same time – but this doesn’t always happen. Someone asked me recently if it was possible to tell whether or not an index would be dropped without having to find out the hard way by dropping the constraint. The answer is yes – after all, Oracle has to make a decision somehow, so if we can find out how it makes the decision we can predict the decision.
So here’s my best theory so far – along with the observations that led to it. First, run a trace while dropping a primary key constraint and see if this gives you any clues; on an instance running 10gR2 I noticed the following statement appearing in the trace file immediately after the delete from cdef$ (constraint definitions).
Here’s a model of a little problem I came across recently. It’s something I wrote about many years ago, and I thought I’d seen a note on Metalink (probably Note 73167.1 – see comments 2 and 4 below) explaining that the issue had been addressed; but the problem is still there, even in 18.104.22.168.
We start with a little data set (and it’s my standard setup of 8KB blocks, LMTs, 1MB uniform extents, and no ASSM):
You might think from the title that this little note is going to be about the index hash join – you would be half right, it’s also about how the optimizer seems to make a complete hash of dealing with index hash joins.
Let’s set up a simple data set and a couple of indexes so that we can take a closer look:
In my previous post, I made the comment:
In general, if you have a three-column index that starts with the same columns in the same order as the two-column index then the three-column index will be bigger and have a higher clustering_factor.
So what scenarios can you come up with that fall outside the general case ?
Alternatively, what argument could you put forward that justifies the general claim ?
I’ll try to respond to comments on this post a little more quickly than the last one, but I still have quite a lot of other comments to catch up on.
Browsing a little history recently I came across a note I’d written about the new-style index hint. In that note I claimed that:
… the index has to start with the columns (product_group, id) in that order – with preference given to an exact match, otherwise using the lowest cost index that starts the right way.
On reading this statement I suddenly realised that I hadn’t actually proved (to myself, even) that if I had the indexes (product_group, id) and (product_group, id, other_col) then a two-column hint forced Oracle to use the two column index in all (legal) circumstances.
So, tonight’s quiz – are there any edge cases, and what easy ways can you think of to prove (or disprove) the claim for the general case.
Footnote: you don’t have to demonstrate the method, just a brief outline of the idea will be sufficient.
Here’s a follow-up to a post I did some time ago about estimating the size of an index before you create it. The note describes dbms_stats.create_index_cost() procedure, and how it depends on the results of a call to explain plan. A recent question on the OTN database forum highlighted a bug in explain plan, however, which I can demonstrate very easily. I’ll start with a small amount of data to demonstrate the basic content that is used to calculate the index cost.
Here’s a funny little optimizer bug – though one that seems to have been fixed by at least 10.2.0.3. It showed up earlier on today in a thread on the OTN database forum. We’ll start (in 22.214.171.124) with a little table and two indexes – one normal, the other descending.
This one’s so odd I nearly posted it as a “Quiz Night” – but decided that it would be friendlier simply to demonstrate it. Here’s a simple script to create a couple of identical tables. It’s using my standard environment but, apart from fiddling with optimizer settings, I doubt if there’s any reason why you need to worry too much about getting the environment exactly right.
Here are a few thoughts on dbms_stats – in particular the procedure gather_index_stats.
The procedure counts the number of used leaf blocks and the number of distinct keys using a count distinct operation, which means you get an expensive aggregation operation when you gather stats on a large index. It would be nice efficiency feature if Oracle changed the code to use the new Approximate NDV mechanism for these counts.
Inspired by Martin Widlake’s series on IOTs, I thought I’d throw out this little item. In the following, run against 10.2.0.3, tables t3 and t4 are index organized tables, in the same tablespace, with a primary key of (id1, id2) in that order.