If you know anything about bitmap indexes you probably know that a single entry in a bitmap index takes the form (key_value, starting rowid, ending rowid, BBC compressed bit string). So an entry covers a single value for a column over a range of rowids in the table, and the string of bits for that (notional) range is reduce to a minimum by a compression mechanism that eliminate repeated zeros in multiples of 8.
So here’s a question – to which I don’t know the answer, although you may be surprised when you try to find it:
If you have a very large table and in one of its columns the first row and the last row (and no others) hold the value 0 (say) and you create a bitmap index on this column, what’s the largest number of rows you could have in the table before Oracle would HAVE to create two index entries in order to cover both rows ?
Follow-up question – once you start getting close to working out the answer, can you think of a way to provide an example without actually creating a table with that many rows in it ?
I’m afraid this is one of my bad puns again – an example of the optimizer making a real hash of the index hash join. I’m going to create a table with several indexes (some of them rather similar to each other) and execute a query that should do an index join between the obvious two indexes. To show how obvious the join should be I’m going to start with a couple of queries that show the cost of simple index fast full scans.
Here’s the data generating code:
Just throwing out a brief comment (one of my many draft notes that I don’t have time to complete) about the dbms_space package. You’re probably familiar with this package and how, for ASSM segments, it can give you a measure of the available space in the blocks in a data segment, reporting 6 possible states of the blocks below the high high water mark (HHWM) e.g.:
This question came up on the OTN database forum a couple of months ago: “Why doesn’t Oracle allow you to create globally partitioned bitmap indexes?” The obvius answer is “It just doesn’t, okay.” But it can be quite interesting to think of reasons why a particular mechanism might not have been implemented – sometimes the answer can give you an insight into how a feature has been implemented, or it might suggest cases where a feature might not work very well, it might give you some ideas on how to work around a particular limitation, and sometimes it might just help to pass the time on a short flight.
Here’s another of my “draft” notes that needs some expansion and, most importantly, proof.
I have a fact table with a status id column that shows a massive skew. But I also have a dimension table that holds the “status code” so (in theory, at least) I have to do a join from the statuses table to the facts table to find rows of a given status. Unfortunately the join hides the skew:
Oracle 12c has increased the maximum length of character-based columns to 32K bytes – don’t get too excited, they’re stored out of lines (so similar in cost to LOBs) and need some modification to the parameter file and data dictionary (starting the database in upgrade mode) before you can use them.
Richard Foote has a pair of articles on indexing such columns:
Be cautious about enabling this option and test carefully – there are going to be a number of side effects, and some of them may require a significant investment in time to resolve. The first one that came to my mind was that if you’ve created a function-based index on a pl/sql function that returns a varchar2() type and haven’t explicitly created the index on a substr() of the return value then the data type of the function’s return value will change from the current default of varchar2(4000) to varchar2(32767) – which means the index will become invalid and can’t be rebuilt or recreated.
Obviously you can redefine the index to include an explicit substr() call – but then you have to find all the code that was supposed to use the index and modify it accordingly.
Here’s one of those little details which I would have said just couldn’t be true – except it’s in the manuals, and the manuals happen to be right.
I wrote a note about the 12c “In-Memory” option some time ago on the OTN Database forum and thought I’d posted a link to it from the blog. If I have I can’t find it now so, to avoid losing it, here’s a copy of the comments I made:
It probably won’t surprise many people to hear me say that the decode() function can be a bit of a nuisance; and I’ll bet that quite a lot of people have had trouble occasionally trying to get function-based indexes that use this function to behave properly. So (to put it all together and support the general directives that case is probably a better choice than decode() and that the cast() operator is an important thing to learn) here’s an example of how function-based indexes don’t always allow you to work around bad design/code. (Note: this is a model of a problem I picked up at a client site, stripped to a minimum – you have to pretend that I’m not allowed to fix the problem by changing code).
A recent question on the Oracle-L list server described a problem with data coming in from SQL Server and an oddity with referential integrity failing on Oracle because (for example) a child row was in lower case while the parent was in upper.
This raised a few comments on how you might handle referential integrity while allowed case to differ. No doubt it’s been done before – by Tom Kyte if no-one else – but the first thought that crossed my mind was to use virtual columns:
Here’s a little note that came about after I tweeted an idle thought on Twitter yesterday
- 12c allows you to have multiple indexes on the same columns on a table, although only one of them is allowed to be visible at any one time – you can do the same with any recent versions of Oracle “almost”, and without the invisibility requirements. (Thanks to Jason Bucata for suggesting the critical detail on this one.)
- 12c allows you to have “partial” indexing on partitioned tables – you can do the same with earlier versions of Oracle “almost” but only if the indexes are local indexes or globally partitioned.
- 12c doesn’t officially allow you to create an index that is a bitmap in the past and a btree in the present (yet) – although you can almost do this in any recent versions of Oracle.
The clustering_factor is one of the most important numbers (if not the most important number) affecting the optimizer’s choice of execution plan – it’s the thing that has the most significant effect on the optimizer’s decision on whether to choose a table scan or an index, and on which index to choose.
One of the sad things about trying to keep on top of Oracle is that there are so many little things that could go wrong and take a long time to identify. In part this is why I try to accumulate test cases for all the oddities and anomalies I come across as I travel around the world – if I’ve spent the time recreating a problem I’ll probably remember it the next time I see the symptoms.
Just one of those little snippets about 12c that might help someone.
Further to an earlier post, online rebuild works in 12c even when the key is “too long”. The internal code has changed completely, and there is no sign of the problematic journal table that caused the problem in earlier versions.
In my last post I made a comment about how the optimizer will use the new format of the index hint to identify an index that is an exact match if it can, and any index that starts with the same columns (in the right order) if it can’t find an exact match. It’s fairly easy to demonstrate the behaviour in 11g by examining the 10053 (CBO) trace file generated by a simple, single table, query – in fact, this is probably a case that Doug Burns might want to cite as an example of how, sometimes, the 10053 is easy to interpret (in little patches):