In the past I have enthused mightily about the benefits of the approximate NDV mechanism and the benefit of using auto_sample_size to collect statistics in 11g; however, as so often happens with Oracle features, there’s a down-side or boundary condition, or edge case. I’ve already picked this up once as an addendum to an earlier blog note on virtual stats, which linked to an article on OTN describing how the time taken to collect stats on a table increased dramatically after the addition of an index – where the index had this definition:
create bitmap index i_s_rmp_eval_csc_msg_actions on
As you might guess from the column name, this is an index based on an XML column, which is stored as a CLOB.
In a similar vein, I showed you a few days ago an old example I had of indexing a CLOB column with a call to dbms_lob.getlength(). Both index examples suffer from the same problem – to support the index Oracle creates a hidden (virtual) column on the table that can be used to hold statistics about the values of the function; actual calculated values for the function call are stored in the index but not on the table itself – but it’s important that the optimizer has the statistics about the non-existent column values.
When I introduced Connor McDonald’s blog a few days ago, it was because we had exchanged a couple of email messages (through the Oak Table Network) about how to minimise the resource consumption when copying a load of data from one IOT to another of the same structure. His problem was the way in which the obvious way of copying the data resulted in a massive sort even though, in principle, it should not have been necessary to sort anything since the data could have been extracted in order by walking the existing IOT.
As a suggestion I referenced a comment I had made in the Addenda to Practical Oracle 8i about 12 years ago when I had first solved the problem of loading an IOT with minimal logging and no sorting. At the time I had been loading data from a sorted file into an empty table that was then going to be exchnaged into a partitioned IOT – but it crossed my mind that loading from a flat file and loading from a UNIX pipe were pretty much the same thing, so perhaps Connor could workaround his problem by making one session spool to a pipe while another session was reading it. In the end, he simply created a massive temporary tablespace, but I thought I’d modify a test script I wrote a few years ago to demonstrate my idea – and here it is:
In a recent question on OTN someone asked why Oracle had put some columns into the overflow segment of an IOT when they had specified that they should be in the main index section (the “IOT_TOP”) by using the including clause.
The answer is simple and devious; there’s a little trap hidden in the including clause. It tells Oracle which columns to include, but it gets applied only after Oracle has re-arranged the column ordering (internally) to put the primary key columns first. The OP had put the last column of the primary key AFTER the dozen columns in the table that he wanted in the index section, but Oracle moved that column to the fifth position in the internal table definition, so didn’t include the desired 10 extra columns.
It was good to see the answers to the last Quiz Night accumulating. The problem posed was simply this: I have two IOTs and I’ve inserted the same data into them with the same “insert as select” statement. Can you explain the cost of a particular query (and it’s the same for both tables) and extreme differences in work actually done. Here’s the query, the critical stats on the primary key indexes, the shared plan, and the critical execution statistic for running the plan.
That’s Index Organized Tables, of course. Searching back through my blog I find that I’ve only written one article about IOTs– although I’m very keen on taking advantage of them, and have made a few references to them in other articles. Rather than addressing this oversight myself, I thought I’d direct you to a series on IOTs by Martin Widlake.
Updated Feb 2014 with another worthwhile catalogue of articles
Richard Foote’s array of articles on IOTs
A question came up on the Oracle database forum a few months ago asking:
What are the benefits and the downside of using IOTs on 16k blocks? Would you recommend it?
I think the best response to the generic question about block sizing came from Greg Rahn in another thread on the forum:
If someone has to ask what block size they need. The answer is always 8KB.***
I’ve probably got a couple of comments about 10053 trace files lurking somewhere on this blog and on my website – and when I mention the 10053 I usually remember to say that it’s a last resort that I only use when I think there may be a bug that needs to be nailed.
So here’s a bit of a 10053 – which I only looked at because I thought it was going to show me a bug.