Here are a few thoughts on dbms_stats – in particular the procedure gather_index_stats.
The procedure counts the number of used leaf blocks and the number of distinct keys using a count distinct operation, which means you get an expensive aggregation operation when you gather stats on a large index. It would be nice efficiency feature if Oracle changed the code to use the new Approximate NDV mechanism for these counts.
A quick collation – and warning – for 11.2
Bottom line – be careful about what you do with system stats on 11.2
Footnote: the MOS link is a search string producing a list of references. I set it up like that because one of the articles referencing the bug is called “Things to consider before upgrade to 184.108.40.206” and it’s worth reading.
Addendum: one of the people on the two-day course I’ve just run in Berlin sent me a link for a quick note on how to set your own values for the system stats if you hit this bug. It’s actually quite a reasonable thing to do whether or not you hit the bug given the way that gathering the stats can produce unsuitable figures anyway: setting system stats. (I’ve also added their company blog to the links on the right, they have a number interesting items and post fairly regularly.)
Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget where I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.
Another problem with high insert/delete rates appears with very small indexes.
If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.
The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.
Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of “2 * number of times you visit the table” if your small table is the second table in a nested loop join – and suddenly a nested loop becomes much more expensive without a real change in the data size.
Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.
Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)
I’ve said in the past that one of the best new features, in my view, in 11g was the appearance of proper virtual columns; and I’ve also been very keen on the new “approximate NDV” that makes it viable to collect stats with the “auto_sample_size”.
Who’d have guessed that if you put them both together, then ran a parallel stats collection it would break :(
The bug number Karen quotes (10013177.8) doesn’t (appear to) mention extended stats – but since virtual columns, function-based indexes, and extended stats share a number of implementation details I’d guess that they might be affected as well.
A recent question on the OTN database forum:
Can any one please point to me a document or a way to calculate the average number of rows per block in oralce 10.2.0.3
One answer would be to collect stats and then approximate as block / avg_row_len – although you have to worry about things like row overheads, the row directory, and block overheads before you can be sure you’ve got it right. On top of this, the average might not be too helpful anyway. So here’s another (not necessarily fast) option that gives you more information about the blocks that have any rows in them (I picked the source$ table from a 10g system because source$ is often good for some extreme behaviour).
From time to time I see people asking how they can check how far their system has got in it’s call to collect stats. The question is often a little ambiguous – they know which call to dbms_stats they’ve used, so they know whether they’re trying to monitor stats collection against a single table (taking advantage of v$session_longops springs to mind) or against an entire schema.
Here’s one simple-minded approach that I whipped up a few years ago – it’s possible you could do better with the latest versions of dbms_stats. Its purpose is simply to give you some way of checking what work dbms_stats needs to do (in this example, for a schema), so that you can check how much has been done and how much remains. The basic method can be modified in various ways to match your tastes.
I wrote an article about system statistics / CPU Costing for Oracle magazine a few years ago – and last week I realised that I’ve never supplied a link to it in the notes and comments I’ve made about system statistics. So I’ve just run a search through the Oracle website trying to find it – and discovered that it’s no longer available. Apparently the editors have decided that any technical articles over a certain age should be withdrawn in case they are out of date and misleading. (Clearly they’ve read my blog on trust – I wish the people maintaining Metalink would do the same as the magazine editors – but they probably have a much larger volume to worry about).
However, I have discovered translations of the article in Russian, Korean and Chinese – so if you can read any of these languages, you might want to take a look at them before they disappear too.
If you want an original English version – dated April 2004, which is when I sent it in to Oracle Magazine, and before it underwent some editing – I’ve posted it as a pdf file.
[More on System Statistics]
In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you that the strategy could lead to a couple of anomalies if you were unlucky. I’ve already published a note about one such anomaly that can occur with fairly long character strings, this note describes another anomaly that could appear in less extreme cases. Again, we start by constructing a data set.
In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you that the strategy could lead to a couple of anomalies if you were unlucky. This note describes one such anomaly. We start with a slightly curious data set:
In an earlier note on interpreting the content of frequency histograms I made a throwaway comment about the extra complexity of interpreting frequency histograms on character-based columns. This note starts to examine some of the complications.
The driving problem behind character columns is that they can get quite large – up to 4,000 bytes – so the content of an “accurate histogram” could become quite large, and Oracle seems to have taken a strategic decision (at some point in history) to minimise this storage. As a result we can see an algorithm that works roughly as follows:
Here’s the output I get from querying dba_tab_histograms for a column that has been given a frequency histogram by a call to dbms_stats.gather_table_stats().
I find it convenient occasionally to “translate” a frequency histogram into a report of the underlying data (sometimes to see how closely the histogram matches the real data). To demonstrate the type of query I use I’ve created a data set with a small number of distinct values and generated a frequency histogram on the data set. This is what the data and histogram look like:
In case you don’t follow the link to Martin Widlake’s blog (see right) very often, he’s done a couple of recent posts on dba_tab_modifications that are worth reading.
(And I’ve just discovered the ‘gutter=”false”;’ option for the ‘sourcecode’ tag in one of the comments on Martin’s blog – and that’s also a helpful feature.)
If you’ve run the scripts from Analyze This, I hope you found that the query gave you two different execution plans. This is my output from the test (with a little cosmetic tidying):
Here’s a little script I wrote a few years ago to make a point about using the dbms_stats package. I’ve just re-run it on 10.2.0.3 to see if it still behaves the way it used to – and it does. If you want to be just a little bit baffled, set up a database with an 8KB blocks size, a tablespace that is locally managed, uniform extent size of 1MB, using freelist management, then run the script: