Here’s a model of a little problem I came across recently. It’s something I wrote about many years ago, and I thought I’d seen a note on Metalink (probably Note 73167.1 – see comments 2 and 4 below) explaining that the issue had been addressed; but the problem is still there, even in 126.96.36.199.
We start with a little data set (and it’s my standard setup of 8KB blocks, LMTs, 1MB uniform extents, and no ASSM):
It’s always worth browsing through the list of Oracle’s bug fixes each time a new release or patch comes out because it can give you clues about where to look for problems in your production release – and how to anticipate problems on the upgrade. This article is an example of a fix that I found while while looking at the note for 188.8.131.52 (MOS licence required for link) quite recently.
You might think from the title that this little note is going to be about the index hash join – you would be half right, it’s also about how the optimizer seems to make a complete hash of dealing with index hash joins.
Let’s set up a simple data set and a couple of indexes so that we can take a closer look:
A few days ago I published an example of the optimizer failing to handle an updateable join view because it didn’t recognise that a particular type of aggregate subquery would guarantee key-preservation. Here’s another example where the human eye can see key-preservation, but the optimizer can’t (even in 184.108.40.206). As usual we’ll start with some sample data – in this case two tables since I want to update from one table to the other.
A recent comment on a note I wrote some time ago about faking histograms asked about the calculations of selectivity in the latest versions of Oracle. As I read the question, I realised that I had originally supplied a formula for calculating cardinality, rather than selectivity, so I thought I’d supply a proper example.
We’ll start with a script to create some data and stats – and I’m going to start with a script I wrote in Jan 2001 (which is why it happens to use the analyze command rather than dbms_stats.gather_table_stats, even though this example comes from an instance of 220.127.116.11).
Here’s a simple update statement that identifies a few rows in a table then updates a column where a matching value can be derived from another table – it’s an example of an update by correlated subquery:
Last year I wrote a few articles for Simpletalk, a web service created by Red Gate for users of SQL Server. This year, Red Gate is setting up a similar service called “All things Oracle” (I’ve added a link in my blogroll) for Oracle users, and I’ve volunteered to write articles for them occasionally.
Some of the stuff they publish will be complete articles on their website, some will be short introductions with links to the authors’ own websites. My first article for them has just been posted – it’s an article that captures a couple of key points from the optimizer presentation I did at the UKOUG conference a couple of weeks ago.
Here are a few thoughts on dbms_stats – in particular the procedure gather_index_stats.
The procedure counts the number of used leaf blocks and the number of distinct keys using a count distinct operation, which means you get an expensive aggregation operation when you gather stats on a large index. It would be nice efficiency feature if Oracle changed the code to use the new Approximate NDV mechanism for these counts.
A quick collation – and warning – for 11.2
Bottom line – be careful about what you do with system stats on 11.2
Footnote: the MOS link is a search string producing a list of references. I set it up like that because one of the articles referencing the bug is called “Things to consider before upgrade to 18.104.22.168″ and it’s worth reading.
Addendum: one of the people on the two-day course I’ve just run in Berlin sent me a link for a quick note on how to set your own values for the system stats if you hit this bug. It’s actually quite a reasonable thing to do whether or not you hit the bug given the way that gathering the stats can produce unsuitable figures anyway: setting system stats. (I’ve also added their company blog to the links on the right, they have a number interesting items and post fairly regularly.)
Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget where I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.
Another problem with high insert/delete rates appears with very small indexes.
If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.
The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.
Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of “2 * number of times you visit the table” if your small table is the second table in a nested loop join – and suddenly a nested loop becomes much more expensive without a real change in the data size.
Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.
Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)
Being a very reserved British type of character I’m not really one to make a big fuss about advertising myself, which is why it’s taken me five years to realise that I ought to make it easy for people to find the free download of Chapter 5 (Clustering Factor) of Cost Based Oracle Fundamentals.
Apress changes the relevant URL from time to time, and I’ve just discovered that they’ve now bundled the pdf file of the chapter into this zip file.
The thing that prompted me to post this special note was that some time ago Mohamed Houri translated the chapter into French as a gesture of appreciation for the fact that I had written the book and Apress has given me permission to post the translation, which is this pdf file.
After publishing my note about emulating star transformations when there was something blocking the optimizer’s ability to do them automatically I ended up having an interesting email conversation with Mark Farnham about variations on the theme, concluding with a completely different (and possibly more intuitive) SQL statement to achieve the same aim.
A little while ago I published a note explaining how it was possible to find queries which ran faster if you manually de-coupled the index and table accesses. Here’s a further example that came up in discussion on a client site recently. The query looks something like this (at least in concept, although it was a little more complex, with some messy bits around the edges):
Browsing through the archive for the Oracle-L listserver a couple of days ago I came across this item dated Feb 2011 where the author was puzzled by Oracle’s choice of index for a query.
He was using 10.2.0.3, and running with the optimizer_mode set to first_rows – which you shouldn’t really be doing with that version of Oracle since Oracle Corp. told us about 10 years ago that “first_rows is avaiable only for backwards compatibility”.
When Oracle transforms a query it will sometimes report an internal view name in the execution plan – and I have seen people asking for help occasionally because they’ve tried to find the view name in dba_views and it hasn’t existed. So, just for reference, here’s a short list of the view names that might appear during query transformation but won’t exist in your system.
With each name I’ve put a brief comment of why they might appear: