In a recent note I wrote about index joins I made a passing comment about limitations in the optimizer’s available strategies that might make you choose to write your code to emulate an index join through explicit SQL references.
Here are two SQL similar SQL statements (with execution plans) that demonstrate the initial problem – the first is just a restatement of the basic example I supplied in the first article:
Despite the title and content, this article is more about thinking and testing than it is about list partitions as such. It’s just based on a little thought experiment with list partitioning.
I have a list-partitioned table, with no indexes, defined as follows:
I’ve spent the last week in Spain – sightseeing, rather than working – with a minimum amount of access to the Internet.
Inevitably I now have to work hard to catch up with my email. As a moment of light relief in an otherwise irritating chore I thought I’d respond to an emailed request for help. (Regular readers of the blog will know that I don’t usually respond to private email requests for solutions, but sometimes someone gets lucky.)
The question was basically this: why do I get different execution plans for the following two statements:
In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you that the strategy could lead to a couple of anomalies if you were unlucky. I’ve already published a note about one such anomaly that can occur with fairly long character strings, this note describes another anomaly that could appear in less extreme cases. Again, we start by constructing a data set.
I’ve often been heard to warn people of the accidents that can happen when they forget about the traps that appear when you start allowing columns to be NULL – but sometimes NULLs are good, especially when it helps Oracle understand where the important (e.g. not null) data might be.
An interesting example of this came up on OTN a few months ago where someone was testing the effects of changing a YES/NO column into a YES/NULL column (which is a nice idea because it allows you to create a very small index on the YESes, and avoid creating a histogram to tell the optimizer that the number of YESes is small).
They were a little puzzled, though, about why their tests showed Oracle using an index to find data in the YES/NO case, but not using the index in the YES/NULL case. I supplied a short explanation on the thread, and was planning to post a description on the blog, but someone on the thread supplied a link to AskTom where Tom Kyte had already answered the question, so I’m just going to leave you with a link to his explanation.
Here’s an example of “creative SQL” that I wrote in response to a question on OTN about combining data from two indexes to optimise access to a table. It demonstrates the principle that you can treat an index as a special case of a table – allowing you to make a query go faster by referencing the same table more times.
Unfortunately you shouldn’t use this particular example in a production system because it relies on the data appearing in the right order without having an “order by” clause. This type of thing makes me really keen to have a hint that says something like: /*+ qb_name(my_driver) assume_ordered(@my_driver) */ so that you could tell the optimizer that it can assume that the rowset from a given query block will appear in the order of the final “order by” clause.
There’s a thread on OTN that talks about a particular deletion job taking increasing amounts of time each time it is run.
It looks like an example where some thought needs to go into index maintenance and I’ve contributed a few comments to the thread – so this is a lazy link so that I don’t have to repeat myself on the blog.
I’ve written before about the effects of subquery factoring (common table expressions – or CTEs) on the optimizer, and the way that the optimizer can “lose” some strategies when you start factoring out subquery expressions. Here’s another example I came across quite recently. It involved a join of about 15 tables so I’ve only extracted a few lines from the SQL and resulting execution plans.
We start with the original query, which had factored out an aggregate subquery then used it in place of an inline view:
I think anyone who has read Wolfgang Breitling’s material about the optimizer will be familiar with the concept of Cardinality Feedback and one particular detail that when Oracle gets a cardinality estimate of one for a “driving” table then there’s a good chance that the execution plan will go wrong. (That’s not rule, by the way, just a fairly common observation after things have gone wrong.)
A recent note on OTN reminded me of a particular scenario where this specific problem can occur. It’s not particularly common, but it may hit people who are building data warehouses from multiple different sources. We start with an unlikely looking data set and very simple query:
From time to time I’ve warned people that subquery factoring should be used with a little care if all you’re trying to do is make a query more readable by extracting parts of the SQL into “factored subqueries” (or Common Table Expressions – CTEs – if you want to use the ANSI term for them). In principle, for example, the following two queries should produce the same execution plan:
I’ve previously published a couple of notes (hereand here) about the driving_site() hint. The first note pointed out that the hint was deliberately ignored if you write a local CTAS or INSERT that did a remote query. I’ve just found another case where the hint is ignored – this time in a simple SELECT statement.
Try running an ordinary distributed query from the SYS account, and then try using the driving_site()hint to make it run at the remote site. When I tried this a few days ago I ended up wasting half an hour translating some SQL from ANSI to Oracle dialect because I thought that the ANSI was making Oracle transform the query in a way that lost the hint – then I discovered that both versions of the code worked correctly if I logged in as a different user.
I was running my queries between two databases using 126.96.36.199 – I won’t guarantee you get the same results on other versions, but it looks like SYS doesn’t honour the driving_site() hint. I can’t think of a robust argument why this should be the case, but if I were forced to do some vague hand-waving I’d probably mumble something about potential security loopholes.
Footnote: I should, of course, have mentioned that there are all sorts of things that behave in unexpected ways if you are logged on as SYS, and that you shouldn’t be logged on as SYS – especially in a production system.
[Further reading on "ignoring hints"]
In the latest Quiz Night, I asked how you could make a query more efficient by changing a two table join into a three table join – with the clue that my third table was a repeat of the first table. Gary Myers, in comment 4, provided the type of answer I was looking for. Sometimes it is more efficient to get a small amount of data from a table on a first pass then go back and get the rest of the data on a second pass – especially if the first pass is an ‘index only’ operation.
Here’s a simple piece of code demonstrating an irritating problem. I’ve created a table, a function-based index, collected stats (without histograms), and then run a query that should use that index – but doesn’t.
Here’s a little script I wrote a few years ago to make a point about using the dbms_stats package. I’ve just re-run it on 10.2.0.3 to see if it still behaves the way it used to – and it does. If you want to be just a little bit baffled, set up a database with an 8KB blocks size, a tablespace that is locally managed, uniform extent size of 1MB, using freelist management, then run the script:
If you run a query using first_rows_N optimisation you could run into a massive performance problem in cases where the optimizer thinks the complete result set is quite large when it is actually very small.
If both conditions are true the optimizer may choose a very resource-intensive execution path “expecting” to stop (or at least pause between fetches) after N rows – hoping to give the impression that it can respond very quickly – but find that the query has to run to completion because the N rows simply don’t exist.