This is possibly my longest title to date – I try to keep them short enough to fit the right hand column of the blog without wrapping – but I couldn’t think of a good way to shorten it (Personally I prefer to use the expression CTE – common table expression – over “factored subquery” or “subquery factoring” or “with subquery”, and that would have achieved my goal, but might not have meant anything to most people.)
If you haven’t come across them before, recursive CTEs appeared in 11.2, are in the ANSI standard, and are (probably) viewed by Oracle as the strategic replacement for “connect by” queries. Here, to get things started, is a simple (and silly) example:
Here’s a little script to demonstrate an observation about a missed opportunity for avoiding work that appeared in my email this morning (that’s morning Denver time):
I was involved in a thread on Oracle-L recently started with the question: “How many LIOs is too many LIOs”. Rather than rewrite the whole story, I’ve supplied a list of links to the contributions I made, in order – the final “answer” is actually the answer to a different question – but travels an interesting path to get there.#
I’ve got a script to emulate the requirement so that people can see for themselves the bug that I mention in post 15; I’ll try to add a couple of notes to it and publish it some time, but for the moment I’ll just remind myself that it’s called (slightly counter-intuitively: no_sort_problem.sql)
Here’s an example I saw a few months ago of the confusion caused by NULL. As the owner of the problem put it: the first query, run from SQL*Plus for testing purposes, takes no time to complete; but when “put into a pl/sql cursor” (as shown in the second query) it takes ages to complete.
How not to write subqueries:
I’ve written notes about the different join mechanisms in the past – but such things are always worth revisiting, so here’s an accumulated bundle of comments about hash joins.
A hash join takes two inputs that (in most of the Oracle literature) are referred to as the “build table” and the “probe table”. These rowsources may be extracts from real tables or indexes, or might be result sets from previous joins. Oracle uses the “build table” to build a hash table in memory, consuming and using the rowsource in a single call; it then consumes the “probe table” one row at a time, probing the in-memory hash table to find a match.
Access to the hash table is made efficient by use of a hashing function applied to the join columns – rows with the same value on the join column end up hashing to the same place in the hash table. It is possible for different input values to produce the same hash value (a hash collision) so Oracle still has to check the actual values once it has identified “probable” joins in the hash table. Because the comparison is based on a hashing mechanism, hash joins can only be used for join predicates that are equality predicates.
There are a couple of posts on the blog describing problems with updateable join views or, to be more precise, join views which were key-preserved but which the optimizer did not recognize as key-preserved. Both scenarios are addressed in 12c:
I’ve written a few notes about anomalies in subquery factoring (with subquery) in the past, principally making a fuss about the fact that moving an inline view into a “with subquery” can cause a plan to change even when the internal code moves the subquery back in line. With the arrival of 12c one of my first sets of tests was to rerun all the examples to see how many of them had been addressed. I hadn’t written about as many examples as I had thought, and some of them had been fixed before 12c, but here are few references to a couple of outstanding items that I thought worth a mention:
[Further reading on “subquery factoring”]
There have been a couple of nice posts about the “Top N” (or First N / Next N) syntax that has appeared in 12c, here and here, for example. I particularly like the first set of examples because they include some execution plans that give you a good idea of what’s going on under the covers. “Under the covers” is important, because if you don’t actually have a large data set to test on you might not realise what impact a “Top N” query might have on a production data set.
An interesting little problem appeared on the Oracle-L mailing list earlier on this week – a query ran fairly quickly when statistics hadn’t been collected on the tables, but then ran rather slowly after stats collection even though the plan hadn’t changed, and the tkprof results were there to prove the point. Here are the two outputs (edited slightly for width – the original showed three sets of row stats, the 1st, avg and max, but since the query had only been run once the three columns showed the same results in each case):
Possibly an item to file under “philosophy”, but a question came up in the seminar I was presenting today that prompted me to blog (very briefly) about why I manage to be so good at avoiding errors and inventing workarounds to problems. You probably know that you may see an execution plan change when you add a foreign key constraint to a table – but today someone in the class asked me if this would still work if the constraint were to be disabled. The immediate response that sprang to my mind was “surely not” – but the second thought was that I didn’t really know the answer and would have to check; and the third thought was that maybe it wouldn’t if disabled, but what about novalidate; and the fourth thought was whether the setting for query_rewrite_integrity would make a difference; and the fifth thought was to wonder if there were other conditions that mattered.
So hey-ho for the weekend, when I have to set up a test case for a query that changes plan when I add a foreign key constraint – and then try (at least) eight different combinations of circumstances to check what it really takes to make the change happen.
You might have expected the following query ought to run reasonably efficiently, after all it seems to be targeted very accurately at precisely the few rows of information I’m interested in:
owner = 'TEST_USER'
and table_name = 'TEST_COMP'
and subpartition_name = 'P_MAX_D'
I have a small collection of postings where I’ve described anomalies or limitations in subquery factoring (the “with subquery”, or Common Table Expression (CTE) to give it the official ANSI name). Here’s another example of Oracle’s code not behaving consistently. You may recognise the basic query from yesterday’s example of logical tuning – so I won’t reprint the code to generate the data sets. This examples in this note were created on 126.96.36.199 – we start with a simple query and its execution plan:
Here’s a model of a problem I solved quite recently at a client site. The client’s query was much more complex and the volume of data much larger, but this tiny, two table, example is sufficient to demonstrate the key principle. (Originally I thought I’d have to use three tables to model the problem, which is why you may find my choice of table names a little odd). I ran this example on 188.8.131.52 – which was the client version:
My note on “NOT IN” subqueries is one of the most popular on my blog, staying in the top 5 hits for the last five years – but it’s getting a bit old, so it’s about time I said something new about “NOT IN” – especially since the Null Aware Anti Join has been around such a long time. The example I want to talk about is, as so often, something that came up as a problem on a customer site. Here’s a bit of SQL to model the situation, which is currently running under Oracle 184.108.40.206: