Here’s a recent request from the OTN database forum – how do you make this query go faster (tkprof output supplied):
from a, b
where A.MARK IS NULL
and a.cntry_code = b.cntry_code and b.dir_code='XX' and b.numb_type='XXX'
and upper(Trim(replace(replace(replace(replace(replace(replace(replace(a.co_name,'*'),'&'),'-'),'/'),')'),'('),' '))) like
This is one of those posts where the investigation is left as an exercise – it’s not difficult, just something that will take a little time that I don’t have, and just might end up with me chasing half a dozen variations (so I’d rather not get sucked into looking too closely). It comes from an OTN question which ends up reporting this predicate:
WHERE ( LENGTH ( :b7) IS NULL OR
UPPER (TRIM (CODFSC)) = UPPER (TRIM ( :b8)) or
UPPER (TRIM (CODUIC)) = UPPER (TRIM ( :b9)))
AND STATE = 0;
The three bind variables all hold the same value; there is a function-based index on upper(trim(codfsc)), and another on upper(trim(coduic)). The execution plan for this query is a full tablescan, but if you eliminate the first predicate Oracle can do a concatenation of two index range scans. This variation doesn’t surprise me, the optimizer’s ability to introduce concatenation is limited; however, I did wonder whether some small variation in the SQL would allow the optimizer to get just a little more clever.
Would you get concatenation if you changed the first predicate to (:b7 is null); if not, would a similar query that didn’t depend on function-based indexes do concatenation; if not is there any rewrite of this query that could do a tablescan ONLY for the case where :b7 was null ?
Demonstrations of any levels of success can be left in the comments if anyone’s interested. To get a fixed font that preserves space start the code with “sourcecode” and end with “/sourcecode” (removing the quotation marks and replacing them with square brackets).
Here’s a little oddity I came across in 18.104.22.168 a few days ago – don’t worry too much about what the query is trying to do, or why it has been written the way I’ve done it, the only point I want to make is that I’ve got the same plan from two different strategies (according to the baseline/outline/hints), but the plans have a difference in cost.
How not to write subqueries:
It’s finally time to take a close look at the parallel versions of the execution plan I produced a little while ago for a four-table hash join. In this note I’ll examine the broadcast parallel distribution. First, here’s a list of the hints I’m going to use to get the effect I want:
leading(t4 t1 t2 t3)
full(t4) parallel(t4, 2)
use_hash(t1) swap_join_inputs(t1) pq_distribute(t1 none broadcast)
full(t1) parallel(t1, 2)
use_hash(t2) swap_join_inputs(t2) pq_distribute(t2 none broadcast)
full(t2) parallel(t2, 2)
use_hash(t3) swap_join_inputs(t3) pq_distribute(t3 none broadcast)
full(t3) parallel(t3, 2)
When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:
A recent question on the OTN database forum included an execution plan that prompted one reader to ask: “but where has the existence subquery gone?” Here’s the original question showing the query, and here’s the later response showing the plan that prompted the question.
There were three possible reasons why that question may have been posed:
Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem:
Here’s an odd little optimizer glitch – probably irrelevant to most people, but an indication of the apparent randomness that appears as you combine features. I’ve created an example which is so tiny that the only explanation I can come up with the for optimizer not “behaving properly” is that I’ve found an undocumented restriction relating to a particular feature.
Here’s a lovely little example that just came up on the OTN database forum of how things break when features collide. It’s a bug (I haven’t looked for the number) that seems to be fixed in 22.214.171.124. All it takes is a deferrable foreign key and an outer join. I’ve changed the table and column names from the original, and limited the deferability to just the foreign key:
Since I’m going to write a couple of articles dissecting parallel execution plans, I thought I’d put up a reference post describing the set of tables I used to generate the plan from the previous post, and the query (with serial execution plan) that produced it. The setup is a simple star schema arrangement – which I’ve generated by created by creating three identical tables and then doing a Cartesian join across the three of them.
When you read an execution plan you’re probably trying to identify the steps that Oracle went through to acquire the final result set so that you can decide whether or not there is a more efficient way of getting the same result.
For a serial execution plan this typically means you have to identify the join order, join methods and access methods together with the point at which each predicate was applied. Generally this isn’t too difficult, although subquery pushing (and a couple of presentation bugs relating to other uses of scalar subqueries) can cause a little confusion; and the difference between join order and the order of operation can be slightly obscured when considering hash joins.
Parallel execution plans are harder, though, because you really need to understand the impact of the order of operation, distribution mechanisms chosen, and (in recent versions of Oracle) the timing of the generation and use of Bloom filters. The topic is stunningly large and easy to describe badly; it’s also going to be easy to make generalisations that turn out to be untrue or (at least) sufficiently incomplete as to be misleading. Rather than attempting to cover the topic in one note, I think I’m going to end up writing two or three.
Or – to be more accurate – real statistics on a virtual column.
This is one of the “10 top tips” that I came up with for my session with Maria Colgan at OOW13. A method of giving more information that might improve execution plans when you can’t change the code. I’ll start with a small data set including a virtual column (running 126.96.36.199), and a couple of problem queries:
(… as in “trick or treat”)
Here’s an important point I learned from Maria Colgan’s “10 tips” presentation on Tuesday of Open World. It comes in two steps – the bit that most people know, and an unexpected consequence:
- autotrace can give misleading execution plans for queries that use bind variables – because autotrace doesn’t peek
- if you run a query after testing it with autotrace, the plan generated by autotrace can be shared by the later execution
Here’s a simple script I’ll be using to demonstrate the behaviour:
Here’s a little quiz about Bloom filtering. There seem to be at least three different classes of query where Bloom filters can come into play – all involving hash joins: partition elimination, aggregate reduction on non-mergeable aggregate views, and parallelism.
This quiz is about parallel queries – and all you have to do is work out how many Bloom filters were used in the following two execution plans (produced by 188.8.131.52), and where they were used.