Here’s an oddity that appeared on the OTN database forum last night:
We have this query in our application which works fine in 9i but fails in 11gR2 (on Exadata) giving an “ORA-00937: not a single-group group function” error….
… The subquery is selecting a column and it doesn’t have a group by clause at all. I am not sure how is this even working in 9i. I always thought that on a simple query using an aggregate function (without any analytic functions / clause), we cannot select a column without having that column in the group by clause. So, how 11g behaves was not a surprise but surprised to see how 9i behaves. Can someone explain this behaviour?
The poster supplied the suspect query, and it certainly looked as if it should never have worked – but I took a guess that the optimizer was doing some sort of transformation that concealed the problem before the optimizer managed to see the error. The subquery was a little odd because it was doing something it didn’t need to do, and my was guess that the optimizer had recognised the option to simplify the query and the simplification had “accidentally” removed the error. This turned out to be correct, but my guess about exactly what had happened to hide the error was wrong.
One of the changes that appeared in 12c was a change to the BROADCAST distribution option for parallel execution. I mentioned this in a footnote to a longer article a couple of months ago; this note simply expands on that brief comment with an example. We’ll start with a simple two-table hash join – which I’ll first construct and demonstrate in 22.214.171.124:
A little while ago I highlighted a special case with the MINUS operator (that one of the commentators extended to include the INTERSECT operator) relating to the way the second subquery would take place even if the first subquery produced no rows. I’ve since had an email from an Oracle employee letting me know that the developers looked at this case and decided that it wasn’t feasible to address it because – taking a wider view point – if the query were to run parallel they would need a mechanism that allowed some synchronisation between slaves so that every slave could find out that none of the slaves had received any rows from the first subquery, and this was going to lead to hanging problems.
The email reminded me that there’s another issue of the same kind that I discovered several years ago – I thought I’d written it up, but maybe it was on a newsgroup or forum somewhere, I can’t find it on my blog or old website). The problem can be demonstrated by this example:
Common internet question: does the order of predicates in the where clause make a difference.
General answer: It shouldn’t, but sometimes it will thanks to defects in the optimizer.
There’s a nicely presented example on the OTN database forum where predicate order does matter (between 10.1.x.x and 126.96.36.199). Note particularly – there’s a script to recreate the issue; note, also, the significance of the predicate section of the execution plan.
It’s bug 6782665, fixed in 188.8.131.52
Recently appeared on Mos – “Bug 18219084 : DIFFERENT EXECUTION PLAN ACROSS RAC INSTANCES”
Now, I’m not going to claim that the following applies to this particular case – but it’s perfectly reasonable to expect to see different plans for the same query on RAC, and it’s perfectly possible for the two different plans to have amazingly different performance characteristics; and in this particular case I can see an obvious reason why the two nodes could have different plans.
Here’s the query reported in the bug:
Here’s a simple little demonstration of an enhancement to the optimizer in 12c that may result in some interesting changes in execution plans as cardinality estimates change from “guesses” to accurate estimates.
Here’s a little script to demonstrate an observation about a missed opportunity for avoiding work that appeared in my email this morning (that’s morning Denver time):
Here’s a recent request from the OTN database forum – how do you make this query go faster (tkprof output supplied):
from a, b
where A.MARK IS NULL
and a.cntry_code = b.cntry_code and b.dir_code='XX' and b.numb_type='XXX'
and upper(Trim(replace(replace(replace(replace(replace(replace(replace(a.co_name,'*'),'&'),'-'),'/'),')'),'('),' '))) like
This is one of those posts where the investigation is left as an exercise – it’s not difficult, just something that will take a little time that I don’t have, and just might end up with me chasing half a dozen variations (so I’d rather not get sucked into looking too closely). It comes from an OTN question which ends up reporting this predicate:
WHERE ( LENGTH ( :b7) IS NULL OR
UPPER (TRIM (CODFSC)) = UPPER (TRIM ( :b8)) or
UPPER (TRIM (CODUIC)) = UPPER (TRIM ( :b9)))
AND STATE = 0;
The three bind variables all hold the same value; there is a function-based index on upper(trim(codfsc)), and another on upper(trim(coduic)). The execution plan for this query is a full tablescan, but if you eliminate the first predicate Oracle can do a concatenation of two index range scans. This variation doesn’t surprise me, the optimizer’s ability to introduce concatenation is limited; however, I did wonder whether some small variation in the SQL would allow the optimizer to get just a little more clever.
Would you get concatenation if you changed the first predicate to (:b7 is null); if not, would a similar query that didn’t depend on function-based indexes do concatenation; if not is there any rewrite of this query that could do a tablescan ONLY for the case where :b7 was null ?
Demonstrations of any levels of success can be left in the comments if anyone’s interested. To get a fixed font that preserves space start the code with “sourcecode” and end with “/sourcecode” (removing the quotation marks and replacing them with square brackets).
Here’s a little oddity I came across in 184.108.40.206 a few days ago – don’t worry too much about what the query is trying to do, or why it has been written the way I’ve done it, the only point I want to make is that I’ve got the same plan from two different strategies (according to the baseline/outline/hints), but the plans have a difference in cost.
How not to write subqueries:
It’s finally time to take a close look at the parallel versions of the execution plan I produced a little while ago for a four-table hash join. In this note I’ll examine the broadcast parallel distribution. First, here’s a list of the hints I’m going to use to get the effect I want:
leading(t4 t1 t2 t3)
full(t4) parallel(t4, 2)
use_hash(t1) swap_join_inputs(t1) pq_distribute(t1 none broadcast)
full(t1) parallel(t1, 2)
use_hash(t2) swap_join_inputs(t2) pq_distribute(t2 none broadcast)
full(t2) parallel(t2, 2)
use_hash(t3) swap_join_inputs(t3) pq_distribute(t3 none broadcast)
full(t3) parallel(t3, 2)
When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:
A recent question on the OTN database forum included an execution plan that prompted one reader to ask: “but where has the existence subquery gone?” Here’s the original question showing the query, and here’s the later response showing the plan that prompted the question.
There were three possible reasons why that question may have been posed:
Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem: