Oracle Scratchpad

March 7, 2014

Subquery Anomaly

Filed under: Bugs,CBO,Execution plans,Oracle,Upgrades — Jonathan Lewis @ 8:57 am GMT Mar 7,2014

Here’s an oddity that appeared on the OTN database forum last night:

We have this query in our application which works fine in 9i but fails in 11gR2 (on Exadata) giving an “ORA-00937: not a single-group group function” error….

… The subquery is selecting a column and it doesn’t have a group by clause at all. I am not sure how is this even working in 9i. I always thought that on a simple query using an aggregate function (without any analytic functions / clause), we cannot select a column without having that column in the group by clause. So, how 11g behaves was not a surprise but surprised to see how 9i behaves. Can someone explain this behaviour?

The poster supplied the suspect query, and it certainly looked as if it should never have worked – but I took a guess that the optimizer was doing some sort of transformation that concealed the problem before the optimizer managed to see the error. The subquery was a little odd because it was doing something it didn’t need to do, and my was guess that the optimizer had recognised the option to simplify the query and the simplification had “accidentally” removed the error. This turned out to be correct, but my guess about exactly what had happened to hide the error was wrong.

Having created a hypothesis I couldn’t resist checking it this morning, so here’s the test case (don’t pay any attention to the actual data I’ve generated, it was a cut-n-paste from a script that I had previously used for something completely different):

create table t1
as
select
	trunc((rownum-1)/15)	n1,
	trunc((rownum-1)/15)	n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

create table t2
as
select
	mod(rownum,200)		n1,
	mod(rownum,200)		n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

begin
 	dbms_stats.gather_table_stats(
 		user,
 		't1',
 		method_opt => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		user,
		't2',
		method_opt => 'for all columns size 1'
	);
end;
/

explain plan for
select
	/*+ qb_name(main) */
	*
from t1
where (n2,n1) in (
	select /*+
			qb_name(subq)
			unnest
		*/
		max(t2.n2), t2.n1
	from t2
	where t2.n1 = t1.n1
)
;

You’ll notice, of course, that I don’t have a group by clause at all, so the presence of the t2.n1 in the select list should lead to Oracle error: “ORA-00937: not a single-group group function”.

In versions from 8i to 11.1.0.7, this query could run, and its execution plan looked looked like this:


----------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |   200 | 45200 |    46 |
|*  1 |  HASH JOIN           |         |   200 | 45200 |    46 |
|   2 |   VIEW               | VW_SQ_1 |   200 |  7800 |    31 |
|   3 |    HASH GROUP BY     |         |   200 |  2400 |    31 |
|   4 |     TABLE ACCESS FULL| T2      |  3000 | 36000 |    14 |
|   5 |   TABLE ACCESS FULL  | T1      |  3000 |   547K|    14 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("N2"="MAX(T2.N2)" AND "N1"="N1" AND "ITEM_1"="T1"."N1")

Notice how the optimizer has produced an inline view (VW_SQ_1) from the subquery, using it to drive a hash join; notice how that inline view has an aggregation operation (HASH GROUP BY) in it. In effect the optimizer has rewritten my query like this:

select
	t1.*
from	(
		select
			distinct max(t2.n2) max_n2, t2.n1 item_1, t2.n1
		from	t2
		group by
			t2.n1
	)	vw_sq_1,
	t1
where
	t1.n2 = vw_sq_1.max_n2
and	t1.n1 = vw_sq_1.n1
and	t1.n1 = vw_sq_1.item_1
;

There’s a clue about why this succeeded in the 10053 trace file, which includes the lines:

"Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:   Passed validity checks.

Compared to the 11.2 lines:

Subquery Unnesting on query block MAIN (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block MAIN (#1).
SU:   Checking validity of unnesting subquery SUBQ (#2)
SU:     SU bypassed: Failed basic validity checks.
SU:   Validity checks failed.

Whatever check it was that Oracle introduced in 11.2 (maybe a check that the query block was inherently legal), unnesting failed – and if I add an /*+ no_unnest */ hint to the original subquery in the earlier versions of Oracle I get the expected ORA-00937.

The philosophical argument is left to the reader: was the original behaviour a bug, or is the new behaviour the bug ?

 

March 5, 2014

12c pq_replicate

Filed under: 12c,Exadata,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 5:39 pm GMT Mar 5,2014

Another day, another airport lounge – another quick note: one of the changes that appeared in 12c was a tweak to the “broadcast” distribution option of parallel queries. I mentioned this in a footnote to a longer article a couple of months ago; this note simply expands on that brief comment with an example. We’ll start with a simple two-table hash join – which I’ll first construct and demonstrate in 11.2.0.4:


create table t1
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 3000
)
select
	rownum				n1,
	lpad(rownum,6,'0')		small_vc,
	lpad(rownum,200,'0')		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1000
;

create table t2
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 3000
)
select
	1 + mod(rownum,10000)			n1,
	lpad(1 + mod(rownum,10000),6,'0')	small_vc,
	lpad(rownum,500,'0')			padding
from
	generator	v1,
	generator	v2
where
	rownum <= 20000 ;

-- collect stats, no histograms.

select
  	/*+
  		leading(t1 t2)
 		parallel(t1 2)
 		parallel(t2 2)
 		use_hash(t2)
 	*/
 	t1.padding,
 	t2.padding
from 	t1, t2
where	t2.n1 = t1.n1
and	t2.small_vc = t1.small_vc
;

-------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost  |    TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |  1000 |   707K|   135 |        |      |            |
|   1 |  PX COORDINATOR         |          |       |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10001 |  1000 |   707K|   135 |  Q1,01 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN            |          |  1000 |   707K|   135 |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE          |          |  1000 |   207K|     4 |  Q1,01 | PCWP |            |
|   5 |      PX SEND BROADCAST  | :TQ10000 |  1000 |   207K|     4 |  Q1,00 | P->P | BROADCAST  |
|   6 |       PX BLOCK ITERATOR |          |  1000 |   207K|     4 |  Q1,00 | PCWC |            |
|   7 |        TABLE ACCESS FULL| T1       |  1000 |   207K|     4 |  Q1,00 | PCWP |            |
|   8 |     PX BLOCK ITERATOR   |          | 20000 |     9M|   131 |  Q1,01 | PCWC |            |
|   9 |      TABLE ACCESS FULL  | T2       | 20000 |     9M|   131 |  Q1,01 | PCWP |            |
-------------------------------------------------------------------------------------------------

In this plan slave set 2 scans table t1 in parallel and broadcasts the result set to slave set 1 (lines 5 – 7). The significance of the broadcast option is that each slave in slave set 2 sends all the rows it has read to every slave in slave set 1. For a fairly large table with a high degree of parallelism this could be a lot of inter-process communication; the total number of rows passing through the PX message pool is “DOP x number of row filtered from t1″.

After a slave in slave set 1 has receive the whole of the t1 result set it builds an in-memory hash table and starts scanning rowid ranges (PX BLOCK ITERATOR) from table t2, probing the in-memory hash table to effect the join (lines 3,4, 8,9). Since each slave has a copy of the whole result set from t1 it can scan any chunk of t2 and handle the contents locally. Moreover, because slave set 1 isn’t reading its second input from a virtual table it is able to write its output immediately the virtual table (:TQ10001) that feeds the query coordinator with the result (lines 1,2) – we don’t have to do a “hash join buffered” operation and buffer the entire second input before starting to execute the join.

So how does 12c change things. With the same starting data and query, here’s the execution plan:

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost  |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |  1000 |   707K|   135 |        |      |            |
|   1 |  PX COORDINATOR       |          |       |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM) | :TQ10000 |  1000 |   707K|   135 |  Q1,00 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN          |          |  1000 |   707K|   135 |  Q1,00 | PCWP |            |
|   4 |     TABLE ACCESS FULL | T1       |  1000 |   207K|     4 |  Q1,00 | PCWP |            |
|   5 |     PX BLOCK ITERATOR |          | 20000 |     9M|   131 |  Q1,00 | PCWC |            |
|   6 |      TABLE ACCESS FULL| T2       | 20000 |     9M|   131 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------

Notice, in particular, that we only have one virtual table (or table queue :TQ10000) rather than two – and that’s from a parallel query slave set to the query co-ordinator, parallel to serial; the query only uses one set of parallel query slaves. Until you run the query with rowsource execution statistics enabled and look at the output from v$pq_tqstat it’s not going to be immediately obvious what has happened, but we should see that somehow Oracle is no longer broadcasting the first table even though it’s still doing something in parallel with both tables.

The run-time statistics confirm that we’ve only used one set of slaves, and each slave in the slave set has scanned the whole of table t1. This means each slave can build the full hash table and then go on to read rowid ranges from table t2. We’ve managed to get the benefit of broadcasting t1 (every slave has the whole of t1 so we don’t have to scan and distribute the big table t2 through the PX message pool) but we haven’t had to clone it multiple times through the PX message pool.

Clearly there’s a trade-off here that Oracle Corp. has decided is worth considering. I’m guessing it’s biased towards Exadata where you might run queries with a very high degree of parallelism. In that case the overhead of task switching as large numbers of messages are passed around may (and this is pure supposition) be greater than the added cost of loading the table into the buffer cache (of each instance) and having each slave scan it from there. (Reminder – 11g introduced two “opposite” changed to tablescans: “serial direct reads” and “parallel in-memory scans”.)

There’s one little oddity in this replication – there’s a pair of hints: pq_replicate and no_pq_replicate to control the effect if you think the optimizer is making the wrong choice. I would have guessed that in my example the hint would read: /*+ pq_replicate(t1) */ as it’s table t1 that is read by every single slave. Strangely, though, this is what the outline section of the execution plan showed:


  /*+
      BEGIN_OUTLINE_DATA
      PQ_REPLICATE(@"SEL$1" "T2"@"SEL$1")
      PQ_DISTRIBUTE(@"SEL$1" "T2"@"SEL$1" BROADCAST NONE)
      USE_HASH(@"SEL$1" "T2"@"SEL$1")
      LEADING(@"SEL$1" "T1"@"SEL$1" "T2"@"SEL$1")
      FULL(@"SEL$1" "T2"@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      OPT_PARAM('_optimizer_cost_model' 'io')
      DB_VERSION('12.1.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Notice how the hint specifies table t2, not table t1 !

Footnote

Here’s a little anomaly,  and a generic warning about “optimizer_features_enable”: I found that if I used the hint /*+ optimizer_features_enable(‘11.2.0.4′) */ in 12c I could still get the pq_replicate() hint to work. Unfortunately there are a few places where the hint (or parameter) isn’t guaranteed to take the optimizer code backwards the full 100%.

February 28, 2014

Empty Hash

Filed under: Bugs,CBO,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 6:45 pm GMT Feb 28,2014

A little while ago I highlighted a special case with the MINUS operator (that one of the commentators extended to include the INTERSECT operator) relating to the way the second subquery would take place even if the first subquery produced no rows. I’ve since had an email from an Oracle employee letting me know that the developers looked at this case and decided that it wasn’t feasible to address it because – taking a wider view point – if the query were to run parallel they would need a mechanism that allowed some synchronisation between slaves so that every slave could find out that none of the slaves had received no rows from the first subquery, and this was going to lead to hanging problems.

The email reminded me that there’s another issue of the same kind that I discovered several years ago – I thought I’d written it up, but maybe it was on a newsgroup or forum somewhere, I can’t find it on my blog or old website). The problem can be demonstrated by this example:

(more…)

February 26, 2014

Predicate Order

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 8:14 am GMT Feb 26,2014

Common internet question: does the order of predicates in the where clause make a difference.
General answer: It shouldn’t, but sometimes it will thanks to defects in the optimizer.

There’s a nicely presented example on the OTN database forum where predicate order does matter (between 10.1.x.x and 11.1.0.7). Note particularly – there’s a script to recreate the issue; note, also, the significance of the predicate section of the execution plan.
It’s bug 6782665, fixed in 11.2.0.1

February 10, 2014

RAC Plans

Filed under: Execution plans,Hints,Oracle,RAC,Troubleshooting — Jonathan Lewis @ 1:12 pm GMT Feb 10,2014

Recently appeared on Mos – “Bug 18219084 : DIFFERENT EXECUTION PLAN ACROSS RAC INSTANCES”

Now, I’m not going to claim that the following applies to this particular case – but it’s perfectly reasonable to expect to see different plans for the same query on RAC, and it’s perfectly possible for the two different plans to have amazingly different performance characteristics; and in this particular case I can see an obvious reason why the two nodes could have different plans.

Here’s the query reported in the bug:

(more…)

February 6, 2014

12c fixed subquery

Filed under: 12c,Execution plans,Oracle,subqueries — Jonathan Lewis @ 2:25 pm GMT Feb 6,2014

Here’s a simple little demonstration of an enhancement to the optimizer in 12c that may result in some interesting changes in execution plans as cardinality estimates change from “guesses” to accurate estimates.

(more…)

February 5, 2014

Minus

Filed under: Execution plans,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 5:42 pm GMT Feb 5,2014

Here’s a little script to demonstrate an observation about a missed opportunity for avoiding work that appeared in my email this morning (that’s morning Denver time):

(more…)

January 23, 2014

Optimisation

Filed under: Execution plans,Oracle,Performance — Jonathan Lewis @ 6:05 pm GMT Jan 23,2014

Here’s a recent request from the OTN database forum – how do you make this query go faster (tkprof output supplied):

 select a.rowid
   from  a, b
   where A.MARK IS NULL
     and a.cntry_code = b.cntry_code and b.dir_code='XX' and b.numb_type='XXX'
     and upper(Trim(replace(replace(replace(replace(replace(replace(replace(a.co_name,'*'),'&'),'-'),'/'),')'),'('),' '))) like
         upper(Trim(substr(replace(replace(replace(replace(replace(replace(replace(b.e_name,'*'),'&'),'-'),'/'),')'),'('),' '),1,25)))||'%';

(more…)

January 2, 2014

Conditional SQL – 4

Filed under: Conditional SQL,Execution plans,Oracle — Jonathan Lewis @ 6:14 pm GMT Jan 2,2014

This is one of those posts where the investigation is left as an exercise – it’s not difficult, just something that will take a little time that I don’t have, and just might end up with me chasing half a dozen variations (so I’d rather not get sucked into looking too closely). It comes from an OTN question which ends up reporting this predicate:

WHERE ( LENGTH ( :b7) IS NULL OR
         UPPER (TRIM (CODFSC)) = UPPER (TRIM ( :b8)) or
         UPPER (TRIM (CODUIC)) = UPPER (TRIM ( :b9)))
       AND STATE = 0;

The three bind variables all hold the same value; there is a function-based index on upper(trim(codfsc)), and another on upper(trim(coduic)). The execution plan for this query is a full tablescan, but if you eliminate the first predicate Oracle can do a concatenation of two index range scans. This variation doesn’t surprise me, the optimizer’s ability to introduce concatenation is limited; however, I did wonder whether some small variation in the SQL would allow the optimizer to get just a little more clever.

Would you get concatenation if you changed the first predicate to (:b7 is null); if not, would a similar query that didn’t depend on function-based indexes do concatenation; if not is there any rewrite of this query that could do a tablescan ONLY for the case where :b7 was null ?

Demonstrations of any levels of success can be left in the comments if anyone’s interested. To get a fixed font that preserves space start the code with “sourcecode” and end with “/sourcecode” (removing the quotation marks and replacing them with square brackets).

December 16, 2013

Unnest Oddity

Filed under: Execution plans,Hints,Oracle,subqueries — Jonathan Lewis @ 6:56 pm GMT Dec 16,2013

Here’s a little oddity I came across in 11.2.0.4 a few days ago – don’t worry too much about what the query is trying to do, or why it has been written the way I’ve done it, the only point I want to make is that I’ve got the same plan from two different strategies (according to the baseline/outline/hints), but the plans have a difference in cost.

(more…)

December 10, 2013

Subquery

Filed under: Oracle,subqueries,Tuning — Jonathan Lewis @ 6:26 pm GMT Dec 10,2013

How not to write subqueries:

(more…)

December 8, 2013

Parallel Execution – 3

Filed under: Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 10:09 pm GMT Dec 8,2013

It’s finally time to take a close look at the parallel versions of the execution plan I produced a little while ago for a four-table hash join. In this note I’ll examine the broadcast parallel distribution. First, here’s a list of the hints I’m going to use to get the effect I want:

	/*+
		leading(t4 t1 t2 t3)
		full(t4) parallel(t4, 2)
		use_hash(t1) swap_join_inputs(t1) pq_distribute(t1 none broadcast)
		full(t1) parallel(t1, 2)
		use_hash(t2) swap_join_inputs(t2) pq_distribute(t2 none broadcast)
		full(t2) parallel(t2, 2)
		use_hash(t3) swap_join_inputs(t3) pq_distribute(t3 none broadcast)
		full(t3) parallel(t3, 2)
		monitor
	*/

(more…)

12c Subqueries

Filed under: 12c,CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 11:32 am GMT Dec 8,2013

When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:

(more…)

November 29, 2013

Interesting Plan

Filed under: Execution plans,Oracle,subqueries — Jonathan Lewis @ 8:56 am GMT Nov 29,2013

A recent question on the OTN database forum included an execution plan that prompted one reader to ask: “but where has the existence subquery gone?” Here’s the original question showing the query, and here’s the later response showing the plan that prompted the question.

There were three possible reasons why that question may have been posed:

(more…)

November 4, 2013

Outline hassle

Filed under: CBO,Execution plans,Oracle — Jonathan Lewis @ 6:17 pm GMT Nov 4,2013

Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem:

(more…)

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,521 other followers