Oracle Scratchpad

April 4, 2014

NVL() change

Filed under: CBO,Oracle,Troubleshooting,Upgrades — Jonathan Lewis @ 6:10 pm BST Apr 4,2014

One of the problems of functions is that the optimizer generally doesn’t have any idea on how a predicate based on function(col) might affect the cardinality. However,  the optimizer group are constantly refining the algorithms to cover an increasing number of special cases more accurately. This is a good thing, of course – but it does mean that you might be unlucky on an upgrade where a better cardinality estimate leads to a less efficient execution plan. Consider for example the simple query (where d1 is column of type date):

select	*
from	t1
where	nvl(d1,to_date('01-01-1900','dd-mm-yyyy')) < sysdate

Now, there are many cases in many versions of Oracle, where the optimizer will appear to calculate the cardinality of

nvl(columnX,{constant}) operator {constant}

as if it were:

columnX is null or columnX operator {constant}

Unfortunately this doesn’t seem to be one of them – until you get to 11.2.something. Here’s a little demonstration code:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	case
		when rownum > 100 then null else sysdate - rownum
	end	d1
from
	generator	v1,
	generator	v2
where
	rownum <= 50000
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

set autotrace traceonly explain

prompt	query with NVL

select	*
from	t1
where	nvl(d1,to_date('01-01-1900','dd-mm-yyyy')) < sysdate
;

prompt	query with OR clause

select	*
from	t1
where	d1 is null or d1 < sysdate
;

If you run this code in 11.1.0.7 you get the following – with numeric variations for cost (which I’m interested not in at the moment):


query with NVL
==============
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  2500 |  2500 |    18  (39)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  2500 |  2500 |    18  (39)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("D1",TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))<SYSDATE@!)

query with OR clause
====================
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 50000 | 50000 |    13  (16)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 50000 | 50000 |    13  (16)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1" IS NULL OR "D1"<SYSDATE@!)

Take note particularly of the difference in the estimated cardinality for the tablescans.

When you upgrade to 11.2.0.4 (possibly earlier – though there are some nvl() related patches that appeared only in 11.2.0.4), you get this:


query with NVL
==============
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 50000 | 50000 |    18  (39)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 50000 | 50000 |    18  (39)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(NVL("D1",TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))<SYSDATE@!)

query with OR clause
====================
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 49900 | 49900 |    13  (16)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 49900 | 49900 |    13  (16)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("D1" IS NULL OR "D1"<SYSDATE@!)

As you can see the estimate for the “NVL()” example is now correct – which means vastly different from the estimate in 11.1.0.7 which seemed to be using the standard “5% for range-based predicate on function(col)”.

It’s interesting to note that a (relatively) small error has crept in to the “OR” example – interestingly the size of the error is exactly the number of rows where d1 is not null (which looks like enough of a coincidence to be a bug – but maybe there’s a good rationale for it)

Conclusion

Yet again, a simple upgrade has the capacity to make a dramatic change to a cardinality estimate – which could mean a significant change to an execution plan and major change in performance. If you’ve read this note, though, you may recognise the pattern that is the driving feature behind the problem.

Footnote:

If you have access to versions 11.2.0.1 through 11.2.0.3 and find that the test data produces different cardinalities please publish the results in the comments – it would be nice to know exactly when this change appears.  (You need only show the body of the execution plans  with labels, not the entire output of the script).

 

March 7, 2014

Subquery Anomaly

Filed under: Bugs,CBO,Execution plans,Oracle,Upgrades — Jonathan Lewis @ 8:57 am BST Mar 7,2014

Here’s an oddity that appeared on the OTN database forum last night:

We have this query in our application which works fine in 9i but fails in 11gR2 (on Exadata) giving an “ORA-00937: not a single-group group function” error….

… The subquery is selecting a column and it doesn’t have a group by clause at all. I am not sure how is this even working in 9i. I always thought that on a simple query using an aggregate function (without any analytic functions / clause), we cannot select a column without having that column in the group by clause. So, how 11g behaves was not a surprise but surprised to see how 9i behaves. Can someone explain this behaviour?

The poster supplied the suspect query, and it certainly looked as if it should never have worked – but I took a guess that the optimizer was doing some sort of transformation that concealed the problem before the optimizer managed to see the error. The subquery was a little odd because it was doing something it didn’t need to do, and my was guess that the optimizer had recognised the option to simplify the query and the simplification had “accidentally” removed the error. This turned out to be correct, but my guess about exactly what had happened to hide the error was wrong.

Having created a hypothesis I couldn’t resist checking it this morning, so here’s the test case (don’t pay any attention to the actual data I’ve generated, it was a cut-n-paste from a script that I had previously used for something completely different):

create table t1
as
select
	trunc((rownum-1)/15)	n1,
	trunc((rownum-1)/15)	n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

create table t2
as
select
	mod(rownum,200)		n1,
	mod(rownum,200)		n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

begin
 	dbms_stats.gather_table_stats(
 		user,
 		't1',
 		method_opt => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		user,
		't2',
		method_opt => 'for all columns size 1'
	);
end;
/

explain plan for
select
	/*+ qb_name(main) */
	*
from t1
where (n2,n1) in (
	select /*+
			qb_name(subq)
			unnest
		*/
		max(t2.n2), t2.n1
	from t2
	where t2.n1 = t1.n1
)
;

You’ll notice, of course, that I don’t have a group by clause at all, so the presence of the t2.n1 in the select list should lead to Oracle error: “ORA-00937: not a single-group group function”.

In versions from 8i to 11.1.0.7, this query could run, and its execution plan looked looked like this:


----------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |   200 | 45200 |    46 |
|*  1 |  HASH JOIN           |         |   200 | 45200 |    46 |
|   2 |   VIEW               | VW_SQ_1 |   200 |  7800 |    31 |
|   3 |    HASH GROUP BY     |         |   200 |  2400 |    31 |
|   4 |     TABLE ACCESS FULL| T2      |  3000 | 36000 |    14 |
|   5 |   TABLE ACCESS FULL  | T1      |  3000 |   547K|    14 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("N2"="MAX(T2.N2)" AND "N1"="N1" AND "ITEM_1"="T1"."N1")

Notice how the optimizer has produced an inline view (VW_SQ_1) from the subquery, using it to drive a hash join; notice how that inline view has an aggregation operation (HASH GROUP BY) in it. In effect the optimizer has rewritten my query like this:

select
	t1.*
from	(
		select
			distinct max(t2.n2) max_n2, t2.n1 item_1, t2.n1
		from	t2
		group by
			t2.n1
	)	vw_sq_1,
	t1
where
	t1.n2 = vw_sq_1.max_n2
and	t1.n1 = vw_sq_1.n1
and	t1.n1 = vw_sq_1.item_1
;

There’s a clue about why this succeeded in the 10053 trace file, which includes the lines:

"Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:   Passed validity checks.

Compared to the 11.2 lines:

Subquery Unnesting on query block MAIN (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block MAIN (#1).
SU:   Checking validity of unnesting subquery SUBQ (#2)
SU:     SU bypassed: Failed basic validity checks.
SU:   Validity checks failed.

Whatever check it was that Oracle introduced in 11.2 (maybe a check that the query block was inherently legal), unnesting failed – and if I add an /*+ no_unnest */ hint to the original subquery in the earlier versions of Oracle I get the expected ORA-00937.

The philosophical argument is left to the reader: was the original behaviour a bug, or is the new behaviour the bug ?

 

February 28, 2014

Empty Hash

Filed under: Bugs,CBO,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 6:45 pm BST Feb 28,2014

A little while ago I highlighted a special case with the MINUS operator (that one of the commentators extended to include the INTERSECT operator) relating to the way the second subquery would take place even if the first subquery produced no rows. I’ve since had an email from an Oracle employee letting me know that the developers looked at this case and decided that it wasn’t feasible to address it because – taking a wider view point – if the query were to run parallel they would need a mechanism that allowed some synchronisation between slaves so that every slave could find out that none of the slaves had received no rows from the first subquery, and this was going to lead to hanging problems.

The email reminded me that there’s another issue of the same kind that I discovered several years ago – I thought I’d written it up, but maybe it was on a newsgroup or forum somewhere, I can’t find it on my blog or old website). The problem can be demonstrated by this example:

(more…)

February 26, 2014

Predicate Order

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 8:14 am BST Feb 26,2014

Common internet question: does the order of predicates in the where clause make a difference.
General answer: It shouldn’t, but sometimes it will thanks to defects in the optimizer.

There’s a nicely presented example on the OTN database forum where predicate order does matter (between 10.1.x.x and 11.1.0.7). Note particularly – there’s a script to recreate the issue; note, also, the significance of the predicate section of the execution plan.
It’s bug 6782665, fixed in 11.2.0.1

January 3, 2014

Index Hash

Filed under: Bugs,CBO,Hints,Ignoring Hints,Index Joins,Indexing,Oracle — Jonathan Lewis @ 6:56 pm BST Jan 3,2014

I’m afraid this is one of my bad puns again – an example of the optimizer  making a real hash of the index hash join. I’m going to create a table with several indexes (some of them rather similar to each other) and execute a query that should do an index join between the obvious two indexes. To show how obvious the join should be I’m going to start with a couple of queries that show the cost of simple index fast full scans.

Here’s the data generating code:

(more…)

December 8, 2013

12c Subqueries

Filed under: 12c,CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 11:32 am BST Dec 8,2013

When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:

(more…)

November 4, 2013

Outline hassle

Filed under: CBO,Execution plans,Oracle — Jonathan Lewis @ 6:17 pm BST Nov 4,2013

Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem:

(more…)

October 2, 2013

FBI decode

Filed under: Bugs,CBO,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 8:26 am BST Oct 2,2013

It probably won’t surprise many people to hear me say that the decode() function can be a bit of a nuisance; and I’ll bet that quite a lot of people have had trouble occasionally trying to get function-based indexes that use this function to behave properly. So (to put it all together and support the general directives that case is probably a better choice than decode() and that the cast() operator is an important thing to learn) here’s an example of how function-based indexes don’t always allow you to work around bad design/code. (Note: this is a model of a problem I picked up at a client site, stripped to a minimum – you have to pretend that I’m not allowed to fix the problem by changing code).

(more…)

September 27, 2013

Virtual Stats

Filed under: CBO,Execution plans,Oracle,Statistics — Jonathan Lewis @ 6:49 am BST Sep 27,2013

Or – to be more accurate – real statistics on a virtual column.

This is one of the “10 top tips” that I came up with for my session with Maria Colgan at OOW13. A method of giving more information that might improve execution plans when you can change the code. I’ll start with a small data set including a virtual column (running 11.1.0.7), and a couple of problem queries:
(more…)

September 25, 2013

Extended Stats

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 5:06 pm BST Sep 25,2013

Here’s a little demo cut-n-pasted from a session running Oracle 12.1.0.1 (it works on 11g, too). All it does is create a table by copying from a well-known table, gather extended stats on a column group, then show you the resulting column names by querying view user_tab_cols.

(more…)

September 7, 2013

Hash Joins

Filed under: CBO,Execution plans,Hints,Oracle,Tuning — Jonathan Lewis @ 12:53 pm BST Sep 7,2013

I’ve written notes about the different joins in the past – but such things are always worth revisiting, so here’s an accumulated bundle of comments about hash joins.

A hash join takes two inputs that (in most of the Oracle literature) are referred to as the “build table” and the “probe table”. These rowsources may be extracts from real tables, or indexes, or might be result sets from previous joins. Oracle uses the “build table” to build a hash table in memory, consuming and using the rowsource in a single call; it then consumes the “probe table” one row at a time, probing the in-memory hash table to find a match.  Access to the hash table is made efficient by use of a hashing function that has been used on the join columns – rows with the same value on the join column end up hashing to the same place in the hash table. It is possible for different input values to produce the same hash value (a hash collision) so Oracle still has to check the actual values once it has identified “probable” joins in the hash table. Because the comparison is based on a hash value, hash joins cannot be used for range-based comparisons.
(more…)

August 15, 2013

MV Refresh

Filed under: Bugs,CBO,Infrastructure,Oracle,Statistics — Jonathan Lewis @ 6:12 pm BST Aug 15,2013

Here’s a funny little problem I came across some time ago when setting up some materialized views. I have two tables, orders and order_lines, and I’ve set up materialized view logs for them that allow a join materialized view (called orders_join) to be fast refreshable. Watch what happens if I refresh this view just before gathering stats on the order_lines table.

(more…)

August 13, 2013

NVL()

Filed under: CBO,Conditional SQL,Execution plans,NULL,Oracle — Jonathan Lewis @ 7:14 am BST Aug 13,2013

Here’s a little detail about how the optimizer can handle the nvl() function that I hadn’t noticed before (and it goes back to at least 8i). This is running on 11.2.0.3, and table t1 is just all_objects where rownum <= 20000:

(more…)

August 5, 2013

Bloom Filter

Filed under: 12c,CBO,Execution plans,Oracle — Jonathan Lewis @ 9:22 pm BST Aug 5,2013

I’ve posted this note as a quick way of passing on an example prompted by a twitter conversation with Timur and Maria about Bloom filters:

The Bloom filter (capital B because it’s named after a person) is not supposed to appear in Oracle plans unless the query is executing in parallel but here’s an example which seems to use a serial Bloom filter.  Running in 11.2.0.3 and 12.1.0.1 (the results shown are the latter – the numbers are slightly different between versions):

(more…)

July 27, 2013

Clustering_factor

Filed under: CBO,Indexing,Oracle — Jonathan Lewis @ 7:21 am BST Jul 27,2013

The clustering_factor is one of the most important numbers (if not the most important number) affecting the optimizer’s choice of execution plan – it’s the thing that has the most significant effect on the optimizer’s decision on whether to choose a table scan or an index, and on which index to choose.

(more…)

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,528 other followers