Oracle Scratchpad

June 8, 2012

Unique Fail

Filed under: Bugs,CBO,Execution plans,Oracle,Statistics — Jonathan Lewis @ 5:54 pm BST Jun 8,2012

As in – how come a unique (or primary key) index is predicted to return more than one row using a unique scan. Here’s and example (running on – but the same type of thing happens on newer versions):

create table t1
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
	rownum			id,
	rownum			n1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
	generator	v1,
	generator	v2
	rownum <= 1e4

create table t2
select	*
from	t1
	mod(id,10) = 0

-- do something about stats

alter table t1 add constraint t1_pk primary key (id) using index (
	create unique index t1_pk on t1(id)

	t2, t1
where between 100 and 200
and = t2.n1
and	t1.n1 is not null

| Id  | Operation                    | Name  | Rows  | Bytes | Cost  |
|   0 | SELECT STATEMENT             |       |    24 |  3048 |    16 |
|   1 |  NESTED LOOPS                |       |    24 |  3048 |    16 |
|*  2 |   TABLE ACCESS FULL          | T2    |    12 |  1428 |     4 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| T1    |     2 |    16 |     1 |
|*  4 |    INDEX UNIQUE SCAN         | T1_PK |     1 |       |       |

Predicate Information (identified by operation id):
   2 - filter("T2"."ID">=100 AND "T2"."ID"<=200)
   3 - filter("T1"."N1" IS NOT NULL)
   4 - access("T1"."ID"="T2"."N1")

Notice line 3 – the table access to t1 driven by line 4 (the unique scan of unique index t1_pk) which is predicted to return 2 rows. Clearly something has gone wrong with the optimizer since this is not a possible outcome at runtime.

There are probably a number of paths through the optimizer code that could result in unexpected results like this, and many of them are probably “emergent properties” rather than simple behaviour (for example, the combination of a subquery factoring clause that has been moved inline where it has then accepted a pushed join predicate might lead to an odd combination of calculations that doesn’t make sense). Before getting too excited, though, if you’re seeing an anomaly like this it’s worth taking a quick look at the statistics, you may simply have an unlucky variation in the timing and sample sizes used that happens to have produced a quirky bit of arithmetic.

Note in the example above the place where I commented: “do something about stats”. Normally that line would read “gather stats at this point”; in this case I did something a little bizarre to make sure that the critical table, index, and columns statistics were not self-consistent. In effect I managed to create some stats that said: “every distinct value for the id column will identify two rows – even though it’s got a unique constraint”. Naturally the optimizer arithmetic “believed” the statistics.

I don’t think the specific scenario I engineered in my example can actually happen in newer versions of Oracle – so I’m not going to tell you exactly what I did to force it – but do remember that odd effects can appear simply because the stats collection didn’t quite work the way you wanted.

In 11g, of course, you can test this hypothesis fairly simply by collecting pending stats, and then enabling them in your own session to see what effect they have, without running the risk of damaging public execution plans. It’s also possible that if you use the auto_sample_size in 11g with the approximate NDV feature enabled then this problem will be less likely to appear.


  1. I don’t see any DDL in your script to create that index (T1_PK)?

    Comment by Colin 't Hart — June 9, 2012 @ 2:25 pm BST Jun 9,2012 | Reply

  2. Reblogged this on lava kafle kathmandu nepal.

    Comment by lkafle — June 10, 2012 @ 3:30 am BST Jun 10,2012 | Reply

  3. > In 11g, of course, you can test this hypothesis fairly simply by collecting pending stats, and then enabling them in your own session to see what effect they have, without running the risk of damaging public execution plans

    Just a word of caution: At least up to not all DBMS_STATS call are “pending” statistics mode aware, which means that some statistics manipulations will still be immediately visible although PENDING statistics have been enabled for a particular table, More details on my post:


    Comment by Randolf Geist — June 10, 2012 @ 1:34 pm BST Jun 10,2012 | Reply

  4. Very interessting. It’s nice to see with a real exemple how much impact have the data statistics and system statistics on the execution plans.

    Comment by Guillaume Goulet-Vallières — June 13, 2012 @ 8:36 pm BST Jun 13,2012 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by