Oracle Scratchpad

February 25, 2014

FBI Skip Scan

Filed under: Bugs,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 6:45 pm GMT Feb 25,2014

A recent posting on the OTN database forum highlighted a bug (or defect, or limitation) in the way that the optimizer handles index skip scans with “function-based” indexes – it doesn’t do them. The defect has probably been around for a long time and demonstrates a common problem with testing Oracle – it’s very easy for errors in the slightly unusual cases to be missed; it also demonstrates a general principle that it can take some time for a (small) new feature to be applied consistently across the board.

The index definitions in the original posting included expressions like substr(nls_lower(colX), 1, 25), and it’s possible for all sorts of unexpected effects to appear when your code starts running into NLS ¬†settings, so I’ve created a much simpler example. Here’s my table definition, with three index definitions:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	mod(rownum - 1,10)		mod1,
	rownum				id1,
	1 + mod(rownum - 1,10)		mod2,
	1 + rownum 			id2,
	lpad(rownum,7)			small_vc,
	rpad('x',100)			padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5 
; 

create index t1_normal on t1(mod2, id2); 
create index t1_split  on t1(mod2, id1 + 1); 
create index t1_fbi    on t1(mod1 + 1, id1 + 1); 

begin
	dbms_stats.gather_table_stats( 
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

Note that I gathered stats after creating the indexes (generally you would be safe creating the indexes after the gathering the stats from 10g onwards) because I needed to gather stats on the virtual columns that support the function-based indexes. As you can see, mod2 = 1 + mod1, and id2 = 1 + id1, and I’ve created three indexes which (internally) are the same although they are defined in three different ways.

So here are three queries which select the same data although, again, the queries are not all exactly the same.¬†Since the queries are doing the same things with structures that are the same we might hope to see the optimizer using the same strategy for all three. In fact, to make things easier for the optimizer, I’ve even told it exactly what to do in two of the cases:


select	*
from	t1
where
	id2 = 50000
;

select	/*+ index_ss(t1(mod2)) */
	*
from	t1
where
	id1 + 1 = 50000
;

select	/*+ index_ss(t1 t1_fbi) */
	*
from	t1
where
	id1 + 1 = 50000
;

In the first query I’ve referenced only the second column of the “normal” index with a high precision query – expecting the optimizer to find the index skip scan path. In the second and third queries I’ve reference the (id1 + 1) index expression which appears as the second column in two different indexes; to help the optimizer along I’ve hinted the index skip scan for each of the indexes in turn.

If the first query can do an index skip scan then the second and third queries should be able to do the same because the physical structures and the statistics (and the actual stored and requested values) are all the same. Here are the three plans (pulled from memory in an 11.2.0.4 instance).


SQL_ID  bvjstz9xa3n7a, child number 0
-------------------------------------
select * from t1 where  id2 = 50000

Plan hash value: 2078113469

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |    12 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |     1 |   125 |    12   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | T1_NORMAL |     1 |       |    11   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID2"=50000)
       filter("ID2"=50000)

SQL_ID  95573qx3w62q2, child number 0
-------------------------------------
select /*+ index_ss(t1(mod2)) */  * from t1 where  id1 + 1 = 50000

Plan hash value: 1422030023

----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |   271 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1       |     1 |   130 |   271   (2)| 00:00:02 |
|*  2 |   INDEX FULL SCAN           | T1_SPLIT |     1 |       |   270   (2)| 00:00:02 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."SYS_NC00007$"=50000)
       filter("T1"."SYS_NC00007$"=50000)

SQL_ID  d4t48va4zrk1q, child number 0
-------------------------------------
select /*+ index_ss(t1 t1_fbi) */  * from t1 where  id1 + 1 = 50000

Plan hash value: 3151334857

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |   271 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |     1 |   130 |   271   (2)| 00:00:02 |
|*  2 |   INDEX FULL SCAN           | T1_FBI |     1 |       |   270   (2)| 00:00:02 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."SYS_NC00007$"=50000)
       filter("T1"."SYS_NC00007$"=50000)

The first query automatically does an index skip scan on the normal index.

The second and third queries calculate the cardinality correctly, reference the right “virtual” column correctly, and come close to obeying the hint by using the required index – and then spoil things by doing an index full scan instead of an index skip scan. Checking the 10053 trace file I found that the optimizer hadn’t even considered the possibility of a skip scan for the last two queries – it had jumped straight to the full scan. (If the optimizer had ignored the hint completely I would have said that this behaviour was simply a limitation of FBIs – but since the optimizer has got partway there I think it’s probably a bug.)

Once you’ve got this far, of course, you might wonder if you could work around the problem in 11g by using virtual columns – so that’s the next test.

drop index t1_split;
drop index t1_fbi;

alter table t1 add (mod_virtual generated always as (mod1 + 1) virtual);
alter table t1 add (id_virtual generated always as (id1 + 1) virtual);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

create index t1_virtual on t1(mod_virtual, id_virtual);

select
	/*+ index_ss(t1) */
	*
from	t1
where
	id_virtual = 50000

I started by dropping the two function-based indexes (because I’m about to create “proper” virtual columns which are identical to the hidden “virtual columns” that were supporting the function-based indexes – if I didn’t do this I’d hit Oracle error “ORA-54015: Duplicate column expression was specified”). Then I declared my virtual columns, collected stats on the whole table (I could have used method_opt=>’for all hidden columns size 1′) so that there would be stats on those columns, then created an index using the virtual columns and tried to get a skip scan on that index. Any bets ?

select  /*+ index_ss(t1) */  * from t1 where  id_virtual = 50000

Plan hash value: 4250443541

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |       |       |   271 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |   133 |   271   (2)| 00:00:02 |
|*  2 |   INDEX FULL SCAN           | T1_VIRTUAL |     1 |       |   270   (2)| 00:00:02 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID_VIRTUAL"=50000)
       filter("ID_VIRTUAL"=50000)

Bad luck – even though “proper” virtual columns can do wonders, they don’t get around this problem. Anyone who’s used multi-column function-based indexes or added virtual columns to indexes might want to look at their code and execution paths to see if there are opportunities for skip scans that the optimizer is failing to take. If there are you might like to raise an SR with Oracle – and point them to this blog as a repeatable test case.

I tried several searches for a relevant bug on MoS – unfortunately it was one of those occasions where the search result was either very small and didn’t have what I wanted, or was so large that I wasn’t going to check all the hits. If anyone does find a bug number, please tell us about it in the comments.

 

2 Comments »

  1. Dear Jonathan,

    Thank you for this excellent article. You pointed out very important subject. More articles about FBI please.

    Regards
    Charlie

    Comment by Charlie — February 26, 2014 @ 8:25 am GMT Feb 26,2014 | Reply

  2. Charlie,

    since FBI’s and virtual columns are brothers,

    http://marogel.wordpress.com/2014/02/20/yet-another-elementary-sql-bug/

    might be interesting for you,
    Though, in contrast to Jonathan, I could not reproduce that issue solely with FBI’s ( it seems, you need real virtual columns there )

    Comment by Matthias Rogel — February 26, 2014 @ 2:22 pm GMT Feb 26,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,422 other followers