Oracle Scratchpad

May 11, 2018

Skip Scan 3

Filed under: CBO,Index skip scan,Indexing,Oracle — Jonathan Lewis @ 2:26 pm BST May 11,2018

If you’ve come across any references to the “index skip scan” operation for execution plans you’ve probably got some idea that this can appear when the number of distinct values for the first column (or columns – since you can skip multiple columns) is small. If so, what do you make of this demonstration:


rem
rem     Script:         skip_scan_cunning.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',1000);
end;
/

create table t1
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        rownum                          id1,
        rownum                          id2,
        lpad(rownum,10,'0')             v1,
        lpad('x',150,'x')               padding
/*
        cast(rownum as number(8,0))                     id,
        cast(lpad(rownum,10,'0') as varchar2(10))       v1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
*/
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(id1, id2);

begin
        dbms_stats.gather_table_stats(
                ownname     => user,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

For repeatability I’ve set some system statistics, but if you’ve left the system stats to default you should see the same effect. All I’ve done is create a table and an index on that table. The way I’ve defined the id1 and id2 columns means they could individually support unique constraints and the index clearly has 1 million distinct values for id1 in the million index entries. So what execution plan do you think I’m likely to get from the following simple query:


set serveroutput off
alter session set statistics_level = all;

prompt  =======
prompt  Default
prompt  =======

select  id 
from    t1
where   id2 = 999
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

You’re probably not expecting an index skip scan to appear, but given the title of this posting you may have a suspicion that it will; so here’s the plan I got running this test on 12.2.0.1:


SQL_ID  8r5xghdx1m3hn, child number 0
-------------------------------------
select id from t1 where id2 = 999

Plan hash value: 400488565

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |      1 |        |  2929 (100)|      1 |00:00:00.17 |    2932 |      5 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |      1 |      1 |  2929   (1)|      1 |00:00:00.17 |    2932 |      5 |
|*  2 |   INDEX SKIP SCAN                   | T1_I1 |      1 |      1 |  2928   (1)|      1 |00:00:00.17 |    2931 |      4 |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID2"=999)
       filter("ID2"=999)


So, an index skip scan doesn’t require a small number of distinct values for the first column of the index (unless you’re running a version older than 11.2.0.2 where a code change appeared that could be disabled by setting fix_control 9195582 off).

When the optimizer doesn’t do what you expect it’s always worth hinting the code to follow the plan you were expecting – so here’s the effect of hinting a full tablescan (which happened to do direct path reads):

SQL_ID  bxqwhsjwqfm7q, child number 0
-------------------------------------
select  /*+ full(t1) */  id from t1 where id2 = 999

Plan hash value: 3617692013

----------------------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |  3317 (100)|      1 |00:00:00.12 |   25652 |  25635 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |      1 |  3317   (3)|      1 |00:00:00.12 |   25652 |  25635 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID2"=999)

Note that the cost is actually more expensive than the cost of the indexed access path.  For reference you need to know that the blocks statistic for the table was 25,842 while the number of index leaf blocks was 2,922. The latter figure (combined with a couple of other details regarding the clustering_factor and undeclared uniqueness of the index) explains why the cost of the skip scan was only 2,928: the change that appeared in 11.2.0.2 limited the I/O cost of an index skip scan to the total number of leaf blocks in the index.  The tablescan cost (with my system stats) was basically dividing my table block count by 16 (to get the number of multi-block reads) and then doubling (because the multiblock read time is twice the single block read time).

As a quick demo of how older versions of Oracle would behave after setting “_fix_control”=’9195582:OFF’:


SQL_ID	bn0p9072w9vfc, child number 1
-------------------------------------
select	/*+ index_ss(t1) */  id from t1 where id2 = 999

Plan hash value: 400488565

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation			    | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |	A-Time	 | Buffers |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		    |	    |	   1 |	      |  1001K(100)|	  1 |00:00:00.13 |    2932 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |	   1 |	    1 |  1001K	(1)|	  1 |00:00:00.13 |    2932 |
|*  2 |   INDEX SKIP SCAN		    | T1_I1 |	   1 |	    1 |  1001K	(1)|	  1 |00:00:00.13 |    2931 |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID2"=999)
       filter("ID2"=999)

The cost of the skip scan is now a little over 1,000,000 – corresponding (approximately) to the 1 million index probes that will have to take place. You’ll notice that the number of buffer visits recorded is 2931 for the index operation, though: this is the result of the run-time optimisation that keeps buffers pinned very aggressively for skip scan – you might expect to see a huge number of visits recorded as “buffer is pinned count”, but for some reason that doesn’t happen. The cost is essentially Oracle calculating (with pinned root and branch) the cost of “id1 = {constant} and id2 = 999” and multiplying by ndv(id1).

Footnote:

Ideally, of course, the optimizer ought to work out that an index fast full scan followed by a table access ought to have a lower cost (using multi-block reads rather than walking the index in leaf block order one block at a time (which is what this particular skip scan will have to do) – but that’s not (yet) an acceptable execution plan though it does now appear a plan for deleting data.

tl;dr

If you have an index that is very much smaller than the table you may find examples where the optimizer does what appears to be an insanely stupid index skip scan when you were expecting a tablescan or, possibly, some other less efficient index to be used. There is a rationale for this, but such a plan may be much more CPU and read intensive than it really ought to be.

 

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.