Oracle Scratchpad

February 9, 2017

Index bouncy scan

Filed under: Oracle,Performance,Tuning — Jonathan Lewis @ 1:05 pm GMT Feb 9,2017

There’s a thread running on OTN at present about deleting huge volumes of duplicated data from a table (to reduce it from 1.1 billion to about 22 million rows). The thread isn’t what I’m going to talk about, though, other than quoting some numbers from it to explain what this post is about.

An overview of the requirement suggests that a file of about 2.2 million rows is loaded into the table every week with (historically) no attempt to delete duplicates. As a file is loaded into the table every row gets the same timestamp, which is the sysdate at load time. I thought it would be useful to know how many different timestamps there were in the whole table.  (From an averaging viewpoint, 1.1 billion rows at 2.2 million rows per week suggests about 500 dates/files/weeks – or about 9.5 years – but since the table relates to “customer accounts” it seems likely that the file was originally smaller and has grown over time, which means the hiostory may be rather longer than that.)

Conveniently there is an index on the “input_user_date” column in the table so we might feel happy running a query that simply does:

        distinct input_user_date
order by

We might then refine the query to do a count(*) aggregate, or do some analytics to find any strange gaps in the timing of the weekly loads. However, all I’m really interested in is the number of dates because I’ve suggested we could de-duplicate the data by running a PL/SQL process that does a simple job for each date in turn, and I want to get an idea of how many times that job will run so that I can estimate how long the entire process might take.

The trouble with the basic query is that the table is (as you probably noticed) rather large, and so is the index. If we assume 8 bytes (which includes the length byte) for a date, 7 bytes for the rowid, 4 bytes overhead, and 100% packing we get about 420 index entries per leaf blocks, so with 1.1 billion entries the index is about 2.6 million leaf blocks. If the index had been built with compression (which means you’d only be recording a date once per leaf block) it would still be about 1.6 million leaf blocks. Fortunately we wouldn’t have to do much “real” sorting to report just a list of distinct values, or even the count(*) for each date, if we made Oracle use an index full scan – but it’s still a lot of work to read 1.6 million blocks (possibly using single block reads) and do even something as simple as a running count as you go. So I whipped up a quick and dirty bit of PL/SQL to do the job.

        m_d1 date := to_date('01-Jan-0001');
        m_d2 date := to_date('01-Jan-0001');
        m_ct number := 0;
                        input_user_date > m_d1

                exit when m_d2 is null;

                m_ct := m_ct + 1;
                dbms_output.put_line('Count: ' || m_ct || '  Date: ' || m_d2);
                m_d1 := m_d2;

        end loop;

The code assumes that the input_user_date hasn’t gone back to a silly date in the past to represent a “null date” (which shouldn’t exist anyway; if you want to use code like this but have a problem with a special “low-value” then you would probably be safest adding a prequel SQL that selects the min(columnX) where columnX is not null to get the starting value instead of using the a constant as I have done.

The execution path for the SQL statement should be an index-only: “index range scan (min/max)” which typically requires only 3 or 4 logical I/Os to find the relevant item for each date (which compares well with the estimated 2,200,000 / 420 = 5,238 leaf blocks we would otherwise have to scan through for each date). Here’s the path you should see:

| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |       |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE              |       |     1 |     8 |            |          |
|   2 |   FIRST ROW                  |       |     1 |     8 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| CA_I1 |     1 |     8 |     3   (0)| 00:00:01 |

Predicate Information (identified by operation id):
   3 - access("INPUT_USER_DATE">:B1)

I did build a little data set as a proof of concept – and produced a wonderful example of how the scale and the preceding events makes a difference that requires you to look very closely at what has happened. I used a table t1 in my example with a column d1, but apart from the change in names the PL/SQL block was as above.Here’s the code I used to create the data and prepare for the test:

create table t1 nologging
        trunc(sysdate) + trunc((rownum - 1)/100) d1,
        rpad('x',100)   padding
        rownum <= 50000

execute dbms_stats.gather_table_stats(user,'t1')
alter table t1 modify d1 not null;

create index t1_i1 on t1(d1) nologging pctfree 95

select index_name, leaf_blocks from user_indexes;

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

My data set has 500 dates with 100 rows per date, and the pctfree setting for the index gives me an average of about 8 leaf blocks per date (for a total of 4,167 leaf blocks). It’s only a small index so I’m expecting to see just 2 or 3 LIOs per date, and a total of about 500 physical reads (one per date plus a handful for reading branch blocks). Here’s the output from the running tkprof against the trace file:

 T1 WHERE D1 > :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    501      0.00       0.01          0          0          0           0
Fetch      501      0.08       0.18       4093       1669          0         501
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1003      0.09       0.19       4093       1669          0         501

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 62     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=3 pr=64 pw=0 time=9131 us)
         1          1          1   FIRST ROW  (cr=3 pr=64 pw=0 time=9106 us cost=3 size=8 card=1)
         1          1          1    INDEX RANGE SCAN (MIN/MAX) T1_I1 (cr=3 pr=64 pw=0 time=9089 us cost=3 size=8 card=1)(object id 252520)

I’ve done a physical read of virtually every single block in the index; but I have done only 3 buffer gets per date – doing fewer buffer gets than physical reads.

I’ve been caught by two optimisations (which turned out to be “pessimisations” in my test): I’ve flushed the buffer cache, so the Oracle runtime engine has decided to consider “warming up” the cache by reading extra blocks from any popular-looking objects that I’m accessing, and the optimizer may have given the run-time engine enough information to allow it to recognise that this index is subject to range scans and could therefore be a suitable object to use while warming up. As you can see from the following extracts from session events and session activity stats – we’ve done a load of multiblock reads through the index.

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
db file sequential read                               1           0           0.03        .031           6
db file scattered read                              136           0          13.54        .100           1

Name                                                                     Value
----                                                                     -----
physical reads                                                           4,095
physical reads cache                                                     4,095
physical read IO requests                                                  137
physical reads cache prefetch                                            3,958
physical reads prefetch warmup                                           3,958

This isn’t likely to happen, of course, in the production system where we’ll be starting with a fully loaded cache and the leaf blocks we need are (logically) spaced apart by several thousand intervening blocks.


I can’t remember who first brought this strategy to my attention – though I’m fairly sure it was one of my Russian colleagues, who has blogged about ways to work around what is effectively a limitation of the “index skip scan”. Apologies to the originator, and if you recognise your work here please add a comment with URL below.


  1. Jonathan, I think this is what you were referring to in the footnote:
    Clever indeed!

    Comment by Oren Nakdimon (@DBoriented) — February 9, 2017 @ 1:39 pm GMT Feb 9,2017 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Powered by