Oracle Scratchpad

January 20, 2020

Index Engineering

Filed under: Indexing,Oracle,Tuning — Jonathan Lewis @ 4:53 pm GMT Jan 20,2020

This is a case study based on a question that appeared on the Oracle Developer Community forum a few days ago.

What I’m aiming to present in this note is the pattern of thinking that you should adopt in cases like this. The final suggestion in this note isn’t necessarily the best answer to the question posed (at the time of writing the OP hadn’t supplied enough information to allow anyone to come up with a best solution), but the point of the exercise is to talk about the journey and (perhaps) remind you of some of the extreme engineering you can do with indexes.

The (massaged) problem statement is as follows:

I have a table of more than 200 million rows that is used for inserts, updates and queries. I have a query on this table and want to know what index I could create to speed up the query.

The supplied definition of the table was not consistent with the names used in the query, so I’ve had to do a little editing, but table, current indexes, and query were as follows:

rem
rem     Script:         extreme_indexing.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2020
rem

create table tbl (
        r_id                    varchar2(30) not null,
        c_id                    number,
        n_id                    varchar2(40),
        created_by              varchar2(30) not null,
        last_modified_by        varchar2(30),
        c_status                char(1),
        a_action                char(1),
        r_creation_dt           timestamp(6),
        cnt                     number(38)
)
;

create        index tbl_1 on tbl(cnt, r_creation_dt, c_id, a_action, last_modified_by);  
create        index tbl_2 on tbl(cnt, c_status, r_creation_dt);  
create bitmap index tbl_3 on tbl(c_status); 

select
        /*+ index(tbl) */
        c_id,
        a_action,
        cnt,
        last_modified_by
from
        tbl
where
        c_status in(
            'N',
            'F'
        )
and     cnt <= 5 -- > comment to avoid wordpress format issue
and     r_creation_dt is not null
group by
        cnt,
        r_creation_dt,
        c_id,
        a_action,
        last_modified_by,
        c_status
order by
        r_creation_dt
fetch 
        first 1000 rows only
;


The first thing to point out is the bitmap index tbl_i3 is almost certainly a bad idea – bitmaps and transactional activity do not mix. It seems quite likely that the OP in this case had read one of the many Internet notes that makes the “not totally wrong” but very misleading statement “bitmap indexes are good when you have a small number of distinct values”, and appled the principle to a column that looks like a “status” column holding only a few distisnct values.

Having got that error out of the way we can start to think about the query.  It’s using the (fairly new) “Fetch first N rows” syntax, which means we may have to find a lot of data and sort it before returning a subset: performance issues can be very deceptive in cases like this because we might want a small result set but have to do a large amount of work to get it.

In this case we’re after the first 1,000 rows – which makes you think that maybe there will be a lot of data satisfying the query. So we have two targets to meet to optimise the query:

  • acquire the data we need as efficiently as possible
  • post-process the data we acquire to derive the 1,000 rows as efficiently as possible

The query is just a single table access – which means we’re either going to do a full tablescan or find a good indexed access path, we don’t have to worry about join strategies.  So the first thing to consider is the volume (and scatter) of data that matches the predicates. If there’s only a “small” amount of data where “c_status in (‘N’,’F’) and cnt <= 5” then an index on – or starting with – (c_status, cnt) may be very helpful. (Note how I’ve specified the column with the equality predicate first – that’s part of a generic strategy for creating multi-column indexes.)

This, though, raises several questions that need to be answered:

  • How small is “small” ? In the context of 200 million rows, 100,000 is small; but if you had to visit 100,000 different blocks in the table and do 100,000 real single block reads from disc that might still be a very bad thing.
  • How many rows have status ‘N’, how many have status ‘F’, how many have cnt <= 5 ? Maybe a really tiny number of rows have cnt<=5 and lots have c_status in (‘N’,’F’) which could make this a case where ignoring the generic column-ordering strategy would be very effective.  Maybe the number of rows satisfying the individual conditions is high but the number satisfying the combination is very low.
  • Is this the ONLY combination of c_status and cnt that is of interest, or (for example) was 5 just the number that was picked as an example,  Would different c_status values be of interest, would some required combinations of c_status and cnt have to use completley different execution paths for the best performance.

I’m going to make some decisions in order to proceed – they may be totally wrong as far as the OP is concerned – so remember that this note is just for discussion purposes. Let’s assume that the common query is always exactly as stated. Perhaps it’s a query that runs every few minutes to clear up some outstanding work with the expectation that new rows matching the query keep appearing while older rows are processed, change status, and disappear from the result set. Let’s also assume that the result set is always “small”, and that it’s small because ‘N’ and ‘F’ are rare (even if the total number of rows with cnt <= 5 is large).

With these assumptions we could start by creating an index on (c_status, cnt), which gets us to exactly the rows we want from the table with no “throwaway” after visiting the table. Here’s the excution plan if that’s our choice of index (running on 12.2.0.1, and with an index() hint to force the use of the index when necessary):

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |        |      1 |        |   1000 |00:00:00.03 |    1573 |     34 |       |       |          |
|*  1 |  VIEW                           |        |      1 |   1000 |   1000 |00:00:00.03 |    1573 |     34 |       |       |          |
|*  2 |   WINDOW NOSORT STOPKEY         |        |      1 |   6451 |   1000 |00:00:00.03 |    1573 |     34 |   219K|   219K|          |
|   3 |    SORT GROUP BY                |        |      1 |   6451 |   1001 |00:00:00.03 |    1573 |     34 |  1186K|   567K| 1054K (0)|
|   4 |     INLIST ITERATOR             |        |      1 |        |  13142 |00:00:00.02 |    1573 |     34 |       |       |          |
|*  5 |      TABLE ACCESS BY INDEX ROWID| TBL    |      2 |  13743 |  13142 |00:00:00.02 |    1573 |     34 |       |       |          |
|*  6 |       INDEX RANGE SCAN          | TBL_I1 |      2 |  13743 |  13142 |00:00:00.01 |      33 |     34 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=1000)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "R_CREATION_DT")<=1000)
   5 - filter("R_CREATION_DT" IS NOT NULL)
   6 - access((("C_STATUS"='F' OR "C_STATUS"='N')) AND "CNT"<=5)

I’ve enabled rowsource_execution_statistics (alter session set statistics_level = all) and pulled my execution plan from memory. As you can see from the A-rows for the index range scan and table access by index rowid, I’ve identified and acquired exactly the rows from the table that might be relevant (all 13,142 of them), then I’ve done a sort group by of all that data, sorting in a way that means the rows will be produced in exactly the order I need for the windowing function that Oracle will use to select the 1,000 rows I want.

If you’re curious, here (courtesy of dbms_utility.expand_sql_text() but cosmetically enhanced) is the transformed SQL that was actually optimised and executed:

SELECT 
        A1.C_ID C_ID,A1.A_ACTION A_ACTION,A1.CNT CNT,A1.LAST_MODIFIED_BY LAST_MODIFIED_BY 
FROM  (
        SELECT 
                /*+ INDEX (A2) */ 
                A2.C_ID C_ID,
                A2.A_ACTION A_ACTION,
                A2.CNT CNT,
                A2.LAST_MODIFIED_BY LAST_ MODIFIED_BY,
                A2.R_CREATION_DT rowlimit_$_0,
                ROW_NUMBER() OVER ( ORDER BY A2.R_CREATION_DT) rowlimit_$$_rownumber 
        FROM 
                TEST_USER.TBL A2 
        WHERE 
                (A2.C_STATUS='N' OR A2.C_STATUS='F') 
        AND     A2.CNT<=5 
        AND     A2.R_CREATION_DT IS NOT NULL 
        GROUP BY 
                A2.CNT,A2.R_CREATION_DT,A2.C_ID,A2.A_ACTION,A2.LAST_MODIFIED_BY,A2.C_STATUS
        ) A1 
WHERE 
        A1.rowlimit_$$_rownumber<=1000 
ORDER BY 
        A1.rowlimit_$_0

There are three main drawbacks to this choice of index.

  • I’ve acquired all the rows in the table that match the predicate even though I only really needed a subset
  • I’ve done a massive sort
  • I’ve created an index that includes every row in the table

Remember that the OP has a table of 200M rows, and we are assuming (pretending) that only a very small fraction of them match the initial predicates. Creating an index on 200M rows because we’re interested in only a few tens of thousands is wasteful of space and (given we have a “status” column) probably wasteful of processing resources as the status moves through several values. So I’m going to address that issue first. Let’s create a “function-based” index that ignores most of the data, and change the code to take advantage of that index – but since this is 12c, let’s do it by adding a virtual column and indexing that column.


alter table tbl add nf_r_creation_dt invisible 
        generated always as (
                case
                        when c_status in ('N','F') and cnt <= 5
                        then r_creation_dt
                end
        ) virtual
/

create index tbl_i2 on tbl(nf_r_creation_dt)
/

I’ve introduced an invisible virtual column called nf_r_creation_dt (nf_ for status N/F) which uses a CASE expression matching the original predicate to return the r_creation_dt for rows that match and null for all the other (ca. 200M) rows. So when I create an index on the column the only entries in the index are for rows that I might want to see.

I have to edit the SQL to match – which simply means changing every appearance of r_creation_dt to nf_r_creation_dt, and eliminating the original predicate giving the following text and execution plan:


select
        /*+ index(tbl) */
        c_id,
        a_action,
        cnt,
        last_modified_by
from
        tbl
where
        nf_r_creation_dt is not null
group by
        nf_r_creation_dt,
        cnt,
        c_id,
        a_action,
        last_modified_by,
        c_status
order by
        nf_r_creation_dt
fetch 
        first 1000 rows only    -- 1,000 rows in the original
/

---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |      1 |        |   1000 |00:00:00.02 |   13139 |     35 |       |       |          |
|*  1 |  VIEW                          |        |      1 |   1000 |   1000 |00:00:00.02 |   13139 |     35 |       |       |          |
|*  2 |   WINDOW NOSORT STOPKEY        |        |      1 |     48 |   1000 |00:00:00.02 |   13139 |     35 | 73728 | 73728 |          |
|   3 |    SORT GROUP BY               |        |      1 |     48 |   1001 |00:00:00.02 |   13139 |     35 |  1116K|   556K|  991K (0)|
|   4 |     TABLE ACCESS BY INDEX ROWID| TBL    |      1 |   2500 |  13142 |00:00:00.02 |   13139 |     35 |       |       |          |
|*  5 |      INDEX FULL SCAN           | TBL_I2 |      1 |  13142 |  13142 |00:00:00.01 |      36 |     35 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=1000)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "NF_R_CREATION_DT")<=1000)
   5 - filter("NF_R_CREATION_DT" IS NOT NULL)

The plan shows an index full scan on the new index. Since the index holds only those rows that might be interesting this isn’t a threat. However we still have to visit all the matching rows in the table – and that might result in more random I/O than we like. So the next step in enhancing performance is to consider adding all the columns we want to the index. There’s a little problem with that: if we add the columns as they are we will go back to having an index entry for every single row in the table so we need to use the same CASE mechanism to create more virtual columns:

alter table tbl add nf_c_status invisible 
        generated always as (
                case
                        when c_status in ('N','F') and cnt <= 5
                        then c_status
                end
        ) virtual
/

alter table tbl add nf_last_modified_by invisible 
        generated always as (
                case
                        when c_status in ('N','F') and cnt <= 5
                        then last_modified_by
                end
        ) virtual
/

alter table tbl add nf_a_action invisible 
        generated always as (
                case
                        when c_status in ('N','F') and cnt <= 5
                        then a_action
                end
        ) virtual
/

alter table tbl add nf_c_id invisible 
        generated always as (
                case
                        when c_status in ('N','F') and cnt <= 5
                        then c_id
                end
        ) virtual
/

alter table tbl add nf_cnt invisible 
        generated always as (
                case
                        when c_status in ('N','F') and cnt <= 5
                        then cnt
                end
        ) virtual
/

create index tbl_i3 on tbl(
        nf_r_creation_dt,
        nf_cnt,
        nf_c_id,
        nf_a_action,
        nf_last_modified_by,
        nf_c_status
)
;

It looks like a bit of a pain to go through all this rigmarole to get all those columns that are null most of the time but echo the original values when the rows match our original predicate; and then we have to modify the query to match:


select
        /*+ index(tbl) */
        nf_c_id,
        nf_a_action,
        nf_cnt,
        nf_last_modified_by
from
        tbl
where
        nf_r_creation_dt is not null
group by
        nf_r_creation_dt,
        nf_cnt,
        nf_c_id,
        nf_a_action,
        nf_last_modified_by,
        nf_c_status
order by
        nf_r_creation_dt
fetch 
        first 1000 rows only    -- 1,000 rows in the original
/

But the big payoff comes from the execution plan:


----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |        |      1 |        |   1000 |00:00:00.01 |      74 |     12 |
|*  1 |  VIEW                  |        |      1 |   1000 |   1000 |00:00:00.01 |      74 |     12 |
|*  2 |   WINDOW NOSORT STOPKEY|        |      1 |   2500 |   1000 |00:00:00.01 |      74 |     12 |
|   3 |    SORT GROUP BY NOSORT|        |      1 |   2500 |   1001 |00:00:00.01 |      74 |     12 |
|*  4 |     INDEX FULL SCAN    | TBL_I3 |      1 |   2500 |   1003 |00:00:00.01 |      74 |     12 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=1000)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "NF_R_CREATION_DT")<=1000)
   4 - filter("NF_R_CREATION_DT" IS NOT NULL)

Notice how the SORT GROUP BY operation is a NOSORT, and the WINDOW operation is both NOSORT and STOPKEY ?

We’ve got the smallest index possible that only gets modified as rows move into, or out of, the interesting state, and when we run the query Oracle does a full scan of the index maintaining “running totals” but stop as soon as it’s aggregated enough results.

tl;dr

For very special cases it’s really amazing what you can (sometimes) do – if you can modify the code – with carefully engineered indexes to minimise the work done by a query AND the work done maintaining the infrastructure needed for that query. Virtual columns are a fantastic aid, especially now that 12c allows them to be invisible.

7 Comments »

  1. Hi

    If You are using PL/SQL and are using update set row command, You must set virtual columns to invisible as a workaround.

    But be aware: ORA-7445 [qcsjRmCol] When Creating View From Base Table With Invisible Virtual Columns (Doc ID 2337257.1)

    lh

    Comment by Anonymous — January 21, 2020 @ 10:35 am GMT Jan 21,2020 | Reply

    • Thanks for the comment.

      It’s useful to know that there’s a reference to the bug that views on tables with inivisble columns can produce. It’s a shame that the note isn’t a bit clearer – the title suggests the error happens on the “create view”, and there’s a hint in the text that it’s a problem that affects either RAC or distributed systems. I’ve added a comment to the document asking for clarificaiton, since I’ve just (single instance, not distributed) created, described, and queried a view on the above table with no problems.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — January 21, 2020 @ 12:02 pm GMT Jan 21,2020 | Reply

  2. Hi Jonathan,

    Since we are using FETCH FIRST 1000 rows only.. if we create the index as follows:

    CREATE INDEX “VISHNU”.”IDX” ON “VISHNU”.”TBL” (“R_CREATION_DT”, “CNT”, “C_ID”, “A_ACTION”, “LAST_MODIFIED_BY”, “C_STATUS”);

    the plan does report INDEX FULL SCAN but it in reality it doesn’t read all the index leaf blocks:

    -------------------------------------------------------------------------------
    | Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |      |  1000 | 72000 |    11   (0)| 00:00:01 |
    |*  1 |  VIEW                  |      |  1000 | 72000 |    11   (0)| 00:00:01 |
    |*  2 |   WINDOW NOSORT STOPKEY|      |  1001 | 53053 |    11   (0)| 00:00:01 |
    |   3 |    SORT GROUP BY NOSORT|      |  1001 | 53053 |    11   (0)| 00:00:01 |
    |*  4 |     INDEX FULL SCAN    | IDX  |  1001 | 53053 |    11   (0)| 00:00:01 |
    -------------------------------------------------------------------------------
    
    
    

    Thanks
    Vishnu

    Comment by Vishnu Potukanuma — January 21, 2020 @ 11:33 am GMT Jan 21,2020 | Reply

    • Vishnu,

      Thanks for the comment.

      The plan shows a full scan because that’s the strategy chosen. However, because the index is defined in exactly the best possible way to meet the requirements of the query Oracle can walk through the index in order creating running totals (in the “nosort” sort group by) passing them up as it does so to the window operation. Since the aggregates are arriving in the right order the window operation can operate without sorting, so it can apply a stop as soon as it has received enough rows from its child operation – hence the full scan stopping early.

      The two disadvantages to this approach are:

      a) you have indexed every row in the 200M rows table with a 6-column index, and at least two of those columns are likely to be subject to several changes: that’s a lot of space and a lot of work for something that might be a very focused requirement

      b) If the creation date of the data is spread of (say) 7 years but the only rows in state N or F were rows created in the last 6 weeks then you have to walk through something like 98% of the index before you find a single row.

      Your strategy may be perfect for your pattern of data and the number of variations of this query that you use – and still be be a total disaster for someone with a completely different pattern and usage.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — January 21, 2020 @ 12:16 pm GMT Jan 21,2020 | Reply

  3. Hi Jonathan,
    I think it is quite impossible to get the following plan given the predicates, group by and order by in the above statement.

    -------------------------------------------------------------------------------
    | Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |      |    99 |  4554 |     3   (0)| 00:00:01 |
    |*  1 |  COUNT STOPKEY         |      |       |       |            |          |
    |   2 |   VIEW                 |      |   101 |  4646 |     3   (0)| 00:00:01 |
    |   3 |    SORT GROUP BY NOSORT|      |   101 |  5353 |     3   (0)| 00:00:01 |
    |*  4 |     INDEX RANGE SCAN   | IDX  |   101 |  5353 |     3   (0)| 00:00:01 |
    -------------------------------------------------------------------------------
    
    

    Thanks,
    Vishnu

    Comment by Vishnu Potukanuma — January 21, 2020 @ 4:15 pm GMT Jan 21,2020 | Reply

    • Vishnu,

      That’s correct. Because you want to order by the r_creation_date that column has to be the first in the index if you want the NOSORT – but there’s no predicate on the column. If you were prepared to change the report to (say) report the N’s before the F’s you could create an index starting with (c_status , r_creation_date) and do a UNION ALL query and Oracle could then get quite clever.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — January 21, 2020 @ 4:29 pm GMT Jan 21,2020 | Reply

  4. Thanks for clarifying this.. Jonathan.

    Thanks,
    Vishnu

    Comment by Vishnu Potukanuma — January 21, 2020 @ 5:09 pm GMT Jan 21,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.