Oracle Scratchpad

January 31, 2019

Descending Problem

Filed under: Execution plans,Indexing,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 3:34 pm GMT Jan 31,2019

I’ve written in the past about oddities with descending indexes ( here, here, and here, for example) but I’ve just come across a case where I may have to introduce a descending index that really shouldn’t need to exist. As so often happens it’s at the boundary where two Oracle features collide. I have a table that handles data for a large number of customers, who record a reasonable number of transactions per year, and I have a query that displays the most recent transactions for a customer. Conveniently the table is partitioned by hash on the customer ID, and I have an index that starts with the customer_id and transaction_date columns. So here’s my query or, to be a little more accurate, the client’s query – simplified and camouflaged:


select  /*+ gather_plan_statistics */
        *
from    (
             select
                    v1.*,
                    rownum rn
             from   (
                             select   /*
                                         no_eliminate_oby
                                         index_rs_desc(t1 (customer_id, transaction_date))
                                      */
                                      t1.*
                             from     t1
                             where    customer_id = 50
                             and      transaction_date >= to_date('1900-01-01','yyyy-mm-dd')
                             order by transaction_date DESC
                ) v1
                where  rownum <= 10 -- > comment to avoid WordPress format issue
         )
where    rn >= 1
;

You’ll notice some hinting – the /*+ gather_plan_statistics */ will allow me to report the rowsource execution stats when I pull the plan from memory, and the hints in the inline view (which I’ve commented out in the above) will force a particular execution plan – walking through the index on (company_id, transaction_date) in descending order.

If I create t1 as a simple (non-partitioned) heap table I get the following plan unhinted (I’ve had to edit a “less than or equal to” symbol to avoid a WordPress format issue):

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |      1 |        |    14 (100)|     10 |00:00:00.01 |      14 |
|*  1 |  VIEW                           |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  2 |   COUNT STOPKEY                 |       |      1 |        |            |     10 |00:00:00.01 |      14 |
|   3 |    VIEW                         |       |      1 |     10 |    14   (0)|     10 |00:00:00.01 |      14 |
|   4 |     TABLE ACCESS BY INDEX ROWID | T1    |      1 |    340 |    14   (0)|     10 |00:00:00.01 |      14 |
|*  5 |      INDEX RANGE SCAN DESCENDING| T1_I1 |      1 |     10 |     3   (0)|     10 |00:00:00.01 |       4 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   5 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Notice the descending range scan of the index – just as I wanted it – the minimal number of buffer visits, and only 10 rows (and rowids) examined from the table. But what happens if I recreate t1 as a hash-partitioned table with local index – here’s the new plan, again without hinting the SQL:


----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |       |      1 |        |   207 (100)|     10 |00:00:00.01 |     138 |       |       |          |
|*  1 |  VIEW                                          |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  2 |   COUNT STOPKEY                                |       |      1 |        |            |     10 |00:00:00.01 |     138 |       |       |          |
|   3 |    VIEW                                        |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |       |       |          |
|*  4 |     SORT ORDER BY STOPKEY                      |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |     138 |  2048 |  2048 | 2048  (0)|
|   5 |      PARTITION HASH SINGLE                     |       |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|   6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1    |      1 |    340 |   206   (0)|    340 |00:00:00.01 |     138 |       |       |          |
|*  7 |        INDEX RANGE SCAN                        | T1_I1 |      1 |    340 |     4   (0)|    340 |00:00:00.01 |       3 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM. LE. 10)
   4 - filter(ROWNUM .LE. 10)
   7 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE">=TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TRANSACTION_DATE" IS NOT NULL)

Even though the optimizer has recognised that is will be visiting a single partition through a local index it has not chosen a descending index range scan, though it has used the appropriate index; so it’s fetched all the relevant rows from the table in the wrong order then sorted them discarding all but the top 10. We’ve done 138 buffer visits (which would turn into disk I/Os, and far more of them, in the production system).

Does this mean that the optimizer can’t use the descending index when the table is partitioned – or that somehow the costing has gone wrong. Here’s plan with the hints in place to see what happens when we demand a descending range scan:


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |      1 |        |   207 (100)|     10 |00:00:00.01 |       8 |
|*  1 |  VIEW                                 |       |      1 |     10 |   207   (1)|     10 |00:00:00.01 |       8 |
|*  2 |   COUNT STOPKEY                       |       |      1 |        |            |     10 |00:00:00.01 |       8 |
|   3 |    VIEW                               |       |      1 |    340 |   207   (1)|     10 |00:00:00.01 |       8 |
|   4 |     PARTITION HASH SINGLE             |       |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|   5 |      TABLE ACCESS BY LOCAL INDEX ROWID| T1    |      1 |    340 |   206   (0)|     10 |00:00:00.01 |       8 |
|*  6 |       INDEX RANGE SCAN DESCENDING     | T1_I1 |      1 |    340 |     4   (0)|     16 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN">=1)
   2 - filter(ROWNUM .LE. 10)
   6 - access("CUSTOMER_ID"=50 AND "TRANSACTION_DATE" IS NOT NULL AND "TRANSACTION_DATE">=TO_DATE('
              1900-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The optimizer is happy to oblige with the descending range scan – we can see that we’ve visited only 8 buffers, and fetched only 10 rows from the table. The cost, however, hasn’t made any allowance for the limited range scan. Check back to the plan for the simple (non-partitioned) table and you’ll see that the optimizer did allow for the reduced range scan. So the problem here is a costing one – we have to hint the index range scan if we want Oracle limit the work it does.

You might notice, by the way that the number of rowids returned in the index range scan descending operation is 16 rather than 10 – a little variation that didn’t show up when the table wasn’t partitioned. I don’t know why this happened, but when I changed the requirement to 20 rows the range scan returned 31 rowids, when I changed it to 34 rows the range scan returned 46 rows, and a request for 47 rows returned 61 index rowids – you can see the pattern, the number of rowids returned by the index range scan seems to be 1 + 15*N.

Footnote:

If you want to avoid hinting the code (or adding an SQL patch) you need only re-create the index with the transaction_date column declared as descending (“desc”), at which point the optimizer automatically chooses the correct strategy and the run-time engine returns exactly 10 rowids and doesn’t need to do any sorting. But who wants to create a descending index when they don’t really need it !

If you want to reproduce the experiments, here’s the script to create my test data.


rem
rem     Script:         pt_ind_desc_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem     Purpose:        
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1 (
        customer_id,
        transaction_date,
        small_vc,
        padding 
)
partition by hash(customer_id) partitions 4
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        mod(rownum,128)                         customer_id,
        (trunc(sysdate) - 1e6) + rownum         transaction_date,
        lpad(rownum,10,'0')                     v1,
        lpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(customer_id, transaction_date) 
local 
nologging
;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

I’ve run this test on 12.1.0.2, 12.2.0.1, and 18.3.0.0 – the behaviour is the same in all three versions.

Update (1st Feb 2019)

As the client reminded me after reading the post, it’s worth pointing out that for more complex SQL you still have to worry about the errors in the cardinality and cost calculations that could easily push the optimizer into the wrong join order and/or join method – whether you choose to hint the ascending index or create a descending index.  Getting the plan you want for this type of “pagination” query can be a messy process.

2 Comments »

  1. Hi,

    I also had some strange observations with interactions between partially descending indexes, hints, ORDER BY ASC/DESC and plans. It was further obfuscated when looking at the plan outline. Like adding ASC/DESC where there were none in the hint, switching it around compared to the hint (for apparantly no reason) or in contradiction to the ORDER BY ASC/DESC. Maybe you can gain some additional perspective by looking at those for your testcase?
    Also weird : The outline hints are (always?) rewritten to the column-name syntax (from the index-name syntax). I’m somewhat worried that this could result in unintended behaviour in edge cases but of course mostly it should be better like being proof against index renaming.

    regards,

    Comment by Racer I. — February 1, 2019 @ 7:22 am GMT Feb 1,2019 | Reply

    • Thanks for the comment. So far I have 6 scripts with “descending_bug” in the name – the anomalies with descending indexes are many and varied. The outline information was pretty triv ial in this very simple case, so I didn’t bother to say anything about it – the default behaviour was simply a hint to “index_rs_asc()” by column name until I addedt the index_rs_desc() hint, which then appeared in the outline in column name form.

      Your comment reminded me to revisit a blog note I wrote a few years ago about the “new” method of identifying indexes in hints ( https://jonathanlewis.wordpress.com/2011/11/24/index-hints/ ). I left an open question at the end and I ought to re-run a couple of tests to see if anything has changed. Essentially, though, the hint will (should) make Oracle choose the index that has the same columns in the same order if it exists or the lowest cost index that starts with the same columns in the same order if it doesn’t.

      I may have to retest whether indexes with switched column orders are now considered, but I don’t think they were the last time I tested.

      Comment by Jonathan Lewis — February 1, 2019 @ 7:59 am GMT Feb 1,2019 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.