Oracle Scratchpad

March 23, 2017

min/max Upgrade

Filed under: Bugs,CBO,Execution plans,Indexing,Oracle,Troubleshooting — Jonathan Lewis @ 8:53 am BST Mar 23,2017

A question came up on the OTN database forum a little while ago about a very simple query that was taking different execution paths on two databases with the same table and index definitions and similar data. In one database the plan used the “index full scan (min/max)” operation while the other database used a brute force “index fast full scan” operation.

In most circumstances the starting point to address a question like this is to check whether some configuration details, or some statistics, or the values used in the query are sufficiently different to result in a significant change in costs; and the first simple procedure you can follow is to hint each database to use the plan from the opposite database to see if this produces any clues about the difference – it’s a good idea when doing this test to use one of the more verbose formatting options for the call to dbms_xplan.

In this case, though, the OP discovered a note on MoS reporting exactly the problem he was seeing:

Doc ID 2144428.1: Optimizer Picking Wrong ‘INDEX FAST FULL SCAN’ Plan vs Correct ‘INDEX FULL SCAN (MIN/MAX)’

which referred to

Bug 22662807: OPTIMIZER PICKING INDEX FFS CAN INSTEAD OF MIN/MAX

Conveniently the document suggested a few workarounds:

  • alter session set optimizer_features_enable = ‘11.2.0.3’;
  • alter session set “_fix_control” = ‘13430622:off’;
  • delete object stats [Ed: so that dynamic sampling takes place … maybe a /*+ dynamic_sampling(alias level) */ hint would suffice].

Of the three options my preference would (at least in the short term) be the _fix_control one. Specifically, from the v$system_fix_control view, we can see that it addresses the problem very precisely with the description: “index min/max cardinality estimate fix for filter predicates”.

The example in the bug note showed a very simple statement (even more simple than the OP’s query which was only a single table query anyway), so I thought I’d build a model and run a few tests to see what was going on. Luckily, before I’d started work, one of the other members of the Oak Table network sent an email to the list asking if anyone knew how the optimizer was costing an example he’d constructed – and I’ve finally got around to looking at his example, and here’s the model and answer(s), starting with the data set:


rem
rem     Script:         test_min_max.sql
rem     Dated:          March 2017
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4
rem             11.2.0.3
rem

create table min_max_test nologging
as
with ids as (
        select /*+ Materialize */ rownum  id from dual connect by rownum <= 50000 -- > comment to protect formatting
),
line_nrs as (
        select /*+ Materialize */  rownum line_nr from dual connect by rownum <= 20 -- > comment to protect formatting
)
select
        id, line_nr ,rpad(' ', 800, '*') data
from
        line_nrs, ids
order by
        line_nr, id
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'min_max_test',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index mmt_ln_id on min_max_test (line_nr, id) nologging;
create index mmt_id    on min_max_test (id)          nologging;

The table has two critical columns: each id has 20 line_nr values associated with it, but the way the data was generated means that the line numbers for a given id are scattered across 20 separate table blocks.

There are two indexes – one on the id which will allow us to find all the rows for a given id as efficiently as possible, and one (slightly odd-looking in this context) that would allow us to find a specific row for a given line_nr and id very efficiently. Two things about these indexes – in a live application they should both be compressed on the first (only, in the case of index mmt_id) column, and secondly the necessity of the mmt_id index is questionable and it might be an index you could drop if you reversed the order of the columns in mmt_ln_id. The thing about these indexes, though, is that they allow us to demonstrate a problem. So let’s query the data – twice, hinting each index in turn:


variable b1 number;
exec :b1 := 50000;

set serveroutput off

select
        /*+ index(t(id)) */
        min(line_nr)
from
        min_max_test t
where
        id = :b1
;

select * from table(dbms_xplan.display_cursor);

select
        /*+ index(t(line_nr, id)) */
        min(line_nr)
from
        min_max_test t
where
        id = :b1
;

select * from table(dbms_xplan.display_cursor);

It’s fairly safe to make a prediction about the execution plan and cost of the first query – it’s likely to be a range scan that accesses a couple of branch blocks, a leaf block and 20 separate table blocks followed by a “sort aggregate” – with a cost of about 23.

It’s a little harder to make a prediction about the second query. The optimizer could infer that the min(line_nr) has to be close to the left hand section of the index, and could note that the number of rows in the table is the same as the product of the number of distinct values of the two separate columns, and it might note that the id column is evenly distributed (no histogram) across the data, so it might “guess” that it need only range scan all the entries for the first line_nr to find the appropriate id. So perhaps the optimizer will use the index min/max range scan with a cost that is roughly 2 branch blocks plus total leaf blocks / 20 (since there are 20 distinct values for line_nr); maybe it would divide the leaf block estimate by two because “on average” – i.e. for repeated random selections of value for id – it would have to scan half the leaf blocks. There were 2,618 leaf blocks in my index, so the cost should be close to either 133 or 68.

Here are the two plans – range scan first, min/max second:


select  /*+ index(t(id)) */  min(line_nr) from  min_max_test t where id = :b1
-----------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              |       |       |    23 (100)|          |
|   1 |  SORT AGGREGATE                      |              |     1 |     8 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MIN_MAX_TEST |    20 |   160 |    23   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | MMT_ID       |    20 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("ID"=:B1)

select  /*+ index(t(line_nr, id)) */  min(line_nr) from  min_max_test t where  id = :b1
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |    22 (100)|          |
|   1 |  SORT AGGREGATE             |           |     1 |     8 |            |          |
|   2 |   FIRST ROW                 |           |     1 |     8 |    22   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| MMT_LN_ID |     1 |     8 |    22   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("ID"=:B1)

Spot on with the estimate for the simple range scan – but what did we do wrong with the estimate for the min/max scan ? You might notice in the first example the “table access by rowid batched” and realise that this is running on 12c. Here’s the plan if I get if I set the optimizer_features_enable back to 11.2.0.3 before running the second query again:


select  /*+ index(t(line_nr, id)) */  min(line_nr) from  min_max_test t where  id = :b1
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |   136 (100)|          |
|   1 |  SORT AGGREGATE             |           |     1 |     8 |            |          |
|   2 |   FIRST ROW                 |           |     1 |     8 |   136   (1)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| MMT_LN_ID |     1 |     8 |   136   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("ID"=:B1)

Using the 11.2.0.3 optimizer model the plan has a cost that’s very close to our prediction – we’ll see why there’s a slight difference in a moment. If we set the optimizer_features_enable to 11.2.0.4 the cost drops back to 22. So for our example 11.2.0.3 will use the simple “index range scan” and an upgrade to 11.2.0.4 (or higher) will switch to the “index full scan (min/max)”. If you look at the OTN posting the impact of the change in costing is exactly the other way around – 11.2.0.3 uses the min/max path, 11.2.0.4 uses the simple index range scan.

The techy bit

You really don’t need to know this – experimenting with the optimizer_features_enable (or _fix_control) will give you plans that show you all the numbers you need to see to check whether or not you’ve run into this particular problem – but if you’re interested here’s a little bit from the two 10053 trace files. We need only look at a few critical lines. From the 11.2.0.3 costing for the min/max scan:


Index Stats::
  Index: MMT_ID  Col#: 1
  LVLS: 2  #LB: 2202  #DK: 50000  LB/K: 1.00  DB/K: 20.00  CLUF: 1000000.00  NRW: 1000000.00
  Index: MMT_LN_ID  Col#: 2 1
  LVLS: 2  #LB: 2618  #DK: 1000000  LB/K: 1.00  DB/K: 1.00  CLUF: 125000.00  NRW: 1000000.00

SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for MIN_MAX_TEST[T]
  Column (#1): ID(NUMBER)
    AvgLen: 5 NDV: 50536 Nulls: 0 Density: 0.000020 Min: 1.000000 Max: 50000.000000
  Table: MIN_MAX_TEST  Alias: T
    Card: Original: 1000000.000000  Rounded: 20  Computed: 19.787874  Non Adjusted: 19.787874

 ****** Costing Index MMT_LN_ID
  Access Path: index (Min/Max)
    Index: MMT_LN_ID
    resc_io: 135.000000  resc_cpu: 961594
    ix_sel: 1.000000  ix_sel_with_filters: 1.9788e-05
    Cost: 135.697679  Resp: 135.697679  Degree: 1

I was running 12.1.0.2 so there were a few extra bits and pieces that I’ve deleted (mostly about SQL Plan Directives and in-memory). Critically we can see that the stats collection has a small error for the ID column – 50,536 distinct values (NDV) instead of exactly 50,000. This seems to have given us a cost for the expected index range of: 2 (blevel) + ceiling(2618 (leaf blocks) * 50536 / 1000000) = 2 + ceil(132.3) = 135, to which we add a bit for the CPU and get to 136. (Q.E.D.)

Then we switch to costing for 11.2.0.4:


SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for MIN_MAX_TEST[T]
  Column (#1): ID(NUMBER)
    AvgLen: 5 NDV: 50536 Nulls: 0 Density: 0.000020 Min: 1.000000 Max: 50000.000000
  Table: MIN_MAX_TEST  Alias: T
    Card: Original: 1000000.000000  Rounded: 20  Computed: 19.787874  Non Adjusted: 19.787874

 ****** Costing Index MMT_LN_ID
  Access Path: index (Min/Max)
    Index: MMT_LN_ID
    resc_io: 21.787874  resc_cpu: 156872
    ix_sel: 1.000000  ix_sel_with_filters: 1.9788e-05
    Cost: 22.324608  Resp: 22.324608  Degree: 1

We still have the small error in the number of distinct values for id, so the estimated number of rows that we need to access from the table for a given id (before “aggregating” to find its minimum line_nr) is 19.787874 (Computed: / Non Adjusted:) rather than exactly 20. Notice, then, that the cost of using the index is 19.787874 + 2 which looks suspiciously like adding the blevel to the number of table blocks to get a cost and forgetting that we might have to kiss a lot of frogs before we find the prince. Basically, in this example at least, it looks like the costing algorithm has NOTHING to do with the mechanics of what actually has to happen at run-time.

Footnote

This is only an initial probe into what’s going on with the min/max scan; there are plenty more patterns of data that would need to be tested before we could have any confidence that we had produced a generic model of how the optimizer does its calculations – the only thing to note so far is that there IS a big change as  you move from 11.2.0.3 to later versions: the case on OTN showed the min/max scan disappearing on the upgrade, the example above shows the min/max disappearing on the downgrade – either change could be bad news for parts of a production system.

There are a couple of related bugs that might also be worth reviewing.

  • Bug 11834402 : CBO CHOOSES A SLOW INDEX FULL SCAN OVER A MUCH FASTER INDEX RANGE SCAN
  • Bug 13430622 : INDEX SCAN IN VERY SLOW FOR ONE PREDICATE AND FAST FOR OTHERS

There is a note, though that this last bug was fixed in 12.1

Footnote 2

When experimenting, one idea to pursue as the models get more complex and you’re using indexes with more than two columns is to test whether the presence of carefully chosen column group statistics might make a difference to the optimizer’s estimates of cardinality (hence cost) of the min/max scan.

3 Comments »

  1. Dear Sir,

    I ran your examples and obtained the same results with one exception. The plan with optimizer features 12.1.0.2 consistently shows the cost of 2624 (100). Here is the whole plan:
    ”’

    select         /*+ index(t(line_nr, id)) */         min(line_nr) from
          min_max_test t where         id = :b1
    
    -----------------------------------------------------------------------------------------
    | Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |           |       |       |  2624 (100)|          |
    |   1 |  SORT AGGREGATE             |           |     1 |     8 |            |          |
    |   2 |   FIRST ROW                 |           |     1 |     8 |  2624   (1)| 00:00:01 |
    |*  3 |    INDEX FULL SCAN (MIN/MAX)| MMT_LN_ID |     1 |     8 |  2624   (1)| 00:00:01 |
    -----------------------------------------------------------------------------------------
    '''
    

    I checked optimizer trace files and at least in the parts presented in your post they show basically the same content. I wonder where this high cost in my plan comes from.

    Regards,

    January

    Comment by January — March 23, 2017 @ 6:17 pm BST Mar 23,2017 | Reply

    • January,

      Thanks for the comment.

      The cost is suspiciously similar to something we might expect for a complete “index full scan” – i.e. just a little bit larger than the number of leaf blocks in the index. I don’t know why you are seeing this. Can you check to see if you have any patches applied – maybe you’ve already got the patch that I referenced near the end of the article installed; or maybe there are some related fix-controls set.

      Do you see a section in the “costing index mmt_ln_id” part which costs for another index access path ?

      Comment by Jonathan Lewis — March 23, 2017 @ 9:04 pm BST Mar 23,2017 | Reply

      • Excuse me for the late reply but I could not find the time to do it before today. I checked for the things you asked for. I do not have any patches applied, it is just a clean install of 12.1.0.2.0. In trace files I find this line: fix 13430622 = enabled. Here are the sections “costing index mmt_ln_id” for both plans:


        ****** Costing Index MMT_LN_ID
        SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN
        SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER
        Access Path: index (Min/Max)
        Index: MMT_LN_ID
        resc_io: 135.000000 resc_cpu: 961594
        ix_sel: 1.000000 ix_sel_with_filters: 1.9788e-05
        Cost: 135.215122 Resp: 135.215122 Degree: 1
        Best:: AccessPath: IndexRange
        Index: MMT_LN_ID
        Cost: 135.215122 Degree: 1 Resp: 135.215122 Card: 1.000000 Bytes: 0.000000

        ****** Costing Index MMT_LN_ID
        SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_SCAN
        SPD: Return code in qosdDSDirSetup: NOCTX, estType = INDEX_FILTER
        Access Path: index (Min/Max)
        Index: MMT_LN_ID
        resc_io: 21.787874 resc_cpu: 156872
        ix_sel: 1.000000 ix_sel_with_filters: 1.9788e-05
        Cost: 21.982730 Resp: 21.982730 Degree: 1
        Best:: AccessPath: IndexRange
        Index: MMT_LN_ID
        Cost: 21.982730 Degree: 1 Resp: 21.982730 Card: 1.000000 Bytes: 0.000000

        In trace file it shows the cost of nearly 22 but in the plan it shows 2624.

        Regards,

        January

        Comment by January — March 28, 2017 @ 3:16 pm BST Mar 28,2017 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Powered by WordPress.com.