Oracle Scratchpad

August 14, 2017

Join Elimination Bug

Filed under: Bugs,CBO,Oracle — Jonathan Lewis @ 11:59 am BST Aug 14,2017

A few years ago a bug relating to join elimination showed up in a comment to a post I’d done about the need to keep on testing and learining. The bug was visible in version 11.2.0.2 and, with a script to replay it, I’d found that it had disappeared by 11.2.0.4.

Today I had a reason to rediscover the script, and decided to test it against 12.2.0.1 – and found that the bug was still present.

Here’s the model:


rem     Script:         join_eliminate_bug_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2012

drop table child purge;
drop table parent purge;

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
        deferrable initially immediate
)
;

create table child(
        id_p    number(4)       
                constraint chi_fk_par
                references parent,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id)
)
;

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');

commit;

begin
        dbms_stats.gather_table_stats(user,'child');
        dbms_stats.gather_table_stats(user,'parent');
end;
/

set serveroutput off

select
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);

The setup is just to show you the correct results with join elimination taking place. Here’s the output from the query and the actual execution plan:

      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally
         2          1 Jack
         2          2 Jill

4 rows selected.


PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  1whubydgj8w0s, child number 0
-------------------------------------
select  chi.* from  child chi,  parent par where  par.id = chi.id_p

Plan hash value: 2406669797

-----------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |    11 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    48 |    11 |
-----------------------------------------------------------

On a single column join, with referential integrity in place, and no columns other than the primary key involved, the optimizer eliminates table parent from the query. But if I now defer the primary key constraint on parent and duplicate every row (which ought to duplicate the query result), watch what happens with the query:


set constraint par_pk deferred;

insert into parent (id,name) values (1,'Smith');
insert into parent (id,name) values (2,'Jones');

alter system flush shared_pool;

select
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);


      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally
         2          1 Jack
         2          2 Jill

4 rows selected.


PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  1whubydgj8w0s, child number 0
-------------------------------------
select  chi.* from  child chi,  parent par where  par.id = chi.id_p

Plan hash value: 2406669797

-----------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |    11 |
|   1 |  TABLE ACCESS FULL| CHILD |     4 |    48 |    11 |
-----------------------------------------------------------

I get the same plan, so I get the same results – and notice that I flushed the shared pool before repeating the query so I haven’t fooled Oracle into reusing the wrong plan by accident – it’s a whole new freshly optimized plan.

Just to show what ought to happen here’s the last bit of the test case:


select  /*+ no_eliminate_join(@sel$1 par@sel$1) */
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);


      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally
         1          1 Simon
         1          2 Sally
         2          1 Jack
         2          2 Jill
         2          1 Jack
         2          2 Jill

8 rows selected.


PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  5p8sp7k8b0fgq, child number 0
-------------------------------------
select /*+ no_eliminate_join(@sel$1 par@sel$1) */  chi.* from  child
chi,  parent par where  par.id = chi.id_p

Plan hash value: 65982890

-----------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |       |       |     5 |
|   1 |  NESTED LOOPS                |        |     4 |    60 |     5 |
|   2 |   NESTED LOOPS               |        |     4 |    60 |     5 |
|   3 |    INDEX FULL SCAN           | PAR_PK |     2 |     6 |     1 |
|*  4 |    INDEX RANGE SCAN          | CHI_PK |     2 |       |     1 |
|   5 |   TABLE ACCESS BY INDEX ROWID| CHILD  |     2 |    24 |     2 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("PAR"."ID"="CHI"."ID_P")


I ran this test on 11.2.0.4 – and then repeated it on 12.2.0.1: the bug is still present (although I thought I’d seen a MoS note saying it had been fixed in 12.1).

It’s always a little dangerous playing around with deferrable constraints – my view is that you should keep the interval of deferment as short as possible and don’t try to use it for doing anything other than correcting known data errors. At present if you have code that defers constraints and runs non-trivial queries afterwards you might want that code to start with an “alter session” to set the hidden parameter _optimizer_join_elimination_enabled to false (after checking with Oracle support, of course).

July 7, 2017

OFE

Filed under: CBO,Oracle,Upgrades — Jonathan Lewis @ 1:14 pm BST Jul 7,2017

The title is a well-known shorthand for parameter optimizer_features_enable and it has been the topic of a recent blog post by Mike Dietrich in which he decries the practice of switching the parameter back to an older version on an upgrade (even though, as he points out, Oracle support has been known to recommend it and the manuals describe – though not with 100% accuracy – why you might do so).

I am one of the people who will suggest that on the upgrade a client should consider setting the optimizer_features_enable to the version just left behind as a strategy for getting to a newer version of the base code while minimising the threat of plan instability, so I’m going to play devil’s advocate in this case even though, as we shall see, I am nearly 100% in favour of Mike’s complaint.

The first point, of course, is full disclosure to the client.  Eventually they will have to set the parameter to the current database version, all they’re doing to trying to spread out the workload of addressing a perceived threat. Moreover, they are only minimising the threat, not eliminating it. Setting the parameter has the effect of changing the state of a long list of parameters and “fix controls” – but there’s no guarantee that it will reverse out all the code changes between the two versions. One hopes it won’t reverse out a bug-fix (though Mike quotes a MoS note where exactly that problem appears); more significantly it might not reverse out a clever code optimisation that (in a few unlucky cases) happens to make the SQL run more slowly even when a new transformation is not involved. What you’re hoping for when you set this parameter is that the number of places in your application where you get an unlucky change in performance is much smaller than it would be if you didn’t set the parameter.

The second point is that you really want to have the minimum impact possible while doing expending as little human effort as possible. To this end it’s better to think in terms of setting the parameter for specific users (via a logon trigger), or specific sessions (e.g. batch runs), or specific statements (through a hint or SQL Patch). It may take a couple of test runs to spot the critical classes of statements that point you at the right granularity of implementation, but the more of your SQL that runs at the newer optimizer level the better.

If you’re going to aim for minimum impact, though, and if you’ve got the time to do some broad-brush testing it’s worth going back to my comment that this parameter is a big switch for a number of parameters and fix controls. Perhaps you will be able to spot which new feature, or which fix control is the one thing that needs to be changed – in the short-term – for your system.  Again, statement level is preferable to session level, which is preferable to user level, which is preferable to system level.

The thought of adding a controlling parameter as hint to a statement will probably have some people thinking about creating SQL baselines rather than adding hints to code – and if it’s 3rd party code then an SQL Baseline may be the necessary strategy. Bear in mind that a common advisory for upgrades is “create SQL Baselines for all your SQL first” – it wouldn’t have been me that said it, though!  So here’s something to consider in the light of the whole yes/no argument about optimizer_features_enable, what does a baseline look like ? Here, taken from an 11g database is the critical content of a baseline for “select user from dual”:


IGNORE_OPTIM_EMBEDDED_HINTS
ALL_ROWS
DB_VERSION('11.2.0.4')
OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
OUTLINE_LEAF(@"SEL$1")

It’s a set of hints – including the optimizer_features_enable() hint.Using SQL Baselines to stabilise your code on the upgrade leaves you exposed (in principle) to exactly the problem in the MoS notes that Mike cited as his example of the parameter undoing a bug-fix and producing wrong results. That, by the way, is why I’m not worried by Mike’s example: if the parameter re-introduces a bug then you would have been running with the bug – or probably a workaround to the bug – anyway (Unless, say, you upgraded from 11.2.0.1 to 11.2.0.4 and decided to set the parameter to 11.2.0.3.1 – but that’s not a strategy compatible with the idea of using the parameter for stability with minimum short-term change.)

The second MoS note that Mike cites is really the one that states – emphasis is mine – a realistic view of the parameter (though I’d view the restriction to Oracle Global Support is a legal cop-out rather than a pure technology requirement):

Modifying the OPTIMIZER_FEATURES_ENABLE parameter generally is strongly discouraged and should only be used as a short term measure at the suggestion of Oracle Global Support.

The follow-up comment is, to my mind, a bit hand-wavy:

By reducing the OPTIMIZER_FEATURES_ENABLE level, new optimizer features are disabled. This has serious potential for negatively affecting performance generally by eliminating the possibility of choosing better plans that are only available with features enabled within the higher revision levels.

Arguing the case against setting the parameter because of the potential for affecting performance negatively – when you’re doing it so that nothing changes – is about as valid as the argument for setting it because of the potential for affecting performance negatively in a tiny percentage of plans that use new features when it’s a very bad idea.

Bottom line: whether or not you set the parameter you’re likely to hit a few edge cases where performance suffers; the less time you have for proper testing in advance the more likely you are to feel the need to set the parameter – but if you start heading in that direction think about using the time you do have available to minimise the scope, or even getting down to the detail of which ACTUAL feature is the problem feature that needs to be disabled for your system.

Footnote

If you want to check which parameters and fix controls change as you set the optimizer_features_enable you could mess around with the dynamic performance views. Alternatively you could take advantage of the optimizer trace – it’s one of the easy things that the 10053 offers.  Enable the trace, optimize a simple statement, then check the trace file for the bit about optimizer parameters – the section you need from 12c trace will be as follows:


***************************************
PARAMETERS USED BY THE OPTIMIZER
********************************

... Some 1,700 lines

***************************************
Column Usage Monitoring is ON: tracking level = 1
***************************************

In my case I connected to SQL*Plus, enabled the trace and executed “select 1 from dual”; then I reconnected, set the trace again, set the optimizer_features_enable back to 11.2.0.4 (I was on 12.1.0.2 at the time) and executed “select 2 from dual”. Then I deleted everything but the relevant section from the two trace files. One of the joys of Unix is that you can then run commands like the following:

sdiff  -s  or32_ora_2098.trc  or32_ora_2110.trc  |  expand  >ofe_diff.txt

That’s “side by side comparison, showing only the differences, expand tab marks out to spaces”. Here’s the result (with one blank line inserted between the parameters and the fix controls):

                                                              > optimizer_features_enable           = 11.2.0.4
                                                              > _fix_control_key                    = -1750344682
optimizer_features_enable           = 12.1.0.2                <
_optimizer_undo_cost_change         = 12.1.0.2                | _optimizer_undo_cost_change         = 11.2.0.4
_fix_control_key                    = 0                       <
_optimizer_cube_join_enabled        = true                    | _optimizer_cube_join_enabled        = false
_optimizer_hybrid_fpwj_enabled      = true                    | _optimizer_hybrid_fpwj_enabled      = false
_px_replication_enabled             = true                    | _px_replication_enabled             = false
_optimizer_partial_join_eval        = true                    | _optimizer_partial_join_eval        = false
_px_concurrent                      = true                    | _px_concurrent                      = false
_px_object_sampling_enabled         = true                    | _px_object_sampling_enabled         = false
_optimizer_unnest_scalar_sq         = true                    | _optimizer_unnest_scalar_sq         = false
_px_filter_parallelized             = true                    | _px_filter_parallelized             = false
_px_filter_skew_handling            = true                    | _px_filter_skew_handling            = false
_optimizer_multi_table_outerjoin    = true                    | _optimizer_multi_table_outerjoin    = false
_px_groupby_pushdown                = force                   | _px_groupby_pushdown                = choose
_optimizer_ansi_join_lateral_enhance = true                   | _optimizer_ansi_join_lateral_enhance = false
_px_parallelize_expression          = true                    | _px_parallelize_expression          = false
_optimizer_ansi_rearchitecture      = true                    | _optimizer_ansi_rearchitecture      = false
_optimizer_gather_stats_on_load     = true                    | _optimizer_gather_stats_on_load     = false
_px_adaptive_dist_method            = choose                  | _px_adaptive_dist_method            = off
_optimizer_batch_table_access_by_rowid = true                 | _optimizer_batch_table_access_by_rowid = false
_px_wif_dfo_declumping              = choose                  | _px_wif_dfo_declumping              = off
_px_wif_extend_distribution_keys    = true                    | _px_wif_extend_distribution_keys    = false
_px_join_skew_handling              = true                    | _px_join_skew_handling              = false
_px_partial_rollup_pushdown         = adaptive                | _px_partial_rollup_pushdown         = off
_px_single_server_enabled           = true                    | _px_single_server_enabled           = false
_optimizer_dsdir_usage_control      = 126                     | _optimizer_dsdir_usage_control      = 0
_px_cpu_autodop_enabled             = true                    | _px_cpu_autodop_enabled             = false
_optimizer_use_gtt_session_stats    = true                    | _optimizer_use_gtt_session_stats    = false
_optimizer_adaptive_plans           = true                    | _optimizer_adaptive_plans           = false
_optimizer_proc_rate_level          = basic                   | _optimizer_proc_rate_level          = off
_adaptive_window_consolidator_enabled = true                  | _adaptive_window_consolidator_enabled = false
_optimizer_strans_adaptive_pruning  = true                    | _optimizer_strans_adaptive_pruning  = false
_optimizer_null_accepting_semijoin  = true                    | _optimizer_null_accepting_semijoin  = false
_optimizer_cluster_by_rowid         = true                    | _optimizer_cluster_by_rowid         = false
_optimizer_cluster_by_rowid_control = 129                     | _optimizer_cluster_by_rowid_control = 3
_distinct_agg_optimization_gsets    = choose                  | _distinct_agg_optimization_gsets    = off
_gby_vector_aggregation_enabled     = true                    | _gby_vector_aggregation_enabled     = false
_optimizer_vector_transformation    = true                    | _optimizer_vector_transformation    = false
_optimizer_aggr_groupby_elim        = true                    | _optimizer_aggr_groupby_elim        = false
_optimizer_reduce_groupby_key       = true                    | _optimizer_reduce_groupby_key       = false
_optimizer_cluster_by_rowid_batched = true                    | _optimizer_cluster_by_rowid_batched = false
_optimizer_inmemory_table_expansion = true                    | _optimizer_inmemory_table_expansion = false
_optimizer_inmemory_gen_pushable_preds = true                 | _optimizer_inmemory_gen_pushable_preds = false
_optimizer_inmemory_autodop         = true                    | _optimizer_inmemory_autodop         = false
_optimizer_inmemory_access_path     = true                    | _optimizer_inmemory_access_path     = false
_optimizer_inmemory_bloom_filter    = true                    | _optimizer_inmemory_bloom_filter    = false
_optimizer_nlj_hj_adaptive_join     = true                    | _optimizer_nlj_hj_adaptive_join     = false
_px_external_table_default_stats    = true                    | _px_external_table_default_stats    = false
_optimizer_inmemory_minmax_pruning  = true                    | _optimizer_inmemory_minmax_pruning  = false
_optimizer_inmemory_cluster_aware_dop = true                  | _optimizer_inmemory_cluster_aware_dop = false

    fix  9898249 = enabled                                    |     fix  9898249 = disabled
    fix 10004943 = enabled                                    |     fix 10004943 = disabled
    fix  9554026 = enabled                                    |     fix  9554026 = disabled
    fix  9593547 = enabled                                    |     fix  9593547 = disabled
    fix  9833381 = enabled                                    |     fix  9833381 = disabled
    fix 10106423 = enabled                                    |     fix 10106423 = disabled
    fix 10175079 = enabled                                    |     fix 10175079 = disabled
    fix 10236566 = enabled                                    |     fix 10236566 = disabled
    fix  9721228 = enabled                                    |     fix  9721228 = disabled
    fix  9929609 = enabled                                    |     fix  9929609 = disabled
    fix 10182672 = enabled                                    |     fix 10182672 = disabled
    fix  9832338 = enabled                                    |     fix  9832338 = disabled
    fix 11668189 = enabled                                    |     fix 11668189 = disabled
    fix 11940126 = enabled                                    |     fix 11940126 = disabled
    fix 12390139 = enabled                                    |     fix 12390139 = disabled
    fix 11744016 = enabled                                    |     fix 11744016 = disabled
    fix 10216738 = enabled                                    |     fix 10216738 = disabled
    fix 12563419 = enabled                                    |     fix 12563419 = disabled
    fix 12535474 = enabled                                    |     fix 12535474 = disabled
    fix 12561635 = enabled                                    |     fix 12561635 = disabled
    fix 12569245 = enabled                                    |     fix 12569245 = disabled
    fix 12569300 = enabled                                    |     fix 12569300 = disabled
    fix 12569316 = enabled                                    |     fix 12569316 = disabled
    fix 12569321 = enabled                                    |     fix 12569321 = disabled
    fix  9002958 = enabled                                    |     fix  9002958 = disabled
    fix 12810427 = enabled                                    |     fix 12810427 = disabled
    fix 12914055 = enabled                                    |     fix 12914055 = disabled
    fix 12978495 = enabled                                    |     fix 12978495 = disabled
    fix 13110511 = enabled                                    |     fix 13110511 = disabled
    fix 13345888 = enabled                                    |     fix 13345888 = disabled
    fix 13396096 = enabled                                    |     fix 13396096 = disabled
    fix 12999577 = enabled                                    |     fix 12999577 = disabled
    fix 12954320 = enabled                                    |     fix 12954320 = disabled
    fix 13036910 = enabled                                    |     fix 13036910 = disabled
    fix 12648629 = enabled                                    |     fix 12648629 = disabled
    fix 13704977 = enabled                                    |     fix 13704977 = disabled
    fix 11843466 = enabled                                    |     fix 11843466 = disabled
    fix 13909909 = enabled                                    |     fix 13909909 = disabled
    fix 12856200 = enabled                                    |     fix 12856200 = disabled
    fix  9852856 = enabled                                    |     fix  9852856 = disabled
    fix 14033181 = enabled                                    |     fix 14033181 = disabled
    fix 13836796 = enabled                                    |     fix 13836796 = disabled
    fix 13699643 = enabled                                    |     fix 13699643 = disabled
    fix 13735304 = enabled                                    |     fix 13735304 = disabled
    fix 14464068 = enabled                                    |     fix 14464068 = disabled
    fix 13448445 = enabled                                    |     fix 13448445 = disabled
    fix  9114915 = enabled                                    |     fix  9114915 = disabled
    fix 13109345 = enabled                                    |     fix 13109345 = disabled
    fix 14605040 = enabled                                    |     fix 14605040 = disabled
    fix 14633570 = enabled                                    |     fix 14633570 = disabled
    fix 13573073 = enabled                                    |     fix 13573073 = disabled
    fix 16237969 = enabled                                    |     fix 16237969 = disabled
    fix 13994546 = enabled                                    |     fix 13994546 = disabled
    fix 14750443 = enabled                                    |     fix 14750443 = disabled
    fix 14552075 = enabled                                    |     fix 14552075 = disabled
    fix 16324844 = enabled                                    |     fix 16324844 = disabled
    fix 13583529 = enabled                                    |     fix 13583529 = disabled
    fix 14565911 = enabled                                    |     fix 14565911 = disabled
    fix 16368002 = enabled                                    |     fix 16368002 = disabled
    fix 16077770 = enabled                                    |     fix 16077770 = disabled
    fix 11814337 = enabled                                    |     fix 11814337 = disabled
    fix 14764840 = enabled                                    |     fix 14764840 = disabled
    fix 16555865 = enabled                                    |     fix 16555865 = disabled
    fix 16625151 = enabled                                    |     fix 16625151 = disabled
    fix 16609749 = enabled                                    |     fix 16609749 = disabled
    fix 16751246 = enabled                                    |     fix 16751246 = disabled
    fix 16749025 = enabled                                    |     fix 16749025 = disabled
    fix 16750067 = enabled                                    |     fix 16750067 = disabled
    fix 15899648 = enabled                                    |     fix 15899648 = disabled
    fix 16690013 = enabled                                    |     fix 16690013 = disabled
    fix 16544878 = enabled                                    |     fix 16544878 = disabled
    fix 16725982 = enabled                                    |     fix 16725982 = disabled
    fix 14648222 = enabled                                    |     fix 14648222 = disabled
    fix 16507317 = enabled                                    |     fix 16507317 = disabled
    fix 16837274 = enabled                                    |     fix 16837274 = disabled
    fix 14085520 = enabled                                    |     fix 14085520 = disabled
    fix 16713081 = enabled                                    |     fix 16713081 = disabled
    fix 14703295 = enabled                                    |     fix 14703295 = disabled
    fix 16908409 = enabled                                    |     fix 16908409 = disabled
    fix 16212250 = enabled                                    |     fix 16212250 = disabled
    fix 17087729 = enabled                                    |     fix 17087729 = disabled
    fix 17088819 = enabled                                    |     fix 17088819 = disabled
    fix 13848786 = enabled                                    |     fix 13848786 = disabled
    fix 13522189 = enabled                                    |     fix 13522189 = disabled
    fix 16796185 = enabled                                    |     fix 16796185 = disabled
    fix 15950252 = enabled                                    |     fix 15950252 = disabled
    fix 16976121 = enabled                                    |     fix 16976121 = disabled
    fix 16582322 = enabled                                    |     fix 16582322 = disabled
    fix 16712213 = enabled                                    |     fix 16712213 = disabled
    fix 17382690 = enabled                                    |     fix 17382690 = disabled
    fix 14846352 = enabled                                    |     fix 14846352 = disabled
    fix 16516751 = enabled                                    |     fix 16516751 = disabled
    fix  8611462 = enabled                                    |     fix  8611462 = disabled
    fix 14062749 = enabled                                    |     fix 14062749 = disabled
    fix 16346018 = enabled                                    |     fix 16346018 = disabled
    fix 12977599 = enabled                                    |     fix 12977599 = disabled
    fix 14191778 = enabled                                    |     fix 14191778 = disabled
    fix 15939321 = enabled                                    |     fix 15939321 = disabled
    fix 17543180 = enabled                                    |     fix 17543180 = disabled
    fix 17301564 = enabled                                    |     fix 17301564 = disabled
    fix 12373708 = enabled                                    |     fix 12373708 = disabled
    fix 17397506 = enabled                                    |     fix 17397506 = disabled
    fix 14558315 = enabled                                    |     fix 14558315 = disabled
    fix 16615686 = enabled                                    |     fix 16615686 = disabled
    fix 16622801 = enabled                                    |     fix 16622801 = disabled
    fix 16954950 = enabled                                    |     fix 16954950 = disabled
    fix 17728161 = enabled                                    |     fix 17728161 = disabled
    fix 17760375 = enabled                                    |     fix 17760375 = disabled
    fix 17640863 = enabled                                    |     fix 17640863 = disabled
    fix 17716301 = enabled                                    |     fix 17716301 = disabled
    fix 17597748 = enabled                                    |     fix 17597748 = disabled
    fix 17303359 = enabled                                    |     fix 17303359 = disabled
    fix 16673868 = enabled                                    |     fix 16673868 = disabled
    fix 17800514 = enabled                                    |     fix 17800514 = disabled
    fix 14826303 = enabled                                    |     fix 14826303 = disabled
    fix 17663076 = enabled                                    |     fix 17663076 = disabled
    fix 17760755 = enabled                                    |     fix 17760755 = disabled
    fix 17997159 = enabled                                    |     fix 17997159 = disabled
    fix 14733442 = enabled                                    |     fix 14733442 = disabled
    fix 17781659 = enabled                                    |     fix 17781659 = disabled
    fix 17526569 = enabled                                    |     fix 17526569 = disabled
    fix 17760686 = enabled                                    |     fix 17760686 = disabled
    fix 17696414 = enabled                                    |     fix 17696414 = disabled
    fix 18116777 = enabled                                    |     fix 18116777 = disabled
    fix 16052625 = enabled                                    |     fix 16052625 = disabled
    fix 18091750 = enabled                                    |     fix 18091750 = disabled
    fix 17572606 = enabled                                    |     fix 17572606 = disabled
    fix 18196576 = enabled                                    |     fix 18196576 = disabled
    fix 17736165 = enabled                                    |     fix 17736165 = disabled
    fix 16434021 = enabled                                    |     fix 16434021 = disabled
    fix 18035463 = enabled                                    |     fix 18035463 = disabled
    fix 18011820 = enabled                                    |     fix 18011820 = disabled
    fix 16405740 = enabled                                    |     fix 16405740 = disabled
    fix 18365267 = enabled                                    |     fix 18365267 = disabled
    fix 17863980 = enabled                                    |     fix 17863980 = disabled
    fix 18398980 = enabled                                    |     fix 18398980 = disabled
    fix 18304693 = enabled                                    |     fix 18304693 = disabled
    fix 18508675 = enabled                                    |     fix 18508675 = disabled
    fix 18456944 = enabled                                    |     fix 18456944 = disabled
    fix 17908541 = enabled                                    |     fix 17908541 = disabled
    fix 18467455 = enabled                                    |     fix 18467455 = disabled
    fix 16033838 = enabled                                    |     fix 16033838 = disabled
    fix 16809786 = enabled                                    |     fix 16809786 = disabled
    fix 18425876 = enabled                                    |     fix 18425876 = disabled
    fix 18461984 = enabled                                    |     fix 18461984 = disabled
    fix 17023040 = enabled                                    |     fix 17023040 = disabled
    fix 14776289 = enabled                                    |     fix 14776289 = disabled

That’s 50 parameter differences, and 147 fix controls. Quite a lot of fixes between the two versions.

If you’re coming to the upgrade a couple of years late then you might want to consider using the new version number and list of parameters you generate as the criteria as a search for bugs in MoS. You might even find that simply running your eye down the list of parameters gives you a clue about a type of execution plan that you’ve never seen in the older version.

 

May 25, 2017

Parallelism

Filed under: 12c,CBO,Hints,Ignoring Hints,Oracle — Jonathan Lewis @ 3:48 pm BST May 25,2017

Headline – if you don’t want to read the note – the /*+ parallel(N) */ hint doesn’t mean a query will use parallel execution, even if there are enough parallel execution server processes to make it possible. The parallel(N) hint tells the optimizer to consider the cost of using parallel execution for each path that it examines, but ultimately the optimizer will still take the lowest cost path (bar the odd few special cases) and that path could turn out to be a serial path.

The likelihood of parallelism appearing for a given query changes across versions of Oracle so you can be fooled into thinking you’re seeing bugs as you test new versions but it’s (almost certainly) the same old rule being applied in different circumstances. Here’s an example – which I’ll start off on 11.2.0.4:


create table t1
segment creation immediate
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                          id,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 ; create index t1_i1 on t1(id); begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

set autotrace traceonly explain

select
        count(v1)
from    t1
where   id = 10
;

select
        /*+ parallel(4) */
        count(v1)
from    t1
where   id = 10
;

select
        /*+ parallel(4) full(t1) */
        count(v1)
from    t1
where   id = 10
;

set autotrace off

I haven’t declare the index to be unique, but it clearly could be; and it’s obvious that with 1M rows and about 120M of table a parallel full scan is probably a bad idea to acquire one row (even if you’re running Exadata!). So what do we get for the three plans – I’ll skip the predicate section – when we want to collect one row.


Base plan - unhinted
--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    16 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |       |     1 |    16 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |     1 |    16 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Hinted parallel(4)
--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    16 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |       |     1 |    16 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |     1 |    16 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Hinted parallel(4) and full(t1)
----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |    16 |   606   (2)| 00:00:02 |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |    16 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |    16 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |    16 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          |     1 |    16 |   606   (2)| 00:00:02 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| T1       |     1 |    16 |   606   (2)| 00:00:02 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------

In 11.2.0.4 the optimizer did consider the parallel hint when it appeared on its own – but it has compared the parallel(4) cost of 606 with the serial index cost of 4 and chosen the indexed access path. This is not a case of ignoring the hint, it’s an example of being fooled if you don’t know how the hint is really supposed to work.

But here’s an interesting change that appeared in 12.2 – this time just the plan with the parallel(4) hint on its own:


---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |          |     1 |    16 |     4   (0)| 00:00:01 |        |      |            |
|   1 |  SORT AGGREGATE                         |          |     1 |    16 |            |          |        |      |            |
|   2 |   PX COORDINATOR                        |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)                  | :TQ10001 |     1 |    16 |            |          |  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE                      |          |     1 |    16 |            |          |  Q1,01 | PCWP |            |
|   5 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1       |     1 |    16 |     4   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|   6 |       PX RECEIVE                        |          |     1 |       |     3   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|   7 |        PX SEND HASH (BLOCK ADDRESS)     | :TQ10000 |     1 |       |     3   (0)| 00:00:01 |  Q1,00 | S->P | HASH (BLOCK|
|   8 |         PX SELECTOR                     |          |       |       |            |          |  Q1,00 | SCWC |            |
|*  9 |          INDEX RANGE SCAN               | T1_I1    |     1 |       |     3   (0)| 00:00:01 |  Q1,00 | SCWP |            |
---------------------------------------------------------------------------------------------------------------------------------

You get a parallel execution plan – although it starts with a serial index range scan which is operated for the new (12c) PX Selector operator that allocates a serial operation to one of the parallel execution slaves – which, approximately, is why the indexed access cost doesn’t change in this example – rather than running it through the query coordinator (QC). The serial range scan does a hash distribution (hashed by block address of the rowids it finds to avoid collisions between parallel execution slave as they do their table accesses.

This is just one cute little trick that makes it worth looking at the upgrade to 12c – this new path is likely to be of benefit to people who had to create global (as opposed to globally partitioned) indexes on partitioned tables.

This note was prompted by a recent twitter comment by Timur Akhmadeev followed in short order by an OTN posting that added further confusion to the problem by running Siebel – which is just one of several 3rd party products that love to configure optimizer parameters with non-standard values like: optimizer_index_cost_adj = 1, or optimizer_mode = first_rows_10. (At the last update I’ve seen on the thread, there seemed to be some other reason why parallelism was being blocked.)

Footnote

In a follow-up tweet, Timue directed me to the 11.2 SQL Language Reference manual – specifically a section on the Parallel Hint, asking if this was an example of a documentation bug.

The trouble with the manuals is that sometimes they are obviously wrong, sometimes they are wrong but it’s not obvious they are wrong, sometimes they omit important information, and sometimes they are badly written and, most specfically, the writing can be ambiguous.

Here’s an extract we could consider:

For PARALLEL, if you specify integer, then that degree of parallelism will be used for the statement.

But my example above shows a “parallel({integer})” hint where we didn’t use that degree of parallelism for the statement.

However the next two sentences read as follows:

If you omit integer, then the database computes the degree of parallelism. All the access paths that can use parallelism will use the specified or computed degree of parallelism.

So what if the optimizer uses the degree of parallelism while calculating the lowest cost plan and ends up with a serial plan ? How comfortable would you feel saying that Oracle has “used the degree of parallelism for the statement”. Or would you say that the first sentence means Oracle isn’t allowed to use a serial plan even if it finds one when doing the arithmetic with the appropriate degree of parallelism.

My call is that this is one of those ambiguous cases – the manual should say something more like:

For PARALLEL, if you specify integer, then that degree of parallelism will be used by the optimizer while calculating the best execution  plan for the statement.

Even then I’m not sure that that’s a complete statement of how the hint works because when you have a full set of system statistics, or have used the dbms_resource_manager.calibrate_io mechanism to tell Oracle about the I/O capacity of the system the optimizer may do some working that says something like: “the hint says degree 64, but the stats say the maximum effective degree will be 38 so I’ll calculate using 38” (This type of thing happens with the older usage of the parallel hint with manual parallelism – I haven’t examined what happens with an automatic policy and the newer option for the hint.)

 

April 14, 2017

Character selectivity

Filed under: CBO,Oracle — Jonathan Lewis @ 12:40 pm BST Apr 14,2017

A recent OTN posting asked how the optimizer dealt with “like” predicates for character types quoting the DDL and a query that I had published some time ago in a presentation I had done with Kyle Hailey. I thought that I had already given a detailed answer somewhere on my blog (or even in the presentation) but found that I couldn’t track down the necessary working, so here’s a repeat of the question and a full explanation of the working.

The query is very simple, and the optimizer’s arithmetic takes an “obvious” strategy in the arithmetic. Here’s the sample query, with the equiavalent query that we can use to do the calculation:


select * from t1 where alpha_06 like 'mm%';

select * from t1 where alpha_06 >= 'mm' and alpha_06 < 'mn';

Ignoring the possible pain of the EBCDIC character set and multi-byte national-language character sets with “strange” collation orders, it should be reasonably easy to see that ‘mn’ is the first string in alphabetical order that fails to match ‘mm%’. With that thought in mind we can apply the standard arithmetic for range-based predicates assuming, to stick with the easy example, that there are no histograms involved. For a range closed at one end and and open at the other the selectivity is:


( ( 'mn' - 'mm') / (high_value - low_value) ) + 1/num_distinct

The tricky bits, of course, are how you subtract ‘mm’ from ‘mn’ and how you use the values stored in the low_value and high_value columns of view user_tab_cols. So let’s generate the orginal data set and see where we go (running on 12c, and eliminating redundant bits from the original presentation):


rem
rem     Script:         selectivity_like_char.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sep 2013
rem

execute dbms_random.seed(0)

create table t1 nologging as
with generator as (
        select rownum id
        from dual
        connect by rownum <= 1000
)
select
        cast(dbms_random.string('l',6) as char(6))      alpha_06
from
        generator,
        generator
where
        rownum <= 1e6 -- > comment to avoid WordPress formatting issue
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1')

column low_value  format a32
column high_value format a32

select
        column_name,
        num_distinct,
        density,
        low_value,
        high_value
from
        user_tab_cols
where
        table_name = 'T1'
order by
        column_name
;

select min(alpha_06), max(alpha_06) from t1;

set autotrace traceonly explain

select
        *
from
        t1
where
        alpha_06 like 'mm%'
;

set autotrace off

It will probably take a couple of minutes to generate the data – it’s 1M random strings, lower-case, 6 characters fixed – and will take up about 12MB of space. Here are the results from the stats and min/max queries, with the execution plan for the query we are testing:


COLUMN_NAME          NUM_DISTINCT    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------- ------------ ---------- -------------------------- --------------------------
ALPHA_06                  1000000    .000001 616161616E72               7A7A7A78747A


MIN(AL MAX(AL
------ ------
aaaanr zzzxtz


Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   157 |  1099 |   265  (20)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   157 |  1099 |   265  (20)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("ALPHA_06" LIKE 'mm%')


Given that there are power(26,6) = 308,915,776 different combinations available for lower-case strings of 6 charactgers it’s not too surprising that Oracle generated 1M different strings, nor is it particularly surprising that the lowest value string started with ‘aaa’ and the highest with ‘zzz’.

So how do we get 157 as the cardinality for the query or, to put it another way, how do we get 0.000157 as the selectivity of the predicate. We need to refer to a note I wrote a few years ago to help us on our way (with a little caveat due to a change that appeared in 11.2.0.4) – what number would Oracle use to represent ‘mm’ and the other three strings we need to work with ?

According to the rules supplied (and adjusted in later versions) we have to:

  1. pad the strings with ASCII nulls (zeros) up to 15 bytes
  2. treat the results as a hexadecimal number and convert to decimal
  3. round off the last 21 decimal digits

We can model this in SQL with a statement like:


SQL> column dec_value format 999,999,999,999,999,999,999,999,999,999,999,999
SQL> select round(to_number(utl_raw.cast_to_raw(rpad('aaaanr',15,chr(0))),'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'),-21) dec_val from dual;

DEC_VAL
------------------------------------------------
505,627,904,294,763,000,000,000,000,000,000,000

1 row selected.

As an alternative, or possibly a cross-check, I created a table with a varchar2(6) column, inserted the four values I was interested in and created a histogram of 4 buckets on the column (there’s a suitable little demo at this URL) and got the following endpoint values:


ENDPOINT_NUMBER                                   ENDPOINT_VALUE
--------------- ------------------------------------------------
              1  505,627,904,294,763,000,000,000,000,000,000,000
              2  568,171,140,227,094,000,000,000,000,000,000,000
              3  568,191,422,636,698,000,000,000,000,000,000,000
              4  635,944,373,827,734,000,000,000,000,000,000,000

Once we’ve got these numbers we can slot them into the standard formula (not forgetting the 1/1,000,000 for the closed end of the predicate) – and to save typing I’m going to factor out 10^21 across the board in the division:

Selectivity = (568,191,422,636,698 – 568,171,140,227,094) / (635,944,373,827,734 – 505,627,904,294,763) + 1/1,000,000

Selectivity = 20,282,409,604 / 130,316,469,532,971 + 1/1,000,000

Selectivity = 0.00015564 + 0.000001 = 0.00015664

From which the cardinality = (selectivity * num_rows) = 156.64, which rounds up to 157. Q.E.D.

March 27, 2017

Index out of range

Filed under: CBO,Indexing,Oracle,Troubleshooting — Jonathan Lewis @ 8:42 am BST Mar 27,2017

I’ve waxed lyrical in the past about creating suitable column group statistics whenever you drop an index because even when the optimizer doesn’t use an index in its execution path it might have used the number of distinct keys of the index (user_indexes.distinct_keys) in its estimates of cardinality.

I’ve also highlighted various warnings (here (with several follow-on URLs) and here) about when the optimizer declines to use column group statistics. One of those cases is when a predicate on one of the columns goes “out of  range” – i.e. is below the column low_value or above the column high_value. Last night it suddenly crossed my mind that if we drop an index and replace it with a column group we might see an example of inconsistent behaviour: what happens when the index exists but the predicate is out of range – would you find that dropping the index and replacing it with a column group would give you different cardinality estimates for out of range predicates ?

Here’s the demonstration of what happened when I created a simple test on 12.1.0.2:


rem
rem     Script:         index_v_colgrp.sql
rem     Author:         Jonathan Lewis
rem
rem     Last tested
rem             12.1.0.2
rem

create table t1
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                          id,
        mod(rownum-1,100)               n1,
        mod(rownum-1,100)               n2,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format problem
;

begin dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t1_i1 on t1(n1, n2);

set autotrace traceonly explain

I’ve created a table with 1M rows, where n1 and n2 are perfectly correlated – there are 100 distinct pairs of values (ranging from (0,0) to (99,99)). Now with autotrace enabled I’ll execute three queries – two with an index on the table of which one will be the baseline plan for predicates that are “in-range” and the other will take the predicates out of range, and the third after I’ve dropped the index and substituted a matching column group to see what I get for the “out of range” plan. The plans may produce different paths as the index disappears, of course, but what we’re only interested in the cardinality estimates in this experiment.

Here’s the code to run the three queries:


select  padding
from    t1
where
        n1 = 50
and     n2 = 50
;

select  padding
from    t1
where
        n1 = 110
and     n2 = 110
;

drop index t1_i1;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for columns (n1, n2) size 1'
        );
end;
/

select  padding
from    t1
where
        n1 = 110
and     n2 = 110
;

And the three execution plans:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  1044K|  2142   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  1044K|  2142   (4)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N1"=50 AND "N2"=50)


---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |    79 |  8453 |    83   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    79 |  8453 |    83   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |    79 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=110 AND "N2"=110)


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    79 |  8453 |  2142   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    79 |  8453 |  2142   (4)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N1"=110 AND "N2"=110)

In summary:

  • With the index in place and the predicates in range the optimizer used user_indexes.distinct_keys to calculate cardinality.
  • With the index in place and the predicates (or just one of them, in fact) out of range the optimizer used the individual column selectivities with linear decay.
  • With a column group instead of an index the optimizer behaved exactly as it used to with the index in place.

So my concern that substituting column groups for indexes was unfounded – the optimizer was being silly (legal disclaimer: that’s just my opinion) with indexes, and the silly (ditto) behaviour with column groups hasn’t changed anything.

I’ll have to go back a couple of versions of Oracle to repeat these tests – maybe this behaviour with user_indexes.distinct_keys in place is relatively recent, but it’s another reason why execution plans may change suddenly and badly as time passes when “nothing changed”.

 

March 23, 2017

min/max Upgrade

Filed under: Bugs,CBO,Execution plans,Indexing,Oracle,Troubleshooting — Jonathan Lewis @ 8:53 am BST Mar 23,2017

A question came up on the OTN database forum a little while ago about a very simple query that was taking different execution paths on two databases with the same table and index definitions and similar data. In one database the plan used the “index full scan (min/max)” operation while the other database used a brute force “index fast full scan” operation.

In most circumstances the starting point to address a question like this is to check whether some configuration details, or some statistics, or the values used in the query are sufficiently different to result in a significant change in costs; and the first simple procedure you can follow is to hint each database to use the plan from the opposite database to see if this produces any clues about the difference – it’s a good idea when doing this test to use one of the more verbose formatting options for the call to dbms_xplan.

In this case, though, the OP discovered a note on MoS reporting exactly the problem he was seeing:

Doc ID 2144428.1: Optimizer Picking Wrong ‘INDEX FAST FULL SCAN’ Plan vs Correct ‘INDEX FULL SCAN (MIN/MAX)’

which referred to

Bug 22662807: OPTIMIZER PICKING INDEX FFS CAN INSTEAD OF MIN/MAX

Conveniently the document suggested a few workarounds:

  • alter session set optimizer_features_enable = ‘11.2.0.3’;
  • alter session set “_fix_control” = ‘13430622:off’;
  • delete object stats [Ed: so that dynamic sampling takes place … maybe a /*+ dynamic_sampling(alias level) */ hint would suffice].

Of the three options my preference would (at least in the short term) be the _fix_control one. Specifically, from the v$system_fix_control view, we can see that it addresses the problem very precisely with the description: “index min/max cardinality estimate fix for filter predicates”.

The example in the bug note showed a very simple statement (even more simple than the OP’s query which was only a single table query anyway), so I thought I’d build a model and run a few tests to see what was going on. Luckily, before I’d started work, one of the other members of the Oak Table network sent an email to the list asking if anyone knew how the optimizer was costing an example he’d constructed – and I’ve finally got around to looking at his example, and here’s the model and answer(s), starting with the data set:


rem
rem     Script:         test_min_max.sql
rem     Dated:          March 2017
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4
rem             11.2.0.3
rem

create table min_max_test nologging
as
with ids as (
        select /*+ Materialize */ rownum  id from dual connect by rownum <= 50000 -- > comment to protect formatting
),
line_nrs as (
        select /*+ Materialize */  rownum line_nr from dual connect by rownum <= 20 -- > comment to protect formatting
)
select
        id, line_nr ,rpad(' ', 800, '*') data
from
        line_nrs, ids
order by
        line_nr, id
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'min_max_test',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index mmt_ln_id on min_max_test (line_nr, id) nologging;
create index mmt_id    on min_max_test (id)          nologging;

The table has two critical columns: each id has 20 line_nr values associated with it, but the way the data was generated means that the line numbers for a given id are scattered across 20 separate table blocks.

There are two indexes – one on the id which will allow us to find all the rows for a given id as efficiently as possible, and one (slightly odd-looking in this context) that would allow us to find a specific row for a given line_nr and id very efficiently. Two things about these indexes – in a live application they should both be compressed on the first (only, in the case of index mmt_id) column, and secondly the necessity of the mmt_id index is questionable and it might be an index you could drop if you reversed the order of the columns in mmt_ln_id. The thing about these indexes, though, is that they allow us to demonstrate a problem. So let’s query the data – twice, hinting each index in turn:


variable b1 number;
exec :b1 := 50000;

set serveroutput off

select
        /*+ index(t(id)) */
        min(line_nr)
from
        min_max_test t
where
        id = :b1
;

select * from table(dbms_xplan.display_cursor);

select
        /*+ index(t(line_nr, id)) */
        min(line_nr)
from
        min_max_test t
where
        id = :b1
;

select * from table(dbms_xplan.display_cursor);

It’s fairly safe to make a prediction about the execution plan and cost of the first query – it’s likely to be a range scan that accesses a couple of branch blocks, a leaf block and 20 separate table blocks followed by a “sort aggregate” – with a cost of about 23.

It’s a little harder to make a prediction about the second query. The optimizer could infer that the min(line_nr) has to be close to the left hand section of the index, and could note that the number of rows in the table is the same as the product of the number of distinct values of the two separate columns, and it might note that the id column is evenly distributed (no histogram) across the data, so it might “guess” that it need only range scan all the entries for the first line_nr to find the appropriate id. So perhaps the optimizer will use the index min/max range scan with a cost that is roughly 2 branch blocks plus total leaf blocks / 20 (since there are 20 distinct values for line_nr); maybe it would divide the leaf block estimate by two because “on average” – i.e. for repeated random selections of value for id – it would have to scan half the leaf blocks. There were 2,618 leaf blocks in my index, so the cost should be close to either 133 or 68.

Here are the two plans – range scan first, min/max second:


select  /*+ index(t(id)) */  min(line_nr) from  min_max_test t where id = :b1
-----------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              |       |       |    23 (100)|          |
|   1 |  SORT AGGREGATE                      |              |     1 |     8 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| MIN_MAX_TEST |    20 |   160 |    23   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | MMT_ID       |    20 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("ID"=:B1)

select  /*+ index(t(line_nr, id)) */  min(line_nr) from  min_max_test t where  id = :b1
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |    22 (100)|          |
|   1 |  SORT AGGREGATE             |           |     1 |     8 |            |          |
|   2 |   FIRST ROW                 |           |     1 |     8 |    22   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| MMT_LN_ID |     1 |     8 |    22   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("ID"=:B1)

Spot on with the estimate for the simple range scan – but what did we do wrong with the estimate for the min/max scan ? You might notice in the first example the “table access by rowid batched” and realise that this is running on 12c. Here’s the plan if I get if I set the optimizer_features_enable back to 11.2.0.3 before running the second query again:


select  /*+ index(t(line_nr, id)) */  min(line_nr) from  min_max_test t where  id = :b1
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |   136 (100)|          |
|   1 |  SORT AGGREGATE             |           |     1 |     8 |            |          |
|   2 |   FIRST ROW                 |           |     1 |     8 |   136   (1)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| MMT_LN_ID |     1 |     8 |   136   (1)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("ID"=:B1)

Using the 11.2.0.3 optimizer model the plan has a cost that’s very close to our prediction – we’ll see why there’s a slight difference in a moment. If we set the optimizer_features_enable to 11.2.0.4 the cost drops back to 22. So for our example 11.2.0.3 will use the simple “index range scan” and an upgrade to 11.2.0.4 (or higher) will switch to the “index full scan (min/max)”. If you look at the OTN posting the impact of the change in costing is exactly the other way around – 11.2.0.3 uses the min/max path, 11.2.0.4 uses the simple index range scan.

The techy bit

You really don’t need to know this – experimenting with the optimizer_features_enable (or _fix_control) will give you plans that show you all the numbers you need to see to check whether or not you’ve run into this particular problem – but if you’re interested here’s a little bit from the two 10053 trace files. We need only look at a few critical lines. From the 11.2.0.3 costing for the min/max scan:


Index Stats::
  Index: MMT_ID  Col#: 1
  LVLS: 2  #LB: 2202  #DK: 50000  LB/K: 1.00  DB/K: 20.00  CLUF: 1000000.00  NRW: 1000000.00
  Index: MMT_LN_ID  Col#: 2 1
  LVLS: 2  #LB: 2618  #DK: 1000000  LB/K: 1.00  DB/K: 1.00  CLUF: 125000.00  NRW: 1000000.00

SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for MIN_MAX_TEST[T]
  Column (#1): ID(NUMBER)
    AvgLen: 5 NDV: 50536 Nulls: 0 Density: 0.000020 Min: 1.000000 Max: 50000.000000
  Table: MIN_MAX_TEST  Alias: T
    Card: Original: 1000000.000000  Rounded: 20  Computed: 19.787874  Non Adjusted: 19.787874

 ****** Costing Index MMT_LN_ID
  Access Path: index (Min/Max)
    Index: MMT_LN_ID
    resc_io: 135.000000  resc_cpu: 961594
    ix_sel: 1.000000  ix_sel_with_filters: 1.9788e-05
    Cost: 135.697679  Resp: 135.697679  Degree: 1

I was running 12.1.0.2 so there were a few extra bits and pieces that I’ve deleted (mostly about SQL Plan Directives and in-memory). Critically we can see that the stats collection has a small error for the ID column – 50,536 distinct values (NDV) instead of exactly 50,000. This seems to have given us a cost for the expected index range of: 2 (blevel) + ceiling(2618 (leaf blocks) * 50536 / 1000000) = 2 + ceil(132.3) = 135, to which we add a bit for the CPU and get to 136. (Q.E.D.)

Then we switch to costing for 11.2.0.4:


SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for MIN_MAX_TEST[T]
  Column (#1): ID(NUMBER)
    AvgLen: 5 NDV: 50536 Nulls: 0 Density: 0.000020 Min: 1.000000 Max: 50000.000000
  Table: MIN_MAX_TEST  Alias: T
    Card: Original: 1000000.000000  Rounded: 20  Computed: 19.787874  Non Adjusted: 19.787874

 ****** Costing Index MMT_LN_ID
  Access Path: index (Min/Max)
    Index: MMT_LN_ID
    resc_io: 21.787874  resc_cpu: 156872
    ix_sel: 1.000000  ix_sel_with_filters: 1.9788e-05
    Cost: 22.324608  Resp: 22.324608  Degree: 1

We still have the small error in the number of distinct values for id, so the estimated number of rows that we need to access from the table for a given id (before “aggregating” to find its minimum line_nr) is 19.787874 (Computed: / Non Adjusted:) rather than exactly 20. Notice, then, that the cost of using the index is 19.787874 + 2 which looks suspiciously like adding the blevel to the number of table blocks to get a cost and forgetting that we might have to kiss a lot of frogs before we find the prince. Basically, in this example at least, it looks like the costing algorithm has NOTHING to do with the mechanics of what actually has to happen at run-time.

Footnote

This is only an initial probe into what’s going on with the min/max scan; there are plenty more patterns of data that would need to be tested before we could have any confidence that we had produced a generic model of how the optimizer does its calculations – the only thing to note so far is that there IS a big change as  you move from 11.2.0.3 to later versions: the case on OTN showed the min/max scan disappearing on the upgrade, the example above shows the min/max disappearing on the downgrade – either change could be bad news for parts of a production system.

There are a couple of related bugs that might also be worth reviewing.

  • Bug 11834402 : CBO CHOOSES A SLOW INDEX FULL SCAN OVER A MUCH FASTER INDEX RANGE SCAN
  • Bug 13430622 : INDEX SCAN IN VERY SLOW FOR ONE PREDICATE AND FAST FOR OTHERS

There is a note, though that this last bug was fixed in 12.1

Footnote 2

When experimenting, one idea to pursue as the models get more complex and you’re using indexes with more than two columns is to test whether the presence of carefully chosen column group statistics might make a difference to the optimizer’s estimates of cardinality (hence cost) of the min/max scan.

March 9, 2017

Join Elimination

Filed under: CBO,Execution plans,Infrastructure,Oracle — Jonathan Lewis @ 6:39 pm BST Mar 9,2017

A question has just appeared on OTN describing a problem where code that works in 11g doesn’t work in 12c (exact versions not specified). The code in question is a C-based wrapper for some SQL, and the problem is a buffer overflow problem. The query supplied is as follows:


select T1.C1 from T1, T2 where T1.C1 = T2.D1;

The problem is that this works in 11g where the receiving (C) variable is declared as

char myBuffer [31];

but it doesn’t work in 12c unless the receiving variable is declared as:

char myBuffer [51];

There’s an important bit of background information that might be giving us a clue about what’s happened (although what I’m about to describe isn’t actually the problem unless the SQL provided is a simplified version of the problem SQL that is expected to display the problem). Column C1 is defined as char(30) and column D1 is defined as char(50). Here’s some sample code showing why you might need a buffer of 50+1 bytes to hold something that ought to be 30+1 bytes long. (This may be nothing to do with the anomaly described in the original posting – it’s just something I thought of when I first saw the question.)


rem     Script:         join_elimination_oddity.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2017

create table t1(
        c30     char(30) primary key
);

create table t2(
        d50     char(50) references t1
);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

explain plan for
select
        t1.c30
from
        t1, t2
where
        t1.c30 = t2.d50
;

select * from table(dbms_xplan.display(null,null,'projection'));

So we’re selecting c30 – the 30 byte character column – from t1; what do we actually get ? Here’s the plan with the projection:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    52 |     1   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |     1 |    52 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T2"."D50" IS NOT NULL)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - "T2"."D50"[CHARACTER,50]

Table t1 has been eliminated and the projected column is the “equivalent” column from t2 – which is too long for the expected output. To work around this problem you can disable join elimination either by parameter (_optimizer_join_elimination_enabled=false) or by hinting /*+ no_eliminate_join(t1) */ in which case the plan (with my data) became a nested loop join from t2 to t1 with column c1 projected as expected.

Footnote:

Two things to note about my demonstration

  • If you’re going to create a referential integrity constraint between columns they do need to be of exactly the same type. (If you want to get technical you might say “in the same domain” – but Oracle doesn’t do domains.)
  • This extremely simple case demonstrates the problem in 11.2.0.4 as well as 12.1.0.2. Possibly a more complex query could be produced where (thanks to limitations in query transformations) 11g doesn’t spot the option for join elimination while 12c does; alternatively, a very simple two-column example in 11g won’t do join elimination while a two-column example in 12.2 can (though it doesn’t always) – so upgrading to 12.2 MIGHT cause more people to see this anomaly appearing.

Update (14/3/17)

A follow-up to the original posting has identified source of the problem. Rather than a side effect of join elimination the anomaly has appeared because of the “partial join evaluation” mechanism introduced in 12c. There is a hint to disable the mechanism /*+ no_partial_join(alias) */ or – if you can’t get hints into place for all the necessary code – it would be possible to disable the feature completely through the use of a hidden parameter “_optimizer_partial_join_eval”=false.  The usual rule of checking with Oracle support before setting hidden parameter applies, of course.

 

March 1, 2017

I don’t know (yet)

Filed under: CBO,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 12:53 pm BST Mar 1,2017

Here’s a question that came to mind while reading a recent question on the OTN database forum. It’s a question to which I don’t know the answer and, at present, I don’t really want to bother modelling at present – although if I were on a customer site and this looked like a likely explanation for a performance anomaly it’s the sort of thing I would create a model for.

If I have a query that runs parallel and does a “hash join buffered” (see, for example, this URL), it’s possible that the processes creating the build table will manage to create the build table in memory without spilling to disc but then find themselves unable to keep the incoming probe table data in memory and spill it to disc before re-reading it to do the join and forward the results to the parent process.

Here’s the plan from the URL above, showing rowsource execution stats:


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Starts | E-Time   |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem |  O/1/M   | Max-Tmp |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |      1 |          |        |      |            |  70101 |00:00:00.17 |      12 |      0 |      0 |       |       |          |         |
|   1 |  PX COORDINATOR         |          |      1 |          |        |      |            |  70101 |00:00:00.17 |      12 |      0 |      0 |       |       |          |         |
|   2 |   PX SEND QC (RANDOM)   | :TQ10002 |      0 | 00:00:01 |  Q1,02 | P->S | QC (RAND)  |      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|*  3 |    HASH JOIN BUFFERED   |          |      2 | 00:00:01 |  Q1,02 | PCWP |            |  70101 |00:00:00.28 |       0 |    310 |    310 |  5952K|  1953K|     2/0/0|    2048 |
|   4 |     PX RECEIVE          |          |      2 | 00:00:01 |  Q1,02 | PCWP |            |  70101 |00:00:00.08 |       0 |      0 |      0 |       |       |          |         |
|   5 |      PX SEND HASH       | :TQ10000 |      0 | 00:00:01 |  Q1,00 | P->P | HASH       |      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|   6 |       PX BLOCK ITERATOR |          |      2 | 00:00:01 |  Q1,00 | PCWC |            |  70101 |00:00:00.05 |    1031 |   1005 |      0 |       |       |          |         |
|*  7 |        TABLE ACCESS FULL| T1       |     26 | 00:00:01 |  Q1,00 | PCWP |            |  70101 |00:00:00.02 |    1031 |   1005 |      0 |       |       |          |         |
|   8 |     PX RECEIVE          |          |      2 | 00:00:01 |  Q1,02 | PCWP |            |  70102 |00:00:00.06 |       0 |      0 |      0 |       |       |          |         |
|   9 |      PX SEND HASH       | :TQ10001 |      0 | 00:00:01 |  Q1,01 | P->P | HASH       |      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  10 |       PX BLOCK ITERATOR |          |      2 | 00:00:01 |  Q1,01 | PCWC |            |  70102 |00:00:00.04 |    1031 |   1005 |      0 |       |       |          |         |
|* 11 |        TABLE ACCESS FULL| T2       |     26 | 00:00:01 |  Q1,01 | PCWP |            |  70102 |00:00:00.01 |    1031 |   1005 |      0 |       |       |          |         |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Note that we have a “hash join buffered” at operation 3, which means incoming data from t2 (the probe table) was buffered before being joined; but we can see 310 writes and reads to temporary space due to that operation. That’s actually t2 spilling from the buffer to disc and then being read back from disc after the t2 scan (at operation 11) completes.

If this pattern of activity appears how is it reflected in the statistics?

Because the build table was completely built in memory you could argue for an “optimal workarea execution”; but because the probe table was temporarily spilled to disc you could argue for a “one-pass workarea operation”. In this case I was able to say that the writes and reads for operation 3 were the t2 table spilling to disc  because (a) I understand (mostly) how parallel hash joins work, (b) I had trace files I could check for timing, and (c) the O/1/M column of the output show 2 optimal operations and no one-pass operations.

The question to which I don’t (yet) know the answer is this: is the I/O caused by this spill to disc captured in v$pgastat under the stats:

NAME                                              VALUE UNIT
-------------------------------------------- ---------- ------------
extra bytes read/written                              0 bytes
cache hit percentage                                100 percent

If so, then you end up with questions like the one on OTN:

How come OEM reports my PGA cache Hit Ratio at 76% when I’ve got so much more memory than seems necessary ?

With the follow-up comment:

All my workarea executions were optimal

Footnote

Though I haven’t examined it recently, I remember noting in the past that the buffer allocation for the incoming probe table was typically a small fraction (often 1/8th or 1/16th) of the size of the buffer allocated for the build table. This may help to explain why you could end up doing a lot of I/O without exhausting the available memory.

I’m probably going to resurrect the example from the linked blog note and check the effects before I publish this post.

 

February 28, 2017

Cost is Time (again)

Filed under: CBO,Execution plans,Oracle — Jonathan Lewis @ 11:19 am BST Feb 28,2017

The hoary old question about lower cost queries running faster or slower that higher cost queries has appeared once again on the OTN database forum. It’s one I’ve addressed numerous times in the past – including on this blog – but the Internet being what it is the signal keeps getting swamped by the noise. This time around a couple of “new” thoughts crossed my mind when reading the question.

There is a Time column on the standard forms of the execution plan output, and the description of this column is available in the manuals and has been for years (here’s a definition from v$sql_plan from 10gR2, for example):

Elapsed time (in seconds) of the operation as estimated by the optimizer’s cost-based approach. For statements that use the rule-based approach, this column is null.

So the first question is this: why are people looking at the cost when they’re asking about the time ? The second question arises from the bit in brackets (parentheses): the time is given in seconds – so how accurate do you think the optimizer’s estimates of ANYTHING are when the best estimate the optimizer will give you for run-time has a granularity of a second ?

Of course there’s a further observation I could make (which only echoes the first question):  I don’t think I’ve ever seen anyone come up with the question: “Will a query with a lower value for Time run faster or slower than a query with a higher value for Time?”

Bottom Line:

Cost is supposed to be a measure of resource usage (per execution of each operation) and should therefore be a measure of time – but the model fails in many ways so when a plan clearly doesn’t meet reasonable expectations for performance you can (often) use the Cost column as an indicator of where the model has failed and this may give you some clues of how to address the problem.

It is unfortunate that before you can recognise when a particular Cost figure is bad you usually need to know something about the data content, the data distribution  pattern, the run-time caching effects, and the way the optimizer does its arithmetic.

 

October 24, 2016

Anniversary OICA

Filed under: CBO,Oracle,Performance,Statistics,Troubleshooting — Jonathan Lewis @ 1:00 pm BST Oct 24,2016

Happy anniversary to me!

On this day 10 years ago I published the first article in my blog. It was about the parameter optimizer_index_cost_adj (hence OICA), a parameter that has been a  source of many performance problems and baffled DBAs over the years and, if you read my first blog posting and follow the links, a parameter that should almost certainly be left untouched.

It seems appropriate to mention it today because I recently found a blog posting (dated 3rd May 2013) on the official Oracle Blogs where the director for Primavera advises setting this parameter to 1 (and the optimizer_index_caching parameter to 90) for the Primavera P6 OLTP (PMDB) database. The recommendation is followed by a fairly typical “don’t blame me” warning, viz: “As with any changes that affect query optimization, it is paramount to TEST, TEST and TEST again. At least these settings are easily adjusted or change back to the original value”.

Here’s a thought, though: setting the optimizer_index_cost_adj to the extreme value 1 is a catastrophic change so don’t suggest it unless you are extremely confident that it’s almost certain to be the right thing to do. If you’re confident that it’s a good idea to reduce the parameter to a much smaller value than the default then suggest a range of values that varies from “ideal if it works, but high risk” to “low risk and mostly helpful”. Maybe a suggestion like: “Primavera P6 OLTP (PMDB) tends to work best with this parameter set to a value in the range of 1 to 15” would be a more appropriate comment from someone in a position of authority.

Here’s another thought: if you work for Oracle you could always contact the optimizer group to present them with your argument for the strategy and see what they think about it. Then you can include their opinion when you offer your suggestion.

For what it’s worth, here’s my opinion: as a general rule you shouldn’t be working around performance issues by fiddling with the optimizer_index_cost_adj; as a specific directive do not set it to 1. If you want to encourage Oracle to be enthusiastic about indexes in general then adjust the system statistics (preferably with a degree of truth). If you need to persuade Oracle that particular indexes are highly desirable than you can use dbms_stats.set_index_stats() to adjust the clustering_factor (and avg_data_blocks_per_key) of those indexes. If you are running 11.2.0.4 or later then you can use dbms_stats.set_table_prefs() to set the “table_cached_blocks” parameter for tables where you think Oracle should be particularly keen on using indexes but isn’t; and if your queries are suffering from bad cardinality estimates because of a pattern of multi-column filter predicates create some column group (extended) statistics.

Why am I so firmly set against setting the optimizer_index_cost_adj to 1 ? Because it doesn’t tell Oracle to “use indexes instead of doing tablescans”, it tells Oracle that every index is just about as good as every other index for almost any query. Here’s a pdf file of an article (formerly published on DBAZine and then on my old website) I wrote over twelve years ago explaining the issue. Various links in the article no longer work, and the data pattern was generated to display the problem in 8i and 9i and you would need to modify the data to display the same effect in newer versions of Oracle – but the principle remains the same.

If you would like to see a slightly newer example of how the parameter causes problems. Here’s a thread dated April 2012 from the OTN database forum where a SYS-recursive query caused a performance problem because the parameter was set 1.

 

August 31, 2016

Parallel_index hint

Filed under: Bugs,CBO,Oracle,Parallel Execution — Jonathan Lewis @ 1:28 pm BST Aug 31,2016

Prompted by a recent OTN posting I’ve dug out from my library the following demonstration of an anomalty with the parallel_index() hint. This note is a warning about  how little we understand hints and what they’re supposed to mean, and how we can be caught out by an upgrade. We’ll start with a data set which, to match a comment made in the origina posting rather than being a necessity for the demonstration, has an index that I’ve manipulated to be larger than the underlying table:


rem
rem     Script:         parallel_index_hint_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          December 2005
rem

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e4)         modded,
        lpad(rownum,10,'0')     v1,
        lpad('x',30,'x')        padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t1_i1 on t1(modded) pctfree 75;
alter table t1 modify modded not null;

Your figures may vary slightly if you try to recreate this model, but according to my stats blocks=7876 for the table and leaf_blocks=8054 for the index. Now I’m going to generate the execution plans for a simple query – count(modded) from the table – with a varying selection of hints.


select /*+ index_ffs(t1 t1_i1) */ count(modded) from t1;

select /*+ parallel_index(t1 t1_i1 20) */ count(modded) from t1;

select /*+ index_ffs(t1 t1_i1) parallel_index(t1 t1_i1 20) */ count(modded) from t1;

select /*+ parallel_index(t1 t1_i1 20) parallel(t1 10) */ count(modded) from t1;

Pause for thought

  • which of the four statements will have a plan that uses an index fast full scan ?
  • which of the four statements will have a plan that indicates parallel execution ?
  • which of the four statements will indicate a parallel index fast full scan ?
  • why are two of the plans going to be identical but with different costs ?

The most interesting bit of this note is in the last question because it’s also the answer to a more subtle “why didn’t Oracle do what I thought it should” question. Here are the four plans I got from an instance of 11.2.0.4 (with a little labelling to remind us about the hinting):


==================================================
index_ffs() on it's own - get index fast full scan
==================================================

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |  1043   (4)| 00:00:06 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_I1 |  1000K|  1043   (4)| 00:00:06 |
-----------------------------------------------------------------------

=====================================================
parallel_index() on it's own
Get serial tablescan which is cheaper than serial FFS
=====================================================

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |  1027   (4)| 00:00:06 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  1000K|  1027   (4)| 00:00:06 |
-------------------------------------------------------------------

========================================================
parallel_index() with index_ffs()
Get parallel fast full scan - at same cost as SERIAL FFS
========================================================

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |     1 |  1043   (4)| 00:00:06 |        |      |            |
|   1 |  SORT AGGREGATE           |          |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR          |          |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000 |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE        |          |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR    |          |  1000K|  1043   (4)| 00:00:06 |  Q1,00 | PCWC |            |
|   6 |       INDEX FAST FULL SCAN| T1_I1    |  1000K|  1043   (4)| 00:00:06 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------

===============================================
parallel_index() with parallel()
Get parallel fast full scan - costed correctly.
(Not costed at the degree given for table).
===============================================

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |     1 |    58   (4)| 00:00:01 |        |      |            |
|   1 |  SORT AGGREGATE           |          |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR          |          |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)    | :TQ10000 |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE        |          |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR    |          |  1000K|    58   (4)| 00:00:01 |  Q1,00 | PCWC |            |
|   6 |       INDEX FAST FULL SCAN| T1_I1    |  1000K|    58   (4)| 00:00:01 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------

First plan – we supplied the index_ffs() hint, it’s valid: so Oracle did an index fast full scan. The serial cost was 1,043.

Second plan – we supplied the parallel_index() hint: but the optimizer apparently didn’t consider the cost of a parallel index fast full scan (which ought to have cost something like (1043/20)/0.9 = 58; instead it chose the serial tablescan at a cost of 1027 (cheaper than a serial index fast full scan because of the way I had defined a very large index).

Third plan – we supplied the parallel_index() hint with an explicit index_ffs() hint: both were legal so the optimizer obeyed the hints and produced a parallel index fast full scan (which is what we might have hoped would happen automatically for the second query). But the cost of the query is 1,043 – the cost of the serial index fast full scan.

Final plan – we didn’t hint an index_ffs() we hinted parallel() and parallel_index(): “strangely” Oracle has selected the parallel index fast full scan – and this time the cost is 58 (which, as I pointed out above, is (1043/20)/0.9, the value it’s “supposed” to be). It looks as if Oracle uses the parallel scaling factor on the index only if the table is ALSO hinted (or defined) to be treated to parallel execution.

Note, by the way, that I’ve hinted the table parallel 10 so that the cost of the parallel tablescan (ca. (1027/10)/0.9 = 114) will be larger than the cost of the parallel index fast full scan. Normally you’re likely to have the same degree hinted for table and index, and the index is likely to be smaller than the table which means that if you see this anomaly in production it’s more likely to be a case of Oracle using a parallel tablescan when you’re expecting a parallel index fast full scan. Oracle will have compared the serial cost of an index fast full scan against the parallel cost of the table.

Footnote:

When I ran the script on 12c there was one important change. The cost of the “correctly costed” index fast full scan came out at 103 rather than 58. Oracle has used the degree from the parallel() hint for the tablescan to calculate the cost of the parallel index fast full scan. If you are unlucky enough to have some code that has conflicting degrees hinted (or defined) across tables and indexes then you may see some plans change because parallel index fast full scans suddenly change their cost.

Addendum (prompted by Randolf Geist’s comment): The plan in 12c also had a Note: “Degree of Parallelism is 4 because of table property”. This was a little odd because the degree of the table was 1 and the hint said 10, but the arithmetic of the tablescan had ( as indicated) used a DOP of 4 – despite reporting Degree=0 in the 10053 trace file. The only way I could get the number 4 was to look at the CPU parameters – bearing in mind the critical parallel parameters:

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     2
parallel_threads_per_cpu             integer     2

parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_degree_limit                string      CPU
parallel_degree_policy               string      MANUAL
parallel_io_cap_enabled              boolean     FALSE

The anomaly of “degree 10” appearing in the costing for the index fast full scan was echoed in the 10053 trace: the file reported “Degree: 20”, then showed the cost being divided by 10.

At run-time, though, the query DID run parallel 20 when hinted with both the parallel() and parallel_index() hints, and ran parallel 10 (while costing and repeating the note about parallel 4) when modified to have only the /*+ parallel(t1 10) */ hint.

August 16, 2016

Month End

Filed under: audit,CBO,Non-technical — Jonathan Lewis @ 1:04 pm BST Aug 16,2016

A question about parallel query and cardinality estimates appeared on OTN a little while ago that prompted me to write this note about helping the optimizer do the best job with the least effort.  (A critical point in the correct answer to the original question is that parallel query may lead to “unexpected” dynamic sampling, which can make a huge difference to the choice of execution plans, but that’s another matter.)

The initial cardinality error in the plan came from the following predicate on a “Date dimension” table:


      AR_DATE.CALENDAR_DT   = AR_DATE.MONTH_END_DT 
AND   AR_DATE.CALENDAR_DT  >= ADD_MONTHS(TRUNC(SYSDATE,'fmyy'), -36) 
AND   AR_DATE.MONTH_END_DT >= ADD_MONTHS(TRUNC(SYSDATE,'fmyy'), -36)

In the parallel plan the estimated number of rows on a full tablescan of the table was 742, while on the serial plan the same tablescan produced a cardinality of 1. You will appreciate that having an estimate of 1 (or less) that is nearly three orders of magnitude wrong is likely to lead to a very bad execution plan.

My first thought when I saw this was (based on a purely intuitive interpretation): “there’s one day every month that’s the last day of the month and we’re looking at roughly that last 36 months so we might predict a cardinality of about 36”. That’s still a long way off the 742 estimate and 1,044 actual for the parallel query, but it’s a warning flag that the serial estimate is probably an important error – it’s also an example of the very simple “sanity checking” mental exercises that can accompany almost any execution plan analysis.

My second thought (which happened to be wrong, and would only have been right some time well before version 10.2.0.5) was that the optimizer would treat the add_months() expressions as unknown values and assign a selectivity of 5% to each of the predicates, reducing the combined selectivity to 1/400th of the selectivity it gave to the first predicate. In fact the optimizer evaluates the expressions and would have used the normal (required range / total range) calculation for those two predicates.

It’s the first predicate that I want to examine, though – how does the optimizer calculate a selectivity for it ? Here’s some code to generate sample data to use for testing.


rem
rem     Script:         month_end.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table t1
nologging
as
select
        rownum                                                   id,
        to_date('01-Jan-2010','dd-mon-yyyy') + rownum - 1       calendar_date,
        add_months(
                trunc(to_date('01-Jan-2010','dd-mon-yyyy') + rownum - 1 ,'MM' ),
                1
        ) - 1                                                   month_end_date
from
        dual
connect by
        level <= trunc(sysdate) - to_date('01-jan_2010','dd-mon-yyyy') + 1
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1')

This clunky bit of code gives me consecutive dates from 1st Jan 2010 up to “today” with the month_end_date column holding the month end date corresponding to the row’s calendar_date. So now we can check what the optimizer makes of the predciate calendar_date = month_end_date:


set autotrace on explain

select count(*) from t1 where calendar_date = month_end_date;

  COUNT(*)
----------
        79

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    16 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    16 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |    16 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CALENDAR_DATE"="MONTH_END_DATE")

Looking at operation 2 we can see that, in effect, the optimizer has considered two independent predicates “calendar_date = {unknown}” and “month_end_date = {unknown}” and taken the lower of the two selectivities – which means the cardinality estimate is 1 because the calendar_date column is unique across this table.

There are various ways to work around problems like this. One of the simplest would be to tell Oracle to sample this table with the (table-level) hint /*+ dynamic_sampling(t1 1) */; in fact, since this predicate is effectively treated as an examination of two predicates the (cursor-level) hint /*+ dynamic_sampling(4) */ would also cause sampling to take place – note that level 4 or higher is required to trigger sampling for “multiple” predicates on a single table. As a general guideline we always try to minimise the risk of side effects so if this problem were embedded in a much larger query I would prefer the table-level hint over the cursor-level hint.

There are other options, though, that would allow you to bypass sampling – provided you can modify the SQL. The script I used to create this table also included the following statement:


alter table t1 add (
        date_offset1 generated always as (calendar_date - month_end_date) virtual,
        date_flag generated always as (case when calendar_date - month_end_date = 0 then 'Y' end) virtual
);

In 12c I would declare these virtual columns to be invisible to avoid problems with any SQL that didn’t use explicit column lists. For demonstration purposes I’ve set up two options – I can find the rows I want with one of two obvious predicates:

    date_offset1 = 0
    date_flag = 'Y'

In fact there’s a third predicate I could use that doesn’t need to know about the virtual columns:

    calendar_date - month_end_date = 0

Unfortunately I can’t arbitrarily swap the order of the two dates in the last predicate, and the optimizer won’t spot that it is also equivalent to “calendar_date = month_end_date”. Here are a few execution plans – for which the only significant bit is the cardinality estimate of the full tablescans:


select count(*) from t1 where date_flag = 'Y';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     2 |     4  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     2 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    79 |   158 |     4  (25)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATE_FLAG"='Y')



select count(*) from t1 where date_offset1 = 0;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    78 |   312 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATE_OFFSET1"=0)



select count(*) from t1 where calendar_date - month_end_date = 0;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    78 |   312 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."DATE_OFFSET1"=0)

It’s interesting to note that the optimizer has transformed the last predicate into the equivalent virtual column expression to do the arithmetic. You might also note that the date_flag option is slightly more accurate, but that’s because it’s based on an expression which is null for the rows we don’t want while the date_offset1 column has a value for every row and a little bit of arithmetical rounding comes into play. You might also note that there’s a small cost difference – which I’d ascribe to the CPU cost that the optimizer has added for the CASE expression being applied on top of the simple date arithmetic.

Of course, whatever else you might play around with when working around a cardinality problem like this, I think the strategic aim for a data warehouse system would be to get a REAL flag column on the table and populate it at data loading time if month-end dates played an important part in the proceedings – though I have to say that the virtual flag column is one I do rather fancy.

 

July 26, 2016

Lost Concatenation

Filed under: Bugs,CBO,Oracle — Jonathan Lewis @ 10:46 am BST Jul 26,2016

This note models one feature of a problem that came up at a client site recently from a system running 12.1.0.2 – a possible bug in the way the optimizer handles a multi-column in-list that can lead to extremely bad cardinality estimates.

The original query was a simple three table join which produced a bad plan with extremely bad cardinality estimates; there was, however, a type-mismatch in one of the predicates (of the form “varchar_col = numeric”), and when this design flaw was addressed the plan changed dramatically and produced good cardinality estimates. The analysis of the plan, 10053 trace, and 10046 trace files done in-house suggested that the problem might relate in some way to an error in the handling of SQL Plan Directives to estimate cardinalities.

This was one of my “solve it in a couple of hours over the internet” assignments and I’d been sent a sample of the original query with the 10046 and 10053 trace files, and a modified version of the query that bypassed the problem, again including the 10046 and 10053 trace files, with a request to explain the problem and produce a simple test case to pass to Oracle support.

The first thing I noticed was that there was something very strange about the execution plan. Here’s the query and plan in from my simplified model, showing the same anomaly:


select  /*+ no_expand */
        count(*)
from    t1, t2
where
        t2.shipment_order_id = t1.order_id
and     (t1.id, t2.v1) in ( (5000, 98), (5000, 99))
;

-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |       |       |       |   331 (100)|          |
|   1 |  SORT AGGREGATE                       |       |     1 |    19 |       |            |          |
|*  2 |   HASH JOIN                           |       |     1 |    19 |  2056K|   331   (5)| 00:00:01 |
|   3 |    TABLE ACCESS FULL                  | T2    |   100K|   878K|       |   219   (3)| 00:00:01 |
|   4 |    TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   100K|   976K|       |     2   (0)| 00:00:01 |
|   5 |     BITMAP CONVERSION TO ROWIDS       |       |       |       |       |            |          |
|   6 |      BITMAP OR                        |       |       |       |       |            |          |
|   7 |       BITMAP CONVERSION FROM ROWIDS   |       |       |       |       |            |          |
|*  8 |        INDEX RANGE SCAN               | T1_PK |       |       |       |     1   (0)| 00:00:01 |
|   9 |       BITMAP CONVERSION FROM ROWIDS   |       |       |       |       |            |          |
|* 10 |        INDEX RANGE SCAN               | T1_PK |       |       |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."SHIPMENT_ORDER_ID"="T1"."ORDER_ID")
       filter((
                  (TO_NUMBER("T2"."V1")=98 AND "T1"."ID"=5000) 
               OR (TO_NUMBER("T2"."V1")=99 AND "T1"."ID"=5000)
       ))
   8 - access("T1"."ID"=5000)
  10 - access("T1"."ID"=5000)

Before going on I meed to remind you that this is modelling a production problem. I had to use a hint to block a transformation that the optimizer wanted to do with my data set and statistics, I’ve got a deliberate type-mismatch in the data definitions, and there’s a simple rewrite of the SQL that would ensure that Oracle does something completely different).

The thing that caught my eye was the use of the bitmap transformation (operations 5,7,9) using exactly the same index range scan twice (operations 8,10). Furthermore, though not visible in the plan, the index in question was (as the name suggests) the primary key index on the table and it was a single column index – and “primary key = constant” should produce an “index unique scan” not a range scan.

Once you’ve added in the fact that operations 8 and 10 are the same “primary key = constant” predicates, you can also pick up on the fact that the cardinality calculation for the table access to table t1 can’t possibly produce more than one row – but it’s reporting a cardinality estimate of 100K rows (which happens to be the number of rows in the table.)

As a final point, you can see that there are no “Notes” about Dynamic Statistics or SQL Directives – this particular issue is not caused by anything to do with 12c sampling. In fact, having created the model, I ran it on 11.2.0.4 and got the same strange bitmap conversion and cardinality estimate. In the case of the client, the first pass the optimizer took went through exactly the same sort of process and produced a plan which was (probably) appropriate for a query where the driving table was going to produce (in their case) an estimated 4 million rows – but not appropriate for the actual 1 row that should have been identified.

In my example, if I allowed concatenation (i.e. removed the no_expand hint) I got the following plan:


------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |       |       |       |     8 (100)|          |
|   1 |  SORT AGGREGATE                        |       |     1 |    19 |            |          |
|   2 |   CONCATENATION                        |       |       |       |            |          |
|   3 |    NESTED LOOPS                        |       |     1 |    19 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID        | T1    |     1 |    10 |     2   (0)| 00:00:01 |
|*  5 |      INDEX UNIQUE SCAN                 | T1_PK |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2    |     1 |     9 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                  | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
|   8 |    NESTED LOOPS                        |       |     1 |    19 |     4   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID        | T1    |     1 |    10 |     2   (0)| 00:00:01 |
|* 10 |      INDEX UNIQUE SCAN                 | T1_PK |     1 |       |     1   (0)| 00:00:01 |
|* 11 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2    |     1 |     9 |     2   (0)| 00:00:01 |
|* 12 |      INDEX RANGE SCAN                  | T2_I1 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T1"."ID"=5000)
   6 - filter(TO_NUMBER("T2"."V1")=99)
   7 - access("T2"."SHIPMENT_ORDER_ID"="T1"."ORDER_ID")
  10 - access("T1"."ID"=5000)
  11 - filter((TO_NUMBER("T2"."V1")=98 AND (LNNVL(TO_NUMBER("T2"."V1")=99) OR
              LNNVL("T1"."ID"=5000))))
  12 - access("T2"."SHIPMENT_ORDER_ID"="T1"."ORDER_ID")

This is a much more appropriate plan – and similar to the type of plan the client saw when they eliminated the type-mismatch problem (I got a completely different plan when I used character values ’98’ and ’99’ in the in-list or when I used a numeric column with numeric literals).

Examining my 10053 trace file I found the following:

  • In the BASE STATISTICAL INFORMATION, the optimizer had picked up column statistics about the order_id column, but not about the id column in the in-list – this explained why the cardinality estimate was 100K, Oracle had “lost” the predicate.
  • In the “SINGLE TABLE ACCESS PATH”, the optimizer had acquired the statistics about the id column and calculated the cost of using the t1_pk index to access the table for a single key (AllEqUnique), then calculated the cost of doing a bitmap conversion twice (remember we have two entries in the in-list – it looks like the optimizer has “rediscovered” the predicate). But it had still kept the table cardinality of 4M.

After coming up with a bad plan thanks to this basic cardinality error, the 10053 trace file for the client’s query then went on to consider or-expansion (concatenation). Looking at this part of their trace file I could see that the BASE STATISTICAL INFORMATION now included the columns relevant to the in-list and the SINGLE TABLE ACCESS PATH cardinalities were suitable. Moreover when we got to the GENERAL PLANS the join to the second table in the join order showed a very sensible cost and cardinality – unfortunately, having been sensible up to that point, the optimizer then decided that an SQL Plan Directive should be used to generate a dynamic sampling query to check the join cardinality and the generated query again “lost” the in-list predicate, resulting in a “corrected” cardinality estimate of 6M instead of a correct cardinality estimate of 1. As usual, this massive over-estimate resulted in Oracle picking the wrong join method with a huge cost for the final join in the client’s query – so the optimizer discarded the or-expansion transformation and ran with the bad bitmap/hash join plan.

Bottom line for the client – we may have seen the same “lose the predicate” bug appearing in two different ways, or we may have seen two different “lose the predicate” bugs – either way a massive over-estimate due to “lost” predicates during cardinality calculations resulted in Oracle picking a very bad plan.

Footnote:

If you want to do further testing on the model, here’s the code to generate the data:


create table t1
nologging
as
with generator as (
        select  rownum id
        from    dual
        connect by
                level <= 1e4
)
select
        rownum                                  id,
        rownum                                  order_id,
        rpad('x',100)                           padding
from
        generator, generator
where
        rownum <= 1e5
;

execute dbms_stats.gather_table_stats(user,'t1')

alter table t1 modify order_id not null;
alter table t1 add constraint t1_pk primary key(id);


create table t2
nologging
as
with generator as (
        select  rownum id
        from    dual
        connect by
                level <= 1e4
)
select
        rownum                                  shipment_order_id,
        mod(rownum-1,1000)                      n1,
        cast(mod(rownum-1,1000) as varchar2(6)) v1,
        rpad('x',100)                           padding
from
        generator, generator
where
        rownum <= 1e5
;

execute dbms_stats.gather_table_stats(user,'t2')

alter table t2 modify shipment_order_id not null;
create index t2_i1 on t2(shipment_order_id);

The interesting question now is WHY does Oracle lose the predicate – unfortunately my model may be too simplistic to allow us to work that out, but it might be sufficient to make it easy for an Oracle developer to see what’s going on and how best to address it. There is one bug on MoS (23343961) that might be related in some way, but I wasn’t convinced that the description was really close enough.

Update

This issue is now recorded on MoS as: Bug 24350407 : WRONG CARDINALITY ESTIMATION IN PRESENCE OF BITMAP OR

 

July 7, 2016

Invisible Bug

Filed under: 12c,Bugs,CBO,Indexing,Oracle — Jonathan Lewis @ 5:27 pm BST Jul 7,2016

At this Wednesday’s Oracle Midlands event someone asked me if Oracle would use the statistics on invisible indexes for the index sanity check. I answered that there had been a bug in the very early days of invisible indexes when the distinct_key statistic on the index could be used even though the index itself would not be considered as a candidate in the plan (and the invisible index is still used to avoid foreign key locking – even in 12c – it’s only supposed to be invisible to the optimizer).

The bug was fixed quite a long time ago – but a comment on the “Index Sanity” article has introduced me to a related bug that is still present in 11.2.0.4 where the presence of an invisible index can affect an execution plan. Here’s a little model (run under 11.2.0.4) to demonstrate:

rem
rem     Script:         invisible_index_bug.sql
rem     Author:         Jonathan Lewis
rem

execute dbms_random.seed(0)

drop table t2;
drop table t1;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        cast(rownum as number(8,0))                     id,
        cast(mod(rownum,1000) as number(8,0))           n1,
        cast(lpad(rownum,10,'0') as varchar2(10))       v1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t2
as
select
        rownum id,
        trunc(dbms_random.value(0,10000)) n1
from
        dual
connect by
        level <= 100
;
begin 
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

column n1 new_value m_n1
select n1 from t2 where id = 50;
clear columns

set autotrace traceonly explain

select
        t1.*
from
        t1, t2
where
        t2.n1 = &m_n1
;

create unique index t2_i1 on t2(n1)
-- invisible
;

select
        t1.*
from
        t1, t2
where
        t2.n1 = &m_n1
;

set autotrace off

All I’ve done is create a couple of tables then do a join that we might expect to see executed as a cartesian merge join; at one point I was going to make the data more complicated and include a join condition, but decided to keep things small and simple so it’s a silly example but it is sufficient to make the point. The funny little bit about selecting an n1 value from t2 was also in anticipation of a more complex example but it does, at least, ensure I query for a value that is in range.

Here are the two execution plans from 11.2.0.4 – the key feature is that the plan changes after the invisible index is created:


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|   119M|  2263   (3)| 00:00:12 |
|   1 |  MERGE JOIN CARTESIAN|      |  1000K|   119M|  2263   (3)| 00:00:12 |
|*  2 |   TABLE ACCESS FULL  | T2   |     1 |     4 |     2   (0)| 00:00:01 |
|   3 |   BUFFER SORT        |      |  1000K|   115M|  2261   (3)| 00:00:12 |
|   4 |    TABLE ACCESS FULL | T1   |  1000K|   115M|  2261   (3)| 00:00:12 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T2"."N1"=5308)


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  1000K|   119M|  2263   (3)| 00:00:12 |
|   1 |  NESTED LOOPS      |      |  1000K|   119M|  2263   (3)| 00:00:12 |
|*  2 |   TABLE ACCESS FULL| T2   |     1 |     4 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   |  1000K|   115M|  2261   (3)| 00:00:12 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T2"."N1"=5308)

Notice how the plan without the invisible index does a “sort” (actually a “buffer sort” so it’s just copying the data into local memory), while the plan with the not quite invisible enough index in place gets away with just a full tablescan. This is bug 16564891, associated with base bug 16544878.

The bug notes say “fixed in 12.2”, but in Oracle 12.1.0.2 the first plan appears in both cases, and we have to make the index visible to get the second plan. (Take note of the need for the “negative” test to prove the point; the fact that the same plan appears for both cases doesn’t, by itself, prove that the bug was fixed, we have to show that the plan would have changed if the bug had still been present).

I believe the problem isn’t the problem of Oracle using the statistics when it shouldn’t; the change appears because in 11g Oracle incorrectly allows itself to see the uniqueness of the index and infer that table t2 is a “single row” table. In 12c the optimizer calculates that there will probably be only one row but that doesn’t stop it choosing the merge join cartesian as the “insurance bet” against having to do more than one tablescan of the t1 table. We can see this difference in the 10053 trace files, the 11g file has an entry for the “Single Table Access Path” for t2 that reads:

1-ROW TABLES:  T2[T2]#0

If you read the bug note for bug 16564891 you’ll see that it has a more realistic example of the problem – and it may give you some idea of where you might run into the bug. In general I don’t think many people are likely to come across the problem since it revolves around uniqueness, which is rather an important property, and there can’t be many occasions when someone decides to add (or test dropping) a unique index. Given that the example in the bug looks like “add a unique index to a dimension table that’s joining to a fact table” that may be a good pointer to where you’re most likely to run into the problem — when you’re trying to enforce data correctness in a data warehouse.

 

June 28, 2016

Index Sanity

Filed under: CBO,extended stats,Indexing,Oracle,Statistics — Jonathan Lewis @ 8:43 am BST Jun 28,2016

By popular demand (well, one person emailed me to ask for it) I’m going to publish the source code for a little demo I’ve been giving since the beginning of the millennium – it concerns indexes and the potential side effects that you can get when you drop an index that you’re “not using”. I think I’ve mentioned the effect several times in the history of this blog, but I can’t find an explicit piece of demo code, so here it is – starting at the conclusion – as a cut and paste from an SQL*Plus session running against an 11g instance:


SQL> set autotrace traceonly explain
select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1184213596

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   484 | 64856 |   227   (2)| 00:00:02 |
|*  1 |  HASH JOIN          |      |   484 | 64856 |   227   (2)| 00:00:02 |
|*  2 |   HASH JOIN         |      |   484 | 57596 |    14   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T2   |    20 |  1160 |     4   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T1   |   484 | 29524 |    10   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL | T3   |  5000 | 75000 |   213   (2)| 00:00:02 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T3"."ID"="T1"."ID1")
   2 - access("T2"."ID1"="T1"."ID1" AND "T2"."IND_PAD"="T1"."IND_PAD"
              AND "T2"."ID2"="T1"."ID2")
   4 - filter("T1"."N1"<=50 AND "T1"."N1">=40)

SQL> drop index t2_i1;

Index dropped.

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2290830436

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    52 |  6968 |    67   (2)| 00:00:01 |
|   1 |  NESTED LOOPS                |       |    52 |  6968 |    67   (2)| 00:00:01 |
|   2 |   NESTED LOOPS               |       |    52 |  6968 |    67   (2)| 00:00:01 |
|*  3 |    HASH JOIN                 |       |    52 |  6188 |    14   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL        | T2    |    20 |  1160 |     4   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL        | T1    |   484 | 29524 |    10   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN         | T3_PK |     1 |       |     0   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID| T3    |     1 |    15 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T2"."ID1"="T1"."ID1" AND "T2"."IND_PAD"="T1"."IND_PAD" AND
              "T2"."ID2"="T1"."ID2")
   5 - filter("T1"."N1"<=50 AND "T1"."N1">=40)
   6 - access("T3"."ID"="T1"."ID1")

Starting from the top – I’ve enabled autotrace which, technically, could mean that the plans are not the ones I’d see at run-time, but you can take my word for it that in 11g they are the run-time plans; then I’ve supplied a query that produces a plan with 3 full tablescans, two hash joins, and no index usage at all.

You’ll notice at operation 3 of the plan that table t2 is very small – only 20 rows selected, with no predicates that could have filtered that result down from a large table (take my word for it the stats have just been collected) so, as the ancient mythology would have it, we don’t really need an index on that table (a quick check tells me that the index wasn’t there to enforce uniqueness). Immediately after the first execution plan you can see that I’ve dropped an index called t2_i1 – trust me that IS the index on table t2.

We “run” the original query again, it gets re-optimised (and there’s no question of cardinality feedback or any other feature coming into play) and we get a different plan.

Dropping, or adding, a multi-column index to a table could change execution plans – even if the index is not present in the plan.

The reason for this is the “index sanity check”. When the optimizer is doing its cardinality estimates, if it see equality conditions on the set of columns that make up an index it can use the distinct_keys statistic from the index in the calculation rather than using the standard calculation of multiplying together the num_distinct of the separate columns. In earlier versions of Oracle there were some restrictions about uniqueness, but the limitations were removed in 11.1.0.7.

In my case there were 10 distinct values for id1, just one value for ind_pad, and 20 distinct values for id2 – but a total of only 20 distinct values for the combination. With an index in place on the combination the optimizer used the value 20 in its calculation, in the absence of the index it used the value 200 – that factor of 10 led to a drop in the join cardinality estimate from 484 rows to 52 rows – at which point the optimizer calculations made the next step in the plan change from a hash join to a nested loop join.

If you want to reproduce the demo, here’s the full script – the data isn’t a realistic data set, and I’ve had to use various non-standard settings to make the script as repeatable as possible – I’ve built the data set in a tablespace using an 8KB block size, 1MB uniform extents and manual (freelist) segment space management.


rem
rem     Script:         index_sanity.sql
rem     Author:         Jonathan Lewis
rem

drop table t3;
drop table t2;
drop table t1;

execute dbms_random.seed(0);

begin   
        begin           execute immediate 'purge recyclebin';
        exception       when others then null;
        end; 

        begin
                dbms_stats.set_system_stats('MBRC',16);
                dbms_stats.set_system_stats('MREADTIM',10);
                dbms_stats.set_system_stats('SREADTIM',5);
                dbms_stats.set_system_stats('CPUSPEED',1000);
        exception
                when others then null;
        end;

end;
/

create table t1
as
select
        mod(rownum,10)          id1,
        mod(rownum,20)          id2,
        rpad('x',40,'x')        ind_pad,
        mod(rownum,100)         n1,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',50)            padding
from
        all_objects
where
        rownum  <= 4000
;

create table t2 
pctfree 99
pctused 1
as
select
        mod(rownum,10)          id1,
        mod(rownum,20)          id2,
        rpad('x',40,'x')        ind_pad,
        mod(rownum,100)         n1, 
        lpad(rownum,10,'0')     small_vc,
        rpad('x',200)           padding
from
        all_objects
where
        rownum <= 20
;

create table t3
pctfree 95
pctused 1
as
select
        rownum          id,
        rpad(rownum,10) small_vc,
        rpad('x',100)   padding
from
        all_objects
where
        rownum <= 5000
;
begin
        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T1',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T2',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T3',
                method_opt => 'for all columns size 1'
        );

end;
/

create        index t1_i1 on t1(id1, ind_pad, id2) pctfree 91;
create        index t2_i1 on t2(id1, ind_pad, id2) pctfree 91;
alter table t3 add constraint t3_pk primary key (id);

set autotrace traceonly explain

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
;

-- alter index t1_i1 invisible;
-- alter index t2_i1 invisible;

drop index t1_i1;
-- drop index t2_i1;

accept X prompt "Press return to coninue"

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
;

set autotrace off

You’ll notice from the commented lines in the above that the effect appears whether you drop the index or make it invisible, also that there’s a similar index on the t1 table that matches the index on the t2 table – I could get the effect from dropping or making invisible either index.

There is a saving grace in 11g – if I do drop, or make invisible, one of these indexes I can protect myself against the statistical effect by create a column group on the same set of columns, and the num_distinct from the column group would serve the same purpose as the distinct_keys from the index.

Next Page »

Powered by WordPress.com.