That’s SQL Tuning Sets, parameter optimizer_features_enable, and SQL Plan Management.
There’s a recent post on OTN describing an issue with SQL Tuning Sets to enforce plan stability when upgrading from 10.2.0.3 to 11.2.0.3 – it doesn’t always work. Here’s a very simple model to demonstrate the type of thing that can happen (the tables are cloned from a completely different demo, so don’t ask why I picked the data they hold):
rem rem Script: ofe_anomaly.sql rem Author: Jonathan Lewis rem Dated: Feb 2012 rem create table t1 as select trunc((rownum-1)/15) n1, trunc((rownum-1)/15) n2, rpad(rownum,180) v1 from all_objects where rownum <= 3000 -- > comment to avoid wordpress format issue ; create table t2 as select mod(rownum,200) n1, mod(rownum,200) n2, rpad(rownum,180) v1 from all_objects where rownum <= 3000 -- > comment to avoid wordpress format issue ; create index t1_i1 on t1(n1); create index t2_i1 on t2(n1); execute dbms_stats.gather_table_stats(user,'t1') execute dbms_stats.gather_table_stats(user,'t2') alter session set optimizer_features_enable = '10.2.0.3'; explain plan for select /*+ ordered use_nl(t1) index(t1) */ t2.n1, t1.n2 from t2,t1 where t2.n2 = 45 and t2.n1 = t1.n1 ; set pagesize 60 set linesize 132 set trimspool on select * from table(dbms_xplan.display(null,null,'outline'));
You’ll notice I’ve included a directive to set the optimizer_features_enable back to 10.2.0.3. If I run this test on both 10.2.0.3 (real) and 11.2.0.3 (with ofe set) I get the same plans but slightly different Outline Data
From 10.2.0.3 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 225 | 3600 | 35 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 120 | 2 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 225 | 3600 | 35 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL | T2 | 15 | 120 | 5 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA USE_NL(@"SEL$1" "T1"@"SEL$1") LEADING(@"SEL$1" "T2"@"SEL$1" "T1"@"SEL$1") INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."N1")) FULL(@"SEL$1" "T2"@"SEL$1") OUTLINE_LEAF(@"SEL$1") ALL_ROWS OPTIMIZER_FEATURES_ENABLE('10.2.0.3') IGNORE_OPTIM_EMBEDDED_HINTS END_OUTLINE_DATA */ From 11.2.0.3 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 225 | 3600 | 54 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 120 | 2 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 225 | 3600 | 54 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL | T2 | 15 | 120 | 24 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA NLJ_PREFETCH(@"SEL$1" "T1"@"SEL$1") USE_NL(@"SEL$1" "T1"@"SEL$1") LEADING(@"SEL$1" "T2"@"SEL$1" "T1"@"SEL$1") INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."N1")) FULL(@"SEL$1" "T2"@"SEL$1") OUTLINE_LEAF(@"SEL$1") ALL_ROWS DB_VERSION('11.2.0.3') OPTIMIZER_FEATURES_ENABLE('10.2.0.3') IGNORE_OPTIM_EMBEDDED_HINTS END_OUTLINE_DATA */
Apart from the appearance of the db_version() hint in 11.2.0.3 the most important difference in the Outline Data section is the nlj_prefetch() hint. 11g introduced a new mechanism for nested loop joins called NLJ Batching, at the same time introducing two new hints to allow the optimizer to specify which mechanism a plan should use, prefetch (nlj_prefetch()) or batching (nlj_batching()). Since 10g only does prefetching it doesn’t have (or need) a hint to specify the mechanism.
The Outline Data section from a plan is basically what Oracle stores as an SQL Plan Baseline – so if I use the approved method to turn the 10g outline data above into the 11g SQL Plan Baseline what’s going to happen to the execution plan when I run the query in the default 11g environment? It’s easy to demonstrate (at least in this case) by simply cutting and pasting the entire 10g outline into the original SQL statement and generating its plan under 11g; here’s the result:
-------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 225 | 3600 | 54 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 225 | 3600 | 54 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL | T2 | 15 | 120 | 24 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | 120 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA USE_NL(@"SEL$1" "T1"@"SEL$1") LEADING(@"SEL$1" "T2"@"SEL$1" "T1"@"SEL$1") INDEX_RS_ASC(@"SEL$1" "T1"@"SEL$1" ("T1"."N1")) FULL(@"SEL$1" "T2"@"SEL$1") OUTLINE_LEAF(@"SEL$1") ALL_ROWS DB_VERSION('11.2.0.3') OPTIMIZER_FEATURES_ENABLE('10.2.0.3') IGNORE_OPTIM_EMBEDDED_HINTS END_OUTLINE_DATA */
The plan has changed. I was slightly surprised to see in this case that the optimizer used neither NLJ prefetching nor NLJ batching, it went all the way back to the traditional nested loop mechanism.
If I had actually captured the original outline hints as an SQL Baseline the optimizer would have found the baseline in the data dictionary, generated this new plan from it, discovered that the plan hash value for this plan didn’t match the plan hash value for the stored plan, and re-optimized the query from scratch – potentially producing a totally different execution plan.
In my example 10g and 11g both wanted to use the nlj_prefetch mechanism when pushed into the nested loop join. 10g, of course, doesn’t have a relevant hint, so I got lucky that 11g wanted to do what 10g had done. In the case of the OP on the OTN forum 11g had decided that it preferred nlj_batching over nlj_prefetch when attempting to apply the baseline, so the plan hash values didn’t match and the optimizer became free to choose a completely different plan.
Without looking very carefully through all the hints available to 11g I can’t decide whether there are other similar cases to worry about, but if you see 11g ignoring SQL Plan Baselines that have been generated from 10g your next step should be to look for operations where 11g has two options (and perhaps two hints) for performing a given operation where 10g has only one option as that may be where the problem lies.
Footnotes:
In the case of the OP’s example, a possible workaround could start by setting the hidden parameter “_nlj_batching_enabled”=0; obviously this shouldn’t be done on a production system without approval of Oracle Support, and it’s never a desirable strategy to change a global parameter to fix a local problem so I’d prefer to set the parameter in a session and generate a new SQL Plan Baseline that would then (probably) include either the nlj_prefetch() hint or maybe it would turn out to be the no_nlj_batching() hint.
To see what plan (and hints) the SQL Plan Baseline would have generated the OP enabled SPM tracing using the new event mechanism, in this case:
alter session set events 'trace[RDBMS.SQL_Plan_Management.*]'; -- run, or explain the query here alter session set events 'trace[RDBMS.SQL_Plan_Management.*] off';
I’m not an optimizer expert, but this particular case might be covered by bug 14009271.
Comment by Roderick Manalac (@rmanalac) — February 13, 2013 @ 10:46 pm GMT Feb 13,2013 |
Roderick,
Thanks for that – it looks like an exact match.
Comment by Jonathan Lewis — February 14, 2013 @ 7:45 am GMT Feb 14,2013 |
Thanks for this blog post Jonathan. It’s explained the issue we’re having very succinctly. I am indeed the “OP” described in this blog.
Comment by Ian Williams — February 19, 2013 @ 1:28 am GMT Feb 19,2013 |
[…] you’ve hinted some SQL and got it working the safe thing to do, in 11g, is to check the outline section of the actual execution plan to see if you’ve missed any important hints and then, if you […]
Pingback by How to hint | Oracle Scratchpad — May 28, 2013 @ 5:26 pm BST May 28,2013 |
[…] TABLE ACCESS BY INDEX ROWID operation 301K times. The double NESTED LOOP (known as the 11g NLJ_BATCHING) is driving here an outer row source of 301K rows (NESTED LOOP operation at line 2) which starts […]
Pingback by A precise index | Mohamed Houri’s Oracle Notes — March 17, 2015 @ 7:50 pm GMT Mar 17,2015 |