Oracle Scratchpad

March 17, 2016

Hinting

Filed under: Hints,Ignoring Hints,Oracle,Upgrades — Jonathan Lewis @ 1:10 pm BST Mar 17,2016

A posting on the OTN database forum a few days ago demonstrated an important problem with hinting – especially (though it didn’t come up in the thread)  in the face of upgrades. A simple query needed a couple of hints to produce the correct plan, but a slight change to the query seemed to result in Oracle ignoring the hints. The optimizer doesn’t ignore hints, of course, but there are many reasons why it might have appeared to so I created a little demonstration of the problem – starting with the following data set:

rem
rem     Script:  OTN_DAG.sql
rem     Author:  J.P.Lewis
rem     Dated:   March 2016
rem

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
from
        generator       g1,
        generator       g2
where
        rownum <= 24000
;

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc((rownum-1)/15)    n1,
        trunc((rownum-1)/15)    n2,
        rpad(rownum,180)        v1
from    generator
where
        rownum <= 3000
;
begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

(Ignore the silliness of the way I’ve created the data, it’s a consequence of using my standard template).

For every row in t2 there are 8 rows in t1, so when I join t1 to t2 on n2 it would obviously be sensible for the resulting hash join to use the t2 (smaller) data set as the build table and the t1 data set as the probe table, but I’m going to pretend that the optimizer is making an error and needs to be hinted to use t1 as the build table and t2 as the probe. Here’s a query, and execution plan, from 11.2.0.4:

explain plan for
select
        /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */
        count(t1.n2)
from
        t1, t2
where
        t2.n2 = t1.n2
and     t1.n1 = 15
and     t2.n1 = 15
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    16 |    97   (3)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    16 |            |          |
|*  2 |   HASH JOIN         |      |    20 |   320 |    97   (3)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   120 |   960 |    85   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N2"="T1"."N2")
   3 - filter("T1"."N1"=15)
   4 - filter("T2"."N1"=15)

As you can see, the optimizer has obeyed my hinting – the join order is t1 -> t2, I’ve used a hash join to join t2, and Oracle hasn’t swapped the join inputs despite the fact that the t1 data set is larger than the t2 data set (960 bytes vs. 120 bytes) which should have persuaded it to swap. (Technically, the leading() hint seems to block the swap of the first two tables anyway – see the “Special Case” section at this URL, but I’ve included it the no_swap_join_inputs() anyway to make the point explicit.)

So now, instead of just count n2, we’ll modify the query to count the number of distinct values for n2:


explain plan for
select
        /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */
        count(distinct t1.n2) 
from
        t1, t2
where
        t2.n2 = t1.n2
and     t1.n1 = 15
and     t2.n1 = 15
;

select * from table(dbms_xplan.display(null,null,'outline alias'));

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |          |     1 |    13 |            |          |
|   2 |   VIEW                | VW_DAG_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |          |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |          |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T2       |    15 |   120 |    12   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T1       |   120 |   960 |    85   (3)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T2"."N1"=15)
   6 - filter("T1"."N1"=15)

Check operations 5 and 6 – Oracle has swapped the join inputs: t2 (the obvious choice) is now the build table. Has Oracle ignored the hint ? (Answer: No).
If you look at operation 2 you can see that Oracle has generated an internal view called VW_DAG_0 – this is an example of the “Distinct Aggregate” transformation taking place. It seems to be a pointless exercise in this case and the 10053 trace file seems to indicate that it’s a heuristic transformation rather than cost-based transformation (i.e. the optimizer does it because it can, not because it’s cheaper). Oracle has transformed the SQL to the following (to which I have applied a little cosmetic tidying):


SELECT  /*+ LEADING (T1) */
        COUNT(VW_DAG_0.ITEM_1) "COUNT(DISTINCTT1.N2)"
FROM    (
        SELECT  T1.N2 ITEM_1
        FROM    TEST_USER.T2 T2,TEST_USER.T1 T1
        WHERE   T2.N2=T1.N2
        AND     T1.N1=15
        AND     T2.N1=15
        GROUP BY
                T1.N2
        ) VW_DAG_0

Notice how the use_hash() and no_swap_join_input() hints have disappeared. I am slightly surprised that the leading() hint is still visible, I would have expected all three to stay or all three to disappear; regardless of that, though, the single remaining hint references an object that does not exist in the query block where the hint has been placed. The original hint has not been “ignored”, it has become irrelevant. (I’ll be coming back to an odd little detail about this transformed query a little later on but for the moment I’m going to pursue the problem of making the optimizer do what we want.)

We have three strategies we could pursue at this point. We could tell the optimizer that we don’t want it to do the transformation; we could work out the query block name of the query block that holds t1 and t2 after the transformation and direct the hints into that query block; or we could tell Oracle to pretend it was using an older version of the optimizer because that Distinct Aggregate transformation only appeared in 11.2.0.1.

You’ll notice that I used the ‘alias’ formatting command in my call to dbms_xplan.display() – this is the queryblock / alias section of the output:


Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$C33C846D
   2 - SEL$5771D262 / VW_DAG_0@SEL$C33C846D
   3 - SEL$5771D262
   5 - SEL$5771D262 / T1@SEL$1
   6 - SEL$5771D262 / T2@SEL$1

Strategy A says try adding the hint: /*+ no_transform_distinct_agg(@sel$1) */
Strategy B says try using the hints: /*+ leading(@SEL$5771D262 t1@sel$1 t2@sel$1) use_hash(@SEL$5771D262 t2@sel$1 no_swap_join_inputs(@SEL$5771D262 t2@sel$1) */
Strategy C says try adding the hint: /*+ optimizer_features_enable(‘11.1.0.7’) */

Strategies A and C (stopping the transformation) produce the following plan:


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    16 |    98   (4)| 00:00:01 |
|   1 |  SORT GROUP BY      |      |     1 |    16 |            |          |
|*  2 |   HASH JOIN         |      |    20 |   320 |    98   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |   120 |   960 |    85   (3)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T2"."N2"="T1"."N2")
   3 - filter("T1"."N1"=15)
   4 - filter("T2"."N1"=15)

Strategy B (allowing the transformation, but addressing the hints to the generated query block) produces this plan:


----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |          |     1 |    13 |            |          |
|   2 |   VIEW                | VW_DAG_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |          |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |          |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T1       |   120 |   960 |    85   (3)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T2       |    15 |   120 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T1"."N1"=15)
   6 - filter("T2"."N1"=15)

All three Strategies have produced plans that use t1, the larger data set, as the build table. It’s hard to resist asking if it’s possible to claim that one of the three strategies is the best strategy; it’s hard to say, but I think I’d favour using the no_transform_distinct_agg() hint because it’s precisely targetted – so avoids the brute force thuggish nature of the reverting back to an old version, and avoids the (possble) fragility of needing to know a very precise query block name which (possibly) might change for some reason if the query were to be modified very slightly. The argument, of course, comes from the perspective of a friendly consultant who visits for a couple of days, gets a bit clever with your SQL, then walks away leaving you to worry about whether you understand why your SQL now works the way it does.

Upgrades

My opening comment was about the difficulty of hinting across upgrades. Imagine you had been running this count(distinct) query in 10.2.0.5, and after some experimention had found that you got the path you needed by adding the hints: /*+ leading(t1 t2)  full(t1) use_hash(t2) no_swap_join_inputs(t2) full(t2) */. This is a careful and thorough piece of hinting (and it does work, of course, in 10.2.0.5).

When the big day for upgrading to 11.2 arrives (just in time for Oracle to ends extended support, possibly) you find that this query changes its execution plan. And this is NOT a rare occurrence. I’ve said it before, and I’ll keep saying it: hinting – especially with “micro-management” hints – is undesirable in a production system. You probably haven’t done it right, and even if the hints are (broadly speaking) perfect in the current version they may be pushed out of context by a new feature in the next version.  If you’ve hinted your code you have to check every single hinted statement to make sure the hints still have the same effect on the upgrade.

This is why I produce the sound-bite (which Maria Colgan nicked): “if you can hint it, baseline it”.  If you had generated a baseline (or outline) from a query with these hints in 10g Oracle would have included the /*+ optimizer_features_enable(‘10.2.0.5’) */ hint with the functional hints, and the upgrade wouldn’t have produced a different plan.

Technically, of course, you could have remembered to add the hint to your production code – but in many cases Oracle introduces far more hints in an SQL Baseline than you might want to put into your code; and by using the SQL Baseline approach you’ve given yourself the option to get rid of the “hidden hinting” in a future version of Oracle by dropping the baseline rather than rewriting the code and (perhaps) recompiling the application.

Inevitably there are cases where setting the optimizer_features_enable backwards doesn’t rescue new from a new plan – there are probably a few cases where the internal code forgets to check the value and bypass some subroutines; more significantly there are cases where one version of Oracle will give you an efficient plan because of an optimizer bug and setting the version backwards won’t re-introduce that bug.

Footnote

I said I’d come back to the “unparsed” query that the optimizer generated from the original count(distinct) statement and the way it left the leading(t1) hint in place but lost the use_hash(t2) and no_swap_join_inputs(t2). I got curious about how Oracle would optimize that query if I supplied it from SQL*Plus – and this is the plan I got:


explain plan for
SELECT  /*+ LEADING (T1) */
        COUNT(VW_DAG_0.ITEM_1) "COUNT(DISTINCTT1.N2)"
FROM    (
        SELECT  T1.N2 ITEM_1
        FROM    TEST_USER.T2 T2,TEST_USER.T1 T1
        WHERE   T2.N2=T1.N2
        AND     T1.N1=15
        AND     T2.N1=15
        GROUP BY
                T1.N2
        ) VW_DAG_0
;

-----------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |     1 |    13 |    98   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE       |           |     1 |    13 |            |          |
|   2 |   VIEW                | VM_NWVW_0 |    20 |   260 |    98   (4)| 00:00:01 |
|   3 |    HASH GROUP BY      |           |    20 |   320 |    98   (4)| 00:00:01 |
|*  4 |     HASH JOIN         |           |    20 |   320 |    97   (3)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL| T1        |   120 |   960 |    85   (3)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL| T2        |    15 |   120 |    12   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."N2"="T1"."N2")
   5 - filter("T1"."N1"=15)
   6 - filter("T2"."N1"=15)

Oracle has managed to do a transformation to this statement that it didn’t do when it first generated the statement – too much recursion, perhaps – and that floating leading(t1) hint has been squeezed back into action by a view-merging step in the optimization that got the hint back into a query block that actually contained t1 and t2!  At this point I feel like quoting cod-philosophy from the Dune trilogy: “Just when you think you understand …”

 

4 Comments »

  1. Thanks for a great post, Jonathan.
    I think one of the biggest sources of many issues with hints is the documentation which has incorrect examples. It’s this way for years, unfortunately.

    Comment by Timur Akhmadeev — March 18, 2016 @ 3:53 pm BST Mar 18,2016 | Reply

  2. Jonathan,

    In your first query you used the hint /*+ leading(t1) use_hash(t2) no_swap_join_inputs(t2) */.
    If you didn’t say no_swap_join_inputs and Oracle chose to swap them would that effectively nullify the effect of the “leading” hint. I’m having trouble understanding how these 2 hints affect each other. I remember you saying elsewhere and with examples that the “leading” hint is always obeyed by the optimizer.

    Thanks,

    Richard

    Comment by rbrieck — September 22, 2016 @ 10:18 pm BST Sep 22,2016 | Reply

    • Richard,

      The leading hint tells Oracle which join order (or orders if the hint specifies only a subset of the tables) the optimizer should consider for a query block. If the hint has been constructed correctly and the join order is legally possible Oracle will that join order.

      Within a join order, the optimizer can choose at each hash join which rowsource should be the build table and which the probe table; the “default” position is that the next table in the join order should be the probe table and the currently joined set should be the build table – this choice can be enforced with the no_swap_join_inputs() hint or over-ridden with the swap_join_inputs() hint.

      For example, if Oracle has joined tables a, b, and c, and the next table in the join order is d then the optimizer can evaluate “(a,b,c) join (d)” or “(d) join (a,b,c)”. Consequently if you see an excecution plan that looks like this:

      hash join
          tablescan full A
          tablescan full B
      

      you can’t (without further information) tell whether the optimizer used the join order A, B, or used the join order B, A and swapped the join inputs.

      Comment by Jonathan Lewis — September 24, 2016 @ 10:44 am BST Sep 24,2016 | Reply

  3. Thanks Jonathan. That clears it up for me. I really appreciate you taking the time to explain it.

    Richard

    Comment by rbrieck — September 26, 2016 @ 8:40 pm BST Sep 26,2016 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.