Oracle Scratchpad

January 14, 2015

Bind Effects

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 1:24 pm GMT Jan 14,2015

A couple of days ago I highlighted an optimizer anomaly caused by the presence of an index with a descending column. This was a minor (unrelated) detail that appeared in a problem on OTN where the optimizer was using an index FULL scan when someone was expecting to see an index RANGE scan. My earlier posting supplies the SQL to create the table and indexes I used to model the problem – and in this posting I’ll explain the problem and answer the central question.

Here’s the query and execution plan (from 11.2.0.x) as supplied by the OP – the odd appearance of the sys_op_descend() function calls is the minor detail that I explained in the previous post, but that’s not really relevant to the question of why Oracle is using an index full scan rather than an index range scan. The /*+ first_rows */ hint isn’t something you should be using but it was in the OP’s query, so I’ve included it in my model:


select /*+ FIRST_ROWS gather_plan_statistics scanned */ count(1)      FROM  XXX
where  (((((COL1 = '003' and COL2 >= '20150120') and COL3 >= '00000000') and COL4>= '000000000000' )
or ((COL1 = '003' and COL2 >= '20150120') and COL3> '00000000' )) or (COL1= '003' and COL2> '20150120'))
order by COL1,COL2,COL3,COL4  

Plan hash value: 919851669  

---------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |      1 |        |  18533 |00:01:47.04 |    156K |  70286 |
|   1 |  TABLE ACCESS BY INDEX ROWID| XXX    |      1 |  7886K |  18533 |00:01:47.04 |    156K |  70286 |
|*  2 |   INDEX FULL SCAN           | XXXXPK |      1 |  7886K |  18533 |00:01:30.36 |    131K |  61153 |
---------------------------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
---------------------------------------------------
  2 - filter((("COL2">:B2 AND "COL1"=:B1 AND
              SYS_OP_DESCEND("COL2")<SYS_OP_DESCEND(:B2)) OR ("COL1"=:B1 AND "COL2">=:B2
              AND "COL3">:B3 AND SYS_OP_DESCEND("COL2")<=SYS_OP_DESCEND(:B2)) OR
              ("COL1"=:B1 AND "COL2">=:B2 AND "COL3">=:B3 AND "COL4">=:B4 AND
              SYS_OP_DESCEND("COL2")<=SYS_OP_DESCEND(:B2))))  

If you look closely you’ll see that the OP has NOT supplied the output from a call to dbms_xplan.display_cursor() – the column and table names are highly suspect (but that’s allowable cosmetic change for confidentiality reasons) the giveaway is that the SQL statement uses literals but the execution plan is using bind variables (which are of the form B{number}, suggesting that the real SQL is embedded in PL/SQL with PL/SQL variables being used to supply values): the bind variables make a difference.

Let’s go back to my model to demonstrate the problem. Here’s a query with the same predicate structure as the problem query (with several pairs of brackets eliminated to improve readability) showing the actual run-time plan (from 11.2.0.4) when using literals:


select
        /*+ first_rows */
        *
from t1
where
        (C1 = 'DE' and C2 >  'AB')
or      (C1 = 'DE' and C2 >= 'AB' and C3 > 'AA' )
or      (C1 = 'DE' and C2 >= 'AB' and C3 >= 'AA' and C4 >= 'BB')
order by
        C1, C2, C3, C4
;

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1      |    21 |  2478 |     4  (25)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IASC |    21 |       |     3  (34)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"='DE')
       filter(((SYS_OP_DESCEND("C2")<SYS_OP_DESCEND('AB') AND "C2">'AB') OR
              (SYS_OP_DESCEND("C2")<=SYS_OP_DESCEND('AB') AND "C3">'AA' AND "C2">='AB') OR
              (SYS_OP_DESCEND("C2")<=SYS_OP_DESCEND('AB') AND "C4">='BB' AND "C2">='AB' AND
              "C3">='AA')))

As you can see, the optimizer has managed to “factor out” the predicate C1 = ‘DE’ from the three disjuncts and has then used it as an access() predicate for an index range scan. Now let’s see what the code and plan look like if we replace the four values by four bind variables:


variable B1 char(2)
variable B2 char(2)
variable B3 char(2)
variable B4 char(2)

begin
        :b1 := 'DE';
        :b2 := 'AB';
        :b3 := 'AA';
        :b4 := 'BB';
end;
/

select
        /*+ first_rows */
        *
from t1
where
        (C1 = :B1  and C2 >  :B2 )
or      (C1 = :B1  and C2 >= :B2 and C3 >  :B3 )
or      (C1 = :B1  and C2 >= :B2 and C3 >= :B3 and C4 >= :B4)
order by C1, C2, C3, C4
;

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |    31 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1      |   437 | 51566 |    31   (4)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | T1_IASC |   437 |       |    27   (4)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((("C1"=:B1 AND "C2">:B2) OR ("C1"=:B1 AND "C3">:B3 AND
              "C2">=:B2) OR ("C1"=:B1 AND "C4">=:B4 AND "C2">=:B2 AND "C3">=:B3)))

The optimizer hasn’t factored out the common expression C1 = :B1. The reason for this, I think, is that though WE know that :B1 is supposed to be the same thing in all three occurrences the optimizer isn’t able to assume that that’s the case; in principle :B1 could be the place holder for 3 different values – so the optimizer plays safe and optimizes for that case. This leaves it with three options: Full tablescan with filter predicates, index full scan with filter predicates, three-part concatenation with index range scans in all three parts. The combination of the /*+ first_rows */ hint and the “order by” clause which matches the t1_1asc index has left the optimizer choosing the index full scan path – presumably to avoid the need to collect all the rows and sort them before returning the first row.

Given our understanding of the cause of the problem we now have a clue about how we might make the query more efficient – we have to eliminate the repetition of (at least) the :B1 bind variable. In fact we can get some extra mileage by modifying the repetition of the :B2 bind variable. Here’s a rewrite that may help:


select
        /*+ first_rows */
        *
from t1
where
        (C1 = :B1 and C2 >= :B2)
and     (
             C2 > :B2
         or  C3 > :B3
         or (C3 >= :B3 and C4 > :B4)
        )
order by C1, C2, C3, C4
;

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |       |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1      |   148 | 17464 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IASC |   148 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C1"=:B1 AND "C2">=:B2 AND "C2" IS NOT NULL)
       filter(("C2">:B2 OR "C3">:B3 OR ("C4">:B4 AND "C3">=:B3)))

I’ve factored out as much of the C1 and C2 predicates as I can – and the optimizer has used the resulting conditions as the access() predicate on the index (adding in a “not null” predicate on C2 that looks redundant to me – in fact the index was on the primary key in the original, but I hadn’t included that constraint in my model). You’ll notice, by the way, that the cardinality is now 148; compare this with the previous cardinality of 437 and you might (without bothering to look closely as the 10053 trace) do some hand-waving around the fact that 437 = (approximately) 148 * 3, which fits the idea that the optimizer was treating the three :B1 appearances as if they were three different possible values accessing three sets of data.

Miscellaneous.

This isn’t the end of the story; there are always more complications and surprises in store as you look further into the detail. For example, on the upgrade to 12c the execution plan for the query with bind variables was the same (ignoring the sys_op_descend() functions) as the query using literals – the optimizer managed to factor out the C1 predicate: does this mean SQL*Plus got smarter about telling the optimizer about the bind variables, or does it mean the optimizer got smarter about something that SQL*Plus has been doing all along ?

This change might make you think that the optimizer is supposed to assume that bind variables of the same name represent the same thing – but that’s not correct, and it’s easy to show; here’s a trivial example (accessing the same table with a query that, for my data, identifies the first row):


declare
        m_id number := 1;
        m_c1 char(2) := 'BV';
        m_c2 char(2) := 'GF';
        m_n number := 0;
begin
        execute immediate
                'SELECT /*+ FIND THIS */ COUNT(*) FROM T1 WHERE ID = :B1 AND C1 = :B1 AND C2 = :B1'
                into m_n
                using m_id, m_c1, m_c2
        ;
end;
/

select sql_id, sql_text from V$sql where sql_text like 'SELECT%FIND THIS%';

SQL_ID        SQL_TEXT
------------- ----------------------------------------------------------------------------------
9px3nuv54maxp SELECT /*+ FIND THIS */ COUNT(*) FROM T1 WHERE ID = :B1 AND C1 = :B1 AND C2 = :B1

If you were looking at the contents of v$sql, or a trace file, or an AWR report, you might easily be fooled into thinking that this was a query where the same value had been used three times – when we know that it wasn’t.

So, as we upgrade from 11g to 12c my model of the original problem suggests that the problem is going to go away – but, actually, I don’t really know why that’s the case (yet). On the other hand, I have at least recognised a pattern that the 11g optimizer currently has a problem with, and I have a method for helping the optimizer to be a little more efficient.

 

7 Comments »

  1. Maybe in 12c the code at that point can differentiate between bind by position and bind by name. EXECUTE IMMEDIATE does bind by position, but SQL*Plus might be doing bind by name (not sure though). If the optimizer was aware of that it could assume that the three :B1 are actually the same value in your example.

    Comment by Randolf Geist — January 14, 2015 @ 6:34 pm GMT Jan 14,2015 | Reply

    • Randolf,

      Thanks for the comment. I always appreciate suggestions that may throw light on unresolved questions. It certainly sounds viable since SQL*Plus is (used to be described as) just a C program using (IIRC, and if the terminology hasn’t changed since I last wrote some) type 4 code to handle the dynamic SQL.

      It might be easy to check the idea by simply tracing the SQL with bind capture enabled, or even using the “binds” option for dbms_stats.display_cursor(), to see if there’s any difference in the output between 11g and 12c. I’ll try to find some time later on this week.

      Comment by Jonathan Lewis — January 15, 2015 @ 7:54 am GMT Jan 15,2015 | Reply

      • If I run the sample query in question within an anonymous PL/SQL block using OPEN / FETCH / CLOSE and bind by position 12c reverts to the INDEX FULL SCAN and understands that the three :B1 can potentially hold different values. I can also see that SQL*Plus uses bind by name in a 10046 trace with binds enabled. So it looks like my assumption above is true and 12c can differentiate the two cases bind by name / position when evaluating this, but 11.2.0.4 can’t.

        Interestingly I would say the DBMS_XPLAN.DISPLAY_CURSOR “Peeked binds” section is misleading because it tells me for the “bind by position” case that the (different) :B1’s would all resort to the first one, which in my understanding is not true – I’ve setup a case where three different values were passed for the different :B1s by bind by position, confirmed by the 10046 trace output.

        Randolf

        Comment by Randolf Geist — January 15, 2015 @ 12:53 pm GMT Jan 15,2015 | Reply

        • Randolf,

          I hadn’t done the explicit open/fetch/close – and I don’t intend to try testing dbms_sql ;) – but I agree (especially about how display_cursor is misleading).

          The bit that surprised me was that 11.2.0.4 and 12.1.0.2 both showed the same BINDS# output in the 10046 trace for the SQL example, viz: apparent bind by name with “no oacdef” for the repeated bind names, but still optimised as if the repetitions could hold different values.

          Comment by Jonathan Lewis — January 15, 2015 @ 2:08 pm GMT Jan 15,2015


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.