Oracle Scratchpad

September 9, 2014

Quiz Night

Filed under: Execution plans,Indexing,Oracle — Jonathan Lewis @ 6:46 pm GMT Sep 9,2014

I have a table with several indexes on it, and I have two versions of a query that I might run against that table. Examine them carefully, then come up with some plausible reason why it’s possible (with no intervening DDL, DML, stats collection, parameter fiddling etc., etc., etc.) for the second form of the query to be inherently more efficient than the first.


select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 between 1 and 3
;

prompt  ===========
prompt  Split query
prompt  ===========

select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 = 1
or      bit_1 > 1 and bit_1 <= 3
;

Update / Answers

I avoided giving any details about the data and indexes in this example as I wanted to allow free rein to readers’ imagination  – and I haven’t been disappointed with the resulting suggestions. The general principles of allowing more options to the optimizer, effects of partitioning, and effects of skew are all worth considering when the optimizer CAN’T use an execution path that you think makes sense.  (Note: I didn’t make it clear in my original question, but I wasn’t looking for cases where you could get a better path by hinting (or profiling) I was after cases where Oracle literally could not do what you wanted.)

The specific strategy I was thinking of when I posed the question was based on a follow-up to some experiments I had done with the cluster_by_rowid() hint. and (there was a little hint in the “several indexes” and more particularly the column name “bit_1”) I was looking at a data warehouse table with a number of bitmap indexes. So here’s the execution plan for the first version of the query  when there’s a simple bitmap index on bit_1.


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   600 | 18000 |    96 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   600 | 18000 |    96 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|*  3 |    BITMAP INDEX RANGE SCAN   | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("BIT_1">=1 AND "BIT_1"<=3)

And here’s the plan for the second query:


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   560 | 16800 |    91 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   560 | 16800 |    91 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|   3 |    BITMAP OR                 |         |       |       |       |
|*  4 |     BITMAP INDEX SINGLE VALUE| BT1     |       |       |       |
|   5 |     BITMAP MERGE             |         |       |       |       |
|*  6 |      BITMAP INDEX RANGE SCAN | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("BIT_1"=1)
   6 - access("BIT_1">1 AND "BIT_1"<=3)

Clearly the second plan is more complex than the first – moreover the added complexity had resulted in the optimizer getting a different cardinality estimate – but, with my data set, there’s a potential efficiency gain. Notice how lines 5 and 6 show a bitmap range scan followed by a bitmap merge: to do the merge Oracle has to “superimpose” the bitmaps for the different key values in the range scan to produce a single bitmap that it can then OR with the bitmap for bit_1 = 1 (“bitmap merge” is effectively the same as “bitmap or” except all the bitmaps come from the same index). The result of this is that when we convert to rowids the rowids are in table order. You can see the consequences in the ordering of the result set or, more importantly for my demo, in the autotrace statistics:


For the original query:
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        604  consistent gets
          0  physical reads
          0  redo size
      27153  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed


For the modified query
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        218  consistent gets
          0  physical reads
          0  redo size
      26714  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed

Note, particularly, the change in the number of consistent gets. Each table block I visited held two or three rows that I needed, in the first query I visit the data in order of (bit_1, rowid) and get each table block 3 time; in the second case I visit the data in order of rowid and only get each table block once (with a “buffer is pinned count” for subsequent rows from the same block).

Here’s the starting output from each query, I’ve added the rowid to the original select statements so that you can see the block ordering:


Original query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         1         12 12         AAAmeCAAFAAAAECAAB
         1         22 22         AAAmeCAAFAAAAEDAAB
         1         32 32         AAAmeCAAFAAAAEEAAB
         1         42 42         AAAmeCAAFAAAAEFAAB
         1         52 52         AAAmeCAAFAAAAEGAAB

Modified query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         2          3 3          AAAmeCAAFAAAAEBAAC
         3          4 4          AAAmeCAAFAAAAEBAAD
         1         12 12         AAAmeCAAFAAAAECAAB
         2         13 13         AAAmeCAAFAAAAECAAC
         3         14 14         AAAmeCAAFAAAAECAAD
         1         22 22         AAAmeCAAFAAAAEDAAB
         2         23 23         AAAmeCAAFAAAAEDAAC
         3         24 24         AAAmeCAAFAAAAEDAAD

By rewriting the query I’ve managed to force a “cluster by rowid” on the data access. Of course, the simpler solution would be to add the /*+ cluster_by_rowid() */ hint to the original query – but it doesn’t work for bitmap indexes, and when I found that it worked for B-tree indexes the next test I did was to try a single bitmap index, which resulted in my writing this note.

Footnote: I don’t really expect Oracle Corp. to modify their code to make the hint work with bitmaps, after all it’s only relevant in the special case of using a bitmap index with a range scan and no subsequent bitmap AND/OR/MINUS operations where it would be needed – and you’re not really expected to use a single bitmap index to access a table, we engineer bitmaps to take advantage of combinations.

30 Comments »

  1. There’s a CHECK constraint on that table, CHECK(bit_1 = 1), which gets successfully injected into the (or-expanded) exec-plan of the second query, resulting in a FALSE predicate in one of the expansions.
    And injection of the constraint into the first query won’t result into a ‘merge’ with the between-predicate.

    Comment by Toon Koppelaars — September 9, 2014 @ 6:52 pm GMT Sep 9,2014 | Reply

    • I’m not sure I’d give lots of marks for plausibility for a constraint that bit_1 = 1 and a query query that expected other values ;)

      However, it’s an interesting thought that small modifications to the query that took advantage of known constraints MIGHT allow the optimizer to pick a better path. I played around with this briefly after seeing your comment, and in 11.2.0.4 it looked like a UNION ALL was more effective than an OR condition for helping the optimizer. I don’t think it was programmed to notice that the two conditions were non-overlapping.

      Comment by Jonathan Lewis — September 10, 2014 @ 10:31 am GMT Sep 10,2014 | Reply

  2. I would expect the concatenation of the two interim result sets from the second query to lower the overall ‘cost’ of the operation.

    Comment by dfitzjarrell — September 9, 2014 @ 7:02 pm GMT Sep 9,2014 | Reply

    • In general I don’t think I would expect that, unless the two separate parts could choose different access paths from the path chosen for the simpler predicate – and there may be some circumstances where this is the case (see later comments).

      On a side note, I think I have written a blog about a case where the optimizer’s cost calculation of “colX between k1 and k3” was higher than the calculated cost of “colX between k1 and k2 or (colX > k2 and colX <= k3) – but this was on an older version of Oracle and related to date conversions and "unknown values".

      Comment by Jonathan Lewis — September 10, 2014 @ 10:36 am GMT Sep 10,2014 | Reply

  3. Is the table partitioned?

    Comment by savvinov — September 9, 2014 @ 7:39 pm GMT Sep 9,2014 | Reply

    • In the example I was thinking of it wasn’t – but with the “Table Expansion” mechanisms of Oracle 11, and the partial indexing of 12c it’s a thought to pursue.
      There may be cases where a rewrite could take advantage of either feature when the basic query fails to – and you might want to apply a UNION ALL approach to the same type of problem in 10g.
      Franck Pachot demonstrated the 12c effect on twitter: https://twitter.com/FranckPachot/status/509433389863018497

      Comment by Jonathan Lewis — September 10, 2014 @ 10:42 am GMT Sep 10,2014 | Reply

  4. It is bit table it can only have value 0 or 1. so second query will use index and work first:). whereas cordiality of first query will be always 3*no of row which will make it use full table scan i guess.

    Comment by Hansie — September 9, 2014 @ 7:57 pm GMT Sep 9,2014 | Reply

    • There are a couple of anomalies about range-based predicates going out-of-range, but without building a model and running the test (I’m thinking a small number of 1’s, lots of zero’s and a histogram) I don’t think I could predict exactly what would happen. Maybe someone will test the idea to see.

      Comment by Jonathan Lewis — September 10, 2014 @ 10:48 am GMT Sep 10,2014 | Reply

  5. If you will have just ONE index on bit_1 column, then, perhaps both queries might perform equally efficiently. However, the reason that you have several indexes, and more importantly you have split the BETWEEN condition more EXPLICITLY in the second query, therefore the optimizer can find an optimal path depending on which index it chooses out of those several indexes.

    Regards,
    Lalit Kumar B

    Comment by Lalit Kumar B — September 9, 2014 @ 8:08 pm GMT Sep 9,2014 | Reply

    • Your hypothesis is a starting point, but I think that the “plausibility” requirement means you have to come up with a slightly more concrete description of the indexing that might allow you suggestion to be viable.

      Comment by Jonathan Lewis — September 10, 2014 @ 10:50 am GMT Sep 10,2014 | Reply

      • Jonathan,

        Firstly, good to see you back with regular posts after your vacation :-)

        Regarding the test case to prove what I explained in words, I would wait sometime to see further replies and then I would post my test case.

        I really appreciate your initiative of quiz, very nice :-) With the basic subject areas that you are picking for new posts, I am getting new thoughts for my blog posts.

        Comment by Lalit Kumar B — September 10, 2014 @ 2:56 pm GMT Sep 10,2014 | Reply

  6. > Examine them carefully, then come up with
    > some plausible reason why it’s possible

    First query: None of the indexes have bit_1 on the leading edge ultimately resulting in the CBO opting for a full table scan as the execution plan with lowest cost.

    Second query: An index with bit_1 as the second column in a composite key exists, and the predicate bit_1=1 is sufficiently selective on the data distribution of bit_1 to tip the CBO in favour of choosing an index skip scan (resulting in a more efficient/lower cost query).

    Comment by Michael D O'Shea — September 9, 2014 @ 8:34 pm GMT Sep 9,2014 | Reply

    • I think you may have to expand on this a little bit – what happens about the data for the >1 <=3 part of the query ? Are you thinking that the split might allow Oracle to chose two different indexes, perhaps because of clustering_factors and skewed data with a histogram highlighting a skew on the 1 value ?

      Comment by Jonathan Lewis — September 10, 2014 @ 11:09 am GMT Sep 10,2014 | Reply

      • Hi Jonathan, my first thoughts when you posted the quiz query was that it was just an Oracle bug.

        Subsequent and less immediate thoughts sent me down a line of thinking of what SQL predicates and data distribution of bit_1 could result in the CBO determining that an index skip scan would be more appropriate (than a full table scan). So grasping for an explanation based on the paucity of detail regarding indexes etc, my line of thinking was that the optimiser sees bit_1=1 in the SQL, determines current table stats aren’t good enough, dynamic sampling is performed, a better representation of bit_1 data distribution is now known, and that an index skip scan on an index with bit_1 not as the leading edge is chosen as lower cost.

        I look forward to you documenting the real explanation !

        Comment by Michael D O'Shea — September 10, 2014 @ 11:26 am GMT Sep 10,2014 | Reply

  7. Hi Jonathan,

    It is the ability of the optimizer to eliminate a predicate before optimizing that might make the second query more efficient than the first one.
    If the optimizer knows that,for example, bit_1 is always equal to 1 then it can infer that the predicate bit_1 = 1 evaluates to true which allows
    the elimination of the second predicate (or bit_1 > 1 and bit_1 <= 3)

    That's said I played with the example below and I couldn't reproduce what I claimed above

    drop table bit_tab;
    
    create table bit_tab (id number, bit_1 number , small_vc varchar2(30));
    
    alter table bit_tab modify bit_1 not null;
    
    alter table bit_tab add constraint bit_tab_bit_1_ck check (bit_1 = 1);
    
    create index ind_bit_1 on bit_tab(bit_1);
    
    insert into bit_tab values (100, 1, 'xxxx');
    
    commit;
    
    explain plan for
    select
            bit_1, id, small_vc
    from
            bit_tab
    where
            bit_1 between 1 and 3
    ;
    
    select * from table (dbms_xplan.display);
    
    -------------------------------------------------------------------------------------------------
    | Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |           |     1 |    43 |     1   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| BIT_TAB   |     1 |    43 |     1   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN                  | IND_BIT_1 |     1 |       |     1   (0)| 00:00:01 |
    -------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - access("BIT_1">=1 AND "BIT_1"<=3)
     
    explain plan for
    select
            bit_1, id, small_vc
    from
            bit_tab
    where
            bit_1 = 1
    or     (bit_1 > 1 and bit_1 <= 3)
    ;
    
    --------------------------------------------------------------------------------------------------
    | Id  | Operation                            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                     |           |     2 |    86 |     1   (0)| 00:00:01 |
    |   1 |  CONCATENATION                       |           |       |       |            |          |
    |   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| BIT_TAB   |     1 |    43 |     0   (0)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN                  | IND_BIT_1 |     1 |       |     0   (0)| 00:00:01 |
    |   4 |   TABLE ACCESS BY INDEX ROWID BATCHED| BIT_TAB   |     1 |    43 |     1   (0)| 00:00:01 |
    |*  5 |    INDEX RANGE SCAN                  | IND_BIT_1 |     1 |       |     1   (0)| 00:00:01 |
    --------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       3 - access("BIT_1">1 AND "BIT_1"<=3)
       5 - access("BIT_1"=1)
           filter(LNNVL("BIT_1">1) OR LNNVL("BIT_1"<=3))
    

    Best regards
    Mohamed

    Comment by hourim — September 9, 2014 @ 8:52 pm GMT Sep 9,2014 | Reply

    • Mohamed,

      I was interested to see that Oracle produced a concatenation automatically for that query – I wasn’t expecting that as I read through your code.
      I think you’re example is a demonstration of the suggestion Toon made above – and there may be variations where a difference does appear.

      Comment by Jonathan Lewis — September 10, 2014 @ 11:16 am GMT Sep 10,2014 | Reply

  8. After ruling out all the possible things you said don’t intervene, the only thing I can think of may affect the result is the data distribution. I’ve created a similar table to the one you use in

    https://jonathanlewis.wordpress.com/2012/03/09/index-upgrades/

    if you create the same table ordered by n1, both queries work pretty much the same.

    Comment by Bertran Saragusti — September 9, 2014 @ 9:28 pm GMT Sep 9,2014 | Reply

  9. The second query requires less table access to get the data than the first query. This may be a result of the BITMAP OR operation in the second query.

    Comment by Brian — September 9, 2014 @ 10:51 pm GMT Sep 9,2014 | Reply

  10. I think it’s possible for the optimizer to come up with a concatenation of a full scan of a single partition and an index range scan of another partition. So if we have one partition with bit_1 = 1, and another partition with a wide range of values for bit_1 (with just a small % of rows between 1 and 3), that would be a good optimization. Unfortunately I don’t have an opportunity to test this theory atm, will try later if I have time

    Comment by savvinov — September 10, 2014 @ 6:13 am GMT Sep 10,2014 | Reply

  11. We think we have one table and several indexes.And we can consider based on data distribution
    first query(also second how?) use INDEX ACCESS PATH(range).But rewriting query with second form I thimk it allows optimizer
    to examine more “additional” access paths than first form.
    For example (we dont know exactly which type indexes there are),to gettiong rowid before scanning bit_tab table rdbms shuld be scan index via INDEX RANGE(bitmap or b*tree) scan for first query.
    But second form of query it can be get “less” rowids before scanning table using another operations like BITMAP MERGE(after BITMAP OR).
    As result for second query it do “less” I/O so cost will be “cheap” than first query.

    Comment by Chinar Aliyev — September 10, 2014 @ 6:57 am GMT Sep 10,2014 | Reply

    • Good general principle – if the optimizer can’t find a good path with the SQL in this form is there another form that let’s it find a better path.
      Rewrites MAY make alternative paths available, and that’s what’s happened here – and the OR/MERGE is exactly it.

      Comment by Jonathan Lewis — September 10, 2014 @ 11:21 am GMT Sep 10,2014 | Reply

  12. For my test both seems to be equally efficient.

    SQL> select /*+ gather_plan_statistics */
      2          bit_1, id, small_vc
      3  from
      4          bit_tab
      5  where
      6          bit_1 between 1 and 3
      7  ;
    
         BIT_1         ID SMALL_VC
    ---------- ---------- ------------------------------
             1       2460 xxxxxxxxx
             2        765 xxxxxxxxx
             3       2695 xxxxxxxxx
    
    SQL&gt; select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------------------------
    SQL_ID  ak1m6y4w26atk, child number 0
    -------------------------------------
    select /*+ gather_plan_statistics */         bit_1, id, small_vc from
          bit_tab where         bit_1 between 1 and 3
    
    Plan hash value: 2508708471
    
    ---------------------------------------------------------------------------------------------------
    | Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    ---------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT            |           |      1 |        |      3 |00:00:00.01 |       5 |
    |   1 |  TABLE ACCESS BY INDEX ROWID| BIT_TAB   |      1 |      3 |      3 |00:00:00.01 |       5 |
    |*  2 |   INDEX RANGE SCAN          | IND_BIT_1 |      1 |      3 |      3 |00:00:00.01 |       3 |
    ---------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("BIT_1"&gt;=1 AND "BIT_1" select /*+ gather_plan_statistics */
      2          bit_1, id, small_vc
      3  from
      4          bit_tab
      5  where
      6          bit_1 = 1
      7  or      bit_1 &gt; 1 and bit_1  <= 3
      8  ;
    
    SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------------------------
    SQL_ID  4v60vspruaqf1, child number 0
    -------------------------------------
    select /*+ gather_plan_statistics */         bit_1, id, small_vc from
          bit_tab where         bit_1 = 1 or      bit_1 &gt; 1 and bit_1 1 AND "BIT_1"
    

    Comment by vipankumarsharma — September 10, 2014 @ 7:43 am GMT Sep 10,2014 | Reply

  13. Still messed up, probably because of mistaking “greater than” and “less than” with tags… trying again:

    create table bit_tab
    (
      bit_1 number,
      id  number,
      small_vc number
    )
    partition by  list (bit_1)
    (
      partition p1 values (1),
      partition p_rest values (default)
    );
    
    insert into bit_tab
    select 1, level, null
    from dual
    connect by level &amp;lt= 1e5;
    
    commit;
    
    insert into bit_tab
    select level, 1e5+level, null
    from dual
    connect by level &amp;lt= 1e5;
    
    create index i$bit_tab on bit_tab(bit_1);
    
    exec dbms_stats.gather_table_stats(user, 'BIT_TAB', estimate_percent=&amp;gt100);
    
    explain plan for
    select bit_1, id, small_vc
    from bit_tab
    where bit_1 = 1
    or bit_1 &amp;gt 1 and bit_1 &amp;lt= 3;
    
    
    select * from table(dbms_xplan.display);
    
    Plan hash value: 3167729278
     
    -----------------------------------------------------------------------------------------------------------------
    | Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
    -----------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |           |   100K|   781K|    75   (4)| 00:00:01 |       |       |
    |   1 |  CONCATENATION                      |           |       |       |            |          |       |       |
    |   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| BIT_TAB   |     2 |    20 |     4   (0)| 00:00:01 |     2 |     2 |
    |*  3 |    INDEX RANGE SCAN                 | I$BIT_TAB |     4 |       |     3   (0)| 00:00:01 |       |       |
    |   4 |   PARTITION LIST SINGLE             |           |   100K|   781K|    71   (5)| 00:00:01 |   KEY |   KEY |
    |*  5 |    TABLE ACCESS FULL                | BIT_TAB   |   100K|   781K|    71   (5)| 00:00:01 |     1 |     1 |
    -----------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       3 - access("BIT_1"&amp;gt1 AND "BIT_1"&amp;lt=3)
       5 - filter("BIT_1"=1 AND (LNNVL("BIT_1"&amp;lt=3) OR LNNVL("BIT_1"&amp;gt1)))
       
    explain plan for 
    select bit_1, id, small_vc
    from bit_tab
    where bit_1 between 1 and 3;
    
    select * from table(dbms_xplan.display);
    
    Plan hash value: 2513941102
     
    ---------------------------------------------------------------------------------------------------
    | Id  | Operation               | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
    ---------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT        |         | 99244 |   872K|   138   (3)| 00:00:02 |       |       |
    |   1 |  PARTITION LIST ITERATOR|         | 99244 |   872K|   138   (3)| 00:00:02 |   KEY |   KEY |
    |*  2 |   TABLE ACCESS FULL     | BIT_TAB | 99244 |   872K|   138   (3)| 00:00:02 |   KEY |   KEY |
    ---------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - filter("BIT_1"&amp;lt=3 AND "BIT_1"&amp;gt=1)
    

    Comment by savvinov — September 10, 2014 @ 8:38 am GMT Sep 10,2014 | Reply

    • Nice example.
      If you have partitions with skewed data distributions you could find that the optimizer picks one “average” path for all the partitions when catering to individual partitions with different paths would be much more efficient. the rewrite gave the optimizer an opportunity to find that alternative.

      (I edited this, and deleted the previous two mangled attempts)

      Comment by Jonathan Lewis — September 10, 2014 @ 11:25 am GMT Sep 10,2014 | Reply

  14. Jonathan,

    in theory I could imagine a situation in which a concatenation of an index access that is only very efficient for the case (bit_1 = 1) and an index access that is only efficient for (bit_1 > 1 and bit_1 <= 3) could be more efficient than using the same index for both cases. Maybe if there is an index on bit_1 with a very good clustering for bit_1 = 1, but very poor clustering for the other values – and another index with the good clustering for case 2 and the bad clustering for case 1. But that said I did not manage to create a fitting example immediately.

    Another option would be the combination of FBIs (excluding the irrelevant entries from the index), but that would afford a change in the query as far as I know (maybe that would be a use case for the combination of multiple filtered indexes as some other RDBMS use them).

    Regards

    Martin

    Comment by Martin Preiss — September 10, 2014 @ 11:35 am GMT Sep 10,2014 | Reply

    • Martin,

      Something like:
      Index (col1, col2)
      Index (col1, col3)

      col2 = 0 for all rows where col1 = 1 — it’s a “new row” value — but randomly generated for all other values of col1.
      col3 = 0 for all rows where col1 != 1 but randomly generated for col1 = 1

      Assume rows for each of col1 = 1, col1 = 2 and col1 = 3 are fairly well clustered – but (col1, col2) looks like a poor index for col1 in (2,3) and (col1, col3) looks like a poor index for col1 = 1
      With suitable data patterns there’s a saving in buffer visits that could be worth making by picking the right index for the right values of col1.

      (Of course, you shouldn’t really have two indexes starting the same way ,…)

      Comment by Jonathan Lewis — September 10, 2014 @ 11:57 am GMT Sep 10,2014 | Reply

  15. Running an example using various selectivities on the bit_1 column provides the following output and plans (this on 11.2.0.3 with single column indexes on the bit_1, bit_2 and small_vc columns):

    
    SQL&gt; 
    SQL&gt; select
      2  	     bit_1, id, small_vc
      3  from
      4  	     bit_tab
      5  where
      6  	     bit_1 between 1 and 3
      7  ;
    
         BIT_1         ID SMALL_VC                                                                                                                        
    ---------- ---------- --------------------                                                                                                            
             1          3 Rec1                                                                                                                            
             2          6 Rec2                                                                                                                            
             3          9 Rec3                                                                                                                            
    
    
    Execution Plan
    ----------------------------------------------------------                                                                                            
    Plan hash value: 3478641842                                                                                                                           
                                                                                                                                                          
    --------------------------------------------------------------------------------------------                                                          
    | Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
    --------------------------------------------------------------------------------------------                                                          
    |   0 | SELECT STATEMENT            |              |     3 |   114 |     3   (0)| 00:00:01 |                                                          
    |   1 |  TABLE ACCESS BY INDEX ROWID| BIT_TAB      |     3 |   114 |     3   (0)| 00:00:01 |                                                          
    |*  2 |   INDEX RANGE SCAN          | BIT_TAB_IDX1 |     3 |       |     2   (0)| 00:00:01 |                                                          
    --------------------------------------------------------------------------------------------                                                          
                                                                                                                                                          
    Predicate Information (identified by operation id):                                                                                                   
    ---------------------------------------------------                                                                                                   
                                                                                                                                                          
       2 - access("BIT_1"&gt;=1 AND "BIT_1" 
    SQL&gt; prompt  ===========
    ===========
    SQL&gt; prompt  Split query
    Split query
    SQL&gt; prompt  ===========
    ===========
    SQL&gt; 
    SQL&gt; select
      2  	     bit_1, id, small_vc
      3  from
      4  	     bit_tab
      5  where
      6  	     bit_1 = 1
      7  or      bit_1 &gt; 1 and bit_1 1 AND "BIT_1"1) OR LNNVL("BIT_1" 
    SQL&gt; 
    SQL&gt; select
      2  	     bit_1, id, small_vc
      3  from
      4  	     bit_tab
      5  where
      6  	     bit_1 between 1 and 3
      7  ;
    
         BIT_1         ID SMALL_VC                                                                                                                        
    ---------- ---------- --------------------                                                                                                            
             1          3 Rec1                                                                                                                            
             2          6 Rec2                                                                                                                            
             3          9 Rec3                                                                                                                            
             1         54 Rec18                                                                                                                           
             2         57 Rec19                                                                                                                           
             3         60 Rec20                                                                                                                           
    ...
             2     219255 Rec73085                                                                                                                        
             3     219258 Rec73086                                                                                                                        
             1     219303 Rec73101                                                                                                                        
             2     219306 Rec73102                                                                                                                        
             3     219309 Rec73103                                                                                                                        
    
    17649 rows selected.
    
    
    Execution Plan
    ----------------------------------------------------------                                                                                            
    Plan hash value: 12981459                                                                                                                             
                                                                                                                                                          
    -----------------------------------------------------------------------------                                                                         
    | Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |                                                                         
    -----------------------------------------------------------------------------                                                                         
    |   0 | SELECT STATEMENT  |         | 21084 |   782K|   103   (1)| 00:00:01 |                                                                         
    |*  1 |  TABLE ACCESS FULL| BIT_TAB | 21084 |   782K|   103   (1)| 00:00:01 |                                                                         
    -----------------------------------------------------------------------------                                                                         
                                                                                                                                                          
    Predicate Information (identified by operation id):                                                                                                   
    ---------------------------------------------------                                                                                                   
                                                                                                                                                          
       1 - filter("BIT_1"&gt;=1 AND "BIT_1" 
    SQL&gt; prompt  ===========
    ===========
    SQL&gt; prompt  Split query
    Split query
    SQL&gt; prompt  ===========
    ===========
    SQL&gt; 
    SQL&gt; select
      2  	     bit_1, id, small_vc
      3  from
      4  	     bit_tab
      5  where
      6  	     bit_1 = 1
      7  or      bit_1 &gt; 1 and bit_1 1 AND "BIT_1" 
    SQL&gt; 
    SQL&gt; select
      2  	     bit_1, id, small_vc
      3  from
      4  	     bit_tab
      5  where
      6  	     bit_1 between 1 and 3
      7  ;
    
         BIT_1         ID SMALL_VC                                                                                                                        
    ---------- ---------- --------------------                                                                                                            
             1          3 Rec1                                                                                                                            
             1        534 Rec178                                                                                                                          
             1       3189 Rec1063                                                                                                                         
    ...
             3     299493 Rec99831                                                                                                                        
    
    1695 rows selected.
    
    
    Execution Plan
    ----------------------------------------------------------                                                                                            
    Plan hash value: 3478641842                                                                                                                           
                                                                                                                                                          
    --------------------------------------------------------------------------------------------                                                          
    | Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |                                                          
    --------------------------------------------------------------------------------------------                                                          
    |   0 | SELECT STATEMENT            |              |  1695 | 64410 |    16   (0)| 00:00:01 |                                                          
    |   1 |  TABLE ACCESS BY INDEX ROWID| BIT_TAB      |  1695 | 64410 |    16   (0)| 00:00:01 |                                                          
    |*  2 |   INDEX RANGE SCAN          | BIT_TAB_IDX1 |  1695 |       |     8   (0)| 00:00:01 |                                                          
    --------------------------------------------------------------------------------------------                                                          
                                                                                                                                                          
    Predicate Information (identified by operation id):                                                                                                   
    ---------------------------------------------------                                                                                                   
                                                                                                                                                          
       2 - access("BIT_1"&gt;=1 AND "BIT_1" 
    SQL&gt; prompt  ===========
    ===========
    SQL&gt; prompt  Split query
    Split query
    SQL&gt; prompt  ===========
    ===========
    SQL&gt; 
    SQL&gt; select
      2  	     bit_1, id, small_vc
      3  from
      4  	     bit_tab
      5  where
      6  	     bit_1 = 1
      7  or      bit_1 &gt; 1 and bit_1 1 AND "BIT_1"1) OR LNNVL("BIT_1" 
    
    

    Notice in two of my three cases CONCATENATION appears to provide a lower-cost ‘solution’. Of course this isn’t a rigorous test so other plans may be generated when the bit_1 selectivity changes.

    Comment by dfitzjarrell — September 10, 2014 @ 4:51 pm GMT Sep 10,2014 | Reply

    • David,

      It looks like lots of your example got lost because of “less than / greater than” issues.
      If it’s easy to re-enter, then use the ‘sourcecode’ tag to protect it. The word goes in square brackets without the single quote marks. Open with sourcecode, close with /sourcecode (i.e. preceded by slash).

      Playing these games can pick out some very odd quirks in the optimizer arithmetic – but what we’re really interested in is the actual work done particularly (in this case) when the strategy seems to be virtually unchanged, viz: it’s really just doing an index range scan in a funny way.

      Comment by Jonathan Lewis — September 10, 2014 @ 6:24 pm GMT Sep 10,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.