Oracle Scratchpad

January 28, 2016

Bitmap Efficiency

Filed under: bitmaps,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 1:02 pm GMT Jan 28,2016

An interesting observation came up on the Oracle-L list server a few days ago that demonstrated how clever the Oracle software is at minimising run-time work, and how easy it is to think you know what an execution plan means when you haven’t actually thought through the details – and the details might make a difference to performance.

The original question was about a very large table with several bitmap indexes, and an anomaly that appeared as a query changed its execution plan.  Here are the critical sections from the plans (extracted from memory with rowsource execution statistics enabled):

--------------------------------------------------------------------------------------------------------
|  Id |Operation                        | Name       | Starts | E-Rows | A-Rows |     A-Time | Buffers |
--------------------------------------------------------------------------------------------------------
|   6 |    TABLE ACCESS BY INDEX ROWID  |       FACT |      1 |      1 |     24 |00:00:00.01 |      31 |
|   7 |     BITMAP CONVERSION TO ROWIDS |            |      1 |        |     24 |00:00:00.01 |       7 |
|   8 |      BITMAP AND                 |            |      1 |        |      1 |00:00:00.01 |       7 |
|*  9 |       BITMAP INDEX SINGLE VALUE |     FACT_0 |      1 |        |      1 |00:00:00.01 |       3 |
|* 10 |       BITMAP INDEX SINGLE VALUE |  FACT_DIM1 |      1 |        |      4 |00:00:00.01 |       4 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
     9 - access("FACT"."C0"=243001)
    10 - access("FACT"."C1"="DIMENSION1"."ID")


-------------------------------------------------------------------------------------------------------
|  Id | Operation                      | Name       | Starts | E-Rows | A-Rows |     A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
|   7 |    BITMAP CONVERSION TO ROWIDS |            |      5 |        |      8 |00:00:00.01 |     119 |
|   8 |     BITMAP AND                 |            |      5 |        |      1 |00:00:00.01 |     119 |
|*  9 |      BITMAP INDEX SINGLE VALUE |  FACT_DIM1 |      5 |        |     20 |00:00:00.01 |      28 |
|* 10 |      BITMAP INDEX SINGLE VALUE |  FACT_DIM2 |      5 |        |    140 |00:00:00.01 |      78 |
|* 11 |      BITMAP INDEX SINGLE VALUE |     FACT_0 |      5 |        |      5 |00:00:00.01 |      13 |
|  12 |   TABLE ACCESS BY INDEX ROWID  |       FACT |      8 |      1 |      8 |00:00:00.01 |       8 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
     9 - access("FACT"."C1"="DIMENSION1"."ID")
    10 - access("FACT"."C2"="DIMENSION2"."ID")
    11 - access("FACT"."C0"=243001)

The first plan shows the steps leading to a single access (Starts = 1) to the FACT table after combining two bitmap indexes; the second shows the second child of a nested loop join where Oracle has combined three bitmaps indexes to access the FACT table – operation 7 (and its descendants) execute 5 times in this case. I’ve included the related parts of the predicate section so that you can see that the predicates at operations 9 and 10 of the first plan are the same as the predicates at operations 9 and 11 of the second plan.

So here’s the question – if one access to fact_dim1 requires 4 buffer visits, why does it take 28 buffer visits to do the same thing 5 times (and it is with the same value every time); conversely if one access to fact_0 requires 3 buffer visits, why do 5 visits to do the same thing take only 13 buffer visits. (Note: the arithmetic is made a little more obscure by the way in which index branch blocks may be pinned during nested loop joins.)

Then there’s a further question – not visible in the plan – the A-Rows in the “BITMAP INDEX SINGLE VALUE” operation is the number of bitmap sections in the rowsource, and we can see that the key values for index fact_dim2 have a significant number of bitmap chunks for a single key (5 executions returned 140 bitmap chunks). This scale, though, is true of all three indexes – in fact a follow-up email pointed out that a typical key value in EVERY ONE of the three indexes consisted of about 100 bitmap chunks, so why can’t we see those hundreds in the execution plan ?

So this is where we’re at: we have an execution plan where we haven’t visited all the bitmap chunks for a bitmap key, and the order in which the bitmap indexes are used in the plan seems to have some effect on the choice of leaf-blocks you visit when accessing the chunks. So (a) could a change in the order of indexes make a significant difference to the number of bitmap chunks you visit and the resulting performance, and (b) is there a way to control the order in which you visit the indexes. That’s where the note starts to get a bit technical – if you don’t want to read any more the answers are: (a) yes but probably not significantly and (b) yes.

Demo

To investigate what goes on inside a “BITMAP AND” I created a table with two bitmap indexes and used a very large setting for pctfree for the indexes so that they had to be stored with a large number of bitmap chunks per key. Here’s the code that I used, with some results from an instance of 12.1.0.2:


create table people
nologging
as
with generator as (
        select  --+ materialize 
                rownum id 
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum-1, 1e2)      id_town_home,
        trunc((rownum-1)/1e4)   id_town_work,
        rpad('x',10,'x')        small_vc,
        rpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;
begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'PEOPLE',
                method_opt       => 'for all columns size 1'
        );
end;
/

create bitmap index pe_home on people(id_town_home) nologging pctfree 95;
create bitmap index pe_work on people(id_town_work) nologging pctfree 95;

select
        index_name, distinct_keys, num_rows, leaf_blocks, avg_leaf_blocks_per_key
from
        user_indexes
where
        table_name = 'PEOPLE'
order by
        index_name
;


INDEX_NAME           DISTINCT_KEYS   NUM_ROWS LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY
-------------------- ------------- ---------- ----------- -----------------------
PE_HOME                        100      30399       15200                     152
PE_WORK                        100       1800         907                       9

As you can see I’ve generated two columns (id_town_home, id_town_work) with 100 distinct values and 10,000 rows each, but with very different data distributions – the rows for any given value for id_town_home are uniformly spread across the entire table, every hundredth row; while the rows for any given value of id_town_work are very tightly clustered as a group of 10,000 consecutive rows. As a consequence the index entry (bitmap string) for a typical key value for id_town_home is enormous and has to be broken into 304 chunks spread across 152 leaf blocks (2 index entries per leaf block), while the index entry for a typical key value for id_town_work is much shorter, but still requires 18 chunks spread across 9 leaf blocks.

So what will I see if I run the following query, and force it to use a BITMAP AND of the two indexes, in the two different orders:

select
        /*+ index_combine(pe) */
        max(small_vc)
from
        people pe
where
        id_town_home = 50
and     id_town_work = 50
;

Based on a very simple interpretation of the typical execution plan and using the index stats shown above we might expect to see roughly A-Rows = 18 with 9 buffer gets (plus a few more for segment headers and branch blocks) on the id_town_work index and A-Rows = 304 with 152 buffer gets on the id_town_home index to allow Oracle to generate and compare the two bit strings – but here are the two plans with their execution stats, generated in 12.1.0.2, and each run after flushing the buffer cache:

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |      1 |        |      1 |00:00:00.01 |     118 |    117 |
|   1 |  SORT AGGREGATE                      |         |      1 |      1 |      1 |00:00:00.01 |     118 |    117 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| PEOPLE  |      1 |    100 |    100 |00:00:00.01 |     118 |    117 |
|   3 |    BITMAP CONVERSION TO ROWIDS       |         |      1 |        |    100 |00:00:00.01 |      18 |     17 |
|   4 |     BITMAP AND                       |         |      1 |        |      1 |00:00:00.01 |      18 |     17 |
|*  5 |      BITMAP INDEX SINGLE VALUE       | PE_WORK |      1 |        |     18 |00:00:00.01 |      14 |     13 |
|*  6 |      BITMAP INDEX SINGLE VALUE       | PE_HOME |      1 |        |      4 |00:00:00.01 |       4 |      4 |
-------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |      1 |        |      1 |00:00:00.01 |     122 |    120 |
|   1 |  SORT AGGREGATE                      |         |      1 |      1 |      1 |00:00:00.01 |     122 |    120 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| PEOPLE  |      1 |    100 |    100 |00:00:00.01 |     122 |    120 |
|   3 |    BITMAP CONVERSION TO ROWIDS       |         |      1 |        |    100 |00:00:00.01 |      22 |     20 |
|   4 |     BITMAP AND                       |         |      1 |        |      1 |00:00:00.01 |      22 |     20 |
|*  5 |      BITMAP INDEX SINGLE VALUE       | PE_HOME |      1 |        |      5 |00:00:00.01 |       8 |      7 |
|*  6 |      BITMAP INDEX SINGLE VALUE       | PE_WORK |      1 |        |     18 |00:00:00.01 |      14 |     13 |
-------------------------------------------------------------------------------------------------------------------

We have NOT touched anything like the entire bit-string for the id_town_home index – a bit-string that spans 152 leaf blocks! Clearly Oracle is doing something clever to minimise the work, and it’s so clever that switching the order of these two extremely different indexes in the plan has made virtually no difference to the work done. Obviously I can’t tell you exactly what the code is doing, but I think I can produce a reasonable guess about what’s going on.

The pe_work index has the smaller number of leaf blocks per key, which makes it the better starting choice for the AND in this case, so the optimizer’s default starting action was to pick the first couple of chunks of that index key value; and Oracle immediately sees that the first rowid that it could possibly need in its result set is roughly in the middle of the table – remember that the “key” columns of a bitmap index are (real_key, first_rowid_of chunk, last_rowid_of_chunk, compressed_bitstring).

Since it now knows the lowest possible rowid that it could need Oracle can now probe the pe_home index by (id_town_home=50, {target_rowid}) – which will let it go to a bitmap index chunk that’s roughly in the middle of the full range of 152. Then Oracle can expand the bitstrings from the chunks it has, reading new chunks as needed from each of the indexes until the 18 chunks / 9 leaf block from the pe_work index have been used up (and that range would have aligned with just two or three chunks from the pe_home index) at which point Oracle can see there’s no more rows in the table that could match both predicates and it doesn’t need to read the next 75 chunks of the pe_home index.

Conversely, when I forced Oracle to use the (inappropriate) pe_home index first, it read the first couple of chunks, then read the first couple of chunks of the pe_work index, at which point it discovered that it didn’t need any of the pe_home index prior to (roughly) chunk 75, so it jumped straight to the right chunk to align with pe_work and carried on from there. That’s why the forced, less efficient, plan that visited pe_home first visited just a couple more leaf blocks than the plan the optimizer selected for itself.

Bottom line on performance (tl;dr) – Oracle is sufficiently smart about checking the start and end ranges on bitmap indexes (rather then arbitrarily expanding the entire bitmap for each key) that even for very large bitmap index entries it will probably only access a couple of “redundant” leaf blocks per index even if it picks the worst possible order for using the indexes. You’re far more likely to notice Oracle picking the wrong indexes (because you know the data better) than you are to spot it using the right indexes in the wrong order – and given that bitmap indexes tend to be relatively small and well buffered (compared to the tables), and given the relatively large number of rows we pick by random I/O from fact tables, a little extra work in the bitmap indexes is unlikely to make a significant difference to the performance of most queries.

Closing fact: in the unlikely circumstances that you do spot the special case where it will make a difference (and it will probably be a difference in CPU usage) then you can dictate the order of the indexes with the undocumented bitmap_tree() hint.  I may get round to writing up the variations one day but, for this simple case, the index_combine() hint that I used to force the BITMAP AND turned into the following bitmap_tree() hint in the outline:

bitmap_tree(@sel$1 pe@sel$1 and((people.id_town_work) (people.id_town_home)))

bitmap_tree( @query_block     table_name@query_block     and( ({first index definition}) ({second index definition}) ) )

Obviously not suitable to throw into production code casually – check with Oracle support if you think it’s really necessary – but if you wanted to reverse the order of index usage in this case you could just swap the order of the index definitions. If you thought there was a third index that should be used you could include its definition (note that it’s table_name.column_name – the index definition – in the brackets).

My reference: bitmap_control_02.sql

1 Comment »

  1. The rowid-based pruning sounds rather similar to a trick I’ve used from time to time when trying to speed up semijoin-using queries: Give the min and max join values separately where the CBO can recognize for sure what range of values it won’t need to go outside of, which helps with e.g. partition elimination. (This was a bigger deal in 10g; the 11g partition elimination supposedly got smarter IIRC.)

    WITH q AS (...)
    SELECT *
    FROM mytab t
    WHERE t.lookup_key IN (SELECT q.lookup_key FROM q)
      AND t.lookup_key >= (SELECT min(q.lookup_key) FROM q)
      AND t.lookup_key <= (SELECT max(q.lookup_key) FROM q)
    

    The extra subqueries are logically redundant but I’ve found them useful once in a while for performance.

    Comment by Jason Bucata — February 3, 2016 @ 6:42 pm GMT Feb 3,2016 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.