Oracle Scratchpad

June 13, 2016

Bitmap Counts

Filed under: bitmaps,Indexing,Oracle — Jonathan Lewis @ 12:40 pm GMT Jun 13,2016

A question came up on the Oracle-L list server a few days ago about a query whose plan showed several bitmap operations. The problem was that the A-Rows column reported by a call to dbms_xplan.display_cursor() was showing numbers that semed to be far too small. In fact the query was producing a parallel execution plan, so the “actuals” for the parallel server operations were reporting zeros because the OP had used the “allstats last” formatting option rather than just “allstats” – but the numbers were still far too small even after this error had been corrected.

This was a detail I’d not noticed before but there was an obvious(footnote 1) guess to explain why this apparent anomaly had appeared, viz: A-Rows counts rows in the rowsource for a table, index entries for a B-tree rowsource, and strings of bits when producing bitmaps. But even the most obvious guess should be checked so here’s some code that will  (at least) corroborate the hypothesis:


rem     Script:         bitmap_counts.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum as number(8,0)                             id,
        mod(rownum - 1,2) as number(8,0)                  n2,
        mod(rownum - 1,100) as number(8,0)                n100,
        lpad(rownum,10,'0') as varchar2(10)               v1,
        lpad('x',100,'x') as varchar2(100)                padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
; 
create bitmap index t1_b2 on t1(n2) nologging;
create bitmap index t1_b100 on t1(n100) nologging;

begin 
        dbms_stats.gather_table_stats(
                ownname    => user,
                tabname    =>'T1',
                method_opt => 'for all columns size 1'
        );
end;
/

I’ve created a table that includes two columns that I’ve indexed with bitmap indexes. Because of the size of the table and the pattern of the data each distinct value for the two columns will require multiple bitmap index entries in the bitmap index.

The multiple bitmap chunks are important but before I comment further, here’s what the program does after creating the data:


select  index_name, distinct_keys, num_rows, clustering_factor
from    user_indexes
where   table_name = 'T1'
order by
        index_name
;

alter session set statistics_level = all;
set serveroutput off

select
        count(*)
from    t1
where   n2 = 0
and     n100 between 20 and 29
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

The first query reports some index stats – to confirm my comment about multiple index entries per key – the second query is the one that’s going to give me a useful plan. Let’s just check the results of the two queries first – the index stats are the only ones needing any comment:


INDEX_NAME           DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
-------------------- ------------- ---------- -----------------
T1_B100                        100        800               800
T1_B2                            2         94                94


  COUNT(*)
----------
     50000

Index t1_b100 reports 800 index entries – the bitmap chunk for each value had to be split into 8 rowid ranges; we’re interested in 10 key values from this index.

Index t1_b2 shows 94 index entries – the bitmap for each value had to be split into 47 rowid ranges: we’re interested in one key value from this index.

And this is what the plan shows:


-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |      1 |        |      1 |00:00:00.01 |      69 |       |       |          |
|   1 |  SORT AGGREGATE              |         |      1 |      1 |      1 |00:00:00.01 |      69 |       |       |          |
|   2 |   BITMAP CONVERSION COUNT    |         |      1 |  55455 |      2 |00:00:00.01 |      69 |       |       |          |
|   3 |    BITMAP AND                |         |      1 |        |      2 |00:00:00.01 |      69 |       |       |          |
|*  4 |     BITMAP INDEX SINGLE VALUE| T1_B2   |      1 |        |     47 |00:00:00.01 |      25 |       |       |          |
|   5 |     BITMAP MERGE             |         |      1 |        |      2 |00:00:00.01 |      44 |  1024K|   512K|  291K (0)|
|*  6 |      BITMAP INDEX RANGE SCAN | T1_B100 |      1 |        |     80 |00:00:00.01 |      44 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("N2"=0)
   6 - access("N100">=20 AND "N100"<=29)


I can’t explain why the in-memory manipulation of the the bitstrings apparently produces two bitstrings at operations 3 and 5, but given the index stats we can understand that the 47 “rows” reported for operation 4 allow for one of the two key values in the index and the 80 “rows” reported for operation 6 allows for 10 of the key values in the index.   Q.E.D.

Bonus commentary

If you want to see table row counts (or their equivalent) appearing in a bitmap plan you’ll need to run a query that does a “bitmap conversion to rowids”, e.g.:


select
        count(n100)
from    t1
where   n2 = 0
and     n100 between 20 and 29
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |      1 |        |      1 |00:00:05.04 |      67 |       |       |          |
|   1 |  SORT AGGREGATE                |                  |      1 |      1 |      1 |00:00:05.04 |      67 |       |       |          |
|*  2 |   VIEW                         | index$_join$_001 |      1 |  55455 |  50000 |00:00:04.94 |      67 |       |       |          |
|*  3 |    HASH JOIN                   |                  |      1 |        |  50000 |00:00:04.75 |      67 |    25M|  4150K|   27M (0)|
|   4 |     BITMAP CONVERSION TO ROWIDS|                  |      1 |  55455 |    500K|00:00:00.94 |      25 |       |       |          |
|*  5 |      BITMAP INDEX SINGLE VALUE | T1_B2            |      1 |        |     47 |00:00:00.01 |      25 |       |       |          |
|   6 |     BITMAP CONVERSION TO ROWIDS|                  |      1 |  55455 |    100K|00:00:00.19 |      42 |       |       |          |
|*  7 |      BITMAP INDEX RANGE SCAN   | T1_B100          |      1 |        |     80 |00:00:00.01 |      42 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("N100"<=29 AND "N2"=0 AND "N100">=20))
   3 - access(ROWID=ROWID)
   5 - access("N2"=0)
   7 - access("N100">=20 AND "N100"<=29)


Note how 80 “rows” at operation 7 turn into 100,000 rows at operation 6, and 47 “rows” at opration 5 turn into 500,000 rows at operation 4.

I’ve tested 11.2.0.4 and 12.1.0.2 with this code and they both behave the same way. Interestingly the final query (count with index hash join) took about 0.12 seconds to run with rowsource execution statistics disabled, but roughly 5 seconds with statistics enabled – and the extra time was all in the hash join.

Footnote 1:

“Obvious”: provided you’ve been working with the relevent bits of the Oracle software for several years(footnote 2) or have been reading the right books; even then something that’s “obvious” isn’t necessarily correct. ‘Oh, obvious,’ said Granny [Weatherwax]. ‘I’ll grant you it’s obvious. Trouble is, just because things are obvious doesn’t mean they’re true.’ – Wyrd Sisters: Terry Pratchett.

Footnote 2:

‘Oh, it’s largely intuitive, Archchancellor,’ said Ponder.  ‘Obviously you have to spend a lot of time learning it first, though.’ – Hogfather: Terry Pratchett

 

2 Comments »

  1. Hi Jonathan:

    1.

    In my testing, when I traced the select statement, the # of fetch statements did equal the number of
    A-Rows.

    For the BITMAPINDEX SINGLE VALUE:

    A total of 46 qerbxFetch statements:

    Example: qerbxFetch(7f443fcc8560): record: srid=01cef333.0000, erid=01cef4bf.0037, data(3521)=[ce…]

    For the BITMAP INDEX RANGE SCAN:

    A total of 80 qerbxRFetch(7f443fccb3a8): kdifxs statements:

    Example: qerbxRFetch(7f443fccb3a8): kdifxs: srid=01cef5f5.0000, erid=01cefee4.002f, data(3515)=[03…],

    2.

    Also, most of the time, there were 2 bitstrings for each 8k block.

    Example:

    qerbxFetch(7f443fcc8560): record: srid=01cef019.0000, erid=01cef1a5.0037, data(3521)=[ce…]
    qerbxFetch(7f443fcc8560): record: srid=01cef1a6.0000, erid=01cef332.0037, data(3521)=[ce…]

    Comment by Brian — June 14, 2016 @ 5:26 pm GMT Jun 14,2016 | Reply

  2. Brian,

    Thanks for the comment –

    1) you might find it interesting to apply your technique to the type of example I’ve described in this post.

    2) Yes – for very large bitstrings Oracle seems to limit an index entry to slightly less than 50% of the block size. I don’t know why, but the same used to be true of B-tree indexes, though the limit has been increased in recent versions for most (but not quite all) aspects of B-trees – maybe the bitmap limitation is another edge case that could be changed in the future.

    Comment by Jonathan Lewis — June 17, 2016 @ 9:48 am GMT Jun 17,2016 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.