Index Sizing

May 17, 2012

Index Sizing

Filed under: Indexing,Infrastructure,Oracle — Jonathan Lewis @ 8:53 am BST May 17,2012

I was in a discussion recently about how to estimate the size of a bitmap index before you build it, and why it’s much harder to do this for bitmap indexes than it is for B-tree indexes. Here’s what I wrote in “Practical Oracle 8i”:

An interesting feature of bitmap indexes is that it is rather hard to predict how large the index segment will be. The size of a b-tree index is based very closely on the number of rows and the typical size of the entries in the index column. The size of a bitmap index is dictated by a fairly small number of bit-strings which may have been compressed to some degree depending upon the number of consecutive 1’s and 0’s. To pick an extreme example, imagine a table of one million rows that has one column that may contain one of eight values ‘A’ to ‘H’ say, which has been generated in one of of the two following extreme patterns:

All the rows for a given value appear together, so scanning down the table we get 125,000 rows with ‘A’ followed by 125,000 rows of ‘B’ and so on.
The rows cycle through the values in turn, so scanning down the table we get ‘A’,’B’. . . ‘H’ repeated 125,000 times.

What will the bitmap indexes look like in the two cases?

For the first example, the basic map for the ‘A’ value will be 125,000 one-bits, followed by 875,000 zero bits – which will be trimmed off. Splitting the 125,000 bits into bytes and adding the necessary overhead of about 12% we get an entry for the ‘A’ rows of 18K. A similar argument applies for each of the values ‘B’ to ‘H’, so we get a total index size of around 8 x 18K – giving 156K.

For the second example, the basic map for the ‘A’ value will be a one followed by 7 zeros, repeated 125,000 times. There is no chance of compression here, so the ‘A’ entry will start at 125,000 bytes. Adding the overhead this goes up to 140K, and repeating the argument for the values ‘B’ to ‘H’ we get a total index of 1.12 MB.

This wild variation in size looks like a threat, but to put this into perspective, a standard B-tree index on this column would run to about 12 Mb irrespective of the pattern of the data. It would probably take about ten times as long to build as well.

I wrote up a test case to recreate this model some time ago, so here it is with the results from an instance of 11.1.0.7:

rem
rem     Script:         bitmap_size.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2012
rem

create table t1
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual 
        connect by 
                level <= 10000
)
select
        chr(65 + mod(rownum-1,8))               bit_scattered,
        chr(65+trunc((rownum-1)/125000))        bit_clustered
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create bitmap index t1_b1_scattered on t1(bit_scattered);
create bitmap index t1_b1_clustered on t1(bit_clustered);

select
        index_name, leaf_blocks, 8 * leaf_blocks KB
from
        user_indexes
where
        table_name = 'T1'

;

set doc off
doc

Results from 11.1.0.7
---------------------

INDEX_NAME           LEAF_BLOCKS         KB
-------------------- ----------- ----------
T1_B1_SCATTERED              164       1312
T1_B1_CLUSTERED               24        192

2 rows selected.
#

So no big change since 8i.

If you modify the code to create B-tree indexes you’ll find they are 14MB each if you don’t use compression, 12MB if you do.

Comments (2)

2 Comments »

[…] Jonathan Lewis was in a discussion recently about how to estimate the size of a bitmap index before you build it, and why it’s much harder to do this for bitmap indexes than it is for B-tree indexes. […]

Pingback by Log Buffer #272, A Carnival of the Vanities for DBAs | The Pythian Blog — May 18, 2012 @ 7:01 am BST May 18,2012 | Reply
[…] Bitmap sizing (May 2012): Why it’s hard to predict the size of a bitmap index […]

Pingback by Indexing Catalogue | Oracle Scratchpad — August 8, 2022 @ 9:55 am BST Aug 8,2022 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome. Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Scratchpad

May 17, 2012

Index Sizing

2 Comments »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS

Oracle Scratchpad

May 17, 2012

Index Sizing

Share this:

Related

2 Comments »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS