Oracle Scratchpad

April 23, 2015

Golden Oldies

Filed under: bitmaps,Indexing,Oracle — Jonathan Lewis @ 8:45 am GMT Apr 23,2015

I’ve just been motivated to resurrect a couple of articles I wrote for DBAZine about 12 years ago on the topic of bitmap indexes. All three links point to Word 97 documents which I posted on my old website in September 2003. Despite their age they’re still surprisingly good.

Update: 26th April 2015

Prompted by my reply to comment #2 below to look at what I said about bitmap indexes in Practical Oracle 8i (published 15 years ago), and found this gem:

An interesting feature of bitmap indexes is that it is rather hard to predict how large the index segment will be. The size of a B-tree index is based very closely on the number of rows and the typical size of the entries in the index column. The size of a bitmap index is dictated by a fairly small number of bit-strings which may have been compressed to some degree depending upon the number of consecutive 1’s and 0’s.

To pick an extreme example, imagine a table of one million rows that has one column that may contain one of eight values ‘A’ to ‘H’ say, which has been generated in one of of the two following extreme patterns:

  • All the rows for a given value appear together, so scanning down the table we get 125,000 rows with ‘A’ followed by 125,000 rows of ‘B’ and so on.
  • The rows cycle through the values in turn, so scanning down the table we get ‘A’,’B’. . . ‘H’ repeated 125,000 times.

What will the bitmap indexes look like in the two cases case?

For the first example, the basic map for the ‘A’ value will be 125,000 one-bits, followed by 875,000 zero bits – which will be trimmed off. Splitting the 125,000 bits into bytes and adding the necessary overhead of about 12% we get an entry of the ‘A’ rows of 18K. A similar argument applies for each of the values ‘B’ to ‘H’, so we get a total index size of around 8 x 18K – giving 156K.

For the second example, the basic map for the ‘A’ value will be a one followed by 7 zeros, repeated 125,000 times. There is no chance of compression here, so the ‘A’ entry will start at 125,000 bytes. Adding the overhead this goes up to 140K, and repeating the argument for the values ‘B’ to ‘H’ we get a total index of 1.12 MB.

This wild variation in size looks like a threat, but to put this into perspective, a standard B-tree index on this column would run to about 12 Mb irrespective of the pattern of the data. It would probably take about ten times as long to build as well.

As we can see, the size of a bitmap index can be affected dramatically by the packing of the column it depends upon as well as the number of different possible values the column can hold and the number of rows in the table. The compression that is applied before the index is stored, and the amazing variation in the resulting index does mean that the number of different values allowed in the column can be much larger than you might first expect. In fact it is often better to think of bitmap indexes in terms of how many occurrences of each value there are, rather than in terms of how many different values exist. Viewing the issue from this direction, a bitmap is often better than a B-tree when each value occurs more than a few hundred times in the table (but see the note below following the description of bitmap index entries).

 

5 Comments »

  1. Hello Jonathan,

    You wrote: “In the bitmap index, every entry consists of a set of flags, a lock byte, and (in this case) four columns of data. The four columns are in fact the indexed value, a pair of rowids and a stream of bits. The pair of rowids identifies a contiguous section of the table, and the stream of bits is encoded to tell us which rows in that range of rowids hold that value.”
    especially “The pair of rowids identifies a contiguous section of the table”

    So, if we access table through B-tree index and bitmap index using indexed_field='{CONSTANT}’, does Oracle has a chance to make less visits of table block from bitmap index entry, which contains “rowids identifies a contiguous section of the table”, then from B-tree index?
    Assume all other is the same: we use the same column to index, and so on. (just thought experiment)

    Comment by Yuri — April 23, 2015 @ 10:28 am GMT Apr 23,2015 | Reply

    • Yuri,

      I think the answer to your question is no.

      Each entry in a (non-unique) B-tree index is made of the pair (key, rowid), which means that the index entries are sorted in rowid order, which means rows with the same key in the same block are visited one after the other, in just the same way they would be with a bitmap index.

      Comment by Jonathan Lewis — April 25, 2015 @ 11:16 am GMT Apr 25,2015 | Reply

  2. Hello Jonathan,

    Some times ago I’ve done several Internet search to find another “golden oldie” of yours, when you are explaining the expected number of buffere gets for some typical sql statement s(Index equality search,index range, full scan etc) , but I could not find it anywhere. Maybe it’s even on your blog, but I have probably searched with wrong keywords. Can you help me?

    Thanks

    Comment by jastreb70 — April 24, 2015 @ 10:45 am GMT Apr 24,2015 | Reply

    • I may have written something about that in Practical Oracle 8i; and I’ve also talked about it in various presentations I’ve given; it’s also something I may have written about on comp.databases.oracle.server when the Usenet newsgroups existed; but I don’t think I ever published an online document about it.

      Comment by Jonathan Lewis — April 25, 2015 @ 11:27 am GMT Apr 25,2015 | Reply

  3. Such modesty :P

    Comment by Daniel Stolf — April 24, 2015 @ 3:13 pm GMT Apr 24,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.