Oracle Scratchpad

December 9, 2013

Bitmap join indexes

Filed under: Indexing,Oracle,Statistics — Jonathan Lewis @ 6:01 pm GMT Dec 9,2013

Here’s another of my “draft” notes that needs some expansion and, most importantly, proof.

I have a fact table with a status id column that shows a massive skew. But I also have a dimension table that holds the “status code” so (in theory, at least) I have to do a join from the statuses table to the facts table to find rows of a given status. Unfortunately the join hides the skew:

select  f.*
from    facts f, statuses s
where   s.code = 'C'
and     f.status_id = s.status_id

The optimizer knows that the status_id column on the facts table has a highly skewed distribution and will create a histogram on it, but it can’t know which status code corresponds to which status_id so the histogram doesn’t help in calculating the join cardinality.

Will a bitmap join index help ? Answer – NO.

A bitmap index will produce a virtual column on the facts table, but you can’t collect stats on the virtual column. In fact, the optimizer won’t use the stats even if you program them into existence with calls to dbms_stats.set_column_stats(); the only thing the optimizer will use is the number of distinct keys in the index.

For the optimizer to collect the stats the Oracle developers would have to come up with a new way to collect column stats by reading an index on that column – but it would be useful.

Possible workaround – don’t do this join, rewrite the query with an IN subquery with a /*+ precompute_subquery */ hint to make the optimizer generate an IN-list of literals at optimisation time.


  1. Is rewriting the code an option? If yes, you could select status_id for the given code in advance. Then include the selected value in the where-clause and avoid the join.

    Comment by Todor Botev — December 10, 2013 @ 6:12 pm GMT Dec 10,2013 | Reply

  2. I have seen this issue during my investigation about star transformation feature (thanks Jonathan about the nice articles, which helped me a lot !). I ended up with using “faked” statistics on the dimension attribute columns, which represents the fact table’s skew into the dimension column’s histogram. In this way, without touching the code (which is the case with pre-build BI tools), I was able to “pass” the right cardinality estimation to the Optimizer.

    Comment by Edward Hayrabedian — December 11, 2013 @ 2:02 pm GMT Dec 11,2013 | Reply

  3. […] that f(500) will be evaluated in the same way that we can trigger in-list calculation with the precompute_subquery […]

    Pingback by CBO catchup | Oracle Scratchpad — August 7, 2015 @ 1:10 pm GMT Aug 7,2015 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at


Get every new post delivered to your Inbox.

Join 5,698 other followers