Here’s another of my “draft” notes that needs some exapansion and, most importantly, proof.
I have a fact table with a “status id” column that shows a massive skew. Unfortunately I have a dimension table that holds the “status code”, so (in theory, at least) I have to do a join from the statuses table to the fact table to find rows of a given status. Unfortunately the join hides the skew:
select f.* from facts f, statuses s where s.code = 'C' and f.status_id = s.status_id ;
The optimizer knows that the status_id column on the facts table has a highly skewed distribution and will create a histogram on it, but it can’t know which status code corresponds to which status_id, so the histogram doesn’t help in calculating the join cardinality.
Will a bitmap join index help ? Answer – NO.
It creates a virtual column on the facts table, but you can’t collect stats on the virtual column. The optimizer won’t use the stats even if you create them, all it will use is the number of distinct keys in the index.
For the optimizer to collect the stats the Oracle developers would have to come up with a new way to collect column stats by reading an index on that column – but it would be useful.
Possible workaround – don’t do this join, rewrite the query with an IN subquery with a /*+ precompute_subquery */ hint to generate an IN-list at optimisation time.