Oracle Scratchpad

May 18, 2018

Bitmap Join Indexes

Filed under: bitmaps,CBO,Execution plans,Indexing,Oracle,Statistics — Jonathan Lewis @ 2:29 pm BST May 18,2018

I’ve been prompted by a recent question on the ODC database forum to revisit a note I wrote nearly five years ago about bitmap join indexes and their failure to help with join cardinalities. At the time I made a couple of unsupported claims and suggestions without supplying any justification or proof. Today’s article finally fills that gap.

The problem is this – I have a column which exhibits an extreme skew in its data distribution, but it’s in a “fact” table where most columns are meaningless ids and I have to join to a dimension table on its primary key to translate an id into a name. While there is a histogram on the column in the fact table the information in the histogram ceases to help if I do the join to the dimension and query by name, and the presence of a bitmap join index doesn’t make any difference. Let’s see this in action – some of the code follows a different pattern and format from my usual style because I started by copying and editing the example supplied in the database forum:


rem
rem     Script:         bitmap_join_4.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem
rem     Last tested 
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem
rem     Notes:
rem     Bitmap join indexes generate virtual columns on the fact table
rem     but you can't get stats on those columns - which means if the
rem     data is skewed you can have a histogram on the raw column but
rem     you don't have a histogram on the bitmap virtual column.
rem

drop table t1;
drop table dim_table;

create table dim_table (type_code number, object_type varchar2(10));

insert into dim_table values (1,'TABLE');
insert into dim_table values (2,'INDEX');
insert into dim_table values (3,'VIEW');
insert into dim_table values (4,'SYNONYM');
insert into dim_table values (5,'OTHER');

alter table dim_table add constraint dim_table_pk primary key (type_code) using index;

exec dbms_stats.gather_table_stats(user,'dim_table',cascade=>true);

create table t1 
nologging
as 
select 
        object_id, object_name, 
        decode(object_type, 'TABLE',1,'INDEX',2,'VIEW',3,'SYNONYM',4,5) type_code 
from 
        all_objects
where
        rownum <= 50000 -- > comment to bypass wordpress format issue
;

insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;


create  bitmap index t1_b1 on t1(dt.object_type)
from    t1, dim_table dt
where   t1.type_code = dt.type_code
;

exec dbms_stats.gather_table_stats(null, 't1', cascade=>true, method_opt=>'for all columns size 254');


select
        dt.object_type, count(*)
from
        t1, dim_table  dt
where
        t1.type_code   = dt.type_code
group by
        dt.object_type
order by
        dt.object_type
;

I’ve started with a dimension table that lists 5 type codes and has a primary key on that type code; then I’ve used all_objects to generate a table of 400,000 rows using those type codes, and I’ve created a bitmap join index on the fact (t1) table based on the dimension (dim_table) table column. By choice the distribution of the five codes is massively skewed so after gathering stats (including histograms on all columns) for the table I’ve produced a simple aggregate report of the data showing how many rows there are of each type – by name. Here are the results – with the execution plan from 12.1.0.2 showing the benefit of the “group by placement” transformation:


OBJECT_TYP   COUNT(*)
---------- ----------
INDEX           12960
OTHER          150376
SYNONYM        177368
TABLE           12592
VIEW            46704

5 rows selected.

-------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost  |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |       |       |   735 |
|   1 |  SORT GROUP BY        |           |     5 |   125 |   735 |
|*  2 |   HASH JOIN           |           |     5 |   125 |   720 |
|   3 |    VIEW               | VW_GBF_7  |     5 |    80 |   717 |
|   4 |     HASH GROUP BY     |           |     5 |    15 |   717 |
|   5 |      TABLE ACCESS FULL| T1        |   400K|  1171K|   315 |
|   6 |    TABLE ACCESS FULL  | DIM_TABLE |     5 |    45 |     2 |
-------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ITEM_1"="DT"."TYPE_CODE")

Having established the basic result we can now examine some execution plans to see how well the optimizer is estimating cardinality for queries relating to that skewed distribution. I’m going to generate the execution plans for a simple select of all the rows of type ‘TABLE’ – first by code, then by name, showing the execution plan of each query:


explain plan for
select  t1.object_id
from
        t1
where
        t1.type_code = 1
;

select * from table(dbms_xplan.display(null,null,'outline'));


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 12592 |    98K|   281   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 12592 |    98K|   281   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T1"."TYPE_CODE"=1)

Thanks to the histogram I generated on the type_code table the optimizer’s estimate of the number of rows is very accurate. So how well does the optimizer handle the join statistics:


prompt  =============
prompt  Unhinted join
prompt  =============

explain plan for
select  t1.object_id
from
        t1, dim_table  dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;

select * from table(dbms_xplan.display(null,null,'outline'));

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 80000 |  1328K|   287  (10)| 00:00:01 |
|*  1 |  HASH JOIN         |           | 80000 |  1328K|   287  (10)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| DIM_TABLE |     1 |     9 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1        |   400K|  3125K|   277   (7)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."TYPE_CODE"="DT"."TYPE_CODE")
   2 - filter("DT"."OBJECT_TYPE"='TABLE')

Taking the default execution path the optimizer’s estimate of rows identified by type name is 80,000 – which is one fifth of the total number of rows. Oracle knows that the type_code is skewed in t1, but at compile time doesn’t have any idea which type_code corresponds to type ‘TABLE’, so it’s basically using the number of distinct values to dictate the estimate.

We could try hinting the query to make sure it uses the bitmap join index – just in case this somehow helps the optimizer (and we’ll see in a moment why we might have this hope, and why it is forlorn):


prompt  ===================
prompt  Hinted index access
prompt  ===================

explain plan for
select 
        /*+ index(t1 t1_b1) */
        t1.object_id
from
        t1, dim_table dt
where
        t1.type_code   = dt.type_code 
and     dt.object_type = 'TABLE'
;

select * from table(dbms_xplan.display(null,null,'outline'));

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       | 80000 |   625K|   687   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    | 80000 |   625K|   687   (1)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |       |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE        | T1_B1 |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."SYS_NC00004$"='TABLE')

The plan tells us that the optimizer now realises that it doesn’t need to reference the dimension table at all – all the information it needs is in the t1 table and its bitmap join index – but it still comes up with an estimate of 80,000 for the number of rows. The predicate section tells us what to do next – it identifies a system-generated column, which is the virtual column underlying the bitmap join index: let’s see what the stats on that column look like:


select
        column_name, histogram, num_buckets, num_distinct, num_nulls, sample_size
from
        user_tab_cols
where
        table_name = 'T1'
order by
        column_id
;


COLUMN_NAME          HISTOGRAM       NUM_BUCKETS NUM_DISTINCT  NUM_NULLS SAMPLE_SIZE
-------------------- --------------- ----------- ------------ ---------- -----------
OBJECT_ID            HYBRID                  254        50388          0        5559
OBJECT_NAME          HYBRID                  254        29224          0        5560
TYPE_CODE            FREQUENCY                 5            5          0      400000
SYS_NC00004$         NONE

4 rows selected.

There are no stats on the virtual column – and Oracle won’t try to collect any, and even if you write some in (using dbms_stats.set_column_stats) it won’t use them for the query. The optimizer seems to be coded to use the number of distinct keys from the index in this case.

 

Workaround

It’s very disappointing that there seems to be no official way to work around this problem – but Oracle has their own (undocumented) solution to the problem that comes into play with OLAP – the hint /*+ precompute_subquery() */. It’s possible to tell the optimizer to execute certain types of subquery as the first stage of optimising a query, then changing the query to take advantage of the resulting data:


explain plan for
select
        /*+
                qb_name(main)
                precompute_subquery(@subq)
        */
        t1.object_id
from
        t1
where
        t1.type_code in (
                select
                        /*+
                                qb_name(subq)
                        */
                        dt.type_code
                from    dim_table dt
                where   dt.object_type = 'TABLE'
        )
;

select * from table(dbms_xplan.display(null,null,'outline'));

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 12592 |    98K|   281   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 12592 |    98K|   281   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T1"."TYPE_CODE"=1)

Oracle hasn’t optimized the query I wrote, instead it has executed the subquery, derived a (very short, in this case) list of values, then optimized and executed the query I first wrote using the constant(s) returned by the subquery. And you can’t see the original subquery in the execution plan. Of course, with the literal values in place, the cardinality estimate is now correct.

It’s such a pity that this hint is undocumented, and one that you shouldn’t use in production.

 

5 Comments »

  1. Hi Jonathan,

    Thanks for the blog and explaining it nicely with all the details.

    – Aswath

    Comment by Aswath Rao — May 19, 2018 @ 3:45 am BST May 19,2018 | Reply

  2. Hello Jonathan,

    Since a bitmap join index somehow resembles a table, maybe Oracle should allow for a histogram to be created on the bitmap join index itself,
    similar to the histogram on the table.
    Such a construct probably could have helped.

    Maybe a workable idea for a future version ?

    Thanks a lot about the info on the “secret hint” :)

    I know that many Oracle gurus strongly recommend not to use hints, but once they exist, I think that all of them should have been fully
    documented, at least for our better understanding of the internals.

    Best Regards,
    Iudith Mentzel

    Comment by Iudith Mentzel — May 20, 2018 @ 4:59 pm BST May 20,2018 | Reply

    • Iudith,

      I think the idea of histograms on indexes is nice – for all types of indexes, not just bitmap join indexes. The code to do the gather shouldn’t be too difficult – though having said that Oracle still hasn’t changed the simple b-tree index code to use the approximate_ndv method to get an accurate distinct_keys and leaf_blocks value. The difficulty, perhaps, is making sure that “index histograms” are used consistently by the optimizer. At present it does a reasonable job with column groups (and other virtual column mechanisms) which is why I think I’d prefer seeing a special “gather_index_stats()” procedure for bitmap join indexes that allowed the optimizer to populate and use the virtual column.

      (Actually I’d quite like to see a parameter to gather_index_stats() that allowed me to request that an index should have stats collected on all its leading subsets so that I didn’t have to create lots of virtual columns by hand.)

      Comment by Jonathan Lewis — May 24, 2018 @ 6:50 pm BST May 24,2018 | Reply

  3. […] several weeks ago Jonathan Lewis wrote an article about a join cardinality estimation problem and how bitmap join index (BJI) could have helped to […]

    Pingback by Join cardinality, filter predicates and bitmap join indexes | Chinar Aliyev`s blog — June 25, 2018 @ 12:25 pm BST Jun 25,2018 | Reply

  4. […] In my presentation to the UKOUG SIG yesterday “Struggling with Statistics – part 2” I described a problem that I wrote about a few months ago: when you join a fact table with a massively skewed distribution on one of the surrogate key columns to a dimension holding the unique list of keys and descriptions a query against a description “loses” the skew. Here’s an demo of the problem that’s a little simpler than the one in the previous article. […]

    Pingback by Hacking for Skew | Oracle Scratchpad — September 28, 2018 @ 1:24 pm BST Sep 28,2018 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: