Oracle Scratchpad

June 16, 2014

Bitmap Nulls

Filed under: bitmaps,Indexing,Infrastructure,NULL,Oracle — Jonathan Lewis @ 9:08 am BST Jun 16,2014

It’s fairly well known that in Oracle B-tree indexes on heap tables don’t hold entries where all the indexed columns are all null, but that bitmap indexes will hold such entries and execution plans can for predicates like “column is null” can use bitmap indexes. Here’s a little test case to demonstrate the point (I ran this last on 12.1.0.1):


create table t1 (val number, n1 number, padding varchar2(100));

insert into t1
select
	decode(rownum,1,to_number(null),rownum), rownum, rpad('x',100)
from
	all_objects
where
	rownum <= 1000
;

insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;

commit;

create bitmap index t1_b1 on t1(val);

execute dbms_stats.gather_table_stats(user,'t1');

set autotrace traceonly explain

select * from t1 where val is null;

set autotrace off

The execution plan for this query, in the system I happened to be using, looked like this:


Execution Plan
----------------------------------------------------------
Plan hash value: 1201576309

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |    32 |  3488 |     8   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    32 |  3488 |     8   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |       |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE        | T1_B1 |       |       |            |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("VAL" IS NULL)

Note that the predicate section shows us that we used the “column is null” predicate as an access predicate into the index.

Of course, this is a silly little example – I’ve only published it to make a point and to act as a reference case if you ever need to support a claim. Normally we don’t expect a single bitmap index to be a useful way to access a table, we tend to use combinations of bitmap indexes to combine a number of predicates so that we can minimise the trips to a table as efficiently as possible. (And we certainly DON’T create a bitmap index on an OLTP system because it lets us access NULLs by index — OLTP and bitmaps don’t get on well together.)

If you do a symbolic block dump, by the way, you’ll find that the NULL is represented by the special “length byte” of 0xFF with no following data.

10 Comments »

  1. Hi Jonathan,

    While trying Jonathan Lewis test case on Bitmap Nulls on my 10g version (10.2.0.1),
    the plan i have got is:

    ----------------------------------------------------------
    Plan hash value: 132275637
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |         |     1 |   108 |     1   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID | T1      |     1 |   108 |     1   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |            |          |
    |*  3 |    BITMAP INDEX SINGLE VALUE | T1_BMI1 |       |       |            |          |
    ----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("N1" IS NULL)
    

    Here table access is TABLE ACCESS BY INDEX ROWID instead of TABLE ACCESS BY INDEX ROWID
    BATCHED, which might be an improvement in 12c for table access, but surprising thing for
    me is the Rows value as 1 instead of 16 (as i have ommited your last insert). And this
    plan was for the first run of that script. Below are the actual values in table for nulls

    JAGDEEP@css>select * from t1 where n1 is null;
    
            N1         N2 V1
    ---------- ---------- -
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
    
            N1         N2 V1
    ---------- ---------- -
                        1 *
                        1 *
                        1 *
                        1 *
                        1 *
    

    Then i thought of checking the plan with dbms_xplan.display_cursor() appended at the end of
    script and ran it. And below are the plans from the same

    ----------------------------------------------------------
    Plan hash value: 132275637
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |         |    16 |  1728 |     5   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID | T1      |    16 |  1728 |     5   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |            |          |
    |*  3 |    BITMAP INDEX SINGLE VALUE | T1_BMI1 |       |       |            |          |
    ----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("N1" IS NULL)
    
    PLAN_TABLE_OUTPUT
    -----------------------------------------------------------------------------------------
    SQL_ID  dhch6wbtr2a0r, child number 0
    -------------------------------------
    select * from t1 where n1 is null
    
    Plan hash value: 132275637
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |         |       |       |     5 (100)|          |
    |   1 |  TABLE ACCESS BY INDEX ROWID | T1      |    16 |  1728 |     5   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |            |          |
    |*  3 |    BITMAP INDEX SINGLE VALUE | T1_BMI1 |       |       |            |          |
    ----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("N1" IS NULL)
    

    Here Oracle has got the Rows value 16 as right and it has got for the both the execution
    (autotrace and dbms_xplan.display_cursor(). And for surity that i exited/relogin to get
    a different session and got the below plans

    ----------------------------------------------------------
    Plan hash value: 132275637
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |         |    25 |  2700 |     7   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID | T1      |    25 |  2700 |     7   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |            |          |
    |*  3 |    BITMAP INDEX SINGLE VALUE | T1_BMI1 |       |       |            |          |
    ----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("N1" IS NULL)
    
    
    PLAN_TABLE_OUTPUT
    ----------------------------------------------------------------------------------------
    SQL_ID  dhch6wbtr2a0r, child number 0
    -------------------------------------
    select * from t1 where n1 is null
    
    Plan hash value: 132275637
    
    ----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
    ----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |         |       |       |     7 (100)|          |
    |   1 |  TABLE ACCESS BY INDEX ROWID | T1      |    25 |  2700 |     7   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |            |          |
    |*  3 |    BITMAP INDEX SINGLE VALUE | T1_BMI1 |       |       |            |          |
    ----------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("N1" IS NULL)
    

    And this time again Oracle has Rows value wrong. Can you please throw some light on how
    it has got it right second time within the same session and while re-login it got them
    wrong again

    Regards,

    Comment by jagdeepsangwan — June 17, 2014 @ 7:06 am BST Jun 17,2014 | Reply

    • Jagdeepsangwan,

      10.2.0.1 was notorious for having lots of little bugs – so that doesn’t help. You’ve also not done a cut-n-paste on my code, so there’s always the possibility of a copying error – for example, perhaps your call to dbms_stats() collected stats on the wrong table. And we don’t know what your call to dbms_xplan() looked like so there may be a clue there.

      Your results MIGHT be consistent with failing to gather stats on the object and Oracle using dynamic sampling, although I would have expected autotrace and display_cursor() to show a note reporting this fact – but maybe that didn’t happen in 10.2.0.1

      Comment by Jonathan Lewis — June 17, 2014 @ 8:02 am BST Jun 17,2014 | Reply

  2. Hi Jonathan,

    Here below the script i created from yours one

    drop table t1 purge;
    create table t1(
    n1	number
    , n2	number
    , v1	varchar2(100)
    )
    ;
    
    insert into t1
    select
    	decode(rownum
    		, 1
    		, to_number(null)
    		, rownum
    	)
    	, rownum
    	, rpad('*', 100)
    from
    	all_objects
    where
    	rownum <= 1e4
    ;
    
    insert into t1
    select * from t1;
    
    insert into t1
    select * from t1;
    
    insert into t1
    select * from t1;
    
    insert into t1
    select * from t1;
    
    commit;
    
    create bitmap index t1_bmi1 on t1(n1);
    
    begin
    	dbms_stats.gather_table_stats(
    		user
    		, 't1'
    		, method_opt => 'for all columns size 1'
    	);
    end;
    /
    
    set autotrace traceonly explain
    
    select * from t1 where n1 is null;
    
    set autotrace off
    
    select * from t1 where n1 is null;
    
    select * from table(dbms_xplan.display_cursor());
    

    Regards,

    Comment by jagdeepsangwan — June 17, 2014 @ 8:10 am BST Jun 17,2014 | Reply

    • Your script generates 160,000 rows with 10,000 different values.
      Your stats collection (since it’s 10g) has sampled the data – you got a lucky sample on one of the three runs of your script.
      You should have checked the stats on the table after seeing a result you weren’t expecting.

      I was using 12c (so using approximate NDV on 100% of the data), with only 1,000 values (which means perfect stats every time).

      Comment by Jonathan Lewis — June 17, 2014 @ 8:24 am BST Jun 17,2014 | Reply

  3. Hi Jonathan,

    I copy-n-paste your script also, and just to mention it is missing a semi colon after first insert-select. And the plan is as following:

    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3605824436
    
    --------------------------------------------------------------------------------------
    | Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |       |    23 |  2484 |     6   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID | T1    |    23 |  2484 |     6   (0)| 00:00:01 |
    |   2 |   BITMAP CONVERSION TO ROWIDS|       |       |       |            |          |
    |*  3 |    BITMAP INDEX SINGLE VALUE | T1_B1 |       |       |            |          |
    --------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("VAL" IS NULL)
    

    Regards,

    Comment by jagdeepsangwan — June 17, 2014 @ 8:29 am BST Jun 17,2014 | Reply

    • Thank you, now corrected.

      And now you can look at the statistics on the table to see if they were computed or sampled, and check the stored value for num_nulls for the column and num_rows for the table.

      Comment by Jonathan Lewis — June 17, 2014 @ 8:35 am BST Jun 17,2014 | Reply

      • I think the stats were sampled as the following are the values of num_nulls and num_rows are as following:

        JAGDEEP@css>select num_rows from user_tables where table_name = 'T1';
        
          NUM_ROWS
        ----------
             32115
        
        JAGDEEP@css>select column_name,
          2  num_nulls from user_tab_columns where table_name = 'T1';
        
        COLUMN_NAME                     NUM_NULLS
        ------------------------------ ----------
        VAL                                    23
        N1                                      0
        PADDING                                 0
        

        Regards,

        Comment by jagdeepsangwan — June 17, 2014 @ 9:03 am BST Jun 17,2014 | Reply

        • After i changed the gathering stats proc call and added estimate_percent => 100 it is giving the correct Rows and feedback and num_nulls and num_rows are also correct.
          Regards,

          Comment by jagdeepsangwan — June 17, 2014 @ 9:12 am BST Jun 17,2014

        • You can also check column sample_size from user_tables and user_tab_columns which, in principle, may have different sample sizes for different columns – particularly when there are histograms being gathered. I’m guessing that you’ve probably got sample sizes of around 5,500

          Comment by Jonathan Lewis — June 17, 2014 @ 3:35 pm BST Jun 17,2014

  4. And i have taken a note of your suggestion to check the stats whenever i see something unexpected

    thanks for that

    Regards,

    Comment by jagdeepsangwan — June 17, 2014 @ 8:34 am BST Jun 17,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,909 other followers