Oracle Scratchpad

January 19, 2012

Quiz Night

Filed under: Indexing,Infrastructure,Oracle — Jonathan Lewis @ 8:51 am BST Jan 19,2012

In my previous post, I made the comment:

In general, if you have a three-column index that starts with the same columns in the same order as the two-column index then the three-column index will be bigger and have a higher clustering_factor.

So what scenarios can you come up with that fall outside the general case ?
Alternatively, what argument could you put forward that justifies the general claim ?

I’ll try to respond to comments on this post a little more quickly than the last one, but I still have quite a lot of other comments to catch up on.

23 Comments »

  1. We can get higher CF if will use large value for pctfree for two-columns index or create it within tablespace with smaller block size.

    Comment by Valentin Nikotin — January 19, 2012 @ 9:06 am BST Jan 19,2012 | Reply

    • i.e. index will be bigger, CF does’t depend on index storing.

      Comment by Valentin Nikotin — January 19, 2012 @ 9:30 am BST Jan 19,2012 | Reply

    • Also three-column index may be rebuilt recently.

      Comment by Valentin Nikotin — January 19, 2012 @ 9:34 am BST Jan 19,2012 | Reply

    • An example how to get smaller values for CF with 3-col index:
      
      Connected to:
      Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
      With the Partitioning, OLAP, Data Mining and Real Application Testing options
      
      SQL> create table t1 (a int, b int, c int, pad varchar2(3000) default lpad('x',3000,'x'));
      
      Table created.
      
      SQL> insert into t1(a, b, c) values(2, 2, 2);
      
      1 row created.
      
      SQL> insert into t1(a, b, c) values(null, null, null);
      
      1 row created.
      
      SQL> insert into t1(a, b, c) values(1, 1, 0);
      
      1 row created.
      
      SQL> insert into t1(a, b, c) values(2, 2, 1);
      
      1 row created.
      
      SQL> commit;
      
      Commit complete.
      
      SQL> create index i1 on t1 (a, b, c);
      
      Index created.
      
      SQL> create index i2 on t1 (a, b);
      
      Index created.
      
      SQL> select i.index_name,
        2         i.clustering_factor,
        3         i.leaf_blocks
        4  from user_indexes i where i.index_name in ('I1','I2');
      
      INDEX_NAME                     CLUSTERING_FACTOR LEAF_BLOCKS
      ------------------------------ ----------------- -----------
      I1                                             2           1
      I2                                             3           1

      Comment by Valentin Nikotin — January 19, 2012 @ 12:41 pm BST Jan 19,2012 | Reply

      • Valentin,

        Another good point – it is possible to find edge cases (particularly with a small number of index entries) where the order of data arrival can have a surprising impact. In your case the effect is also largely dependent on the great length of the table rows.

        Comment by Jonathan Lewis — January 26, 2012 @ 10:32 pm BST Jan 26,2012 | Reply

  2. Hi Jonathan,
    if you use index key compression on the three-column index, it can be smaller and the clustering_factor can vary.

    Regards
    Stefan

    Comment by Stefan — January 19, 2012 @ 11:18 am BST Jan 19,2012 | Reply

    • Clustering_factor does not depend on the way index is stored but only on indexed columns.

      Comment by Valentin Nikotin — January 19, 2012 @ 12:58 pm BST Jan 19,2012 | Reply

      • Hello Valentin,
        sorry i misinterpret it. You are right, that the clustering_factor depends on how the data of the table is sorted in relation to the index and this does not change in case of index key compression.

        Regards
        Stefan

        Comment by Stefan — January 19, 2012 @ 1:34 pm BST Jan 19,2012 | Reply

    • Stefan,
      As Valentin has pointed out, compression won’t affect the clustering_factor, but it will affect the size (number of leaf blocks) in the index – and that does have some impact on the cost of using the index; so if you have different compression on the indexes you can get counter-intuitive results.

      Comment by Jonathan Lewis — January 26, 2012 @ 10:34 pm BST Jan 26,2012 | Reply

  3. Well, if you have a lot of ties for the second column and the third column in context of the key path is well correlated with the physical order of rows in blocks, then I suppose you could get a better cluster factor. (I haven’t tested this – yet.) If Oracle is already ordering by file, block, and row within ties across possibly many leaf blocks spanning the tie, then this wouldn’t help [and points to a possible enhancement if they aren't already doing it.] If it does help, then possibly the reduction in rowid reference key size for partially repeated values could, I suppose, win a trade-off and the extra bit of key length required for the third column. I guess I know what I’ll be doing the next little while…

    I suppose too that the third column’s key could consistently fit in the leaf block space that won’t fit another full row, then the index would be the same size, not bigger.

    I’m not sure why you claim the 3rd key makes the cluster factor worse. Even if Oracle already sorts ties within the second key, the third key could represent an equally good cluster factor, right? Is this just a quibble on the <= versus < boundary? Probably.

    This test will delay the start of my reading your new book, which has arrived.

    Comment by Mark Farnham — January 19, 2012 @ 2:37 pm BST Jan 19,2012 | Reply

  4. If we have index on column A then within the group defined by any fixed value a1 all index entries will be ordered by rowid, thus when we are going through the leaf rows we get minimal possible number of changes of relevant table blocks.

    If we add additional column B into the index, it may change the order of rowid when we are going through the leaf rows, thus the number of changes of relevant table blocks may also change (increase from the minimum).

    So far so good, but when we jump to the next group defined by a2 (…<a1<a2<…), we can get one additional change of table block for 1-col index as compared with 2-col index in the following situation:
    the table has rows with A = a1 and A = a2 in the same block (B1) and also rows with A = a2 in the previous (order by rowid) block (B2),
    but column B might help to order the index entries within group defined by a2 so that they will go first from the block B1 and then from the block B2 for A = a2.
    Thus we will get one less change for the 2-col index as compared with 1-col. Same is for 2- and 3- columns indexes.
    The simplest situation (for 2-col and 3-col indexes) is presented in the previous comment.

    Comment by Valentin Nikotin — January 19, 2012 @ 6:16 pm BST Jan 19,2012 | Reply

    • Valentin,
      I think your first couple of statements summarise the informal argument about why we generally assume a two-column index will have a lower clustering_factor than a three-column index – the inclusion of the rowid in the index entry is very important, so if we have six rows with the same (two-column) key, three rows in each of two blocks (call them B1, B2) the block component of the index entries would be for (B1, B1, B1, B2, B2, B2). Adding an extra column to the index could then very easily change the order of visiting the rows to (B1, B2, B1, B2, B1, B2).

      On the other hand, as you point out, there are patterns of data where an odd synchronsiation of values and block locations could contradict the general intuition.

      Comment by Jonathan Lewis — January 26, 2012 @ 11:10 pm BST Jan 26,2012 | Reply

  5. Jonathan,

    for a bitmap index the number of columns should have no impact on the clustering factor – as it is equal to num_rows. In a b*tree index the clustering factor should remain unchanged if the third column is a constant (or null). Maybe the value of these indexes would be limited…

    Comment by Martin Preiss — January 19, 2012 @ 6:33 pm BST Jan 19,2012 | Reply

    • for the sake of completeness: the CF would be the same, but the three-column indexes would be bigger in both cases

      Comment by Martin Preiss — January 19, 2012 @ 6:36 pm BST Jan 19,2012 | Reply

    • Martin,
      I wasn’t thinking about bitmap indexes when I posed the question, but my first thought is that if (‘x’,’y’) is a key value in the two-column index then it may have one index entry; if you add a third column you may have N index entries like (‘x’,’y’,’a’), (‘x’,’y’,’b’) – with a corresponding increase in the clustering_factor. This won’t change the cost of using the index, of course, since the clustering_factor of a bitmap index isn’t used in the cost.

      Comment by Jonathan Lewis — January 26, 2012 @ 10:40 pm BST Jan 26,2012 | Reply

  6. Following stronger statement is true. For many “not very unique” indexes we can create a function (falsely deterministic) such that if you create the FBI with the columns from the original index and this function will be after them, the clustering factor of the resulting index will be smaller than the original.
    See simple example below.

    create or replace procedure decrease_cf
    (
      i_table_owner varchar2,
      i_table_name varchar2,
      i_index_owner varchar2,
      i_index_name varchar2
    ) authid current_user is
      l_pk_cols varchar2(32767);
      l_pk_cols_arg varchar2(32767);
      l_pk_cols_pred varchar2(32767);
      l_cols varchar2(32767);
      l_cols_pred varchar2(32767);
      l_tbl_name_cf varchar2(30) := substr(i_table_name, 1, 23)||'_4INDCF';
      l_ind_bno varchar2(30) := substr(l_tbl_name_cf,1,26)||'_BNO';
      l_fnc_name_cf varchar2(30) := 'FNC_'||substr(i_table_name, 1, 19)||'_4INDCF';
      l_ind_name_cf varchar2(30) := substr(i_index_name, 1, 27)||'_CF';
      l_cnt number;
      l_stmt_ctas varchar2(32767) := q'{
        create table "$tbl_name_cf$" as
        with T as
        (
          select dbms_rowid.rowid_block_number(rowid) bno,
                 dense_rank()over(order by $col_list$) rn,
                 min(dbms_rowid.rowid_block_number(rowid))over(partition by $col_list$) minbno,
                 max(dbms_rowid.rowid_block_number(rowid))over(partition by $col_list$) maxbno,
                 $col_list$
          from "$tbl_owner$"."$tbl_name$"
          group by dbms_rowid.rowid_block_number(rowid), $col_list$
        )
        select $col_list$, bno
        from t t1
        where minbno <> bno
          and exists (select 1 from t t2 where t2.maxbno = t1.bno and t2.rn = t1.rn - 1)
      }';
      
      l_stmt_fnc varchar2(32767) := q'{
        create or replace function "$fnc_name_cf$" ($pk_cols_arg$) return number deterministic is
          l_res number;
        begin
          select 1 - count(*)
          into l_res
          from "$tbl_owner$"."$tbl_name$" t
             , "$tbl_name_cf$" tc
          where tc.bno = dbms_rowid.rowid_block_number(t.rowid)
                $pk_cols_pred$      
                $cols_pred$;      
          return l_res;
        end;
      }';
      
    begin
      select rtrim(listagg('"'||cc.column_name||'",'
                          )within group(order by cc.position), ','),
             rtrim(listagg('"'||cc.column_name||'" '||
                           '"'||c.owner||'".'||
                           '"'||c.table_name||'".'||
                           '"'||cc.column_name||'"%type,'
                          )within group(order by cc.position), ','),
             listagg('and t."'||cc.column_name||'" = "$fnc_name_cf$"."'||cc.column_name||'" '
                    )within group(order by cc.position)
      into l_pk_cols, l_pk_cols_arg, l_pk_cols_pred
      from all_constraints c
         , all_cons_columns cc
      where c.owner = cc.owner
        and c.table_name = cc.table_name
        and c.constraint_type = 'P'
        and c.owner = i_table_owner
        and c.table_name = i_table_name;
      if l_pk_cols is null then
        dbms_output.put_line('table hasn''t pk');
        return;
      end if;
      --dbms_output.put_line('l_pk_cols = '||l_pk_cols);
      --dbms_output.put_line('l_pk_cols_arg = '||l_pk_cols_arg);
      --dbms_output.put_line('l_pk_cols_pred = '||l_pk_cols_pred);
      
      select rtrim(listagg('"'||ic.column_name||'",')within group(order by ic.column_position), ','),
             listagg('and t."'||ic.column_name||'" = tc."'||ic.column_name||'" '
                    )within group(order by ic.column_position)
      into l_cols, l_cols_pred
      from all_ind_columns ic
      where ic.index_owner = i_index_owner
        and ic.index_name = i_index_name
        and ic.table_owner = i_table_owner
        and ic.table_name = i_table_name;
      if l_cols is null then
        dbms_output.put_line('index not found');
        return;
      end if;
      --dbms_output.put_line('l_cols = '||l_cols);
      --dbms_output.put_line('l_cols_pred = '||l_cols_pred);
      
      for rec in (select 1 from user_indexes where index_name = l_ind_name_cf) loop
        execute immediate 'drop index "'||l_ind_name_cf||'"';
        dbms_output.put_line('index '||l_ind_name_cf ||' is droped');
      end loop;
      for rec in (select 1 from user_procedures where procedure_name = l_fnc_name_cf) loop
        execute immediate 'drop function  "'||l_fnc_name_cf||'"';
        dbms_output.put_line('index '||l_fnc_name_cf ||' is droped');
      end loop;
      for rec in (select 1 from user_tables where table_name = l_tbl_name_cf) loop
        execute immediate 'drop table "'||l_tbl_name_cf||'" purge';
        dbms_output.put_line('table '||l_tbl_name_cf ||' is droped');
      end loop;
      
      l_stmt_ctas := replace (l_stmt_ctas, '$tbl_name_cf$', l_tbl_name_cf);
      l_stmt_ctas := replace (l_stmt_ctas, '$tbl_owner$', i_table_owner);
      l_stmt_ctas := replace (l_stmt_ctas, '$tbl_name$', i_table_name);
      l_stmt_ctas := replace (l_stmt_ctas, '$col_list$', l_cols);  
      --dbms_output.put_line(l_stmt_ctas);
      execute immediate l_stmt_ctas;  
      execute immediate 'create index "'||l_ind_bno||'" on "'||l_tbl_name_cf||'"(bno)';
      
      execute immediate 'select count(*) from "'||l_tbl_name_cf||'"' into l_cnt;
      -- dbms_output.put_line('table '||l_tbl_name_cf ||'is created with '||l_cnt||' rows');
      if l_cnt = 0 then
        dbms_output.put_line('index can''t be improved');
        return;
      end if;
      
      l_stmt_fnc := replace (l_stmt_fnc, '$pk_cols_arg$', l_pk_cols_arg);
      l_stmt_fnc := replace (l_stmt_fnc, '$tbl_owner$', i_table_owner);
      l_stmt_fnc := replace (l_stmt_fnc, '$tbl_name$', i_table_name);
      l_stmt_fnc := replace (l_stmt_fnc, '$tbl_name_cf$', l_tbl_name_cf);
      l_stmt_fnc := replace (l_stmt_fnc, '$pk_cols_pred$', l_pk_cols_pred);
      l_stmt_fnc := replace (l_stmt_fnc, '$cols_pred$', l_cols_pred);
      l_stmt_fnc := replace (l_stmt_fnc, '$fnc_name_cf$', l_fnc_name_cf);
      --dbms_output.put_line(l_stmt_fnc);
      execute immediate l_stmt_fnc;
      
      execute immediate 'create index "'||l_ind_name_cf||'" on "'
                        ||i_table_owner||'"."'||i_table_name||'" ('
                        ||l_cols||', "'||l_fnc_name_cf||'"('||l_pk_cols||'))';         
         
      dbms_stats.gather_table_stats('"'||i_table_owner||'"', '"'||i_table_name||'"', 
                                    estimate_percent => 100, cascade => true);
     
      for rec in (select i.owner, i.index_name, i.clustering_factor 
                  from all_indexes i 
                  where i.index_name in (l_ind_name_cf, i_index_name)) loop
        dbms_output.put_line(rpad(rec.owner||'.'||rec.index_name, 62, ' ')||rec.clustering_factor);
      end loop;
    
    end;
    /
    SQL> create table t_test_cf(id primary key, a) as
      2  select level, mod(dbms_random.random, 1000)
      3  from dual connect by level <= 1e4;
    
    Table created.
    
    SQL> create index i_test_cf on t_test_cf(a);
    
    Index created.
    
    SQL> exec decrease_cf(user, 'T_TEST_CF', user, 'I_TEST_CF')
    TEST_USER.I_TEST_CF                                           8716
    TEST_USER.I_TEST_CF_CF                                        8357
    
    PL/SQL procedure successfully completed.

    Comment by Valentin Nikotin — January 19, 2012 @ 11:06 pm BST Jan 19,2012 | Reply

    • –°orrect typos and slightly improve the function

      create or replace procedure decrease_cf
      (
        i_table_owner varchar2,
        i_table_name varchar2,
        i_index_owner varchar2,
        i_index_name varchar2
      ) authid current_user is
        l_pk_cols varchar2(32767);
        l_pk_cols_arg varchar2(32767);
        l_pk_cols_pred varchar2(32767);
        l_cols varchar2(32767);
        l_cols_pred varchar2(32767);
        l_tbl_name_cf varchar2(30) := substr(i_index_name, 1, 23)||'_4INDCF';
        l_ind_bno varchar2(30) := substr(l_tbl_name_cf,1,26)||'_BNO';
        l_fnc_name_cf varchar2(30) := 'FNC_'||substr(i_index_name, 1, 19)||'_4INDCF';
        l_ind_name_cf varchar2(30) := substr(i_index_name, 1, 27)||'_CF';
        l_stmt_ind_cf varchar2(32767);
        l_cnt number;
        l_stmt_ctas varchar2(32767) := q'{
          create table "$tbl_name_cf$" as
          with T as
          (
            select dbms_rowid.rowid_block_number(rowid) bno,
                   dense_rank()over(order by $col_list$) rn,
                   min(dbms_rowid.rowid_block_number(rowid))over(partition by $col_list$) minbno,
                   dense_rank()over(partition by $col_list$ order by dbms_rowid.rowid_block_number(rowid) desc) rn2,
                   $col_list$
            from "$tbl_owner$"."$tbl_name$"
            group by dbms_rowid.rowid_block_number(rowid), $col_list$
          )
          , T1 as
          (
            select rn, max(decode(rn2, 2, bno)) maxbno2, max(decode(rn2, 1, bno)) maxbno1
            from T
            group by rn
          )
          , REQ(rn, bno, maxbno) as
          (
            select rn, null bno, maxbno1 maxbno
            from T1
            where rn = 1
            union all
            select T1.rn, T.bno, decode(T.bno, T1.maxbno1, T1.maxbno2, T1.maxbno1) maxbno
            from REQ r
            join T1 on T1.rn = r.rn + 1
            left join T on T.rn = T1.rn and T.bno = r.maxbno and T.bno <> T.minbno          
          )
          select $col_list$, bno
          from T
          where exists (select 1 from REQ where REQ.rn = T.rn and REQ.bno = T.bno)
        }';
        
        l_stmt_fnc varchar2(32767) := q'{
          create or replace function "$fnc_name_cf$" ($pk_cols_arg$) return number deterministic is
            l_res number;
          begin
            select 1 - count(*)
            into l_res
            from "$tbl_owner$"."$tbl_name$" t
               , "$tbl_name_cf$" tc
            where tc.bno = dbms_rowid.rowid_block_number(t.rowid)
                  $pk_cols_pred$      
                  $cols_pred$;      
            return l_res;
          end;
        }';
        
      begin
        select rtrim(listagg('"'||cc.column_name||'",'
                            )within group(order by cc.position), ','),
               rtrim(listagg('"'||cc.column_name||'" '||
                             '"'||c.owner||'".'||
                             '"'||c.table_name||'".'||
                             '"'||cc.column_name||'"%type,'
                            )within group(order by cc.position), ','),
               listagg('and t."'||cc.column_name||'" = "$fnc_name_cf$"."'||cc.column_name||'" '
                      )within group(order by cc.position)
        into l_pk_cols, l_pk_cols_arg, l_pk_cols_pred
        from all_constraints c
           , all_cons_columns cc
        where c.owner = cc.owner
          and c.table_name = cc.table_name
          and c.constraint_type = 'P'
          and c.owner = i_table_owner
          and c.table_name = i_table_name;
        if l_pk_cols is null then
          dbms_output.put_line('table hasn''t pk');
          return;
        end if;
        --dbms_output.put_line('l_pk_cols = '||l_pk_cols);
        --dbms_output.put_line('l_pk_cols_arg = '||l_pk_cols_arg);
        --dbms_output.put_line('l_pk_cols_pred = '||l_pk_cols_pred);
        
        select rtrim(listagg('"'||ic.column_name||'",')within group(order by ic.column_position), ','),
               listagg('and t."'||ic.column_name||'" = tc."'||ic.column_name||'" '
                      )within group(order by ic.column_position)
        into l_cols, l_cols_pred
        from all_ind_columns ic
        where ic.index_owner = i_index_owner
          and ic.index_name = i_index_name
          and ic.table_owner = i_table_owner
          and ic.table_name = i_table_name;
        if l_cols is null then
          dbms_output.put_line('index not found');
          return;
        end if;
        --dbms_output.put_line('l_cols = '||l_cols);
        --dbms_output.put_line('l_cols_pred = '||l_cols_pred);
        
        for rec in (select 1 from user_indexes where index_name = l_ind_name_cf) loop
          execute immediate 'drop index "'||l_ind_name_cf||'"';
          dbms_output.put_line('index '||l_ind_name_cf ||' is droped');
        end loop;
        for rec in (select 1 from user_procedures where procedure_name = l_fnc_name_cf) loop
          execute immediate 'drop function  "'||l_fnc_name_cf||'"';
          dbms_output.put_line('index '||l_fnc_name_cf ||' is droped');
        end loop;
        for rec in (select 1 from user_tables where table_name = l_tbl_name_cf) loop
          execute immediate 'drop table "'||l_tbl_name_cf||'" purge';
          dbms_output.put_line('table '||l_tbl_name_cf ||' is droped');
        end loop;
        
        l_stmt_ctas := replace (l_stmt_ctas, '$tbl_name_cf$', l_tbl_name_cf);
        l_stmt_ctas := replace (l_stmt_ctas, '$tbl_owner$', i_table_owner);
        l_stmt_ctas := replace (l_stmt_ctas, '$tbl_name$', i_table_name);
        l_stmt_ctas := replace (l_stmt_ctas, '$col_list$', l_cols);  
        --dbms_output.put_line(substr(l_stmt_ctas, 1, 1000));
        --dbms_output.put_line(substr(l_stmt_ctas, 1001));
        --return;
        execute immediate l_stmt_ctas;  
        execute immediate 'create index "'||l_ind_bno||'" on "'||l_tbl_name_cf||'"(bno)';
        
        execute immediate 'select count(*) from "'||l_tbl_name_cf||'"' into l_cnt;
        dbms_output.put_line('table '||l_tbl_name_cf ||' is created with '||l_cnt||' rows');
        if l_cnt = 0 then
          dbms_output.put_line('index can''t be improved');
          return;
        end if;
        
        l_stmt_fnc := replace (l_stmt_fnc, '$pk_cols_arg$', l_pk_cols_arg);
        l_stmt_fnc := replace (l_stmt_fnc, '$tbl_owner$', i_table_owner);
        l_stmt_fnc := replace (l_stmt_fnc, '$tbl_name$', i_table_name);
        l_stmt_fnc := replace (l_stmt_fnc, '$tbl_name_cf$', l_tbl_name_cf);
        l_stmt_fnc := replace (l_stmt_fnc, '$pk_cols_pred$', l_pk_cols_pred);
        l_stmt_fnc := replace (l_stmt_fnc, '$cols_pred$', l_cols_pred);
        l_stmt_fnc := replace (l_stmt_fnc, '$fnc_name_cf$', l_fnc_name_cf);
        --dbms_output.put_line(l_stmt_fnc);
        execute immediate l_stmt_fnc;
        
        l_stmt_ind_cf := 'create index "'||l_ind_name_cf||'" on "'
                          ||i_table_owner||'"."'||i_table_name||'" ('
                          ||l_cols||', "'||l_fnc_name_cf||'"('||l_pk_cols||'))';
        
        execute immediate l_stmt_ind_cf;
        
        dbms_output.put_line('index created:');
        dbms_output.put_line(l_stmt_ind_cf);
           
        dbms_stats.gather_table_stats('"'||i_table_owner||'"', '"'||i_table_name||'"', 
                                      estimate_percent => 100, cascade => true);
       
        for rec in (select i.owner, i.index_name, i.clustering_factor 
                    from all_indexes i 
                    where i.index_name in (l_ind_name_cf, i_index_name)) loop
          dbms_output.put_line(rpad(rec.owner||'.'||rec.index_name, 62, ' ')||rec.clustering_factor);
        end loop;
      
      end;
      /

      Comment by Valentin Nikotin — January 20, 2012 @ 6:54 am BST Jan 20,2012 | Reply

  7. For a bulk loaded table the CF for the 2-column table should be at least as good as the CF of the 3-column table.
    If the data is not bulk loaded and the third column increases for each insert (for example insertion date) I would expect a better CF for the 3-column table.

    Comment by Michael vR — January 20, 2012 @ 9:11 am BST Jan 20,2012 | Reply

  8. Guess it’s all a question of math. 2 columns index, combination of two values. 3 Columns index, combination of 3 values, taken by three. There goes your clustering factor…the last column increases the number of combinations possible. Of course compression will save something,but compression is used (correct me if i’m wrong) for the prefix, not the sufix, meaning:
    in a two column index, you can compress the first column
    in a three column index (depending on the density of the second column) you can compress the first two columns.

    Still, that will not be the deciding factor.
    Fact is: third column value has to be stored in the index. Clustering factor increases due to the number of combinations.
    Maybe i’m wrong, but that’s how i see it.
    Alex

    Comment by Alexandru Ersenie — January 24, 2012 @ 11:24 am BST Jan 24,2012 | Reply

  9. Jonathan,
    I’m working on a test case scenario (looks tricky!)
    I posted something recently

    http://srivenukadiyala.wordpress.com/2011/12/25/optimizer-might-ignore-a-more-suitable-superset-composite-index/

    regards
    srivenu

    Comment by srivenu kadiyala — January 25, 2012 @ 11:16 am BST Jan 25,2012 | Reply

    • Srivenu,

      Your blog highlights an interesting side-effect of the flaw in the optimizer’s model when it comes to using the clustering_factor. (It also fit quite nicely with a little posting I did recently about enhancing index statistics to include figures about every prefix combination.)

      Comment by Jonathan Lewis — January 26, 2012 @ 11:00 pm BST Jan 26,2012 | Reply

  10. I could think of a test case, where a 2 column global index on a partitioned table could have a higher clustering factor than a super set 3 column local index. (I dont have access to an oracle database right now to test it out).
    srivenu

    Comment by srivenu kadiyala — January 25, 2012 @ 12:59 pm BST Jan 25,2012 | Reply

    • Srivenu,
      The mixture of local and global indexes is certainly a case where the side effects on clustering_factor can be very counter-intuitive.

      On a completely different tack, creating a hash-partitioned index on a non-partitioned table could also result in less contention for popular blocks on insertion, which can avoid a bug that causes indexes to become much bigger than they need to. There’s lots of room for discusion when looking at partitioned indexes.

      Comment by Jonathan Lewis — January 26, 2012 @ 10:56 pm BST Jan 26,2012 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,985 other followers