Oracle Scratchpad

February 28, 2010

Index Efficiency 2

Filed under: Indexing,Troubleshooting — Jonathan Lewis @ 3:39 pm BST Feb 28,2010

Some time ago I wrote a note on index efficiency that referenced, in one of the comments, an article on my old website about how you could use existing statistics to estimate the “correct” size of an index.

A little while ago, while applying the concept on a client site and implementing a mechanism that would allow the DBAs to do the same check as regularly as needed, I discovered a couple of flaws in the original code – so I had to spend a little time correcting it,and I’ve finally published the latest version here.

The basic concept is unchanged:

The volume of user data stored in a B-tree index – the space needed for the basic column values – can be found from the column statistics in views user_tab_columns and user_tables (or their ‘dba_’ or ‘all_’ equivalents).

The number of entries in the index is given in the view user_indexes (or equivalent) – and for each index entry we need to allow for some well-known “overheads”:

  • Rowid – 6 bytes for a unique index, 7 for a non-unique, 10 (or 11) for a global index on a partitioned table.
  • lock byte – one byte
  • flag byte – one byte
  • row directory entry – two bytes

There are a couple of defects and rounding errors built into these assumptions, of course, but if the index statistics are reasonably up to date we can get a fairly accurate estimate of the expected size for most indexes. There are several other notes included with the script – do read them before you try using it so that you have an idea of how the results are derived and what they mean .

One important point: if you do find an index that seems to be much  bigger than it should be, don’t rebuild it or coalesce it until you’ve given yourself a little time to think about why it got into that state, and what else you might infer from the information available. (Some scripts to help with this decision are in the Script Catalogue.)

11 Comments »

  1. Apologies for moving this thread into the realms of actual issues with indexes, it seems like an interesting diversion (to me) but feel free to remove if misplaced.

    We have a series of perhaps ten materialized views in a 10.2.0.3 database which are refreshed frequently (between five and fifteen minutes) via the automatic dbms_refresh.refresh job created implicitly when the materialized view was created.

    The materialized view storage tables have associated indexes, both tables and indexes are in the same tablespace, LMT with initial 64K, allocation type ‘SYSTEM’, segment_space_management ‘AUTO’.

    The materialized view storage tables appear relatively constant in size (reflecting the slowly changing data they contain) at around 10-30M or so.

    However the associated indexes do not remain constant in size, instead they appear to continually acquire extents and (presumably) because of the allocation type the extent size increases through 1M, 8M, 64M, resulting in indexes of up to 4G (before they are detected by free space alerting mechanisms). Rebuilding these indexes returns them to an expected size (a fraction of the table size).

    Is this as simple as a misinformed combination of mv refresh / storage parameters?

    Comment by padders — March 1, 2010 @ 1:43 pm BST Mar 1,2010 | Reply

    • Padders,

      This looks like it may be one of the bugs associated with ASSM and the difficulty of choosing a moment to update the bitmaps to show that a block is free.

      Any updates which do a massive delete followed by inserts run a risk of making indexes bigger than they need to be. This is because Oracle cannot re-use the space in an index block until the transaction has committed – even for rows inserted by the same transaction (except for one special case). The same thing happens with tables, in fact, but the mechanics are different and the effect is much less noticeable. (See http://jonathanlewis.wordpress.com/2009/05/20/lunchtime-quiz/ for more details.)

      Since you’re on 10g with materialized views, I’d guess that you are doing atomic refreshes, which means Oracle deletes everything from the MVs and their indexes, then re-inserts, then commits – which fits a critical pattern.

      You could experiment by putting a couple of your indexes into a tablespace using freelist management to see the problem “magically” disappears. Alternatively just schedule a “coalesce” on all the problem indexes immediately after the refresh.

      Comment by Jonathan Lewis — March 2, 2010 @ 8:46 pm BST Mar 2,2010 | Reply

  2. Hi,

    Did you compare your estimation of index size with the one Oracle provides when doing explain plan on the create index statement ?

    Here is an example to get estimated size for all current user indexes:

    begin
     for r in ( select index_name from user_indexes ) loop
      execute immediate 'explain plan set statement_id='''||r.index_name||''' for '||
      dbms_metadata.get_ddl('INDEX',r.index_name);
     end loop;
    end;
    /
    select object_owner,object_name
     ,cast(extractvalue(xmltype(other_xml),'/other_xml/info[@type="index_size"]') as number) estimated_index_size 
    from plan_table p where id=1;
    

    Regards,
    Franck.

    Comment by Franck Pachot — March 1, 2010 @ 8:36 pm BST Mar 1,2010 | Reply

  3. hi Jonathan,

    nice post ,just out of curiousity do you mind sharing what was the flaws in the original script and what are the defects we should take note in the calculation

    Thanks

    Comment by josh — March 2, 2010 @ 6:42 am BST Mar 2,2010 | Reply

    • Josh,

      The main flaws are in the notes in the script. The biggest, perhaps, is that it doesn’t cope with compression at all. There’s also a problem with rounding errors and approximations when you have a lot of nulls in the columns that make up the index. It’s very easy to get an approximation which is much larger than the final index would actually be.

      Comment by Jonathan Lewis — March 2, 2010 @ 8:53 pm BST Mar 2,2010 | Reply

  4. [...] Efficiency 3 Filed under: Uncategorized — Jonathan Lewis @ 9:20 pm UTC Mar 3,2010 My last posting about indexes was an introduction to a pl/sql script that estimated the sizes that your indexes would be [...]

    Pingback by Index Efficiency 3 « Oracle Scratchpad — March 3, 2010 @ 9:21 pm BST Mar 3,2010 | Reply

  5. [...] 12-How to estimate index size with a simple script when stats are available? Jonathan Lewis-Index Efficiency-2 [...]

    Pingback by Blogroll Report 26/02/2010 – 05/03/2010 « Coskan’s Approach to Oracle — March 25, 2010 @ 1:49 pm BST Mar 25,2010 | Reply

  6. Hi Jonathan,
    We are upgrading from 9.2.0.6 to 11gR2 and I was assigned the task of redesigning the index rebuild process. As you have mention at several places across your blog, I work for a company where IT managers force the DBA to have regular index rebuilds. At present the indexes are chosen as per what oracle states %deleted >=30 and height>3.

    It takes us approx 8 hours!! to complete the ‘validate structur’e for all indexes (on a daily clone).

    Your script caught my attention and I’m trying to use it ti identify worthy candidates for rebuild. On this regard, could you please elaborate on the relevance of the “scaling factor” used?

    I’m also trying to decide on what to rebuild and what all to coalesce. Any inputs you have will be great!

    Jithin

    Comment by Jithin Sarath — September 1, 2010 @ 11:48 am BST Sep 1,2010 | Reply

    • Jonathan,
      Apologies. I now understand the use of “scaling factor”. It’s been a very long week and I just could not concentrate. Spending some time with the family seems to have helped and I was now casually going thro the code – and it hit me on the face :).

      However, the doubt on coalesce still stands though.

      Comment by Jithin Sarath — September 1, 2010 @ 2:24 pm BST Sep 1,2010 | Reply

      • Jithin,

        I’ve added a link to the Script Catalogue to the end of the article. There are two scripts there that can help you identify indexes where a coalesce would be sufficient; both do a lot of work since they read the index one block at a time, but they don’t cause any locking. One of them generates a statistical report of leaf block usage, the other “draws a picture” of the index leaf level.

        Comment by Jonathan Lewis — September 1, 2010 @ 5:01 pm BST Sep 1,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,087 other followers