Oracle Scratchpad

August 7, 2009

Index Fragmentation

Filed under: fragmentation,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 6:07 pm BST Aug 7,2009

Here’s a thought for the weekend:

When people talk about “index fragmentation”, what do they mean, and why do they care ?

I often see email or forum posts from people claiming that their indexes are fragmented and need to be rebuilt – but they rarely explain (even when asked) what they mean by “fragmented”, and how they have measured the “fragmentation”, and why they think they have evidence that the index needs to be rebuilt.

So if you are accustomed to talking about indexes being “fragmented”, would you let me know what you mean, and how you measure “fragmentation”. (I can think of three or four interpretations for the term – but I’m interested to hear from people who actually use it.)

[Further reading on Fragmentation]

20 Comments »

  1. I think most people just mean that there’s too much deleted entries, or that it’s too big compared to the number of entries. How they measure it is beyond me.

    I know that when I say a TABLE is fragmented, I mean that it’s actual size is way bigger than avg_row_len*numrows.

    Thanks bye ;)

    Comment by Daniel Stolf — August 7, 2009 @ 7:36 pm BST Aug 7,2009 | Reply

  2. I’m not using this term ;) Here is how I can interpret “fragmented index”:
    1) there are many leaf blocks with “holes” which would never be reused
    2) many index segment extents are under-utilized, i.e. there big number of unused free blocks
    3) indexed data values are in fragments, i.e. data is sparsed when ordered, with points of very high and low density

    Comment by Timur Akhmadeev — August 7, 2009 @ 8:06 pm BST Aug 7,2009 | Reply

  3. I agree with Daniel.
    Sometime ago it was a formula with DEL_LF_ROWS and LF_ROWS and IIRC index could be called fragmented if it was more then 20 % of delete rows comparing to LF_ROWS.
    Now I have found in Metalink note 30405.1 that

    “As a rule of thumb if 10-15% of the table data changes, then you should
    consider rebuilding the index.”

    regards,
    Marcin

    Comment by Marcin Przepiorowski — August 7, 2009 @ 8:06 pm BST Aug 7,2009 | Reply

    • Marcin,

      From time to time I use the feedback links on Metalink notes to point out errors and omissions – but somehow there’s always another note waiting to crawl out of the woodwork.

      I can’t help noticing that the note you’ve quoted was: “Checked for relevance on 13-Jun-2007″, and yet the “Related Documents” are both Oracle 8 manuals, and when you follow the links to the two metalink notes referenced you find:

      Note 33343.1: “In Oracle V6 the validate index command …” from a note “Checked for relevance on 23-Jun-2008″.

      Note 1019722.6: Date Created 16-Jul-1996, for versions 7.3 to 11.1 – but hasn’t caught up with the structure of index_stats for 9i.

      Comment by Jonathan Lewis — August 10, 2009 @ 10:07 pm BST Aug 10,2009 | Reply

      • Jonathan,

        Thanks for your feedback. That’s true that note itself is very old but what do you think about using ratio between DEL_LF_ROWS and LF_ROWS columns as a indicator to rebuild a index ?

        Richard Foote has written in his blog http://richardfoote.wordpress.com/2008/02/08/index-rebuild-vs-coalesce-vs-shrink-space-pigs-3-different-ones/ that index coalesce will be more efficient from resource usage point of view but he was writing about deleting about 10 % of rows from table.

        So in that case if coalesce can be more efficient Metalink rule of thumb is useless, but what about deleting about 50 % of blocks ? Of course it’s depend as always in Oracle but I’m just looking for something which can be used a trigger to take a look on particular index.

        ps.
        Thanks God there is still google and thefreedictionary.com to translate all English idioms ;)

        regards,
        Marcin Przepiorowski

        Comment by Marcin Przepiorowski — August 11, 2009 @ 8:08 am BST Aug 11,2009 | Reply

        • Marcin,
          Sorry about the “crawling out of the woodwork” idiom – when writing quickly I don’t always remember to edit for EFL (English as a foreign language) readers.

          I think the del_lf_rows vs. lf_rows comparison might be of some use to a new DBA (or visiting consultant) who is trying to identify interesting spots in a badly documented system – but if you have a process that has a pattern of deletes and inserts that makes that ratio useful, then you should have a proper coding strategy embedded in the routine that does the deletes anyway. (And the “validate index” command locks the table anyway, so isn’t appropriate to a lot of systems these days).

          So – you need a strategy to get from ignorance to familiarity, and something this code (which has an error that I will correct eventually) is a “brute force” way of discovering interesting items – but once you know which indexes are possible threats, you shouldn’t need to use the method ever again.

          Comment by Jonathan Lewis — August 11, 2009 @ 8:21 pm BST Aug 11,2009

        • Jonathan,

          Thanks once again for you answer.
          My point of view about del_lf_rows vs lf_rows is very close to your description
          and your alternative method to estimate a proper index size is as always brilliant.

          ps.
          It is a good way to learn English by EFL readers to read native blogs – so don’t edit your posts ;)

          regards,

          Comment by Marcin Przepiorowski — August 12, 2009 @ 8:11 am BST Aug 12,2009

  4. I’ve often wondered if people asking such things have sql server in mind, so I googled for fragmented index sql server. It seems that it is common enough there to have magical DBA incantations to deal with it, varying by version. So I guess it must be a problem.

    Comment by joel garry — August 7, 2009 @ 9:28 pm BST Aug 7,2009 | Reply

    • In sql server it is assumed to be a problem when using the index for the equivilent of range scans esspecially where results are to be ordered on the indexed values. Th eproblem I have is I don’t have the toolset to prove/disprove that there is actually a performance problem with fragmented indexes in sql server. A lot of DBA’s will rebuild/defragment sql server indexes simply because its recomended best practice.

      Comment by Chris_c — August 10, 2009 @ 12:40 pm BST Aug 10,2009 | Reply

  5. Here is my understanding:
    Index Fragmentation is free and available space within the index structure which can not be used for future inserts.
    This fragmented space can be cleaned up by coalesce or rebuild.
    I have used COALESCED option (can be done online with no additional space required) only and not REBUILD on indexes to reclaim free space.
    For identifying candidates we can use analyze index index_name validate structure;– run during planned maintenance window if production object and see how much percentage of the allocated index space is actually used in the index_stats view.
    Locally Managed Tablespaces should be used to avoid fragmentation in segments.
    Keep up the good work. You are great!
    Regards
    Rakesh

    Comment by Rakesh — August 10, 2009 @ 2:41 am BST Aug 10,2009 | Reply

    • Rakesh,

      I’ve just realised that you’ve used the “fragmented” word about tablespaces and segments – would you care to try your hand at giving a detailed description about what you mean by fragmentation in that context, why you think it matters, how you would detect the problem, and what symptoms you would expect to see that prove that it’s a problem.

      Comment by Jonathan Lewis — August 11, 2009 @ 8:23 pm BST Aug 11,2009 | Reply

      • Jonathan,

        I mentioned “Locally Managed Tablespaces should be used to avoid fragmentation in segments.” :

        By using the Dictionary managed tablespaces and not correctly considering the extent sizes, chances of fragmentation are more in the tablespaces.
        Oracle recommends use of locally managed tablespaces and Auto-allocate Extent Management/ Uniform Extent Management to avoid fragmentation.

        Also, index segments get fragmented due to repeated manipulations of the balanced tree structures and incorrect setting of storage parameters like PCTFREE/ PCTUSED.

        Below are some of the documents helped in my understanding of the same:

        1. http://www.oracle.com/technology/products/manageability/database/pdf/ow04/1241_minhas.pdf
        2. http://www.oracle.com/technology/deploy/availability/pdf/defrag.pdf
        3. metalink note 186826.1 Various Aspects of Fragmentation

        Regards
        Rakesh

        Comment by Rakesh — August 12, 2009 @ 4:28 am BST Aug 12,2009 | Reply

        • Rakesh,

          I hope you won’t think I’m picking on you – but your contribution is useful because it demonstrates how sometimes a concept gets detached from any real meaning.

          In your original comment, you said of “index fragmentation” that it was:
          “free and available space within the index structure which can not be used for future inserts”
          That’s a pretty good description – except: (a) all space with the index (leaf) structure is always available, so “and available” is redundant; (b) all space within the index (leaf) structure can be used for future inserts; and (c) the failure to re-use that space is a feature of the application, not of the index.

          So, for indexes, I like to see people stop talking about “fragmentation” and start saying things like “I have an index with (a lot of) available space that my application doesn’t want to use.”

          Your comments about “repeated manipulation of the balanced tree structure” look like a quote and the comments about pctused (it’s ignored for indexes) and pctfree (it applies only when you create/coalesce/rebuild the index) are particularly irrelevant. Note that every single insert or delete from an index may manipulate the tree structure.

          As far as “fragmentation in segments” is concerned, you’ve now switched this to “…fragmentation [are more] in the tablespaces.” But you haven’t explained what you mean by fragmentation and why it matters. (Given the articles you list, I think I know what you have in mind – and again I’d rather not see the “F” word used as a label for it.)

          Funnily enough I discussed “Tablespace Fragmentation” in a presentation I did in November 1996.

          Comment by Jonathan Lewis — August 12, 2009 @ 6:21 am BST Aug 12,2009

  6. This article seems to say it all: http://www.jlcomp.demon.co.uk/14_index_rebuild_i.html

    ;)

    Comment by Nigel — August 10, 2009 @ 8:24 am BST Aug 10,2009 | Reply

  7. I really wasn’t interested in what person A thinks person B means when person B says “my index is fragmented”. I wanted a few people (person Bs) who actually do say it to explain what they mean when they say it and, if possible, to explain how they detect the phenomenon.

    Comment by Jonathan Lewis — August 10, 2009 @ 10:10 pm BST Aug 10,2009 | Reply

  8. [...] “Here’s a thought for the weekend . . . ” When people talk about ‘index fragmentation’, what do they mean, and why do they care?  . . . ”would you let me know [...]

    Pingback by Log Buffer #158: a Carnival of the Vanities for DBAs | Pythian Group Blog — August 14, 2009 @ 5:47 pm BST Aug 14,2009 | Reply

  9. [...] Jonathan Lewis sobre Index Fragmentation, en SAP hay que tener cuidado que hay gente que te manda reorganizar un indice a la [...]

    Pingback by Entradas de Oracle semana 33 « Gruñidos sobre Oracle y SAP — August 16, 2009 @ 6:43 pm BST Aug 16,2009 | Reply

  10. I don’t like “fragmentation” in discussions about indices; there are enough arguments in previous posts.
    But as there is a question about “fragmentation” per se, the 1st time I was confronted with it (in my PC-life) was fragmentation on an old DOS (I guess it was 3.0) FAT filesystem. In this context, “fragmentation” was the level of non-continuous physical blocks belonging to a file.
    It’s hard to find equivalences in oracle structures: sequential vs. scattered reads (which could be a measurement) are affected by block size, segments and blocks in the buffer cache (at least), and as they are limited by segment-boundaries, it doesn’t matter if segments are in sequence or not. (Ok, on a single disk, it does matter, but I hardly can tell from within oracle DB only).
    Maybe this can be used against the misuse of “fragmentation” in discussions.

    Comment by Martin Berger — August 26, 2009 @ 6:38 pm BST Aug 26,2009 | Reply

    • Martin,

      If I gave prizes for comments I think I’d give a prize for this one.

      When I posted the original note I wanted people to think about what they meant by talking about an index being “fragmented” – and then to realise that (as you suggest) it’s a very bad term to use about indexes.

      Rakesh made the most important point – though with some errors: sometimes an index accumulates free space that is unlikely to be reused (at least in the near future) and results in queries doing more work than is really necessary. If this free space can be eliminated safely and cost-effectively then it is worth taking some time to notice the effect and do something about it.

      Using an unsuitable and misleading term like “fragmented” stops people from thinking carefully about what the problem is, why it happens, and what to do about it.

      Comment by Jonathan Lewis — August 26, 2009 @ 7:14 pm BST Aug 26,2009 | Reply

  11. [...] For ideas from other readers, you might also like to read the comments supplied in response to the question I posed last year: “What do you mean by index fragmentation ?”  [...]

    Pingback by Fragmentation 4 « Oracle Scratchpad — July 22, 2010 @ 7:02 pm BST Jul 22,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,505 other followers