This note started life as a nutshell until I realised that it was going to be more of a coconut than a hazel nut and decided to turn it into a short series instead. I should manage to post four parts over the next two weeks:
The implication of the word “fragmentation” is that something is broken into pieces, but it also carries an emotional overtone that suggests it’s lots of little pieces. In an Oracle context you need to consider what you mean by “pieces”, the granularity of the pieces, and the possible impact on performance. Since it’s possible to talk about fragmentation at the (logical) disk level, the file level, the tablespace level, the segment level, the extent level, and the block level, it’s necessary to think very clearly about what you’re trying to say when you make a comment like “my tablespace is fragmented” or “my index is fragmented”.
Let’s start with an example: I have created a new tablespace and moved a table into it. When I check dba_extents the table has 100 extents. Clearly it is “fragmented” in the basic sense of the word since it is made of 100 different pieces. On the other hand, because the table was the first thing I created in the tablespace, I can see that all the extents are adjacent – so you could say the table is “logically fragmented” but “physically contiguous”.
Does this example of fragmentation have any impact on the performance your system ? Since most I/O done by Oracle operates at the block level (we read data blocks into the db cache, we write data blocks to files), and the location of the block within any particularly extent is irrelevant, the answer is probably no. But there are times when we try to read multiple adjacent blocks with a single read request (tablescans and index fast full scans); does it matter that our “physically contiguous” table is “logically fragmented” into lots of extents ?
What if the extents are (say) only 64KB each, does this limit the size of the “db file multiblock read” requests that we will be making or can those reads cross extent boundaries ? What if the tablespace is made up of two (or more) files so that the extents generally “round-robin” between files – does this affect the way the reads can operate ? What if we try to do a parallel tablescan -are the restrictions on “direct path reads” different ? If you’re running a datawarehouse that spends a lot of its time doing this type of operation then these are just some of the questions you need to answer. (See, for example, a note I wrote three years ago about some of the anomalies of I/O sizes when running parallel query, and a related enhancement in 11g described by Christian Antognini a couple of years later.)
It’s only after you start to think clearly about what you mean by “fragmentation” that you can begin to understand the possible problems that it can cause and the reasons why it may, or may not, have an impact on your system. In part two I’ll make some comments about the way you should think about fragmentation at the disk level and the tablespace level.