Index too big

March 25, 2010

Index too big

Filed under: Infrastructure,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 8:50 pm GMT Mar 25,2010

I thought I’d posted this a couple of years ago – but maybe it was something I put on the Oracle database forum in response to a question. If it was on the forum the same (or similar) question has recently appeared. “How come my index is so big when there’s no data in the table ?”

Of course, the amount of space currently allocated to an object may have nothing to do with the current amount of data stored in the object – after all, you may have executed a “delete tableX” command moments ago – but it’s worth being aware of a little trap that may result in an apparent sizing anomaly all the time. To get things going, consider the following code fragment that creates an indexed table, then does a little data processing of a style that I sometimes see in overnight batch processes:


rem
rem     Script:         index_too_big.sql
rem     Dated:          March 2010
rem     Author:         Jonathan Lewis
rem

create table t1 (v1 varchar2(400));
create index i1 on t1(v1);

begin
        for i in 1..1000 loop
                insert into t1 values (lpad(i,400));
                delete from t1;
        end loop;
        commit;
end;
/

Assume this is in a database with an 8KB blocksize, and the default tablespace is locally managed with uniform 1MB extents using freelist management rather than ASSM (automatic segment space management).

Question

Roughly how many blocks would you expect to be below the table and index highwater marks at the end of execution ?

Answer

Plenty of good answers in the comments below. Here’s a summary of the key points.

Oracle cannot reuse the space made available from deletes until after the commit; however tables and indexes are handled differently.
Tables need only keep a “stub” to represent the row that has been marked for deletion, but indexes have to keep the entire index entry.

As a result, the amount of space “held” until the commit will be much larger in this case for the index than it is for the table.

All 1,000 row stubs will fit into a single 8KB block quite easily, so the table never grows beyond one block. Various people have done arithmetic for the index – and an approximation is good enough. Each entry is about 400 bytes (plus a bit) so we will get about 20 to a leaf block, which means we need about 50 leaf blocks, so we need about 50 branch entries. The branch entries will also be about 400 bytes, so 20 per branch block – so we need three branch blocks and a root block above them.

As we bump the highwater mark on the table and index, though, we may get a few more blocks than we finally use – so the total below the highwater mark may be a little higher. (The purpose of the little puzzle was to highlight the differences between table and index deletion, so I wasn’t planning to mention that originally – but Charles Hooper picked it up in his comment.)

I’ll just close with a couple of trace file dumps from a 9.2.0.8 system after running the code:

Segment header block of the table.
Note “blocks in seg header’s freelists = 1”, “blocks below hwm = 1”. There is one block in the table, and it’s still not full despite having 1,000 row directory entries in it – so it’s still on the freelist.

Start dump data blocks tsn: 12 file#: 11 minblk 9 maxblk 9
buffer tsn: 12 rdba: 0x02c00009 (11/9)
scn: 0x0000.03e52147 seq: 0x01 flg: 0x00 tail: 0x21471001
frmt: 0x02 chkval: 0x0000 type: 0x10=DATA SEGMENT HEADER - UNLIMITED
Extent Control Header
-----------------------------------------------------------------
Extent Header:: spare1: 0 spare2: 0 #extents: 1 #blocks: 127
last map 0x00000000 #maps: 0 offset: 4128
Highwater:: 0x02c0000b ext#: 0 blk#: 1 ext size: 127
#blocks in seg. hdr's freelists: 1
#blocks below: 1
mapblk 0x00000000 offset: 0
Unlocked
Map Header:: next 0x00000000 #extents: 1 obj#: 45209 flag: 0x40000000
Extent Map
-----------------------------------------------------------------
0x02c0000a length: 127

nfl = 1, nfb = 1 typ = 1 nxf = 0 ccnt = 1
SEG LST:: flg: USED lhd: 0x02c0000a ltl: 0x02c0000a
End dump data blocks tsn: 12 file#: 11 minblk 9 maxblk 9

Partial segment header block of the index.
Note that the HWM is at block 63, but we appear to have 53 blocks on the freelist (that’s the empty leaf blocks after the commit, but not the branch blocks). I can’t quite make the numbers add up the way I expected them to, and that’s a detail I may have to look into some day.

Start dump data blocks tsn: 12 file#: 11 minblk 137 maxblk 137
buffer tsn: 12 rdba: 0x02c00089 (11/137)
scn: 0x0000.03e52566 seq: 0x01 flg: 0x00 tail: 0x25661001
frmt: 0x02 chkval: 0x0000 type: 0x10=DATA SEGMENT HEADER - UNLIMITED
Extent Control Header
-----------------------------------------------------------------
Extent Header:: spare1: 0 spare2: 0 #extents: 1 #blocks: 127
last map 0x00000000 #maps: 0 offset: 4128
Highwater:: 0x02c000c9 ext#: 0 blk#: 63 ext size: 127
#blocks in seg. hdr's freelists: 53
#blocks below: 63
mapblk 0x00000000 offset: 0

Treedump of the index – with the “leaf:” lines deleted.
Note the root and three branch blocks; and the 53 leaf blocks (20 + 20 + 13) in total listed by the branches.

----- begin tree dump
branch: 0x2c0008a 46137482 (0: nrow: 3, level: 2)
branch: 0x2c000a5 46137509 (-1: nrow: 20, level: 1)
branch: 0x2c000a6 46137510 (0: nrow: 20, level: 1)
branch: 0x2c000bb 46137531 (1: nrow: 13, level: 1)

Supplementary Question – for those with nothing better to do this weekend

Each time I delete a row from the table its “delete flag” is set. At some point, though, Oracle wants to re-use the space taken by the row and leave just a “stub” to show where the row should be if I roll back. When, and therefore how often, in my test case does Oracle remove the actual rows and replace them with stubs ? (See “Heap Block Compress” for answers.)

Comments (20)

20 Comments »

OK, I’m up for being embarrassed (For those readers in the UK, I feel like Alan Davies on BBC’s “QI”).
I’m going to guess one block in the table and…1000/18 blocks for the index – 55 leaf blocks and one branch block.
Why? I know… Not, let me correct that, I believe, that space in tables is reused pretty aggressively – traditionally if a block went below pctused then the block would be flagged as available for more data in the freelists, so I do know that generally speaking space in table blocks is reused. So the first block is constantly having it’s one record removed, oracle knows it can reuse space and so it does so for the table. It never needs to move onto the next block.
Indexes, the space does not get reused, so the new index value will get inserted further up the block. The old record that was deleted just gets marked somehow as no longer valid. Thing is, I am not sure if this is true of the current block… So I am guessing.
If you were deleting enough data in one go to fill more than one block (even if it was only one and a tiny bit blocks) I would be pretty confident that the freed blocks would not be reused and each delete would leave one whole block left empty each cycle.
Why divide 8k blocks by 18 and not 20, as 20*400bytes is pretty much 8K? Overhead in the index structure using up some space.

In the real world I’d nip off to my test box and try it, but I have not had time to do that yet.

I await correction…

Comment by mwidlake — March 25, 2010 @ 9:22 pm GMT Mar 25,2010 | Reply
- Martin,
  
  Close enough – and with a reasonable explanation of why.
  
  Comment by Jonathan Lewis — March 26, 2010 @ 9:38 pm GMT Mar 26,2010 | Reply
I won’t give the answer away as it’s an interesting question to get people thinking.

However, a follow-up question to get people pondering more on this would be to ask what difference would it make if the commit statement were to be inside the loop, rather than outside the loop at the end.

It’s pretty well the same work after all, the same amount of data being inserted and deleted …

Comment by Richard Foote — March 26, 2010 @ 1:00 am GMT Mar 26,2010 | Reply
- Another interesting question would be what difference would it make if the “commit” outside the loop was a “rollback” – not very realistic, I know, but interesting nevertheless (block splits happen in an autonomous transaction).
  Or if the “commit” was outside the PL/SQL block, albeit this should only play a role in a concurrency situation, if I remember correctly (call freelists).
  
  Comment by Flado — March 26, 2010 @ 1:07 pm GMT Mar 26,2010 | Reply
Question: roughly how many blocks would you expect to be below the table and index high-water marks at the end of execution ?
1 block below HWM in table; around 56 blocks in index (how? assuming default PCTFREE of 10 for index block, each index entry will be 400 bytes(data)+6 bytes(rowid)+some storage for index maintenance ~ 408 bytes; which means around 18 index entries per block. As all DMLs are within single transaction, index space will not be reused so 1000 rows will result in index consuming 55 leaf blocks plus a root block)
what difference would it make if the commit statement were to be inside the loop, rather than outside the loop at the end.
No difference for table size so 1 block below HWM for table. But for index the space will be reused as commit inside loop converts original single transaction into 1000 transactions. As index blocks can be reused across transactions, the end result will be around 19 (18 leaf blocks + 1 root) index blocks below HWM (i.e. index will not grow and have same size as that before the process)

p.s. Disclaimer:- This is being written on a Friday Morning, without referring to any of Richard’s articles or using “Google” skills :), so forgive me for mistakes.

Comment by Narendra — March 26, 2010 @ 8:52 am GMT Mar 26,2010 | Reply
- >assuming default PCTFREE of 10 for index block
  PCTFREE applies to CREATE INDEX only.
  
  Comment by Timur Akhmadeev — March 26, 2010 @ 10:04 am GMT Mar 26,2010 | Reply
  - Glad, I added my “Disclaimer”. :)
    But seriously, I should have remembered about PCTFREE behaviour for index.
    
    Comment by Narendra — March 26, 2010 @ 2:32 pm GMT Mar 26,2010 | Reply
Taking a guess here. 5 blocks for the table below the high watermark, 4 of which will not have been utilized. The number of blocks below the high watermark for the index is not so easy to guess – while there is a sequential number driving the entries, the fact that the sequential numbers are stored in a VARCHAR2 means that some of the block splits could be 90-10 splits while others are 50-50 splits. In this case LPAD is used rather than RPAD, so the alphanumeric order will be the same as the numeric order, so we should be seeing 90-10 index block splits in all cases.

Could the number of blocks be dependent on whether or not a 32 bit or 64 bit platform is used (assuming 9i or earlier)? What about the Oracle release? What about if someone had adjusted the _BUMP_HIGHWATER_MARK_COUNT parameter? Would it make a difference if the index were declared as unique?

Comment by Charles Hooper — March 26, 2010 @ 10:35 am GMT Mar 26,2010 | Reply
- Charles,
  
  You’ve raised a couple of interesting points beyond the answers given by Martin and Narenda.
  
  Definitely worth mentioning that their arithmetic was valid because of the so-called “90/10” splits cause by the monotonic increasing values. I don’t think 32/64 bit would make a difference, nor the version (so long as it was 6 or above)
  
  The comment about _bump_highwater_mark_count is worth a mention. People do fiddle with it occasionally, of course, but the reason for mentioning it is that there could be some blocks that are on the index free list because the HWM has been recently bumped (by its default of 5) which aren’t in the index yet.
  
  In fact, the table is a special case for the HWM – when you start inserting data into a segment it allocates blocks one at a time for the first few before the 5 block “bump” takes over; so in this case there were no extra empty blocks on the free list. (I think this behaviour is just a hangover from the old days when keeping space to a minimum was more important that pausing briefly to allocate a new block).
  
  Comment by Jonathan Lewis — March 26, 2010 @ 10:00 pm GMT Mar 26,2010 | Reply
I would agree with Narenda up to the calculation of the number of index blocks, where I would add to his total the number of branch blocks created, as the index will not fit in a single block. That is the limit of my knowledge though, as I have no idea as to how many leaf block entries will fit in a single branch block. If I had to guess though, I would think the number of index entries in a leaf block would not be an unreasonable number as an estimate of leaf block entries per branch block. So on this rash assumption, I would estimate 55/18 branch blocks, that is 3 say. So my total index blocks would be 56 + 3 = 59.
Finally, I would estimate that there should be one block for the leaf blocks if the commit was placed inside the loop as once a leaf block was filled with deleted entries it would be returned to the free list, and so there would be one block left at the end of the loop containing the last few deleted index entries from the loop. I do not know how many Branch blocks there will be, but I assume it will be one for the very first index entry insertion, as there will never be a second leaf block created, there will be no further brach block entries. So, in Richard’s case I think the Index blocks will be 1 leaf block, 1 branch block and 1 index root block giving 3 blocks.
I will probably wish I had not posted this entry, I await my logic and lack of understanding being torn apart!

Comment by tonysleight — March 26, 2010 @ 10:55 am GMT Mar 26,2010 | Reply
- Tony,
  
  Good to think about the number of branch blocks and the dependency on the size of the entry that has to be in the branch block. In my example the way I defined the index entry requires very large entries in the branch blocks too – Oracle tries to trim branch block entries to the smallest size that will allow it to distinguish the first entry in one leaf block from the largest possible entry in the previous leaf block, but my entries match over almost their entire length.
  
  Comment by Jonathan Lewis — March 26, 2010 @ 10:03 pm GMT Mar 26,2010 | Reply
oh dear – so much more reading for me to do ;o(

Comment by kent — March 30, 2010 @ 11:53 am BST Mar 30,2010 | Reply
Hi Jonathan,

I am still curious about the answers to Richard Foote’s question and Flado’s.
– commit outside the loop
– rollback instead of commit

Regards Hans-Peter

Comment by Hans-Peter — March 30, 2010 @ 5:10 pm BST Mar 30,2010 | Reply
Hans-Peter,

Richard’s question was “what if the commit was inside the loop”. In this case each time you insert the next entry, the previous entry will be wiped out of the index and the space will re-used. If you dump the table block, though, you will see that Oracle stacks the incoming rows into the block’s free space, even though it will keep recycling the block’s row directory so that it never grows and only the last inserted/deleted row is identified and the “heap block compress” (see pingback) happens in the same way.

Flado’s question was “what if you do a rollback outside the loop”. In this case the rollback would actually happen, and the table’s row directory would be reduced to a minimum and the row directories of the index leaf blocks would be emptied, but the leaf blocks would still be in the index structure and the root block would still have entries pointing to the leaf blocks.

Comment by Jonathan Lewis — March 31, 2010 @ 10:14 pm BST Mar 31,2010 | Reply
[…] 10-Difference between deletion from index and table Jonathan Lewis-Index too big […]

Pingback by 19/03 /2010 – 26/03/2010 « Coskan’s Approach to Oracle — May 3, 2010 @ 2:40 am BST May 3,2010 | Reply
[…] files — Jonathan Lewis @ 7:24 pm UTC Mar 30,2010 In a recent note showing how an index could become much larger than the underlying table because of the different ways that Oracle handles deletion from table and index blocks, I pointed […]

Pingback by heap block compress « Oracle Scratchpad — June 29, 2010 @ 9:50 am BST Jun 29,2010 | Reply
[…] – it cannot immediately reuse the space, it has to wait until after the commit. (I posted an example demonstrating this difference a few months ago.) Another critical difference between tables and indexes is that an index […]

Pingback by Fragmentation 4 « Oracle Scratchpad — July 22, 2010 @ 7:01 pm BST Jul 22,2010 | Reply
I`ve done the following:

analyze index i1 validate structure;

And receved those numbers

LF_ROWS,DEL_LF_ROWS from index_stats;

LF_ROWS DEL_LF_ROWS
———- ———–
1000 1000

Could you please explain, what they mean. Especially remembering your description for LF_ROWS in book Practical Oracle8i: “Number of current rows in the index (excluding deleted rows)”

Oracle 11.2.0.2

Comment by Yuri — August 3, 2011 @ 7:15 am BST Aug 3,2011 | Reply
- Yuri,
  
  I think the first thing to say is that you’ve found an error in the book, the lf_rows figures include the del_lf_rows figures. The book was based on 8.1.5, but I don’t think that the calls to index_stats would have changed from 8.1.5 to 8.1.7.4 which is the earliest version I can get my hands on. (I’ve added an item to my list of errata.)
  
  Your stats mean: there are 1,000 entries visible in the row-directories of the leaf blocks (lf_rows=1,000), but every entry has its Deleted flag set (del_lf_rows = 1,000).
  
  Comment by Jonathan Lewis — August 3, 2011 @ 9:17 am BST Aug 3,2011 | Reply
[…] nature of the objects and, as a side effect of the delete/insert cycle, you’re likely to see the indexes grow to roughly twice their optimal size and you may see the statistic “recursive aborts on index […]

Pingback by 12c MView refresh | Oracle Scratchpad — February 15, 2022 @ 12:14 pm GMT Feb 15,2022 | Reply