Jump to update for new format details
Here’s a mechanism for examining indexes in some detail if you think that something odd may be going on inside them. It’s a feature that I first decribed in Practical Oracle 8i, although the book doesn’t mention a problem with it that I subsequently discovered that means I always test it carefully on a small index before I use it on a large one. Here’s a demonstration – cut from an SQL*Plus session on 10.2.0.3:
SQL> select object_id 2 from user_objects 3 where object_name = 'ORD_PLACED' 4 ; OBJECT_ID ---------- 100466 1 row selected. SQL> alter session set events 'immediate trace name treedump level 100466'; Session altered.
The treedump dumps a trace file with a summary of the index (or index partition) whose object_id (not data_object_id) you have specified in the level. In general you get one line in the trace file for each block in the index – but there are some versions of Oracle where you get a full symbolic dump for every leaf block, hence the need to check with a small index before trying it on a large one.
The dump will look something like the following (which is the start of a dump of the index sys.i_source1):
branch: 0x40035a 4195162 (0: nrow: 3, level: 2) branch: 0x40e2c0 4252352 (-1: nrow: 312, level: 1) leaf: 0x40035b 4195163 (-1: nrow: 240 rrow: 240) leaf: 0x40215f 4202847 (0: nrow: 218 rrow: 218) leaf: 0x40035c 4195164 (1: nrow: 444 rrow: 444) leaf: 0x40035d 4195165 (2: nrow: 444 rrow: 444) ... etc ...
To generate the dump, Oracle walks the tree structure in order one block at a time. The hexadecimal and decimal numbers starting each line are the block address expressed in two different ways.
The first line of the dump shows the root block. In this case the index has blevel=2 (which means a height of 3), so a walk down the index will visit the root, one level 1 branch, and then hit a leaf block. The root block currently has three pointers (nrow = 3) to branch blocks at level 1.
Note 1 – The root block is normally no different in format from a branch block, but when an index consists of nothing but the root block the root block is formatted as a leaf block not a branch block.
Note 2 – whatever block splits happen, the root block is always kept at the same location – the block after the segment header block and any freelist group blocks.
Indented one step (three characters) line 2 shows you the first branch block of the level below – the first of the level 1 branch blocks in this case. In brackets you see that this is entry -1, has 312 pointers to blocks at the next level and is a level 1 branch block. The significance of level 1 is that the entries will be pointing to leaf blocks.
It may seem a little odd that Oracle starts at the “-1th” branch block: computer programs usually start counting at zero or one, but Oracle has a rationale for starting at -1 in branch blocks. Most of the entries in a branch block are pairs of the form (partial key, block address) – and these are listed in the block’s row directory – but there is one special entry (called the “leftmost child – kdxbrlmc”) which omits the partial key and gets stored in a separate place outside the row directory. Oracle omits the key to save a little space and infers the value from the block’s parent (in the branch level above). Since the row directory starts counting at row 0, the leftmost child has to be labelled “-1”.
I haven’t shown it, but 312 lines further down the trace file, you’d find branch block zero, and the other two branch blocks would be a similar sort of distance further down again.
The third line of the trace file is the first leaf block (and again we start counting from -1). Note that there is no “level” in the leaf block. Instead you get two entries for rows: the number of current rows (rrow) in the block and the current size (nrow) of the block’srow directory (see glossary). The row directory may hold references to rows that have been deleted, committed, but not yet cleaned out of the structure, which is why the dump shows two pieces of information.
The part of the index shown in the example is completely clean and tidy (rrow = nrow in every block) but something a little odd has happened at the start of the index as we can see evidence of a 50/50 leaf block split at the low end due to inserts after the index was intially filled – note how the leaf block addresses are nearly in sequence (035b, 215f, 035c, 035d) but a completely different block number appears between the first and third blocks to hold the overflow as the first block split.
For comparison purposes, here’s a section of the index produced during one run of my “index explosion” code:
branch: 0x140118a 20976010 (0: nrow: 551, level: 1) leaf: 0x140118b 20976011 (-1: nrow: 409 rrow: 409) leaf: 0x140118c 20976012 (0: nrow: 214 rrow: 214) leaf: 0x140118d 20976013 (1: nrow: 214 rrow: 214) leaf: 0x140118e 20976014 (2: nrow: 213 rrow: 213) leaf: 0x1401190 20976016 (3: nrow: 206 rrow: 206) leaf: 0x140118f 20976015 (4: nrow: 327 rrow: 327) leaf: 0x1401191 20976017 (5: nrow: 169 rrow: 169) leaf: 0x1401192 20976018 (6: nrow: 169 rrow: 169) leaf: 0x1401194 20976020 (7: nrow: 159 rrow: 159) leaf: 0x1401193 20976019 (8: nrow: 169 rrow: 169) leaf: 0x1401196 20976022 (9: nrow: 160 rrow: 160) leaf: 0x1401195 20976021 (10: nrow: 327 rrow: 327) ... leaf: 0x140152d 20976941 (544: nrow: 141 rrow: 141) leaf: 0x140152c 20976940 (545: nrow: 146 rrow: 146) leaf: 0x140152f 20976943 (546: nrow: 140 rrow: 140) leaf: 0x140152e 20976942 (547: nrow: 282 rrow: 282) leaf: 0x1401530 20976944 (548: nrow: 282 rrow: 282) leaf: 0x1401531 20976945 (549: nrow: 138 rrow: 138)
We have an index with blevel = 1, and 551 leaf blocks (numbered -1 to 549).
The first leaf block is 100% packed and underwent a “90/10” split.
The next 4 leaf blocks show clear evidence of 50/50 splits (each holds roughly half of the maximum 409 rows in the first block). The next leaf block suggests that a bit of a catastrophe hit the block and its ITL grew sharply – this block is 100% “full” (a fact that I ascertained by doing a block dump), but can only fit 327 entries because its ITL has grown to about 70 entries.
The size of the ITL doesn’t change over the next few blocks, but we can see a few 50/50 splits followed by a 90/10 split – and so it goes on. By the time we get to the end of the list, we’ve had another glitch, and a “full” block now holds only 282 rows, and blocks showing 50/50 splits hold about 140 rows. In this test run, the ITLs grew to a worst case of about 112 entries – not a complete disaster but certainly bad news.
The treedump isn’t detailed enough to tell you everything that’s happened to your index – but it is relatively short (one line per block) and gives you some good clues about what’s been going on. And the presence of the decimal form of the block address makes it easy to dump individual blocks if you want to investigate further, for example:
SQL> select 2 dbms_utility.data_block_address_file(20976011) file#, 3 dbms_utility.data_block_address_block(20976011) block# 4 from 5 dual 6 / FILE# BLOCK# ---------- ---------- 5 4491 1 row selected. SQL> alter system dump datafile 5 block 4491; System altered.
Footnote:
In some versions of Oracle you get a full block dump of every leaf block – which makes it very expensive to do a treedump.
In versions of Oracle prior to 9.2.0.6 there is a bug which means your session will crash with a “numeric overflow” error if you try to dump an index that has been created automatically to support a unique or primary key constraint. This makes it hard to dump index organized tables(IOTs). This is because bit 12 (4096) been set in ind$.property and the code isn’t expecting such a large number. (Technically it is possible to update the property column directly; it works, but I wouldn’t do it on the production system – but maybe on a backup you’re about to discard.)
Update Dec 2010
:
See this comment on a related post): Each line of a treedump takes about 75 bytes, so you need to be able to generate a tracefile of around (75 * leaf_blocks / 1048576)MB. Make sure you check parameter max_dump_file_size before you start.
Update Aug 2022
At some point between 11g and 19c the format of the tree dump changed to add a little extra information about the leaf blocks and re-arrange the way the “maximim/current” figures were displayed.
Old format
branch: 0x1400983 20973955 (0: nrow: 15, level: 2) branch: 0x1400c73 20974707 (-1: nrow: 238, level: 1) leaf: 0x1400984 20973956 (-1: nrow: 186 rrow: 47) leaf: 0x1401e6a 20979306 (0: nrow: 186 rrow: 43) leaf: 0x1401412 20976658 (1: nrow: 191 rrow: 53)
New format
branch: 0x9000104 150995204 (0: nrow: 15, level: 2) branch: 0x9000322 150995746 (-1: nrow: 238, level: 1) leaf: 0x900017f 150995327 (-1: row:186.47 avs:3904) leaf: 0x9001534 151000372 (0: row:186.43 avs:3904) leaf: 0x9000b6a 150997866 (1: row:191.53 avs:3794)
Note how the leaf blocks now show the available space (avs:), and the nrow: N rrow: M have been collapsed to just row: N.M
The figures came from a test script run against 11.2.0.4 and 19.11.0.0 that did a large “random insert” followed by a bulk delete – so the data processing seems to have behaved consistently across those two versions to produce the (logically) identical results.
Hi Jonathan:
How did you caluclate the ITL number in these cases?
-this block is 100% “full”, but can only fit 327 entries because its ITL has grown to about 70 entries.
– In this test run, the ITLs grew to a worst case of about 112 entries
Thank you
Kumar
Comment by Kumar — August 18, 2009 @ 4:21 am BST Aug 18,2009 |
Two options –
Comment by Jonathan Lewis — August 18, 2009 @ 4:58 pm BST Aug 18,2009 |
Could you give a little more detail on the leftmost child branch block. From another JL article,
So this is a special case when “there is only room for a single entry in each leaf block?”
If that’s the only time, then this is more of handling a boundary condition than an optimization, no?
Comment by Mark Brady — August 18, 2009 @ 5:57 pm BST Aug 18,2009 |
Mark,
I’ve edited your comment to clarify that quote from my the article website (a 45KB word document).
It’s certainly a nice idea to consider, but history suggests otherwise. I can’t remember exact versions, but there was a time when the longest key was limited to about 40% of the block size (as mentioned here), and this particular trick was in place even in those days.
Comment by Jonathan Lewis — August 18, 2009 @ 8:01 pm BST Aug 18,2009 |
[…] 3-How to read treedump for what happened in indexes ? Jonathan Lewis – Treedump […]
Pingback by Blogroll Report 14/08/2009 – 21/08/2009 « Coskan’s Approach to Oracle — September 8, 2009 @ 12:07 pm BST Sep 8,2009 |
[…] (are you likely to have hot spots in the index, or a few very popular values) . Dump a few blocks (see the end of this note) from one of the higher numbered extents in the index to see if they have a lot of ITL entries, or […]
Pingback by Index Efficiency 3 « Oracle Scratchpad — March 3, 2010 @ 9:21 pm GMT Mar 3,2010 |
[…] start with an article on the treedump command, which I published a little while ago. This dumps a tracefile describing your index, one […]
Pingback by Treedump – 2 « Oracle Scratchpad — March 7, 2010 @ 6:34 pm GMT Mar 7,2010 |
Hi Jonathan,
I have a remark about dbms_utility.data_block_address_file(20976011) file#. If if I’m not mistaken, the treedump gives the relative DBA (data block address) of the leaves. But block dump takes the absolute file#.
So in your example dbms_utility.data_block_address_file(20976011) gives the rfile# that we need to convert into file# (from v$datafile – knowing the index segment’s tablespace) before using it for block dump. They were probably equals in your example.
Plase correct me if I’m wrong.
Regards,
Franck.
Comment by Franck Pachot — April 29, 2010 @ 7:19 am BST Apr 29,2010 |
Franck,
Thanks for that, it’s a very good point, and one that I had completely overlooked. I think you’re probably correct (it certainly makes sense), but I haven’t actually checked but I will remember to verify it one day.
I think the difference won’t be visible to many DBA’s, though; my memory of the change in file number formats is that Oracle rigged their implementation so that the relative file number and absolute file number were the same until you had about 1,024 files (possibly 1,022 if you want to be precise) – and there is an event you can set to break this connection so that you can test without having to create a thousand files.
Comment by Jonathan Lewis — May 2, 2010 @ 11:37 am BST May 2,2010 |
Hi, In fact I have tested it as I had to treedump (using our example) on a database that has file# != rfile#, and that’s how I discovered it.
It is not only the number of datafiles that makes them different: when you use transportable tablespaces, the relative file number do not change (or that would involve visiting all data blocks in the tablespace, and all rowids that reference a block within that tablespace…)
Comment by franckpachot — May 6, 2010 @ 12:21 pm BST May 6,2010 |
Franck,
Thanks for that comment – excellent point about the transportable tablespace as well, that’s another of those things that’s easy to forget.
I put in a “!=” on your comment, I’m guessing that the HTML parser took out a “less than, greater than” that you’d had there.
Comment by Jonathan Lewis — May 6, 2010 @ 6:20 pm BST May 6,2010 |
[…] treedump : A useful (though sometimes expensive) option for looking into index structures […]
Pingback by Index ITLs « Oracle Scratchpad — June 5, 2010 @ 8:58 am BST Jun 5,2010 |
Jonathan,
Few questions to understand the treedump…
1.I checked first 30K rows and last 1k rows(rows for leaf blocks) in my treedump and i found that ‘nrow’ is same as ‘rrow’ everywhere. How should i interpret it ?
2.snap of my treedump
Am i right in interpretation that,
there are 14 branch blocks at level 4 (just below root block),118 branch blocks at level-3,112 branch blocks are level-2 and 187 branch blocks at level-1.
First branch block(i.e. ‘-1’) at level-1 points to 205 leaf blocks…and so on…???
Comment by Bhavik Desai — December 6, 2010 @ 3:05 pm GMT Dec 6,2010 |
Some clarification.
The “root” block is just another type of branch block – so the first line is the root block which holds 14 entries which point to 14 blocks below it. The root block is at level 4 (blevel = 4, height = 5 for your index).
The first block pointed to by the root block is at level three, it holds 118 entries; there are 13 other branch blocks at this level
The first block pointed to by the first level 3 branch block is at level 2, it holds 112 entries; there are 117 branch blocks at this level pointed to by the same parent level 3. We could “guesstimate” that there are roughly 13 * 118 other blocks at level 2 (assuming each of the 14 level 3 branch blocks all hold the same number of pointers.
etc. etc. etc.
Comment by Jonathan Lewis — December 7, 2010 @ 7:22 am GMT Dec 7,2010 |
[…] snapshots of v$mystat, calls to “alter system flush buffer_cache”, and event 10200; the treedump can also be very helpful for identifying block […]
Pingback by Quiz Night « Oracle Scratchpad — January 23, 2011 @ 6:00 pm GMT Jan 23,2011 |
Oracle Index Rebuild – A Corner Case…
Regular index rebuilds in Oracle are a vital necessity – this old myth has been proven wrong long ago. But here is the story about an index where at first we did not understand why a rebuild reduced the size so heavily.After a blooper involving a DROP …
Trackback by BlogNotes to Myself — September 6, 2012 @ 8:20 pm BST Sep 6,2012 |
[…] big the LOBindex is. The first step I took was to check the object_id of the LOBindex and then do a tree dump (which showed 66 leaf blocks) and then I checked the segment header block and dumped that with the […]
Pingback by Filtering LOBs | Oracle Scratchpad — May 28, 2018 @ 8:26 am BST May 28,2018 |
[…] exactly fill two leaf blocks. I’ve captured the object_id of the index because I want to do a “treedump” of the index before and after inserting the next […]
Pingback by Index splits | Oracle Scratchpad — October 29, 2018 @ 1:28 pm GMT Oct 29,2018 |
[…] done a treedump of the secondary index and picked a leaf block address from the treedump and dumped that leaf […]
Pingback by IOT Bug | Oracle Scratchpad — December 16, 2019 @ 3:58 pm GMT Dec 16,2019 |
[…] and the height (and blevel) of the index; then I find it’s object_id so that I can do a treedump of […]
Pingback by OERI 6051 | Oracle Scratchpad — January 9, 2022 @ 9:27 pm GMT Jan 9,2022 |
[…] il_01 then the script completes very quickly (complaining, of course, about the attempt to do a treedump with a null level (ORA-49100: Failed to process event statement). Assuming you’re using an 8KB […]
Pingback by Index ITL Limit | Oracle Scratchpad — February 18, 2022 @ 5:02 pm GMT Feb 18,2022 |
[…] get a complete picture we’ll have to do some work with the dbms_space package, the index treedump command, dumping redo, and we also ought to take a look at v$rollstat and v$enqueue_stat. and […]
Pingback by drop partition | Oracle Scratchpad — August 5, 2022 @ 8:24 pm BST Aug 5,2022 |
[…] in each leaf block have been deleted. So that we know what the tests start from I’ve done a treedump of the index before and after the delete (and included a pause in the script to allow you to find a […]
Pingback by Shrinking indexes | Oracle Scratchpad — September 2, 2022 @ 7:22 pm BST Sep 2,2022 |