A couple of days ago I wrote about some things I’d like to see in index statistics, which means changing dbms_stats.gather_index_stats(). Here’s an idea for dbms_stats.gather_table_stats(). I’d like to see the number of chained and migrated rows counted separately in dba_tables when we run the packaged procedure. The optimizer will use information about chained or migrated rows, but the information is only collected if you use the analyze command (and even then the two figures are summed into the single value chain_cnt) .
I have to say that it does look as if the information that Oracle would need to count (and distinguish between) chained rows and migrated rows is already available in the blocks. If we check a bit of a block dump we can see that various flags seem to tell us everything that the code would need to know. Here’s an extract from a table block dump, showing the start of three rows (technically row pieces) in the block.
tab 0, row 0, @0x1c79 tl: 9 fb: --H----- lb: 0x2 cc: 0 nrid: 0x0180000b.0 tab 0, row 1, @0x1952 tl: 807 fb: --H-FL-- lb: 0x0 cc: 4 col 0:  78 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ... tab 0, row 2, @0x162b tl: 807 fb: --H-FL-- lb: 0x0 cc: 4 col 0:  78 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ...
Notice particularly the “flag byte (fb)”, and the presence of one row reporting a “next rowid (nrid)”.
Amongst other things, the flag byte tells us what the row pieces represent.
H – this is a row header
F – this is the first row piece of a row
L – this is the last row piece of a row
These three flag values start us in the right direction. If we have a simple migrated row then we will have a head piece (in the original row position), pointing to the complete row. The head piece will have an H flag byte, and an nrid pointing to the actual row; when we get to the actual row its flag byte will show only the FL bits, and there will be an hrid (head rowid) associated with it, pointing back to the original location.
If we have a simple chained row, the first section will have the H and F bits set, and an nrid entry; any “middle” rowpieces will have an nrid, but no flag bits set, and the last row piece will have the L bit set, and no nrid or hrid.
If a row has migrated, and then chained after migrating, the first piece will have the H bit set, and an nrid entry; the next piece will have the F bit set and both an hrid and nrid; any middle pieces will have an nrid but no flag bits set; and the last piece will have the L bit set, but no nrid or hrid.
There may be other alternatives, of course, e.g. a row that first chains on an update, leaving a few columns behind, and then migrates those columns out of the initial location; then there may be other variations when rows shrink and start to “unchain” or “migrate back (partially)”; and I haven’t even started to think about rows with more than 254 columns which have “intra-block” chaining – i.e. several row pieces in the same block. But looking at the flags and information about head/next row pieces, it looks as if a full scan of a table could identify the number of chained rows and the number of migrated rows separately. If it has an H only it has migrated (but may also have chained), if it’s got an HF it has chained (only). Any rows with no flag setting or L don’t add much value unless we want to work out an average number of pieces per chained row. Any rows with FL are the migrated rows. Any rows with F only are rows which have migrated and then chained.
Inevitably there could be accounting errors on sampling, though. We might find 10 rows with H only, but only 7 rows with FL; what assumption should we make about the three outstanding rows ? Might they be just migrated, or could they be migrated and chained ? In view of the room for error, it might be sensible to implement the counts only as part of the approximate NDV mechanism that only runs if you choose the auto sample size option, and always seems to do a full segment scan.