Oracle Scratchpad

December 16, 2011

I Wish

Filed under: Oracle,Statistics,Wishlist — Jonathan Lewis @ 6:31 pm BST Dec 16,2011

A couple of days ago I wrote about some things I’d like to see in index statistics, which means changing dbms_stats.gather_index_stats(). Here’s an idea for dbms_stats.gather_table_stats(). I’d like to see the number of chained and migrated rows counted separately in dba_tables when we run the packaged procedure. The optimizer will use information about chained or migrated rows, but the information is only collected if you use the analyze command (and even then the two figures are summed into the single value chain_cnt) .

I have to say that it does look as if the information that Oracle would need to count (and distinguish between) chained rows and migrated rows is already available in the blocks. If we check a bit of a block dump we can see that various flags seem to tell us everything that the code would need to know. Here’s an extract from a table block dump, showing the start of three rows (technically row pieces) in the block.

tab 0, row 0, @0x1c79
tl: 9 fb: --H----- lb: 0x2  cc: 0
nrid:  0x0180000b.0

tab 0, row 1, @0x1952
tl: 807 fb: --H-FL-- lb: 0x0  cc: 4
col  0: [200]
 78 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 ...

tab 0, row 2, @0x162b
tl: 807 fb: --H-FL-- lb: 0x0  cc: 4
col  0: [200]
 78 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 ...

Notice particularly the “flag byte (fb)”, and the presence of one row reporting a “next rowid (nrid)”.
Amongst other things, the flag byte tells us what the row pieces represent.

H – this is a row header
F – this is the first row piece of a row
L – this is the last row piece of a row

These three flag values start us in the right direction. If we have a simple migrated row then we will have a head piece (in the original row position), pointing to the complete row. The head piece will have an H flag byte, and an nrid pointing to the actual row; when we get to the actual row its flag byte will show only the FL bits, and there will be an hrid (head rowid) associated with it, pointing back to the original location.

If we have a simple chained row, the first section will have the H and F bits set, and an nrid entry; any “middle” rowpieces will have an nrid, but no flag bits set, and the last row piece will have the L bit set, and no nrid or hrid.

If a row has migrated, and then chained after migrating, the first piece will have the H bit set, and an nrid entry; the next piece will have the F bit set and both an hrid and nrid; any middle pieces will have an nrid but no flag bits set; and the last piece will have the L bit set, but no nrid or hrid.

There may be other alternatives, of course, e.g. a row that first chains on an update, leaving a few columns behind, and then migrates those columns out of the initial location; then there may be other variations when rows shrink and start to “unchain” or “migrate back (partially)”; and I haven’t even started to think about rows with more than 254 columns which have “intra-block” chaining – i.e. several row pieces in the same block. But looking at the flags and information about head/next row pieces, it looks as if a full scan of a table could identify the number of chained rows and the number of migrated rows separately. If it has an H only it has migrated (but may also have chained), if it’s got an HF it has chained (only). Any rows with no flag setting or L don’t add much value unless we want to work out an average number of pieces per chained row. Any rows with FL are the migrated rows.  Any rows with F only are rows which have migrated and then chained.

Inevitably there could be accounting errors on sampling, though. We might find 10 rows with H only, but only 7 rows with FL; what assumption should we make about the three outstanding rows ? Might they be just migrated, or could they be migrated and chained ? In view of the room for error, it might be sensible to implement the counts only as part of the approximate NDV mechanism that only runs if you choose the auto sample size option, and always seems to do a full segment scan.

Related articles.

Another difference between analyze and dbms_stats
Anomalies with counting continued row activity

8 Comments »

  1. This is an interesting topic. I wonder if Oracle might start storing the migrated to address for single piece rows in indexes. I wonder if Oracle might start storing (perhaps at user index definition and/or the system figuring out when it made sense) up to some n block addresses in the index for multipiece rows. The behavior of fetching “the rest of the columns” when adaptive direct read kicks in is interesting. Monitoring column usage and possibly allowing physical reordering of columns so the most used columns are in the first piece might be useful. I think these things tend to get lower priority because there is a tendency to think people who use more than 254 columns in a table or allow much row migration to take place should be glad it works at all.

    I’m certainly glad you brought this all up. In the era of big data, it is the sort of remediation the database engine could do in response to defects in how something to rebuild has accreted. Scratchpad has scratched the surface of this topic in a very important way. It could become a whole industry area of research and expertise.

    Comment by Mark W. Farnham — December 17, 2011 @ 12:38 am BST Dec 17,2011 | Reply

    • Mark,

      Every time I start to think about the options I start to worry about how easy it would be to do it wrong; there are so many considerations of timing and overheads, effects of migrating backwards or migrating to a third location – not to mention thoughts of data that ought to be well-clustered ending up badly scattered because of migration. I think I’d rather see a little more education of how to anticipate data growth, rather than an undercover mechanism that tries to work around the problems. (Think cursor_sharing!)

      Comment by Jonathan Lewis — December 19, 2011 @ 12:23 pm BST Dec 19,2011 | Reply

  2. I have a table with 978 columns and 1 row in it. as expected it has stored the row as 4 row pieces (as there are trailing not null columns).

    here is the block dump of the block.

    block_row_dump:
    tab 0, row 0, @0x3e1b
    tl: 357 fb: --H-F--- lb: 0x0  cc: 255
    nrid:  0x1e0174e4.1
    tab 0, row 1, @0x3c18
    tl: 515 fb: -------- lb: 0x0  cc: 255
    nrid:  0x1e0174e4.2
    tab 0, row 2, @0x3ab4
    tl: 356 fb: -------- lb: 0x0  cc: 255
    nrid:  0x1e0174e4.3
    tab 0, row 3, @0x2fe9
    tl: 2763 fb: -----L-- lb: 0x0  cc: 213
    end_of_block_dump

    But when I acces this table it says it has visited 2 blocks and 8 row pieces.
    Not sure why it is vising 2 blocks when it got all 4 row pieces in 1 block and 8 row pices when it has only 4.

    table scan blocks gotten                                                     2
    table scan rows gotten                                                       8
    table scans (short tables)                                                   1
    

    Comment by raova — March 23, 2012 @ 5:20 pm BST Mar 23,2012 | Reply

    • Raova,
      I repeated your experiment using an 8KB block size (just in case it was an anomaly due to the 16KB block size) and I get the same result – however the behaviour was dependent on the the structure of the table.

      It looks as if you rebuilt the table before reporting this test – is that the case ? Your row-piece ordering is something I could reproduce only through ‘alter table XXX move’

      Comment by Jonathan Lewis — March 31, 2012 @ 7:03 am BST Mar 31,2012 | Reply

  3. Any chance this made it in 12c? We have chained rows, intra-block chaining and migrated rows and would love to have an easy way of detecting them. Currently we monitor sessions for “table fetch continued row” and report on the sql to later analyze the source. Once identified we use “analyze table list chained rows” to identify the culprits and attempt to figure out whether we guesstimate which category they fall in. If we guess they are migrated we then delete and re-insert the rows as part of a batched up operation.

    Comment by Kenny — February 25, 2014 @ 12:58 pm BST Feb 25,2014 | Reply

    • Kenny,

      I haven’t noticed any enhancements in this area

      On a potential threat 11.2.0.4, and 12.1, have a new parameter “_rowlen_for_chaining_threshold” which looks as if it may allow “longish” rows to chain rather than migrate – and that may confuse the issue even more.

      Comment by Jonathan Lewis — February 25, 2014 @ 6:09 pm BST Feb 25,2014 | Reply

      • Thanks, I’ll have to look into that parameter. It sounds like that might be good for rows that the first columns are accessed more frequently than the latter ones. That would reduce a get in those cases. I’d really like to see an option to disable migration and incur the index maintenance costs. The extra reads every time we access the row is much more costly to us than if the original update would just delete all index entries and reinsert themselves with a new rowid ( or just update the rowid ).

        Kenny

        Comment by Kenny — February 25, 2014 @ 6:23 pm BST Feb 25,2014 | Reply

    • When this is a chronic problem (it often happens when a skeleton of a row is inserted and other columns are born null and filled in later), it *may* be useful to have “born as” values the average length of final values. For numeric columns you may have to also have a padding column that you update so the sum of the vsize of the two columns is a constant. There is definitely a large engineering and update overhead to do this to minimize row migration but it is probably competitive with delete-insert pairs, which is the alternative. When this technique fits very well with the column content and life cycle of rows in a table, it *may* also allow you to use a very small pctfree and still avoid migrated rows. Notice that using the default values stored in the dictionary is *not* useful for this purpose. Good luck. This is not something to undertake lightly, but it is very effective in some cases.

      Comment by rsiz — February 25, 2014 @ 6:52 pm BST Feb 25,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,173 other followers