Oracle Scratchpad

January 14, 2014

Single block reads

Filed under: Infrastructure,Oracle,Performance — Jonathan Lewis @ 6:52 pm GMT Jan 14,2014

When a “cache read” tablescan (or index fast full scan) takes place we generally expect to see waits on “db file scattered read” as Oracle performs multi-block reads to do the scan. But we all know that Oracle will skip over blocks that are already in the cache and can therefore end up doing multi-block reads of many different sizes, even down to the point where it does single block reads (waiting for “db file sequential read”).

A quirky little question came up on OTN a little while ago: “for a large table we expect multiblock reads to be positioned at the end of the LRU for prompt re-use; but when Oracle does a single block read as part of a tablescan does it go to the end of the LRU (because it’s part of a large tablescan) or does it go to the mid-point of the LRU (because it’s a single block read)?”

The description of how blocks are treated in a tablescan has been simplified, of course, but the question is still valid – so what’s the answer, and how (without going into an extreme level of detail) would you demonstrate it ?

 

8 Comments »

  1. I think the simplest way is to set up db_file_multiblock_read_count=1, start a full table scan with “_serial_direct_read” = never and check x$bh, especially tch, lru_flag, flag columns. With the table larger than 25% of buffer cache we can see the same TCH=0, flag=524288 for db file scattered read and db file sequential read.

    Comment by Vyacheslav Rasskazov — January 15, 2014 @ 4:45 am GMT Jan 15,2014 | Reply

    • Vyacheslav,

      That looks good to me.

      If anyone needs to learn about the significance of the 25% and the tch columns (which are probably the important parts) there’s my blog item about them.

      And if anyone’s wondering about the “flag = 524288”, there’s an article on my blog that describes it, showing that bit 19 (dec 524,288) is defined as: “only_sequential_access” i.e. only ever read by a tablescan. (It’s a pity the person defining the flag didn’t talk to the person defining the db file waits.)

      One point that’s worth making about Vyacheslav’s test is this: if I had found tch=1 for those blocks I wouldn’t have been confident that I had reproduced the general case; it’s possible that setting db_file_multiblock_read_count to 1 triggered some other code change that changed the caching; on the other hand I would be reasonably confident that tch=0 was a result to be trusted (though that is, partly, confirmation bias coming into play).

      In my own tests I avoided setting any parameters to special values.

      Comment by Jonathan Lewis — January 15, 2014 @ 8:16 am GMT Jan 15,2014 | Reply

      • Jonathan,

        I would be also curious whether setting db_file_multiblock_read_count to 1 changed some code path. For me a little bit safer way would be
        1) creating table with let me say 1000 blocks
        2) creating rowid for every distinct dba
        3) reading to buffer cache every second block (maybe several times)
        4) to be sure disabling serial direct reads
        5) performing FTS and checking x$bh view (mostly tch column)

        Comment by Pavol Babel — January 15, 2014 @ 9:27 pm GMT Jan 15,2014 | Reply

        • Pavol,

          That’s almost exactly what I did.

          I chose a tablesize that was between the 10% an 25% limits, with two rows per block; and I left 4 seconds between the indexed accesses I used to prime the cache (to make sure that the TCH was incremented on each range scan). I didn’t even have to disable direct reads explicitly, presumably because half the table was cached and that had an enough of an effect on the calculations to make Oracle choose cached reads.

          Comment by Jonathan Lewis — January 15, 2014 @ 9:33 pm GMT Jan 15,2014

  2. Maybe a little bit offtopic. I would like only to as whether I’m the only one ho started to hate buffered full scans after release of 10g? Reusing same buffers for tables larger then 2% was so far best strategy in 9i.
    I have seen several reporting apllications performing many FTS (same tables are read several times for different reports). When dealing with FTS of tables betwen 20 and 25%, they consume large portion of buffer cache (whic makes few reports using index scan very slow) and morever, parallel reads of same table are slowed by “read by other session” or CBC latch waits. When you have FTS of same table with 75% of data in SGA in two different session, it is very often more effective to perform direct read in both sessions due “read by other session” wait
    Of course there is dark site of serial direct read, since delayed block cleanout is not performed “permanently”.
    I think 10g buffered reads behaviour have sense only for huge buffer cache (bigger than 256gb) for databese with few terabytes and there you start having issues with NUMA affinity etc.

    Comment by Pavol Babel — January 16, 2014 @ 8:14 am GMT Jan 16,2014 | Reply

    • Pavol,

      I think there are likely to be several different views on this that depend very much on the circumstances of what’s happening, and the mix of table sizes compared to the 2, 10, and 25% limits.

      I’ve seen some big problems with direct path serial reads because the same table was being scanned by several processes running different reports where caching the table would have helped; I’ve seen the opposite symptom where cached scans (as you say) keep flushing data being used “too slowly” by other reports.

      Arguably sessions aren’t slowed down by “read by other session”, they’re accelerated because they wait less time for a read (that they would have been doing anyway) to complete. Allowing 25% of the cache for large tablescans, though, does seem a little over the top – but it shouldn’t make any difference to the “cache buffers chains” latching which is dictated by the block address, not by the buffer address.

      I think Oracle’s in a no-win situation here – it’s got an “average” strategy that’s trying to avoid the worst excesses of bad decisions – so there are bound to be ways in which people can look at their systems and Oracle’s implementation and see “obvious” ways that Oracle could do things better.

      Comment by Jonathan Lewis — January 16, 2014 @ 8:52 am GMT Jan 16,2014 | Reply

  3. I’ve seen this behaviour seriously penalising tablescan performance in an environment where instead of 1 scattered read with N blocks I got several smaller scattered reads; ela time for smaller chunks was not shorter than the N blocks read. It seemed that split was because some “in the middle” blocks were found in cache.

    Comment by MatteoP — January 20, 2014 @ 4:57 pm GMT Jan 20,2014 | Reply

    • Thanks for the comment.

      It’s probably a fairly common occurrence, though perhaps not often highly noticeable. It’s easy to imagine that the “serial direct path tablescan” was introduced because of complaints from people reporting problems like yours.

      Comment by Jonathan Lewis — January 21, 2014 @ 9:57 pm GMT Jan 21,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a reply to Vyacheslav Rasskazov Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.