Block Sizes

July 19, 2008

Block Sizes

Filed under: Block Size,Infrastructure,Troubleshooting — Jonathan Lewis @ 7:56 pm BST Jul 19,2008

There is a recent thread on the OTN forum [ed Jan 2009: the link is temporarily broken] about block sizing and the impact it may have on performance. The thread contains some interesting insights into the problems of testing and trouble-shooting. Unfortunately the thread has accumulated so much extraneous material that it has become hard to separate the wheat from the chaff, so I thought I’d make it easy to follow a couple of useful discussions that ran through the thread by supplying a sequence of URLs that you can visit in turn.

[Ed Jan 2009]
According to this comment from Justin Kestelyn, extremely long messages were causing performance problems in the current version of the forum software and have have been temporarily archived to a separate location – and this thread contained some very long messages. Since it may be some time before it reappears on the forum I’ve posted a copy made just before it disappeared so that (until the original reappears) you can read the interesting bits here. Warning the file is a pdf of about 3MB – the file will open in a separate window so that you can read the file and my comments about the file at the same time. When I view the file it runs to 218 pages, so I’ve added page references after the links in the blog note (the links, of course, do not currently work).

(When first created the document was 233 pages long, but following a request from Don Burleson I removed a couple of entries that quoted material from his website, then following a request from Janet Burleson I removed all entries made by Don Burleson, so some of the page references may still be a little inaccurate.)

Item 1: It’s hard to build a good test case.

This item starts with user sp009 being the first to construct and supply a demonstration (end of page 6: sp009) to support (or refute) some of the comments that had been flying around.

Unfortunately there was an important flaw (top of page 8: Lewis) in his example – he wasn’t running his queries he was simply looking at execution plans, and there was a configuration error that would have spoiled the tests anyway.

So sp009 created the test again giving us two postings – one to report some details about configuration, data sizes and execution plans (top of page 12: sp009), and the other containing the tkprof summaries (end of page 15: sp009) for the test queries.

Unfortunately the tests still had a configuration mismatch (page 20: Lewis) which obscured the source of the variation in timing between the tests, and the “wait” information had not been captured.

Following through sp009 recreated the tests (top of page 33: sp009) and, since he was using 10g, allowed Oracle to pick a multiblock read count rather than specifying one.

Many of the figures from this test showed that the “infrastructure” of the test was now appropriate; but a number of anomalies (top of page 39: Lewis) showed up. For example, the tests of different block sizes showed a time difference of 3.22 seconds over 24.39 seconds, but the unaccounted time in the tests was three of four times the size of this difference.

At this point sp009 dropped out of the test cycle but Greg Rahn had also been supplying us with some test results and produced a test case (end of page 55: Rahn) modelled on the example created by sp009 but using a more realistic data set. These results showed much better internal consistency than the final set from sp009. Even so, the time difference between Greg’s results (0.06 seconds over 24 seconds) was less than the accounting error (0.1 seconds) for a given block size.

Almost inevitably there was room to argue that Greg Rahn’s test was not appropriate for the general case (end of page 56: Lewis), largely because he was running 11g and Oracle chose to use the new ‘serial direct reads’ feature, which means it bypassed the (cache buffers chains) latch activity that can be a significant fraction of the CPU expense of logical I/O.

Building good test cases is very difficult. You constantly have to ask yourself if you’re constructing a valid model of the problem you are investigating; you always have to think about what other reasons there might be for the results you are seeing; and you have to make sure you can explain any transient anomalies in the results. Peer review helps enormously – the omissions, or assumptions, that you’ve made may stand out clearly to someone who doesn’t start with your preconceptions when they examine your test case.

Item 2: Tests highlighting bugs (no links)
It has been pointed out a few times in the Oracle Forum and elsewhere on the Internet that if you set the db_file_multiblock_read_count to zero in 10g Oracle will actually use the value one – which is not usually a good value for “multi-block” reads.

BUT – as a consequence of various tests being run and the results being published on the thread by people using different versions of 10gR2, it transpired that this behaviour was evidence of a bug (number 5768025) fixed in 10.2.0.4.

Item 3: When should testing stop

It can be difficult deciding when to implement the best fix that you can come up with, when it’s possible that just “a little more time” will allow you to identify the real problem.

Steve Karam introduced a case (top of page 144: Karam) through a link to his blog where moving a table from a tablespace using a 16KB blocksize to a tablespace with a 4KB block size made an enormous difference to performance. (Starting with just a simple export/import to the new block size that made the batch run drop from 45 minutes to 2 minutes 20).

Steve’s motive for trying this strategy was that there was another system already running with a 4KB block size that used the same code and completed quickly. It’s a good idea (especially when pushed for immediate results) to start by adopting the strategy of: “make this one look like the one that works and see what happens”.

Steve’s report, unsurprisingly, met with some disbelief – along the lines of “if it looks too good to be true, it is too good to be true” – and a couple of suggestions about why the performance might have changed.

But Steve had already covered the options suggested in some of the other tests he had run. Then he published some extracts of trace files taken from two runs of the job (top of page 148: Karam) , one on a database using 4KB blocks, and one on a database using 16KB blocks.

The results suggested that the problem (top of page 149: Lewis) was not directly associated with the block size, although it might have been triggered by an unexpected side effect (for which read bug) of using 16KB blocks.

But at this point you may think that you’ve already addressed the root cause; and even if you don’t think this, you might look at what you’ve achieved so far and take the view; “I’ve made the problem go away; I don’t really know why it went away; but how much more time can I afford to spend understanding what’s really happening?” The other time-dependent question you should ask is: “If this really is the solution, how long is it going to take to implement on production.”

In this case the 4KB block size made the problem go away – if you were the production DBA would you move the whole database to a 4KB blocksize, or run with one table in a special tablespace and move other tables as other batch jobs showed the same symptoms? What if the problem came back – in the 4KB block size – perhaps in several months time when the data sizes are larger ? It’s not an easy call to make.

Being a little curious, I tried to emulate Steve’s result by creating a test based on the brief description he had supplied. All I had to do was create a two-column table, then update every row … nothing dramatic happened. Then a couple of days later I had a flash of insight, and decided to modify the code so that my update changed a column from null to a non-null value (end of page 172: Lewis).

The problem seems to be a bug that you’ll notice only if you combine ASSM with larger block sizes and an unsuitable setting of pctfree in a table. And just to finish it off, Mark Williams tracked down (top of page 178: Williams) a possible reference to the bug on Metalink.

One of the difficulties with solving problems with Oracle is that sometimes you don’t know whether you’ve really solved a problem or merely postponed it for a while. It helps if you know Oracle so well that you can say both:

“that behaviour is not reasonable”
“that approach should not have been a solution”

It helps if you can give yourself a reasonable explanation of why a solution should work. If you can’t do that then maybe your solution isn’t really addressing the root cause. Even then, given the time restrictions imposed on most production DBAs, a temporary workaround may be better than nothing.

Final Comment

If you have to ask the question “What block size should I use?” the answer is probably 8KB.

There are cases where you may get a small benefit from picking the “perfect” block size for your system – but for most of the systems I’ve seen, there’s usually been a lot more benefit to be gained from addressing the big performnace problems before trying to play games with changing block sizes.

If you want to see a more thorough investigation of how things can change (and the scale of that change) as you use different block sizes, Charles Hooper has a long series of demonstrations running through the entire thread.

Comments (10)

10 Comments »

Great synopsis and distillation Jonathan, will help no end source the light through all the heat.

Comment by SeánMacGC — July 21, 2008 @ 11:08 am BST Jul 21,2008 | Reply
Its good to see things haven’t changed much from the days of Oracle 7.3.x when the block-size discussion was a hot-topic for Unix platforms. I can remember briefly discussing this topic with Jonathan (back in 1997, UK, Surrrey).

Comment by Tom Jurcic — July 22, 2008 @ 4:32 am BST Jul 22,2008 | Reply
What’s “shocking” is the fact that, many times, we build a table with no modification on the PCTFREE: we use standard values and “hope for the best”.

Nothing new: old problems still around! :)

Recently I worked on a project where many tables where “read only” and the fact table was a fat-quasi-only-insert one.

We choose to set a very low PCTFREE (about 1-2%) and suggest a db-block size of 8K. Ehm…standard db_file_multiblock_read_count for us.

In production this become PCTFREE 10% and db-block 32K! :|

Time for a “reorg”?

Comment by lascoltodelvenerdi — July 23, 2008 @ 9:16 am BST Jul 23,2008 | Reply
Very good summary indeed, Jonathan! Everybody embarking on Oracle should probably read this.

For those lured to this posting by the “block size” keyword, Richard Foote has some excellent blog entries about whether to separate indexes to tablespaces with different block sizes, e.g. http://richardfoote.wordpress.com/2008/03/26/store-indexes-in-a-larger-block-tablespace-height-reduction-12-myth-five-foot-one/

Comment by Robert Klemme — August 8, 2008 @ 7:08 am BST Aug 8,2008 | Reply
[…] was from a post I wrote on the Oracle Forum: Database – General. I recommend that you read Jonathan Lewis’ summarization of the thread instead of reading all 671 posts (as of today). You will spend much less time and get […]

Pingback by Automatic DB_FILE_MULTIBLOCK_READ_COUNT | Structured Data — August 14, 2008 @ 12:02 pm BST Aug 14,2008 | Reply
[…] how to implement them efficiently. Ready to get deep into the internals? Jonathan Lewis has a good writeup on block sizes and how they can affect your application. For the dbas who work with Oracle RAC, […]

Pingback by Log Buffer #107: A Carnival of the Vanities for DBAs — February 20, 2009 @ 9:15 pm GMT Feb 20,2009 | Reply
A thread that you might want to comment on ?

http://forums.oracle.com/forums/thread.jspa?threadID=900977&tstart=0

Apparently, Indexes in a 16K block tablespace are very large. But we haven’t determined why.

Comment by Hemant K Chitale — May 18, 2009 @ 9:11 am BST May 18,2009 | Reply
[…] Summary of an OTN forums thread – what was likely the longest thread ever on the topic of block sizes (and very likely multiple block sizes in the same database) from June 2008. The message thread was too large to be supported on the new OTN software due to performance reasons. Fortunately, Jonathan Lewis obtained a copy of the thread content in a PDF file: https://jonathanlewis.wordpress.com/2008/07/19/block-sizes/ […]

Pingback by Faulty Quotes 5 – Block Sizes « Charles Hooper's Oracle Notes — January 31, 2010 @ 6:04 am GMT Jan 31,2010 | Reply
Nice explaination….

Thanks,

Rafi.

Comment by Rafi — April 15, 2010 @ 6:22 am BST Apr 15,2010 | Reply
[…] (also the technical editor of this book) reported fixing the problem, and not the book author (reference reference2 reference3 […]

Pingback by Book Review: Oracle Tuning: The Definitive Reference Second Edition « Charles Hooper's Oracle Notes — November 28, 2010 @ 5:18 pm GMT Nov 28,2010 | Reply