Oracle Scratchpad

August 3, 2017

Block Sizes

Filed under: Block Size,Infrastructure,Oracle — Jonathan Lewis @ 7:31 pm BST Aug 3,2017

Here’s an old draft based on a very old tpc-c test that I’ve only just rediscovered. There are a couple of brief comments at the end that I’ve added just before publishing, otherwise this is just as I wrote it years ago.

Several years ago I copied some results from a couple of TPC-C reports of what were then, if I recall correctly, the top performance results for the current Oracle version. I made the copy May 2010, but never got around to writing up a note commenting on them – but the comment I have to make is pretty timeless so here it is, only seven years late [update: now 10 years, presumably I wrote these notes in 2017]

One of the reports was for a Sun SPARC Enterprise T5440 Server Cluster running RAC on version 11.1.0.7 with 12 instances, the other for an HP Integrity Superdome-Itanium2/1.6GHz/24MB iL3 running single instance 10.2.0.3. The results aren’t intended as any sort of reflection on current Sun and HP equipment, or current performance; I ‘ve published them together to highlight one simple idea. How do you answer someone who asks questions like: how big should my cache size be, should I use a keep or recycle cache, should I use a different block size for my indexes ?

Take a look at these two sets of figures from (we assume) the best experts in their particular platform when they are trying to address exactly the same performance problem.

 Sun Sparc  HP Superdome
 db_2k_cache_size  60G
 db_[4k_]cache_size  2G  1G  (default cache)
 db_[8K_]cache_size 65G (default cache)  10G
 db_16K_cache_size  95G  476G
 db_keep_cache_size  210G  1220G
 db_recycle_cache_size  40G  16G

Clearly the HP has about 1783G allocated to the total buffer cache compared to “only” 415G on the Sun – a factor of just over 4 difference – so it’s not surprising that the values for the individual cache are rather different. What is significant, though, is that the values are not consistently different.

Why isn’t the Sun using the 2K at all – surely if it’s a good idea to use very small blocks for something it’s a good idea for both platforms. In fact it was a common claim in some quarters that small blocks were a really good idea for RAC and it was the Sun that was running RAC.  (Of course that doesn’t really matter if you can rig your system to avoid cross-instance traffic.)

Why is the default cache on the Sun 6 times the size of the 8K cache on the HP rather than being in scale at roughly 1/4; alternatively why is the default cache on the Sun 65 times the size of the default cache on the HP (remember some tablespaces have to match the default block size).

Similarly the recycle caches show the same reversal in their relative sizes, Why ?

Why, since the 16K block size has been seen as “quite nice” has neither platform gone to the even nicer 32K ? One of the loudly touted myths of the time was that bigger blocks were good for indexes – but neither test uses the largest possible block size.  (Actually, I have to note that I think the HP implementation didn’t allow for 32KB Oracle blocks, but I may be wrong about that.)

The purpose of this post isn’t to question exactly why the two configurations are significantly different for the same task – the purpose is to point out that picking a block size, or set of block sizes, or use of the keep and recycle caches, isn’t a trivial task with a knee-jerk right answer that you can get by posting a question on OTN or reading a couple of brief articles on a web-site.

tl;dr

If you have to ask for help choosing the best block size – use the default for your platform, and don’t mess with the keep and recycle unless you’re prepared to pay an expert to spend a little time helping you, and then spend quite a lot of time doing the tests that they should have suggested.

 

Addendum July 2020

It’s a very long time since I read any of the tpc-c reports – they tend to run to dozens of pages of description of exactly what hardware and software has been used, what scripts were used to populate tables, what parameters were set, and so on. One thing I do remember from reading a few of them is that they weren’t realistic attempts to implement a working system – they were only attempts to take a “realistic” data set but get it into the database in a way that would allow the tpc-c test to run as quickly as possible.

For example –

if you use a 1TB disc to store 100GB of data you can position the data on the fastest part of the disc

if you create a critical table as a single table hash cluster you can access all the rows for a single key value very quickly if you pick a block size that will hold exactly all the rows for one key. And when you notice that the table can store all the data for a key in a 2KB block size you minimise the pressure on the buffer cache by putting that table into its own 2KB buffer cache. The fact that you have to predefine the size of the table/cluster doesn’t matter if the tpc-c test doesn’t worry about what happens “tomorrow” as more data arrives. (Note: I recall that in one 10g or 11g tpc-c test Oracle used a “future” feature of range partitioned hash clusters – which finally appeared in 12c)

 

 

 

 

June 15, 2011

Block size

Filed under: Block Size,Infrastructure,Oracle — Jonathan Lewis @ 5:55 pm BST Jun 15,2011

I knew that failing to have a db_Nk_cache_size setting for your database could cause a statement to crash when it tried to address an object (or tablespace) using a non-standard block size, reporting errors like:

    ORA-29339: tablespace block size 16384 does not match configured block sizes”
    ORA-00379: no free buffers available in buffer pool DEFAULT for block size 16K

Here’s an interesting variation on the theme, reported in a note on the OTN database forum. Note particularly the ORA-603 and ORA-604 that wrap the ORA-379; and that the user states that the problem cache is the standard block size for the database. Unfortunately we never saw a resolution to this thread – perhaps it was simply a case of a cache that was too small when the database got very busy.

Footnote: a database can fail to open if it needs to do recovery in a tablespace for which there is no buffer set. Of course this is only likely to happen if you’re running with an init.ora file and have created a non-standard cache with ‘alter system’ calls while the database was previously up. Here’s an extract from an alert log showing the type of report you get:

Fri May 20 17:58:38 2011
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 2 processes
Started redo scan
Completed redo scan
 374 redo blocks read, 98 data blocks need recovery
Fri May 20 17:58:40 2011
Slave exiting with ORA-379 exception
Errors in file c:\oracle\diag\rdbms\d11g\d11g\trace\d11g_p000_2056.trc:
ORA-00379: no free buffers available in buffer pool  for block size 16K
Aborting crash recovery due to slave death, attempting serial crash recovery
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 374 redo blocks read, 98 data blocks need recovery
Aborting crash recovery due to error 379
Errors in file c:\oracle\diag\rdbms\d11g\d11g\trace\d11g_ora_3536.trc:
ORA-00379: no free buffers available in buffer pool  for block size 16K
ORA-379 signalled during: ALTER DATABASE OPEN...

September 25, 2009

Bugs

Filed under: ASSM,Block Size,Bugs,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:03 pm BST Sep 25,2009

There are times as I browse through Metalink when I see descriptions of bugs that make me wonder how on earth anyone managed to find them. There are bugs which are so bizarre in their combination of prerequisites that you might think they’d never,ever,  show up. Here’s one that got a mention on OTN some time back.

Problem: an update on a simple, unindexed, table takes 90 minutes if the table is in a tablespace using a 16KB block size; but closer to 90 seconds if the table is in a tablespace using a 4KB block size. The effect is totally reproducible.
(more…)

May 21, 2009

Row Directory

Filed under: Block Size,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 8:52 am BST May 21,2009

Yesterday I asked the question: how many row entries can you create in an 8KB block with pctfree 0. It brought out some interesting observations – including one I hadn’t thought of relating to a small difference between ASSM and freelist management.

I suspect, however, that most people didn’t quite realise the significance of the wording: “row entries”– I  wasn’t asking about “rows”. So here’s a little script you can run after you create the table, followed by the result of dumping the first block in the table.
(more…)

March 22, 2009

Block size – again

Filed under: Block Size,Infrastructure,Performance,trace files,Troubleshooting,undo — Jonathan Lewis @ 7:09 pm GMT Mar 22,2009

Here’s a little oddity that I came across at a client site recently.

The client called me in because they were having problems with Oracle error “ORA-01555: snapshot too old” appearing in a particular task after a few thousand seconds (typically 5,000 to 6,000) even though they had set the undo_retention to 14,400 seconds and had a huge undo tablespace running with autoextend enabled on the data files.

(more…)

October 28, 2008

IOTs and blocksize

Filed under: Block Size,Infrastructure,IOT,Oracle,Performance,Tuning — Jonathan Lewis @ 7:17 pm GMT Oct 28,2008

A question came up on the Oracle database forum a few months ago asking:

What are the benefits and the downside of using IOTs on 16k blocks? Would you recommend it?

I think the best response to the generic question about block sizing came from Greg Rahn in another thread on the forum:

If someone has to ask what block size they need. The answer is always 8KB.[1]

(more…)

July 19, 2008

Block Sizes

Filed under: Block Size,Infrastructure,Troubleshooting — Jonathan Lewis @ 7:56 pm BST Jul 19,2008

There is a recent thread on the OTN forum [ed Jan 2009: the link is temporarily broken] about block sizing and the impact it may have on performance. The thread contains some interesting insights into the problems of testing and trouble-shooting. Unfortunately the thread has accumulated so much extraneous material that it has become hard to separate the wheat from the chaff, so I thought I’d make it easy to follow a couple of useful discussions that ran through the thread by supplying a sequence of URLs that you can visit in turn.

[Ed Jan 2009]
According to this comment from Justin Kestelyn, extremely long messages were causing performance problems in the current version of the forum software and have have been temporarily archived to a separate location – and this thread contained some very long messages. Since it may be some time before it reappears on the forum I’ve posted a copy made just before it disappeared so that (until the original reappears) you can read the interesting bits here. Warning the file is a pdf of about 3MB – the file will open in a separate window so that you can read the file and my comments about the file at the same time. When I view the file it runs to 218 pages, so I’ve added
page references after the links in the blog note (the links, of course, do not currently work).

(When first created the document was 233 pages long, but following a request from Don Burleson I removed a couple of entries that quoted material from his website, then following a request from Janet Burleson I removed all entries made by Don Burleson, so some of the page references may still be a little inaccurate.)

(more…)

July 22, 2007

Combinations

Filed under: Block Size,Infrastructure,Troubleshooting — Jonathan Lewis @ 7:48 pm BST Jul 22,2007

A comment I always make in my seminar about Oracle technology is that the best way to find bugs is to combine two different bits of the technology.  It’s easy to say, of course, but how can you possibly guess which combinations are going to produce bugs.

Here’s a really surprising one:  Bug no: 3469992 – fixed in 10.2 (Mar 2006) so probably no longer relevant to most people.

 
Rediscovery Information: 
        If flashback query fails with ORA-600[15201] and db_block_size is  
        bigger than 8K, it is due to this problem.    

Workaround: 
        using smaller block_size 
 

How on earth are you supposed to guess that flashback query and blocksizes would have anything to do with each other ? After all, flashback query is just read-consistency from a different perspective – it couldn’t possibly go wrong !

Website Powered by WordPress.com.