Oracle Scratchpad

January 29, 2007

Small Partitions

Filed under: Infrastructure,Partitioning,Performance — Jonathan Lewis @ 8:09 pm GMT Jan 29,2007

Here’s a little example of how technology catches you out.

You have a table of 1,000,000 blocks in your system and from time to time it is necessary to run a tablescan through it. It’s not nice, but it only happens once a day, and it’s not worth building an index to create an alternative access path. On the plus side, since the table is much larger than 2% of the number of buffers in the cache, the scans don’t cause much damage to the cache because Oracle performs large scans by constantly recycling a few buffers from the tail-end of the LRU (least recently used) list.

[Update: based on the note written by Dion Cho (see comments list) that "few" buffers is likely to be 25% of the default cache in any recent copy of Oracle - the 2% has changed its meaning dramatically since 8i - and I haven't been keeping up.]
[Update again: Some notes I wrote a couple of years laters about small tables and the variation in treatment depending on size]

Then one day you decide to treat yourself by licensing the partitioning option, and you partition this big table into 128 separate pieces. The next time the tablescan runs, it thrashes your buffer cache to death, because each partition is now only 8,000 blocks long, which just happens to be a little less than that critical 2% – so the blocks are loaded to the mid-point rather than the end-point of the LRU list. Bad luck – the critical 2% relates to the size of the individual segments, not to the size of the table.

And another thing:

The fact that a “small” table essentially means one that is less than 2% of the size of the buffer cache is fairly well known. But is that 2% of:

  • The blocks in the standard default cache
  • The number of blocks in the keep, recycle and standard default caches combined
  • The number of blocks in all the default caches combined – by number of blocks
  • The combined memory size of all the default caches, divided by the size of the default block
  • The number of blocks in all the caches combined – by number of blocks
  • The combined memory size of all the caches, divided by the size of the default block
  • Other

And if you are using  ASMM (automatic SGA memory management) does the 2% vary as the buffer cache changes, or does it fix itself at the size set by the hidden parameter __db_cache_size when the database starts up, or at some figure dictated by the setting of the sga_target parameter.

It’s amazing how many little details you have to check when you start looking a little closely – and one day I may have to look that closely,  but tonight’s not the night.


  1. hello Jonathan,
    so, that begs the question, is there any way to either force data loading into a specific part of the LRU or tweak the 2% m(tr)agical number? ;-) or perhaps, it is simply better to build bigger partitions, but then again, one day you’ll do full a partition scan so the bigger the partition the more time it will take for a full scan… always these darn tradeoffs…

    which begs another question… how did you determine the LRU load location? ;-) –I love ‘em scripts !

    Comment by cos — January 29, 2007 @ 8:23 pm GMT Jan 29,2007 | Reply

  2. Cos,
    You can specify a table to be NOCACHE (although that may be a deprecated option nowadays, I’d have to check that), and there is a NOCACHE hint. There is also a hidden parameter which, of the top of my head, is _small_table_threshold to specify the size at which a table is no longer considered “small”.
    The mid-point/end-point strategies are documented in one of the standard manuals. I checked what was happening by the effects, not by peeking into memory. Simply count how many blocks from an object were in the buffer after the scan – with a carefully sized cache and set of partitions to make it easy:

    select state, objd, count(*) from v$bh group by state, objd;

    You have to be a bit careful if you have lots of “free” buffers around, though, as this introduces a special (hence misleading) case.

    Comment by Jonathan Lewis — January 29, 2007 @ 8:36 pm GMT Jan 29,2007 | Reply

  3. “bit careful if you have lots of “free” buffers around”

    ahh…. same “index clustering factor” analogy somewhat, but at the memory level, I take it ;-)

    well, if I hang around your blogsite much longer, I would have memorized all v$ and g$ by heart… ;-)

    which begs a question — waits and wait events: how much of your consulting time, as an approximate %-age, are you needing to look at wait events and such? — I know Cary advocates this a lot, if I’m not mistaken, as a de-facto procedure…. but I think it’s overkill in many scenarios.


    Comment by cos — January 29, 2007 @ 10:25 pm GMT Jan 29,2007 | Reply

  4. Whether the table was or was not partitioned one might consider setting “_serial_direct_reads”=true to avoid buffer cache issues entirely. Setting a low degree of parallelism might be more robust, if that option is available. The checkpoint ought not to be an issue for an occasional query.

    Comment by David Aldridge — January 30, 2007 @ 12:29 am GMT Jan 30,2007 | Reply

  5. alrighty then!
    I’m officially starting my list of undocumented Oracle parameters and v$,g$ views.
    Documented & supported, undocumented but supported, undocumented and unsupported. Any other categories, such as undocumented, unsupported and highly unstable, such as __* double-underscore? ;-)

    Comment by Cos — January 30, 2007 @ 12:37 am GMT Jan 30,2007 | Reply

  6. This is why all these “auto everything” options are so dangerous and why I point blank refuse to use them.

    Folks think it’s just a matter of “set and forget”, helped by liberal dousings of the “we don’t need no dbas” nonsense. Unfortunately, maths are not something that goes away at the click of a button. And cache apportioning and its use patterns are definitely the domain of maths.

    Cache mechanisms and use patterns are something that one wants to keep under strict control. This essentially means: use one set of values and LOCK it in. Measure the effects, deduct a better config, make the change and LOCK it in. Rinse and repeat until diminishing returns creeps in. Review once a year or whenever there is a known change of use pattern. Monitor for such changes.

    Basic feedback loop engineering principles really, which seem to have been forgotten by the folks who write this “auto” stuff. There is much more to feedback than this, but here is not the place to talk about it.

    Thanks for the alert, Jonathan: much appreciated.

    Comment by Noons — January 30, 2007 @ 3:37 am GMT Jan 30,2007 | Reply

  7. @Jonathan

    I’m thinking about the fact that we can define tablespaces (and cache) with different block size.

    As this are strictly related, I hope that a table is defined “small” looking at the tablespace where it “lives”.

    So I would vote for
    “The number of blocks in the keep, recycle and standard default caches based on the (related) tablespace block size”

    I remember (I read on your website or your book) that the access on tablespace with different block size that they are “divided” by the standard one…so the same can apply to the cache…

    Comment by Antonio — January 30, 2007 @ 8:00 am GMT Jan 30,2007 | Reply

  8. Excellent

    Comment by Mirjana — January 30, 2007 @ 9:47 am GMT Jan 30,2007 | Reply

  9. Cos, you missed “documented but unsupported” ;-)

    David, I’d be cautious about the _serial_direct_reads, but the paralle option is viable: provided you realise it will (at least) double the impact on your disks – albeit for a shorter period of time.

    Noons: I’m in favour of the same approach – although the cost/benefit analysis for trivial systems often comes out in favour of throwing excess hardware at it and using “fire and forget”.
    Question: Should one say “Maths are not something that …”, or “Maths is not something that …”
    Answer: Don’t ask me, I teach Maths, not English. :-)

    Antonio, facts cannot be derived by hoping or by voting – a fact that people from all shades of the political spectrum find hard to appreciate. At some stage, hard science needs to be applied without pre-conceptions.

    Comment by Jonathan Lewis — January 30, 2007 @ 12:05 pm GMT Jan 30,2007 | Reply

  10. @Jonathan

    You’ll get better result if you know what you want from the start.

    That’s true even for “hard science”! :-)

    Mine was not a pre-conceptions mine was an hypothesis…well, sort of! :-)

    Comment by Antonio — January 30, 2007 @ 1:02 pm GMT Jan 30,2007 | Reply

  11. Antonio, I think I’d prefer the statement: “you waste less time if you know where to look”, but the down-side to that is that you tend to look for positive re-inforcement and forget to look for (a) the contra-indications and (b) the alternative solutions.
    I knew what you meant, I was just taking advantage of the words you used to make a serious point.

    Comment by Jonathan Lewis — January 30, 2007 @ 4:15 pm GMT Jan 30,2007 | Reply

  12. Us ‘mericans would say “Math is not something that…” ;-) What’s “Maths”? ;-)

    Comment by Mark Bobak — January 31, 2007 @ 6:21 am GMT Jan 31,2007 | Reply

  13. “positive re-inforcement” ≈ ‘Confirmation Bias’, which is always something to be very mindful of, regardless of the sphere.

    Comment by SeanMacGC — January 31, 2007 @ 8:48 am GMT Jan 31,2007 | Reply

  14. What’s “Maths”?

    “Maths” is somewhat of a colloquialism or archaism…. it is/was used fondly-matter-of-speach in some countries in Europe. I have not heard the expression in a long while…but it’s a cute expression, nonetheless. In a more real sense, it means, algebra, calculus, trigonometry, etc… the many “maths” sciences ;-)

    Comment by cos — January 31, 2007 @ 9:03 pm GMT Jan 31,2007 | Reply

  15. “maths” or “math”, I think that regardless of the supposed “age” of the term the final effect is the same. IOW, one ignores either one of them at a very high cost.
    oh, and yes: in Australia, it’s called Maths. Or Mathematics, if you insist in a precise term.

    Comment by Noons — February 1, 2007 @ 3:59 am GMT Feb 1,2007 | Reply

  16. This is typical example when partitioning is bad choice regardles of partitions size

    Comment by Andjelko Miovcic — April 29, 2013 @ 8:59 pm GMT Apr 29,2013 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at


Get every new post delivered to your Inbox.

Join 4,308 other followers