This is part 2 of an article on the KEEP cache. If you haven’t got here from part one you should read that first for an explanation of the STATE and CUR columns of the output.
Here’s a little code to demonstrate some of the problems with setting a KEEP cache – I’ve set up a 16MB cache, which gives me 1,996 buffers of 8KB in 10.2.0.3, and then created a table that doesn’t quite fill that cache. The table is 1,900 data blocks plus one block for the segment header (I’ve used freelist management to make the test as predictable as possible, and fixed the pctfree to get one row per block).
create table t1 pctfree 90 pctused 10 storage (buffer_pool keep) as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 ) select rownum id, lpad(rownum,10,'0') small_vc, rpad('x',1000) padding from generator v1, generator v2 where rownum <= 1900 ; alter system flush buffer_cache; -- scan the table to load it into memory select /*+ full(t1) */ count(small_vc) from t1 where id > 0 ; -- check the content of x$bh from another connection -- update every fifth row (380 in total) update t1 set small_vc = upper(small_vc) where mod(id,5) = 0 ; -- check the content of x$bh from another connection
The query I ran under the SYS account was this:
select bpd.bp_blksz, bpd.bp_name, wds.set_id, bh.state, cur, ct from ( select set_ds, state, bitand(flag, power(2,13)) cur, count(*) ct from x$bh group by set_ds, state, bitand(flag, power(2,13)) ) bh, x$kcbwds wds, x$kcbwbpd bpd where wds.addr = bh.set_ds and bpd.bp_lo_sid <= wds.set_id and bpd.bp_hi_sid >= wds.set_id and bpd.bp_size != 0 order by bpd.bp_blksz, bpd.bp_name, wds.set_id, bh.state, bh.cur ;
In my test case this produced two sets of figures, one for the DEFAULT cache, and one for the KEEP cache but I’ve only copied out the results from the KEEP cache, first after the initial tablescan, then after the update that affected 380 blocks:
BP_BLKSZ BP_NAME SET_ID STATE CUR CT ---------- ------- ---------- ----- ----- ---------- 8192 KEEP 1 0 0 95 1 0 1901 ******* ********** ---------- sum 1996 BP_BLKSZ BP_NAME SET_ID STATE CUR CT ---------- ------- ---------- ----- ----- ---------- 8192 KEEP 1 1 0 1462 1 8192 380 3 0 323 ******* ********** ---------- sum 1996
In the first output you see the 1901 buffers holding blocks from the table (1,900 data plus one segment header), with the remaining 95 buffers still “free” (state 0). The table blocks are all shown as XCUR (state 1, exclusive current)
In the second output you see 380 buffers holding blocks with state ‘XCUR’ with bit 13 of the flag column set, i.e. “gotten in current mode”. These are the 380 blocks that have been updated – but there are also 323 blocks shown as CR (state 3, “only valid for consistent read”). A detailed check of the file# and dbablk for these buffers shows that they are clones of (most of) the 380 blocks in the XCUR buffers.
Do a bit of arithmetic – we have 1462 blocks left from the original tablescan, plus 380 blocks in CUR mode (of which there are 323 clones) for a total of 1,842 blocks – which means that 59 blocks from the table are no longer in the cache. As we clone blocks we can lose some of the blocks we want to KEEP.
Unfortunately for us, Oracle has not given any preferential treatment to buffers which hold blocks in the XCUR state – any buffer which reaches the end of LRU chain and hasn’t been accessed since it was first loaded will be dumped so that the buffer can be used to create a clone (but see footnote). This means that a constant stream of inserts, updates, deletes, and queries could result in lots of clones being created in your KEEP cache, pushing out the data you want to keep.
If you want to size your KEEP cache to minimise this effect, you probably need to start by making it somewhat larger than the objects it is supposed to KEEP, and then checking to see how many clones you have in the cache – because that will give you an idea of how many extra buffers you need to stop the clones from pushing out the important data.
When I wrote and ran the test cases in this note the client was running Oracle 10.2 – while writing up my notes I happened to run the test on 126.96.36.199 (still using freelists rather than ASSM) and got the following output from my scan of the KEEP cache:
BP_BLKSZ BP_NAME SET_ID STATE CUR CT ---------- ------- ---------- ----- ----- ---------- 8192 KEEP 1 1 0 1901 3 0 91 ******* ********** ---------- sum 1992
Apart from the fact that you get slightly fewer buffers per granule in 11g (the x$bh structure has become slightly larger – and x$bh is a segmented array where each segment shares the granule with the buffers it points to) you can see that we only have 91 clones in the KEEP cache, and apparently we’ve managed to update our 380 blocks without changing their flag to “gotten in current mode”. Doing an update is, of course, just one way of making clones appear – but perhaps 11g will generally have more success in keeping current versions of blocks in memory than earlier versions.
There is , unfortunately, a very special feature to this test case – it’s using a single tablescan to update the table. So having said in part 1 that I was going to write a two-part article, I’ve got to this point, done a few more tests, and decided I need to write part three as well. Stay tuned.
Footnote: Technically there are a couple of circumstances where Oracle will bypass the buffer and walk along the LRU chain looking for another block – but I’ve excluded them from this demonstration.