A comment on one of my early blogs about the 12c in-memory database option asked how Oracle would deal with read-consistency. I came up with a couple of comments outlining the sort of thing I would look for in a solution, and this note is an outline on how I started to tackle the question – with a couple of the subsequent observations. The data is (nearly) the same as the data I generated for my previous article on the in-memory database (and I’m running 220.127.116.11, of course):
create table t1 nologging as select * from all_objects where rownum <= 50000 ; insert /*+ append */ into t1 select * from t1; commit; insert /*+ append */ into t1 select * from t1; commit; insert /*+ append */ into t1 select * from t1; commit; begin dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1'); end; / alter table t1 inmemory priority high memcompress for query low inmemory memcompress for query high (object_type) ;
In this case I’ve made the inmemory priority high and I haven’t set any column to “no inmemory” although I have made one column different from the rest (v$_im_column_level doesn’t get populated unless there is some variation across columns). I have to say I couldn’t get very consistent behaviour in terms of when the data finally got into memory with this table creation – possibly something to do with using “alter table” rather than “create table” – but a second “alter table t1 inmemory;” seemed to do the trick if Oracle was playing hard to get.
Once I’d checked that the table was in memory I collected performance figures from v$mystat and v$session_event for the following query:
select /* Test Run */ last_ddl_time from t1 where t1.created > trunc(sysdate) and t1.object_type = 'TABLE' and t1.subobject_name is not null ;
Once I was satisfied that the in-memory option was working correctly, I went through the following steps:
- Session 1: set transaction read only;
- Session 1: run the query and collect performance figures
- Session 2: do several small, committed, updates, modifying a total of 30 or 40 random rows
- Session 2: Flush the buffer cache – so that we can see future block acquisition
- Session 1: re-run the query and collect performance figures – compare and contrast
The effect of the “set transaction read only;” was to force the session to do some extra work in the second execution of the query to make the data read-consistent back to the start of the “transaction”. The results were as follows (don’t forget that some of the numbers will relate to the action of collecting the performance figures):
First execution =============== Name Value ---- ----- Requests to/from client 4 opened cursors cumulative 5 user calls 6 recursive calls 3 session logical reads 6,680 DB time 1 non-idle wait count 4 consistent gets 3 consistent gets from cache 3 consistent gets pin 3 consistent gets pin (fastpath) 3 logical read bytes from cache 24,576 calls to kcmgcs 7 calls to get snapshot scn: kcmgss 1 table scans (long tables) 1 table scans (IM) 1 IM scan CUs memcompress for query low 1 session logical reads - IM 6,677 IM scan bytes in-memory 5,155,309 IM scan bytes uncompressed 45,896,824 IM scan CUs columns theoretical max 18 IM scan rows 399,984 IM scan rows optimized 399,984 IM scan CUs split pieces 1 IM scan CUs predicates received 3 IM scan CUs predicates applied 3 IM scan CUs predicates optimized 1 IM scan CUs pruned 1 IM scan segments minmax eligible 1 session cursor cache hits 5 workarea executions - optimal 1 parse count (total) 4 execute count 5 bytes sent via SQL*Net to client 1,150 Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message to client 9 0 0.00 .000 0 SQL*Net message from client 9 0 0.44 .049 8,408 Second Execution ================ Name Value ---- ----- Requests to/from client 4 opened cursors cumulative 5 user calls 6 recursive calls 3 session logical reads 6,728 DB time 1 non-idle wait count 35 enqueue requests 2 enqueue releases 2 physical read total IO requests 29 physical read total multi block requests 24 physical read total bytes 6,987,776 cell physical IO interconnect bytes 6,987,776 consistent gets 92 consistent gets from cache 92 consistent gets pin 44 consistent gets pin (fastpath) 5 consistent gets examination 48 logical read bytes from cache 753,664 physical reads 853 physical reads cache 853 physical read IO requests 29 physical read bytes 6,987,776 consistent changes 48 free buffer requested 894 CR blocks created 41 physical reads cache prefetch 824 physical reads prefetch warmup 713 shared hash latch upgrades - no wait 43 calls to kcmgcs 7 calls to get snapshot scn: kcmgss 1 file io wait time 3,861 data blocks consistent reads - undo records applied 48 rollbacks only - consistent read gets 41 table scans (long tables) 1 table scans (IM) 1 table scan rows gotten 2,803 table scan blocks gotten 41 IM scan CUs memcompress for query low 1 session logical reads - IM 6,636 IM scan bytes in-memory 5,155,309 IM scan bytes uncompressed 45,896,824 IM scan CUs columns theoretical max 18 IM scan rows 399,984 IM scan rows optimized 399,984 IM scan rows cache 48 IM scan blocks cache 41 IM scan CUs split pieces 1 IM scan CUs predicates received 3 IM scan CUs predicates applied 3 IM scan CUs predicates optimized 1 IM scan CUs pruned 1 IM scan segments minmax eligible 1 session cursor cache hits 5 workarea executions - optimal 1 parse count (total) 4 execute count 5 bytes sent via SQL*Net to client 1,150 bytes received via SQL*Net from client 1,772 SQL*Net roundtrips to/from client 4 Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- Disk file operations I/O 2 0 0.01 .003 0 db file sequential read 5 0 0.01 .001 0 db file scattered read 24 0 0.38 .016 0 SQL*Net message to client 10 0 0.01 .001 0 SQL*Net message from client 10 0 0.76 .076 8,408
There’s quite a lot of stats which probably aren’t interesting – and there’s one detail that is important but doesn’t appear (at least not clearly) and that’s the fact that the table in question had about 6,800 blocks below its highwater mark.
So, what do the stats tell us? The most obvious change, of course, is that we had to do some physical reads to get a result set: 24 multiblock reads and 5 single block reads (the latter from the undo tablespace). This is echoed in the session stats as 853 “physical reads cache” from 29 “physical read IO requests”. We can then see the specific read-consistency work (in two ways – with a third close approximation):
consistent changes 48 CR blocks created 41 data blocks consistent reads - undo records applied 48 rollbacks only - consistent read gets 41 IM scan rows cache 48 IM scan blocks cache 41
We applied 48 undo change vectors to fix up 41 blocks to the correct point in time and used them to read 48 rows – the last pair of figures won’t necessarily match the first two pairs, but they do give us a measure of how much data we had to acquire from the cache when trying to do an in-memory scan.
The number 41 actually appears a couple more times: it’s “table scan blocks gotten” (which might seem a little odd since we got far more than 41 blocks by multiblock reads – but we only really wanted 41), and it’s also the change (downwards) in “session logical reads – IM”. Even when Oracle does a pure in-memory query it calculates the number of blocks it would have been reading and reports that number as “session logical reads” and “session logical reads – IM” – so there’s another way to get confused about buffer visits and another statistic to cross-check when you’re trying to work out how to calculate “the buffer hit ratio” ;)
After the first read the scattered reads all seemed to be 32 blocks of “intermittent” tablescan – perhaps this is a measure of the number of blocks that are compressed into a single in-memory chunk (for query low), but perhaps it’s a side effect of the “physical reads prefetch warmup” that Oracle may do when the cache has a lot of empty space. I’ll leave it as an exercise to the reader to refine the test (or think of a different test) to determine whether it’s the former or latter; it’s quite important to find this out because if Oracle is tracking change at the “in-memory chunk” rather than at the block level then a small amount of high-precision change to an in-memory table could result in a relatively large amount of “redundant” I/O as a long-running query tried to stay read-consistent.