This note is about a feature of LOBs that I first desribed in “Practial Oracle 8i” but have yet to see used in real life. It’s a description of how efficient Oracle can be, which I’ll start with a description of, and selection from, a table:
create table test_lobs ( id number(5), bytes number(38), text_content clob ) lob (text_content) store as text_lob( disable storage in row cache ) ; -- insert a row SQL> desc test_lobs Name Null? Type ----------------------- -------- ---------------- ID NUMBER(5) BYTES NUMBER(38) TEXT_CONTENT CLOB SQL> select id, bytes, dbms_lob.getlength(text_content) from test_lobs; ID BYTES DBMS_LOB.GETLENGTH(TEXT_CONTENT) ---------- ---------- -------------------------------- 1 365025 365025 1 row selected.
I’ve got a table with a single CLOB column holding a single row. The size of the single CLOB is roughly 365KB (or about 45 blocks of 8KB). Old hands who have had to suffer LONG columns will recognise the trick of recording the size of a LONG as a separate column in the table; it’s a strategy that isn’t really necessary with LOBs but old coding habits die hard. It’s quite hard to find details of how much space has been used in a LOB segment (the space_usage procedure in the dbms_space package doesn’t allow you to examine LOBSEGMENTs), but I did a coupld of block dumps to check on this LOBSEGMENT and it had allocated 46 blocks on the first insert.
So here’s the clever bit – how big will the LOBSEGMENT grow when I update that one CLOB ?
It’s common knowledge (to users of LOBs) that the undo mechanism Oracle has for LOBs is simply to leave the old LOB in place and create a new one – so the intial response to the question might be to guess that the LOBSEGMENT will grow to roughly double the size. But it doesn’t have to be like that, at least, not if you update the LOB the way I happen to have done, which is like this:
declare m_length integer; m_lob clob; begin select text_content, dbms_lob.getlength(text_content) into m_lob, m_length from test_lobs where id = 1 for update ; dbms_output.put_line('Lob size: ' || m_length); dbms_lob.write( lob_loc => m_lob, amount => 17, offset => 1, buffer => 'This is an update' ); commit; end; /
My code very specifically changes only the first 17 bytes of the LOB. So how much does Oracle have to do to effect this change ? The LOB-handling mechanisms are smart enough to work out that only the first (of 45) blocks in the LOB need to be changed, so Oracle need only add one block to the segment and write the new version of the first LOB block to that one block. (In fact the segment – which was in a tablespace using freelist management – grew by the “standard” 5 blocks from which Oracle selected just one block to add to the LOB.)
So how does Oracle keep track of the whole LOB if it can change it one piece at a time ? This is where the (notionally invisible and you don’t need to know about it) LOBINDEX comes into play. Oracle maintains an index keyed by (LOB_ID, chunk_number) *** pointing to all the chunks of a LOB in order, so when you update a single chunk Oracle simply creates an updated copy of the chunk and changes the appropriate index entry to point to the new chunk. So here’s an image representing our one LOB value just after we’ve created it and before we’ve updated:
And then we “modify” the first chunk – which means we have to add a chunk (which in this case is a single block) to the segment, create a new version of the first chunk, modify the index to point to the new block, and add an index entr – keyed by time-stamp – to the end of the index to point to the old chunk; something like this:
Now, when we run a query to select the LOB, Oracle will follow the index entries in order and pick up the new chunk from the end of the LOBSEGMENT. But the LOBINDEX is protected by undo in the standard fashion, so if another long-running query that started before our update needs to see the old version of the LOB it will create a read-consistent copy of the relevant index leaf block– which means that from its perspective the index will automatically be pointing to the correct LOB chunk.
The index is actually quite an odd one because it serves two functions; apart from pointing to current lobs by chunk number, it also points to “previous” chunks by timestamp (specifically the number of seconds between Midnight of 1st Jan 1970 and the time at which the chunk was “overwritten”). This makes it easy for Oracle to deal with the retention interval for LOBs – any time it needs space in the LOBSEGMENT it need only find the minimum timestamp value in the index and compare it with “sysdate – retention” to see if there are any chunks available for re-use.
To sum up – when you update LOBs, and it’s most beneficial if you have an application which doees piece-wise updates, you leave a trail of old chunks in in the LOBSEGMENT. The version of the LOB you see is dictated by the version of the index that you generate when you request a copy of the LOB at a given SCN.
*** Footnote: My description of the LOBINDEX was an approximation. Each index entry carries a fixed size “payload” listing up to eight lob chunks; so the (LOB_ID, chunk_number) index entries in a LOBINDEX may point to every 8th chunk in the LOB. The significance of the “fixed size” payload is that the payload can be modified in place if the pointer to a LOB chunk has to be changed – and this minimises disruption of the index (at a cost of some wasted space).