Oracle Scratchpad

March 22, 2013

LOB Update

Filed under: Infrastructure,LOBs,Oracle — Jonathan Lewis @ 10:36 pm GMT Mar 22,2013

This note is about a feature of LOBs that I first described in “Practial Oracle 8i” but have yet to see used in real life. It’s a description of how efficient Oracle can be, which I’ll start with a description of, and selection from, a table:

create table test_lobs (
	id              number(5),
	bytes           number(38),
	text_content    clob
)
lob (text_content) store as text_lob(
	disable storage in row
	cache
)
;

-- insert a row

SQL> desc test_lobs
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID                               NUMBER(5)
 BYTES                            NUMBER(38)
 TEXT_CONTENT                     CLOB

SQL> select id, bytes, dbms_lob.getlength(text_content) from test_lobs;

        ID      BYTES DBMS_LOB.GETLENGTH(TEXT_CONTENT)
---------- ---------- --------------------------------
         1     365025                           365025

1 row selected.

I’ve got a table with a single CLOB column holding a single row. The size of the single CLOB is roughly 365KB (or about 45 blocks of 8KB). Old hands who have had to suffer LONG columns will recognise the trick of recording the size of a LONG as a separate column in the table; it’s a strategy that isn’t really necessary with LOBs but old coding habits die hard. It’s quite hard to find details of how much space has been used in a LOB segment (the space_usage procedure in the dbms_space package doesn’t allow you to examine LOBSEGMENTs), but I did a couple of block dumps to check on this LOBSEGMENT and it had allocated 46 blocks on the first insert.

So here’s the clever bit – how big will the LOBSEGMENT grow when I update that one CLOB ?

It’s common knowledge (to users of LOBs) that the undo mechanism Oracle has for LOBs is simply to leave the old LOB in place and create a new one – so the initial response to the question might be to guess that the LOBSEGMENT will grow to roughly double the size. But it doesn’t have to be like that, at least, not if you update the LOB the way I happen to have done, which is like this:

declare

	m_length	integer;
	m_lob		clob;

begin

	select
		text_content,
		dbms_lob.getlength(text_content)
	into	m_lob, m_length
	from
		test_lobs
	where
		id = 1
	for update
	;

	dbms_output.put_line('Lob size: ' || m_length);

	dbms_lob.write(
		lob_loc	=> m_lob,
		amount	=> 17,
		offset	=> 1,
		buffer	=> 'This is an update'
	);

	commit;

end;
/

My code very specifically changes only the first 17 bytes of the LOB. So how much does Oracle have to do to effect this change ? The LOB-handling mechanisms are smart enough to work out that only the first (of 45) blocks in the LOB need to be changed, so Oracle need only add one block to the segment and write the new version of the first LOB block to that one block. (In fact the segment – which was in a tablespace using freelist management – grew by the “standard” 5 blocks from which Oracle selected just one block to add to the LOB.)

So how does Oracle keep track of the whole LOB if it can change it one piece at a time ? This is where the (notionally invisible and you don’t need to know about it) LOBINDEX comes into play. Oracle maintains an index keyed by (LOB_ID, chunk_number) *** pointing to all the chunks of a LOB in order, so when you update a single chunk Oracle simply creates an updated copy of the chunk and changes the appropriate index entry to point to the new chunk. So here’s an image representing our one LOB value just after we’ve created it and before we’ve updated:

lob_1

And then we “modify” the first chunk – which means we have to add a chunk (which in this case is a single block) to the segment, create a new version of the first chunk, modify the index to point to the new block, and add an index entr – keyed by time-stamp – to the end of the index to point to the old chunk; something like this:

lob_2

Now, when we run a query to select the LOB, Oracle will follow the index entries in order and pick up the new chunk from the end of the LOBSEGMENT. But the LOBINDEX is protected by undo in the standard fashion, so if another long-running query that started before our update needs to see the old version of the LOB it will create a read-consistent copy of the relevant index leaf block– which means that from its perspective the index will automatically be pointing to the correct LOB chunk.

The index is actually quite an odd one because it serves two functions; apart from pointing to current lobs by chunk number, it also points to “previous” chunks by timestamp (specifically the number of seconds between Midnight of 1st Jan 1970 and the time at which the chunk was “overwritten”). This makes it easy for Oracle to deal with the retention interval for LOBs – any time it needs space in the LOBSEGMENT it need only find the minimum timestamp value in the index and compare it with “sysdate – retention” to see if there are any chunks available for re-use.

To sum up – when you update LOBs, and it’s most beneficial if you have an application which does piece-wise updates, you leave a trail of old chunks in the LOBSEGMENT. The version of the LOB you see is dictated by the version of the index that you generate when you request a copy of the LOB at a given SCN.

 

*** Footnote: My description of the LOBINDEX was an approximation. Each index entry carries a fixed size “payload” listing up to eight lob chunks; so the (LOB_ID, chunk_number) index entries in a LOBINDEX may point to every 8th chunk in the LOB. The significance of the “fixed size” payload is that the payload can be modified in place if the pointer to a LOB chunk has to be changed – and this minimises disruption of the index (at a cost of some wasted space).

 

3 Comments »

  1. […] features, the other key component of the Delphix server is the Delphix file-system (DxFS). I wrote a little note a few days ago to describe how Oracle can handle “partial” updates to LOB values […]

    Pingback by Delphix Overview | Oracle Scratchpad — April 4, 2013 @ 9:04 pm BST Apr 4,2013 | Reply

  2. Jonathan,

    You said (in passing): “Old hands who have had to suffer LONG columns will recognise the trick of recording the size of a LONG as a separate column in the table; it’s a strategy that isn’t really necessary with LOBs but old coding habits die hard.”

    I am wondering, what is (was) this trick good for?

    Cheers,
    Flado

    Comment by Vladimir Andreev — November 2, 2015 @ 2:56 pm GMT Nov 2,2015 | Reply

    • Flado,

      It dates back to the time when you had little choice about handling longs other than dynamically allocating memory to hold them. Being able to find the length before you allocated memory gave you some options for how you handled the retrieval.

      Comment by Jonathan Lewis — November 2, 2015 @ 3:06 pm GMT Nov 2,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.