One of the nice things about declaring your (basicfile) LOBs as “enable storage in row” is that the block addresses of the first 12 chunks will be listed in the row and won’t use the LOB index, so if your LOBs are larger than 3960 bytes but otherwise rather small the LOB index will hold only the timestamp entries for deleted LOBs. This makes it just a little easier to pick out the information you need when things behave strangely, so in this installment of my series I’m going to take about an example with with storage enabled in row.
I’m going to demonstrate an issue that is causing a major problem. First I’m going to build a table with a LOB column with multiple (4) freepools – because that’s what you do to handle concurrency – then I’m going to start 4 sessions (carefully checking that I have one associated with each free pool) and do a few thousand inserts with commits from each session. The size of the LOB value I insert will be 20KB so it will be “indexed” in the row but stored out of the row taking 3 LOB blocks.
Once I’ve got the data in place I’m going to use three of the sessions to delete three quarters of the rows from the table then use a call to the dbms_space package to show you that the segment contains virtually no free space. I’ve engineered the code so that it will take just three more rows in the table to fill the available free space and force Oracle either to allocate a new extent or to start reclaiming some of the delete reusable LOB space – and I’m going to run that insert from the session that DIDN’T delete any rows.
I’ve been running these tests on 220.127.116.11, but get similar behaviour on 12c.
create table t1( id number constraint t1_pk primary key, c1 clob ) lob (c1) store as basicfile text_lob( enable storage in row chunk 8k retention nocache logging freepools 4 tablespace test_8k_assm ) ; declare m_v1 varchar2(32767) := rpad('X',20000,'X'); begin for i in 0..0 loop insert into t1 values (i, m_v1); commit; end loop; end; / truncate table t1 ;
You’ll notice I’ve used the retention keyword. Before I built the LOB I set my undo_retention to 10 seconds so that the space from deleted LOBs should become available for reuse very quickly. The name of the tablespace I’ve used for the LOB is a clue that I’m using an 8KB block size and ASSM (the latter is a requirement of the retention option).
Here’s the code to check which freepool (0 to 3) a session will be associated with (this isn’t documented, but seems to be correct);
select mod(pid,4) from v$process where addr = ( select paddr from v$session where sid = ( select sid from v$mystat where rownum = 1 ) ) ;
So I can keep starting sessions and running that query until I’ve got a session covering each freepool. (The first time I tried this I had to start 7 sessions before I got all 4 freepools covered). Now I can run the following from all 4 sessions concurrently:
define m_loop_counter = 12027 lock table t1 in row share mode; commit; declare m_v1 varchar2(32767) := rpad('x',20000,'x'); begin for i in 1..&m_loop_counter loop insert into t1 values (s1.nextval, m_v1); commit; end loop; end; /
The purpose of the lock table command is to ensure that all 4 processes start running simultaneously. From a fifth session I execute a “lock table t1 in exclusive mode” before starting the other four sessions running, so they all queue on the exclusive lock; then I commit from the fifth session and away we go. The whole thing took about 30 seconds to run. The rather random-looking value 12,027 was a careful choice to ensure that the last extent in the segment had just a few blocks left – and I used my “dbms_space_use.sql” script to check this, getting the following output:
==== ASSM ==== Unformatted : 7 / 57,344 Freespace 1 ( 0 - 25% free) : 0 / 0 Freespace 2 ( 25 - 50% free) : 0 / 0 Freespace 3 ( 50 - 75% free) : 0 / 0 Freespace 4 ( 75 - 100% free) : 0 / 0 Full : 144,324 / ############ PL/SQL procedure successfully completed. ======= Generic ======= Segment Total blocks: 145536 Object Unused blocks: 0
I’ve got 7 “unformatted” blocks in the segment – though in fact these might be “formatted but free” from the perspective of the LOB code.
After going to sessions 0, 1, and 3 and deleting 12,000 rows from each in turn (and committing, leaving a total of 12,108 rows in the table) the report doesn’t change: I haven’t made any space free I’ve simply flagged it in the LOB index as “reusable”. So now we go to session 2 and run the following code 3 times – with “set timing on”:
SQL> l 1 declare 2 m_v1 varchar2(32767) := rpad('x',20000,'x'); 3 begin 4 for i in 1..1 loop 5 insert into t1 values (s1.nextval, m_v1); 6 commit; 7 end loop; 8* end;
The first run took 0.02 seconds – and the unformatted count dropped to 4
The second run took 0.01 seconds – and the unformatted count dropped to 1
The third run took 10.74 seconds, of which 9 seconds was CPU. The session generated 500,000 redo entries totalling 100MB of redo from 1 million db block changes after doing 8.4 million logical I/Os, issuing 108,000 enqueue (lock) requests and running 108,000 index range scans. The report of space usage ended up looking like this:
Unformatted : 108,125 / 885,760,000 Freespace 1 ( 0 - 25% free) : 0 / 0 Freespace 2 ( 25 - 50% free) : 0 / 0 Freespace 3 ( 50 - 75% free) : 0 / 0 Freespace 4 ( 75 - 100% free) : 0 / 0 Full : 36,333 / 297,639,936 PL/SQL procedure successfully completed. ======= Generic ======= Segment Total blocks: 145664 Object Unused blocks: 0
My session has cleared every single piece of re-usable space from the LOB and made it free (unformatted) before allocating space for its one LOB. (That’s going to hurt when the client has 2 million LOBs on the reusable list and isn’t running everything on SSDs – which is why I’m working on this problem).
If you’re wondering why it takes so much redo and so many buffer visits to free 36,000 LOBs this (roughly) is what Oracle does to free up one reusable LOB of 3 blocks – which corresponds to a single index entry carrying three block ids:
- Find the lowest index entry in the freepool, pin the index leaf block
- Identify the last block in the list of 3
- Lock the relevant L1 space management block for the segment and set relevant “bit” to “unformatted”
- Delete the index entry
- Re-insert the index entry with one block id removed
- Commit and unlock the L1 bitmap block
- Repeat delete/insert the cycle for 2nd block id
- Repeat the cycle for 3rd (or 1st since we’re going backwards) block id – but don’t re-insert the index entry
Oracle reclaims one block (chunk) at a time. And that’s a bit of a clue to a possible workaround because event 44951 gets mentioned a couple of times in MoS and on the internet as a workaround to a particular problem of HW enqueue waits for LOBS. MoS note 740075.1 tells us:
When using Automatic Segment Space Management (ASSM), and the fix for Bug 6376915 has been applied in your database (Included in 10.2.0.4 +) it is possible to adjust the number of chunks that are cleaned up when the chunk cleanup operation is required.
This can be enabled by setting event 44951 to a value between 1 and 1024 (default is 1). With the value between 1 and 1024 setting the number of chunks to be cleaned up each time a chunk reclaimation operation occurs. This can therefore reduce the number of requests for the High Watermark Enqueue.
Other notes explain that by default only one chunk is cleaned up at a time – which is exactly the behaviour I’m seeing. So what happens when I bounce the database with this event set at level 5 (an arbitrary choice, but larger than the LOBs I’ve been inserting) in the parameter file and repeat the experiment ? On the first attempt it made no difference, but then I changed the experiment slightly and started again. Initially I had done my first re-insert from the one session that hadn’t deleted any rows – which made it an extreme boundary condition; on the second attempt I deleted two rows from the session that had not yet deleted any data (and waited for the retention time to elapse) before doing the inserts from that session.
Deleting two rows would put 6 blocks (in two index entries) onto my re-usable list, so I was starting the inserts with 7 free blocks, 6 reusable blocks and the event set to level 5. Here’s what I saw as I inserted rows one by one.
- Insert one row: “Unformatted” blocks went up to 9: I had freed 5 of the reusable blocks then used 3 of them for my lob (7 + 5 – 3 = 9)
- Insert one row: “Unformatted” blocks went down to 7: I had freed the last reusable block then used 3 blocks for my lob (9 + 1 – 3 = 7)
- Insert one row: “Unformatted” blocks went down to 4
- Insert one row: “Unformatted” blocks went down to 1
- Insert one row: Oracle cleared all the reusable space (11 seconds, 500MB redo), then added an extent (!) to the segment and used 2 of its blocks for part of the new LOB.
So the event isn’t really connected with my problem – though it adds some efficiency to the processing – and my “boundary condition” is one that’s likely to occur fairly frequently if you’ve got a basicfile LOB defined with multiple freepools. Fortunately it’s probably going to require two pre-conditions before it’s a big problem: first that you’re handling a large number of LOBs and second that your pattern of inserting and deleting is not symmetric – it’s when you use a large number of concurrent sessions for small batches of inserts but a single session for large bulk deletes that all hell can break loose shortly after a delete.
As with many other features of Oracle, skew plays a part in making things break. If you’re doing lots of inserts and deletes of basicfile lobs make sure the mechanisms you use for inserting and deleting look similar: in particular similar numbers of processes to do similar amounts of work for both operations.
P.S. It gets worse.
P.P.S. Don’t even start to think that you can work around this by using securefiles.
P.P.P.S. I got an hint from one test that if a reusable LOB is exactly the same size as the LOB being inserted then Oracle very cleverly takes the entry index entry and rewrites it to be the LOB index entry rather than freeing (and then potentially using) the space it identifies.