Oracle Scratchpad

March 18, 2011

ASSM ouch!

Filed under: ASSM,Infrastructure,Oracle — Jonathan Lewis @ 6:20 pm GMT Mar 18,2011

Here’s a nasty little surprise I got last week while investigating an oddity with stats collection. I wanted to create a table in an ASSM tablespace and populate it from two or three separate sessions simultaneously so that I could get some “sparseness” in the data load. So I created a table and ran up 17 concurrent sessions to insert a few rows each. Because I wanted to know where the rows were going I got every session to dump the bitmap space management block at the start of the segment – the results were surprising.

I was using 11.1.0.6, with a locally managed tablespace with 1MB uniform extents, 8KB blocks, and ASSM. Here’s the basic code:

create table t1 (
	n1	number,
	v1	varchar2(100)
)
tablespace test_8k_assm
;

--
--  17 sessions then synchronised to do the following
--

begin
	for i in 1..3 loop
		insert into t1 values(i, rpad('x',100));
		commit;
	end loop;
end;
/

alter system flush buffer_cache;
execute dump_seg('t1',1,'table',0)

(The dump_seg() procedure is just a simple bit of code I wrote to dump selected blocks from a named segment.)
The results I saw in the bitmap block were so bizarre that I then ran the following query:

select
	dbms_rowid.rowid_block_number(rowid) as block_id,
	count(*)
from
	t1
group by
	dbms_rowid.rowid_block_number(rowid)
order by
	block_id
;

Remember, I was using 8KB blocks and 1MB uniform extents – which means 128 blocks per extent – and I had started with a completely new, clean, tablespace. So how many blocks do you think I had allocated to the table ? Here are the results of my query:

  BLOCK_ID   COUNT(*)
---------- ----------
        40          3
        41          3
        43          3
       106          3
       110          3
       112          3
       165          3
       173          3
       236          3
       303          3
       433          3
       464          3
       541          3
       555          3
       563          3
       626          3
      1000          3

What you see here is a table which has managed to allocate several extents – despite the fact that the 17 sessions could have inserted their data into 17 separate blocks in the first extent of the table. The table had grown to nine extents – although it hadn’t even put data into all the extents – and according to dbms_stats.gather_table_stats()there were 1,088 blocks in the table !

This looks pretty disastrous for ASSM – but I do have to say that flushing the buffer cache (17 times) could have introduced an extreme pinning problem that resulted in this surprising result. When I removed the flush and dump lines from my test the “realistic” results showed 17 blocks used in the first extent – so don’t panic if you’re using ASSM in a highly concurrent system; but if you have a couple of tables that seem to be much bigger than expected, maybe you now know what to look for.

9 Comments »

  1. Seeing the extent boundaries might be illustrative. If I recall correctly though, through some patch level if a session has to wait for a free block inserting (while another session’s insert attempt has already requested a new extent), instead of waiting for the new extent the other session already requested, the second session asks for a new extent as well. So if you sequence of alter flush and dump seq is pinning the extent I could see that. I wonder what happens if you toss a lock sleep in so all the inserts complete before the first alter.
    Concurrency driven excessive extent allocation drove one of my customers temporarily to distraction, but that was driven by many concurrent sessions inserting and a legitimately full extent, not an artifact of flushing. Nightly pre-allocation of a day’s worth of extents is one solution to that problem until the relevant patch is in place.

    Comment by Mark W. Farnham — March 18, 2011 @ 8:11 pm GMT Mar 18,2011 | Reply

  2. Hi Jonathan. You might want to test on a patched version – 11.1.0.6 is a VERY buggy version. Well over 1,000 bug fixed by 11.1.0.7 and many more by 11.1.0.7.5.

    I’d be more concerned if it still appeared in 11.1.0.7.5 or in 11.2.0.2 – which is less buggy than 11.1.0.7.x.

    Comment by Kirk Brocas — March 18, 2011 @ 8:18 pm GMT Mar 18,2011 | Reply

    • Kirk,

      That’s the beauty of declaring the version number and supplying the demonstration code. Anyone who thinks it might be a problem can repeat the test on their own platform and check the results. (And then they might add a comment about versions which are fixed or still broken.)

      I’ve just repeated the test on an instance running 11.2.0.2 on 32-bit Windows – and the anomaly doesn’t appear, although it does show an interesting difference in ASSM behaviour that I might look into and write about some other time.

      It only took 15 minutes to re-create the test including copying the text from the blog and creating a suitable tablespace.

      Comment by Jonathan Lewis — March 20, 2011 @ 10:10 am GMT Mar 20,2011 | Reply

  3. Extreme pinning problem? Could you elaborate a bit?
    Thanks!

    Comment by Flado — March 18, 2011 @ 9:08 pm GMT Mar 18,2011 | Reply

    • Flado,

      This anomaly wasn’t something I was interested in at the time, and was also an artifact of a very unusual bit of processing, so I haven’t spent any time looking into the details. The MOS reference that Tom Roach supplied may give you a better idea of what the type of think I was guessing about when I made the comment about pinning.

      In brief, though: to flush the buffer cache DBWR has to gain exclusive access to dirty blocks temporarily to copy them to disc, and it does this through a buffer pin (there’s a reference in the Glossary, which also points to a rather dated article on my old website). I’d guess that if I’d been tracing this code I would have seen buffer busy waits – which basically is the wait event as you wait for someone else to release an exclusive pin – and maybe some write complete waits.

      It’s possible that sessions were creating new extents because they needed access to a bitmap block to find out where to do their inserts – and DBWR was denying them access to the existing block because it had it pinned for writing.

      Comment by Jonathan Lewis — March 20, 2011 @ 10:26 am GMT Mar 20,2011 | Reply

  4. Jonathan,

    Take a look at bug 6635214

    Comment by Tom — March 18, 2011 @ 9:26 pm GMT Mar 18,2011 | Reply

  5. Hello Jonathan,
    Reading this post only now after I hit a similar problem on our 11gR2 RAC (on Solaris) with ASSM managed tablespaces (under ASM too).
    This table of ours gets touched for a dml for every transaction from the application(24×7 availability and online merchant trans). contains two Varray LOBs. The typical row count is around 300K. Avg rowlen 1300. But this table grew to 180GB+ even though the row count did not change much. Continuous inserts and deletes of previous day records keep the size very consistent.
    The extent map is bizarre as with any ASSM. At times I am seeing 64M extents being allocated. If I take out the row count per block, on an average it is almost 1. Sometimes 3 to 4, very rarely.
    Luckily Index is quite stable and hence the application is not crashing due to timeouts.
    I tried SHRINK SPACE compact and it took 16 hours to finish. After that the subsequent SHRINK SPACE seemed to do the same thing for more than 15 min before I stopped it.
    With such extent allocation, the dbms_space shows me that total full blocks is around 21M. But, that is the number I am trying to eliminate after shrinking. Meaning even if I conservatively make 1 row/block, the total count of blocks should not exceed the total number of rows.
    However, it looks like I am going to end up in the same situation again after resizing in a future time.
    Is there a way out to get out of this situation just for that table?
    Regards
    Kumar Ramalingam.

    Comment by Sankarakumar Ramalingam — August 18, 2014 @ 7:56 pm GMT Aug 18,2014 | Reply

    • Kumar,

      It’s always best to give the full version number when describing unusual behaviour – someone might recognise a bug associated with a particular version if the version number is there to prompt them.

      You’ve mentioned storing two varrays as LOBs – but you haven’t made it clear whether the excess space is in the LOB segments or in the table segment – the avg_row_len will include figures for LOBs that are in-row, but will only allow for a potential LOB locator for LOBs that are out of line; this makes it harder to work out what space is actually used in the table segment. There’s a script

      This does look like a bug, so you should raise it with Oracle; as a general thought it’s always a little difficult for Oracle to pick a moment to fix up the bitmap at the end of a transaction, so it’s possible that something about your delete/insert cycle is leaving blocks marked as full at the end of transaction when they actually have space in them. You could move the table out ASSM, of course; alternatively I think there’ a procedure for fixing segment bitmaps in the dbms_space_admin package so you might need to rebuild the object once and then introduce a regular bitmap fixup (maybe every night) – the rebuild might have to use dbms_redefinition to work as close to online as possible.

      Comment by Jonathan Lewis — August 19, 2014 @ 8:29 am GMT Aug 19,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,523 other followers