Oracle Scratchpad

July 20, 2022

LOB space

Filed under: Infrastructure,LOBs,Oracle,Problem Solving — Jonathan Lewis @ 5:07 pm BST Jul 20,2022

Erratum: (With thanks to Bobby Durrett for highlighting the error – see comment 3 below): I’ve stated in this note that CLOBs are stored internally using a fixed-length two-byte characterset. This is only true if the “external” characterset is a variable length multi-byte characterset (such as AL32UTF8). It is not true of fixed-width charactersets (which, implicitly, covers all the single-byte character sets such as WE8ISO8859P1). See MOS document 257772.1 for further comments, and the Database Globalization Support Guide 19c Appendix A for a list identifying fixed and variable width character sets.

This note is still subject to checking and correction regarding the break point between in-row and out-of-row.

There’s a thread on the Oracle developer forum at present asking why calls to dbms_lob.getlength() and calls to dbms_space.space_usage() produce such different results for the storage used by a LOB column.

It’s a really good question to demonstrate two points. First that it’s hard to supply all the details that are needed when you ask a question; second that it’s hard to write well about even such a tiny topic. The source of the second problem answers the first problem – there’s a fairly large decision tree to consider:

  1. Is the LOB defined as enable or disable storage in row?
  2. Is it a BLOB or CLOB?
  3. Is compression enabled ?
  4. Is deduplication enabled ?
  5. Is it a basicfile or securefile LOB?
    1. Is there a chunk declaration (for basicfile LOBs)
    2. Is the LOB undo limit set by pctversion or retention
    3. Have multiple freepools been declared (for basicfile LOBs)

The answer to question 1 means the difference between a tiny LOB taking a few bytes in the table segment (enable) or a whole chunk in the LOB segment (disable).

The answer to question 2 is important for two points – (1) CLOBs are stored using a two-byte fixed width character set ([ed] if the session character is multi-byte variable length) , which means they may take much more space than you might be expecting, and (2) the getlength() function reports characters not bytes.

Questions 3 and 4 result in similar space estimation errors: getlength() will report the decompressed length for every (logical) copy of a LOB value, so summing it across the table could be over-reporting quite dramatically.

Question 5.1 and 5.2 ought to start with a check of whether the LOB has been stored in a tablespace using (“legacy”) freelist management or (“new”) automatic segment space management because that affects whether or not it’s even possible to use securefiles, and it affects whether or not basicfiles can use the retention option.

Question 5.1: This applies only to basicfile LOBs because securefile LOBs ignore the chunk parameter. The dbms space.space_usage() procedures have many “out” parameters with the word “blocks” in their names – but the code uses these parameters to report chunks, not blocks. So if you’ve set the LOB chunk size to something other than default the result will need to be scaled up to get the block count (or you could just use the “bytes” values and divide by the LOB segment’s block size).

Question 5.2: This gets messy, because pctversion is silently ignored by securefiles but used by basicfiles. On the other hand retention can be used for securefiles or basicfiles (assuming automatic undo management is enabled), but its usage with basicfiles doesn’t match its usage with securefiles. Moreover if you include both retention and pctversion in your table declaration Oracle raises error: ORA-32600: RETENTION and PCTVERSION cannot be used together for both basicfiles and securefiles (so Oracle is not quite ignoring pctversion for securefiles). (It may be worth mentioning that if you use export/import to upgrade your database you may find that this “spontaneously” changes a basicfile lob to a securefile lob.)

Question 5.3: freepools is another parameter that is silently ignored for securefiles but can have a significant impact on the way that basicfiles reuse space from old copies of LOB values hence on the amount of allocated but unused space.

Underneath all these, of course, is the question “which version of Oracle”, because that affects the default for the choice between securefile or basicfile (though no-one ought to be using a version of Oracle that still has basicfile as the default). There’s also a significant variation with version of when a blob goes “out of row”. [Ed Jan 2023: no, there isn’t]. And then there’s the question of which overloads of dbms_space.space_usage() are available and which is the correct one to be used.

I think you can appreciate from the above how easy it might be to answer a “simple” question like “why do these two numbers differ” if the question included all the relevant details, but very hard to give a useful answer if the question simply quoted the results of unspecified calls to a couple of procedures or queries.

Answering the question

This user is running 11.2.0.4, has a basicfile BLOB, stored in an ASSM tablespace, no compression, not deduplicated, with the default 8KB chunk size, and storage enabled in row enable. So:

  • it’s a BLOB so getlength() will report the byte count and we don’t have to cater for 2-byte characters
  • it’s set to enable storage in row, so we have to ignore BLOBs with lengths up to 3,898 bytes (for 11.2.0.4 – it would be 3,964 for basicfile blobs in 19c.)
  • it’s basicfile so will stored 8,132 bytes per lob segment block. (securefile would use 8,060 – these figures don’t seem to have changed between 11g and 19c).
  • it’s stored under ASSM so there will be an overhead of just under 1% for segment space metadata when the segment is very large (one bitmap block out of each 128).
  • we can infer that the chunk size is 8KB because we have a report that shows “Full blocks” x 8KB = “Full bytes” – so we could be “losing” an average of 4KB per LOB. (Note, all blocks in a LOB segment are Full or Unformatted, and the metadata blocks aren’t reported by the dbms_space.space_usage() procedures)
  • we might guess that the LOB is using the default pctversion 10, but if it’s using the retention then the undo_retention time can make a huge difference to the amount of space holding “deleted but retained” lobs.

So, for the OP, a first stab at the code to estimate the expected number of blocks in the LOB segment would be:

with blob_lengths as (
        select  dbms_lob.getlength(b1) blob_len
        from    t1
),
blobs_ool as (
        select  blob_len
        from    blob_lengths
        where   blob_len > 3964  -- Corrected Jan 2023 (11g boundary)
)
select
        sum(ceil(blob_len/8132)) blob_blocks
from
        blobs_ool
/

You’ll notice I’ve deliberately used the label “blob” everywhere in the hope that this will be a reminder that the code won’t apply to CLOB columns with their 2-byte fixed-width character set.

Starting with this figure we then need to consider the “undo” for LOBs so (assuming the OP has created a basicfile with the default pctversion (i.e. 10) allowing 10% of the total space to be old copies of the BLOB values) the number of blocks has to be multiplied by 10/9; then, since the metadata for the segment will take about 1% of the segment’s blocks, multiply by 101/100. So scale the query result up by 101/90 to get the “expected” storage requirement.

To compare this estimate with results from dbms_space.space_used() we need to pick the right overload of procedures; two of them are for securefile lob segments only, the third (initially the only option and the one we want now) was generically for ASSM segments.

I’ve published a sample of one of the securefile options here, and the generic ASSM code that we need is part of the script here. Our estimate (excluding the 1% metadata) should be a reasonable match for the “Full blocks” reported by this procedure call.

Since this is a basicfile LOB (and especially since it’s on 11.2.0.4) we may find a massive discrepancy due to some serious defects in the internal code which can be triggered by the way the LOB is used – in which case it’s worth reading the series I wrote 5 years ago modelling a problem (mostly about time spent, in fact) a client had with basicfile LOB deletion.

Summary Points

You shouldn’t be using basicfile LOBs on any recent version of Oracle

Check exactly when your type of LOB goes “out of row”, and remember the difference between Blobs and Clobs when it comes to using length() or getlength() on them.

When summing lengths remember to ignore LOBs that will (probably) be stored “in row”, and don’t forget that even a couple of bytes stored “out of row” require a full “chunk” (not “block”).

The “undo data” for a LOB value is simply the previous copy of the value, left in place. The total volume of “undo” can be limited by pctversion for basicfile lobs (but is silently ignored by securefiles). Both types of lob will obey retention but the only (implicit) option for basicfiles is “auto” which means “the same as undo_retention” and requires automatic undo management to be enabled. This setting for retention can result in a very large amount of old lob data being kept.

If you’re using basicfile LOBs (especially with older versions of Oracle) the mechanism for re-using expired LOB values has some defects that can result in catastrophic behaviour that introduce performance and space problems.

10 Comments »

  1. […] LOB Space (July 2022): considerations when comparing dbms_space.space_usage() with sum(dbms_lob.getlength()) […]

    Pingback by LOB Catalogue | Oracle Scratchpad — July 20, 2022 @ 5:15 pm BST Jul 20,2022 | Reply

  2. […] LOB Space (July 2022): considerations when comparing dbms_space.space_usage() with sum(dbms_lob.getlength()) […]

    Pingback by Troubleshooting catalogue | Oracle Scratchpad — July 20, 2022 @ 5:17 pm BST Jul 20,2022 | Reply

  3. You said “CLOBs are stored using a two-byte fixed width character set” but that is only true for certain character sets. With WE8MSWIN1252 CLOBs are stored using a one byte character set and with AL32UTF8 they are stored two bytes per character based on my tests and Oracle Doc ID 257772.1.

    Comment by bobbydurrettdba — January 30, 2023 @ 9:54 pm GMT Jan 30,2023 | Reply

    • Bobby,

      Thanks for highlighting the error. As I said “… it’s hard to write well about even such a tiny topic.”

      I’ve added a correction to the note.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — January 30, 2023 @ 11:24 pm GMT Jan 30,2023 | Reply

      • Could not agree more about how hard it is to write a correct blog post. I find it overwhelming. In my current situation it looks like the max size for an inline clob on my system is 3964 bytes instead of the 3898 that you list. My system is 11.2.0.2 HP Unix Itanium WE8MSWIN1252 character set. I hope to post my test scripts and output soon if I can figure out how to write it up. By the way, I found this post and the ones it links too very helpful in my current situation.

        Comment by bobbydurrettdba — January 30, 2023 @ 11:45 pm GMT Jan 30,2023 | Reply

        • Bobby,

          There is a note in the text that says: it’s set to enable storage in row, so we have to ignore BLOBs with lengths up to 3,898 bytes (for 11.2.0.4 – it would be 3,964 for basicfile blobs in 19c.)

          Since you’re using a Basicfile CLOB I’m now wondering if in 11.2.0.x there’s a difference in storage between Clob and Blob columns or whether I just made a mistake (though since I made a point of stating the difference that seems a little unlikely).

          Regards
          Jonathan Lewis

          Comment by Jonathan Lewis — January 31, 2023 @ 1:04 pm GMT Jan 31,2023

  4. For what its worth I tried to find the cutoff for inline blobs and I am getting 3964 bytes on 11.2 and 19c. I tried HP Unix and Linux for 11.2.

    Comment by bobbydurrettdba — January 31, 2023 @ 4:48 pm GMT Jan 31,2023 | Reply

    • Bobby,

      It’s taken a little while, but I’ve discovered why I made a mistake – so I’m going to have to correct three or four blog notes, and publish a new one. The “in-row LOBs” note had an elegant little test that showed when the “sys_op_opnsize()” of a lob column would change from (just under, sometimes) 4,000 bytes to a a very small value that I assumed was whatever Oracle thought the LOB Locator was.

      I didn’t do any blockdumps to confirm this assumption; but when I checked the block dumps this evening I found that Oracle (in 11.2.0.4) was giving me figures which were about 66 bytes larger than the actual length of the column dump – which included the LOB locator – that was actually being stored.

      Many thanks,
      Jonathan Lewis

      Comment by Jonathan Lewis — January 31, 2023 @ 7:46 pm GMT Jan 31,2023 | Reply

  5. […] present this is just a place holder to remind me to finish commenting on (and correcting) a mistake I made when I wrote a note about the number of bytes of data you could […]

    Pingback by Lob Space redux | Oracle Scratchpad — February 1, 2023 @ 1:57 pm GMT Feb 1,2023 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: