Oracle Scratchpad

February 11, 2016

Quiz Night

Filed under: 12c,Infrastructure,LOBs,Oracle,Statistics — Jonathan Lewis @ 3:28 pm BST Feb 11,2016

I was setting up a few tests on a copy of 12.1.0.2 recently when I made a mistake creating the table – I forgot to put in a couple of CAST() calls in the select list, so I just patched things up with a couple of “modify column” commands. Since I was planning to smash the table in all sorts of ways and it had taken me several minutes to create the data set (10 million rows) I decided to create a clean copy of the data so that I could just drop the original table and copy back the clean version – and after I’d done this I noticed something a little odd.

Here’s the code (cut down to just 10,000 rows), with a little output:

--
--      Script:  12c_vc32767_b.sql
--

create table t1 (
        id      not null,
        date_open,
        date_closed,
        deal_type,
        client_ref,
        small_vc,
        padding
)
nologging
as
select
        rownum                                          id,
        trunc(
                add_months(sysdate, -120) +
                        (rownum)* 3652 / 1e7
        )                                               date_open,
        trunc(
                add_months(
                        add_months(sysdate, -120) +
                                (rownum) * 3652 / 1e7,
                        12 * trunc(dbms_random.value(1,6))
                )
        )                                               date_closed,
        dbms_random.string('U',1)                       deal_type,
        dbms_random.string('U',4)                       client_ref,
        lpad(rownum,10)                                 vc_small,
        rpad('x',100,'x')                               vc_padding
from
        dual
connect by
        level <= 10000
;
alter table t1 modify deal_type varchar2(1);
alter table t1 modify client_ref varchar2(4);

create table t2 nologging as select * from t1;

begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt=>'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                user,
                't2',
                method_opt=>'for all columns size 1'
        );
end;
/

select table_name, num_rows, blocks, avg_row_len from user_tables;

TABLE_NAME             NUM_ROWS     BLOCKS AVG_ROW_LEN
-------------------- ---------- ---------- -----------
T1                        10000        304         139
T2                        10000        218         139

There’s no hidden code – this is the code I ran, and the tables both went into the same tablespace- yet one table has used about 30% more blocks than the other on a simple “create as select”, even though the row lengths and row counts are the same!

When I first did this test it just so happened that the only place in the database I had to make the t2 copy was in a tablespace using freelist management, and I’d created the original table in a tablespace using ASSM, so I wasted a little time examining the ASSM space management (bitmap) blocks before I realised what had happened.

Suggestions about what has happened, and how to confirm your hypothesis, invited in the comments.

Update (following morning)

The explanation is in the comments: I had enabled extended (32,767 byte) varchar2().  My calls to dbms_random.string() had returned character values, and the return type defaults to the maximum character length, which meant they were declared as varchar2(32767) and those are implemented as (unhackable) CLOBs in 12.1.0.2.

When I modified the table to change the varchar2(32767) to shorter lengths Oracle let me get away with it because none of the actual stored values at that time was longer than my target declaration – but it didn’t physically change the stored column values to “short” varchar2(), it left them as CLOBs, and a “short” CLOB includes a LOB Locator which starts out at about 30 bytes.

When I created t2 from t1 Oracle first copied the (short) column definitions, and then copied the data, reading and converting the CLOBs to normal varchar2() storage, and that explains why t2 needed fewer blocks.  The avg_row_len for the two tables matched because the code to gather stats simply applies the sys_op_opnsize() function to the varchar2() defined in t1, and doesn’t give you any indication about the LOB locator.

(If you use the extremely deprecated analyze command to gather stats on the two tables you’ll find that the avg_row_len of table t1 then allows for the LOB locator size.)

“Workaround:”

I should have used:  cast(dbms_random.string(‘U’,4) as varchar2(4)) and the column definition would have been a proper varchar2(4) from the start.  (As noted in a comment from Ivica Arsov below, substr(…, 1, 4) would also have worked.  Of course I should have used cast() for the numerics as well so that I could declare them as things like number(8,0) rather than the slightly dangerous “number”.

Note:  I’ve put “workaround” in quotes because it’s not really a workaround to a problem – it’s just an example of doing it right.

Footnote:

Here’s a symbolic dump of a single character (‘T’) being stored as a varchar2(32767):


LOB
Locator:
  Length:        84(31)
  Version:        1
  Byte Length:    1
  LobID: 00.00.00.01.00.00.01.6b.ba.d7
  Flags[ 0x01 0x0c 0x00 0x80 ]:
    Type: BLOB
    Storage: SecureFile
    Characterset Format: IMPLICIT
    Partitioned Table: No
    Options: ReadWrite
  SecureFile Header:
    Length:   11
    Old Flag: 0x48 [ DataInRow SecureFile ]
    Flag 0:   0x90 [ INODE Valid ]
    Layers:
      Lengths Array: INODE:5
      INODE:
        00 00 01 01 54

Note the 84(31) at line 3: this is a reminder of how big a LOB locator could get for an out of line LOB, compared to the current size of the LOB locator. Rows that hold out of line LOBs can actually be much longer than avg_row_len tells you.

Addendum

Another side effect of enabling extended varchar2() types is the surprise you get when you use error logging in SQL (i.e. syntax like: “insert into t1 select … log errors”). When you create the “clone” table for the log it changes every column from the original into a varchar2(32767) e.g.:

desc target
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 ID                                                                                NUMBER
 N1                                                                                NUMBER
 N2                                                                                NUMBER
 SMALL_VC                                                                          VARCHAR2(10)
 PADDING                                                                           VARCHAR2(100)


begin
        dbms_errlog.create_error_log('TARGET');
end;
/

desc err$_TARGET
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 ORA_ERR_NUMBER$                                                                   NUMBER
 ORA_ERR_MESG$                                                                     VARCHAR2(2000)
 ORA_ERR_ROWID$                                                                    ROWID
 ORA_ERR_OPTYP$                                                                    VARCHAR2(2)
 ORA_ERR_TAG$                                                                      VARCHAR2(2000)
 ID                                                                                VARCHAR2(32767)
 N1                                                                                VARCHAR2(32767)
 N2                                                                                VARCHAR2(32767)
 SMALL_VC                                                                          VARCHAR2(32767)
 PADDING                                                                           VARCHAR2(32767)

You probably hope that you’re not going to see many rows inserted into the error log table, so the extra space taken up in the table by error rows probably won’t be a problem – but the fact that you get two segments (the LOB and the LOBINDEX) created for every column in the original table might cause a problem – even if they are completely empty.

January 21, 2015

LOB Space

Filed under: ASSM,Infrastructure,LOBs,Oracle — Jonathan Lewis @ 1:26 pm BST Jan 21,2015

Following on from a recent “check the space” posting, here’s another case of the code not reporting what you thought it would, prompted by a question on the OTN database forum about a huge space discrepancy in LOBs.

There’s a fairly well-known package called dbms_space that can give you a fairly good idea of the space used by a segment stored in a tablespace that’s using automatic segment space management. But what can you think when a piece of code (written by Tom Kyte, no less) reports the following stats about your biggest LOB segment:


Unformatted Blocks .....................             107
FS1 Blocks (0-25)  .....................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               0
Full Blocks        .....................         859,438
Total Blocks............................       1,746,304
Total Bytes.............................  14,305,722,368

Of the available 1.7M blocks approximately 890,000 seem to have gone missing!

I tend to think that the first thing to do when puzzled by unexpected numbers is to check for patterns in the arithmetic. First (though not particularly interesting) the LOB segment seems to be using the standard 8KB blocksize: 1,746,304 * 8192 = 14,305,722,368; more interestingly, although only approximately true, the number of full blocks is pretty close to half the total blocks – does this give you a hint about doing a little test.


create table test_lobs (
        id              number(5),
        text_content    clob
)
lob (text_content) store as text_lob(
        disable storage in row
        chunk 32K
        tablespace test_8k_1m_assm
)
;

begin
        for i in 1..1000 loop
                insert into test_lobs values(
                        i, 'x'
                );
                commit;
        end loop;
end;
/

I’ve created a table with a LOB segment, storing LOBs out of row with a chunk size of 32KB in a tablespace which (using my naming convention) is locally managed, 8KB blocksize, uniform 1MB extents, using ASSM. So what do I see if I try to check the space usage through calls to the dbms_space package ? (There’s some sample code to do this in the comments of the blog I linked to earlier, but I’ve used some code of my own for the following – the first figure reported is the blocks, the second the bytes):


====
ASSM
====

Unformatted                   :      119 /    3,899,392
Freespace 1 (  0 -  25% free) :        0 /            0
Freespace 2 ( 25 -  50% free) :        0 /            0
Freespace 3 ( 50 -  75% free) :        0 /            0
Freespace 4 ( 75 - 100% free) :        0 /            0
Full                          :    1,000 /   32,768,000

=======
Generic
=======
Segment Total blocks: 4224
Object Unused blocks: 0

Apparently the segment has allocated 4,224 blocks, but we’ve only used 1,000 of them, with 119 unformatted and 3,105 “missing”; strangely, though, the 1,000 “Full blocks” are simultaneously reported as 32,768,000 bytes … and suddenly the light dawns. The dbms_space package is NOT counting blocks, it’s counting chunks; more specifically it’s counting “bits” in the bitmap space management blocks for the LOB segment and (I think I’ve written this somewhere, possibly as far back as Practical Oracle 8i) the bits in a LOB segment represent chunks, not blocks.

Conclusion:

The OP has set a 16KB chunksize with a 8KB block size. His numbers look fairly self-consistent: 1,746,304 –  (2 * 859,438) – 107 = 27,321; the difference is about 1.6% of the total allocation, which is in the right ballpark for the space management blocks, especially if the segment is in a tablespace using with 1MB uniform extents.

 

August 19, 2014

LOB Length

Filed under: Infrastructure,LOBs,Oracle,Performance — Jonathan Lewis @ 6:06 pm BST Aug 19,2014

It’s funny how you can make little savings in work all over the place in Oracle if you’re prepared to look a little closely at what’s going on. Here’s a quirky little example with LOBs and function calls that might just have some greater relevance in other situations. Here’s a little data set, and two queries that I might run against it:


create table tbl(
	c1      clob
)
lob (c1) store as c_lob(
	disable storage in row
	nocache nologging
)
;

begin
	for i in 1..128 loop
		insert into tbl values(rpad('x',4000));
		commit;
	end loop;
end;
/

commit;

-- collect stats, prime dictionary cache and library cache
-- run both queries twice and check stats on second run

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		/*+ no_merge */
		dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		rownum rn, dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

The question that you might ask yourselves when you see these queries is: will they do similar amounts of work. Of course, I wouldn’t be asking the question if the answer were yes. Despite the no_merge() hint, which you might think would have the same effect as the rownum approach, Oracle seems to execute the call to dbms_lob.getlength() twice for each row in the first query, but only once per row for the second query. Here are the stats (from autotrace) on the second run of the two queries when autotrace is enabled:


Statistics (for no_merge)
----------------------------------------------------------
         40  recursive calls
          0  db block gets
        271  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Statistics (for rownum)
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        131  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As you can see, the consistent gets for the no_merge() approach is roughly double that for the rownum approach – and since we have 128 rows/LOBs in the table that looks suspiciously like 2 gets vs. 1 get per LOB depending on the approach – which suggests two calls to the function. This is further corroborated by the execution plans, and especially by the predicate sections (how often have I said “always check the predicates”) which show that the predicate has been pushed inside the view that’s been hinted to be non-mergeable, but it hasn’t been pushed inside the view that uses the rownum instantion trick:


Execution Plan for no_merge()
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    13 |            |          |
|   2 |   VIEW              |      |     6 |    78 |     2   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| TBL  |     6 |   522 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("DBMS_LOB"."GETLENGTH"("C1")>3960)

Execution Plan for rownum
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |      |     1 |    13 |            |          |
|*  2 |   VIEW               |      |   128 |  1664 |     2   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| TBL  |   128 | 11136 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("LEN">3960)

My first thought on seeing this difference was to apply the /*+ no_push_pred */ hint to block predicate pushing – but then I remembered that the hint is about join predicate pushing and this is a simple filter push. A quick search of the hidden parameters, though, revealed this:

_optimizer_filter_pushdown : enable/disable filter predicate pushdown

Setting this parameter to false – either through a call to ‘alter session’ or through an /*+ opt_param( opt_param(‘_optimizer_filter_pushdown’ , ‘false’) */ hint – allowed the no_merge approach to produce the same plan and resource usage as the rownum approach. Of course, for a production system, I’d probably use the rownum approach rather than mess around with hidden parameters.

Footnote:

I don’t know why the code with the no_merge() approach reported 40 recursive calls (on its first execution with autotrace). A couple of variations on the experiment suggested that it had something to do with the number of rows (or consequential buffer visits) that survived the predicate call – for a sufficiently small number of rows the recursive call count happened to drop to zero; but the phenomenon needs further investigation.

Reference Script: lob_oddity.sql

May 29, 2014

Securefiles

Filed under: Bugs,Infrastructure,LOBs,Oracle — Jonathan Lewis @ 6:15 pm BST May 29,2014

A few weeks ago someone emailed me about a problem they had importing securefiles – it was very slow. Such things are never easy to address by email, of course, but there were three features to consider: (a) it was securefiles, (b) it was impdp, and (c) it was across a database link. If you read my blog regularly you’ll have seen me comment a few times that the easiest way to break Oracle is to mix a few features – so

  • securefiles and impdp (I know why LOBs generally appear to be slow to import, was it “LOBs” or specifically securefile LOBs)
  • securefiles and database links (db links are always slower than local actions – easy to do a comparative test)
  • impdp and database links (probably not, they’re supposed to work very well together in general – could do a local/remote comparison)
  • impdp with securefiles across a database link (easy enough to factor out the database link)

It was just a brief email, and I didn’t have an answer offhand, so I pointed out that there were a few bugs on MoS about impdp and LOBs and left it at that. A couple of days ago I got a follow-up email telling me that the problem was Bug 13609098 : IMPORTING SMALL SECUREFILE LOBS USING DATA PUMP IS SLOW.

There are two reasons for writing this note – the first, of course, is just to publicise the bug because I’ve seen three of four complaints over the Internet about slow imports with LOBs  and maybe a couple of those were actually “small securefile LOBs”; and then it’s possible that there are other people who haven’t even realised that their imports could be running faster.

The second reason, though, is to highlight a viewpoint that leaves me approaching Oracle features with extreme caution: this looks like the sort of bug that many people should have noticed, but the first reference is Jan 2012, and the earliest patch seems to be dated Oct 2013 – 22 months later! There could be various reasons for the long gap – but the one that always comes to my mind first in cases like this is: “are there so few people using ‘feature X’ that this bug stayed near the bottom of the todo list for a long time ?” – followed by the slightly less alarmist “maybe there are quite a lot of people, but very few have noticed” and “but the specific combination is, perhaps, just a little unlikely”. If there really are very few people using the feature then I’m not going to be keen to advise a client to take it on without doing an extremely careful set of tests – at scale – of everything they’re likely to do with the feature. I don’t want something to break after go-live and find that it take weeks to identify the root cause and months to fix.

In this particular case I’ll believe that the combination of Securefile LOBs (“large” objects) that were actually small and in large numbers is significant. I’m prepared to assume that the customer base using Securefiles is a reasonable size but the subset who hit this combination is a small fraction of the whole;  and that means I won’t be quite so paranoid about suggesting Securefiles as an option to a client – though I’d still insist on modelling any special cases that their requirements might highlight.

Footnote

The bug is fixed in 12.2 with several backports to 11.2.0.x for different platforms.

 

 

 

March 2, 2014

Auto Sample Size

Filed under: Function based indexes,Indexing,Infrastructure,IOT,LOBs,Oracle,Statistics — Jonathan Lewis @ 6:38 pm BST Mar 2,2014

In the past I have enthused mightily about the benefits of the approximate NDV mechanism and the benefit of using auto_sample_size to collect statistics in 11g; however, as so often happens with Oracle features, there’s a down-side or boundary condition, or edge case. I’ve already picked this up once as an addendum to an earlier blog note on virtual stats, which linked to an article on OTN describing how the time taken to collect stats on a table increased dramatically after the addition of an index – where the index had this definition:


create bitmap index i_s_rmp_eval_csc_msg_actions on
    s_rmp_evaluation_csc_message (
        decode(instr(xml_message_text,' '),0,0,1)
    )
;

As you might guess from the column name, this is an index based on an XML column, which is stored as a CLOB.

In a similar vein, I showed you a few days ago an old example I had of indexing a CLOB column with a call to dbms_lob.getlength(). Both index examples suffer from the same problem – to support the index Oracle creates a hidden (virtual) column on the table that can be used to hold statistics about the values of the function; actual calculated values for the function call are stored in the index but not on the table itself – but it’s important that the optimizer has the statistics about the non-existent column values.

(more…)

February 21, 2014

Indexing LOBs

Filed under: Function based indexes,Indexing,Infrastructure,LOBs,Oracle — Jonathan Lewis @ 6:42 pm BST Feb 21,2014

Many years ago, possibly when most sites were still using Oracle 8i, a possible solution to a particular customer problem was to create a function-based index on a CLOB column using the dbms_lob.getlength() function call. I can’t find the notes explaining why this was necessary (I usually have some sort of clue – such as the client name – in the script, but in this case all I had was a comment that “the manuals say you can’t do this, but it works provided you wrap the dbms_lob call inside a deterministic function”).

I never worked out why the dbms_lob.getlength() function wasn’t declared as deterministic – especially since it came complete with a most restrictive restricts_references pragma – so I had just assumed there was probably some good reason based on strange side effects when national language charactersets came into play. But here’s a little detail I noticed recently about the dbms_lob.getlength() function: it became deterministic in 11g, so if the client decided to implement my suggestion (which included the usual sorts of warnings) it’s now legal !

Footnote – the length() function has been deterministic and usable with LOBs for a long time, certainly since late 9i, but in 8i length(lob_col) will produce Oracle error “ORA-00932: inconsistent datatypes”

January 6, 2014

LOB changes

Filed under: Infrastructure,LOBs,Oracle,Troubleshooting — Jonathan Lewis @ 7:10 pm BST Jan 6,2014

It’s always useful to collect baseline information – especially when it helps you notice that the baseline has moved in a way that might explain the next performance problem you see. Here’s an example demonstrating the benefit.

I have a table with a LOB column that is stored out of line. Many years ago I decided that I wanted to compare how the redo generation varied as I change the LOB from cached to nocache (with nologging). So here was one of my simplest test scripts (stripped to a minimum):

(more…)

June 19, 2013

Wasted Space

Filed under: compression,fragmentation,Infrastructure,LOBs,Oracle — Jonathan Lewis @ 9:55 am BST Jun 19,2013

Here’s a little quiz: If I take the average row length of the rows in a table, multiply by the number of rows, and convert the result to the equivalent number of blocks, how can the total volume of data in the table be greater than the total number of blocks below the table high water mark ? I’ve got three tables in a schema, and they’re all in the same (8KB block, 1M uniform extent, locally managed) tablespace, but here’s a query, with results, showing their space utilisation – notice that I gather schema stats immediately before running my query:

(more…)

March 22, 2013

LOB Update

Filed under: Infrastructure,LOBs,Oracle — Jonathan Lewis @ 10:36 pm BST Mar 22,2013

This note is about a feature of LOBs that I first desribed in “Practial Oracle 8i” but have yet to see used in real life. It’s a description of how efficient Oracle can be, which I’ll start with a description of, and selection from, a table:
(more…)

August 27, 2012

Fragmentation ?

Filed under: fragmentation,Infrastructure,LOBs,Oracle — Jonathan Lewis @ 5:15 pm BST Aug 27,2012

Here’s a simple piece of SQL that could, in theory, compare the current size of  a table with the size it might be after a call to “alter table move” – and it’s followed by the results for a table that’s current in the database that I’m looking at:

select
	blocks, num_rows, avg_row_len, pct_free,
	ceil(num_rows * avg_row_len / (8000 * ((100 - pct_free)/100))) blocks_needed
from
	user_tables
where
	table_name = 'T1'
;

    BLOCKS   NUM_ROWS AVG_ROW_LEN   PCT_FREE BLOCKS_NEEDED
---------- ---------- ----------- ---------- -------------
        25       1000          22         10             4

(more…)

July 9, 2009

Concatenating LOBs

Filed under: Infrastructure,LOBs,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:24 pm BST Jul 9,2009

If you have to handle LOBs, it’s worth checking for “unusual” activity. Here’s an example of unexpected behaviour that I came across a couple of years ago.

The client had a table with data that had to be written to a flat file so that a number of other databases could import it using SQL*Loader. The table definition and the query to dump the data are shown below – note, particularly, the CLOB sitting in the middle of the table:

(more…)

November 19, 2008

Lateral LOBs

Filed under: Infrastructure,lateral view,LOBs,Oracle — Jonathan Lewis @ 10:20 pm BST Nov 19,2008

The following request appeared on the comp.databases.oracle.server newsgroup a few days ago:

Is it possible to create a view that will split single CLOB column into multiple chunks of VARCHAR2 something like this:

TABLE:
---------------------------
ID              INTEGER
DATA            CLOB

VIEW:
--------------------------------------------------
ID              INTEGER
CHUNK_ID        INTEGER
DATA            VARCHAR(1000 char)

(more…)

May 11, 2007

LOB sizing

Filed under: Infrastructure,LOBs,Oracle,Performance,Tuning — Jonathan Lewis @ 7:29 pm BST May 11,2007

Some time ago, I was asked to take a quick look at an application that had to handle a lot of LOBs. The LOB-specific part of the application was actually quite simple – contracts were stored as LOBs – but only for occasional visual reference; all the “structured” information from the contract was extracted and stored in relational tables. Some time after a contract had expired, the LOB could be deleted to reclaim space (in theory).  Historically, the client had purged a load of LOBs from time to time, but didn’t have a deliberate house-keeping task to do the job on a regular basis.

(more…)

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 6,206 other followers