Oracle Scratchpad

August 30, 2012

Surprises

Filed under: Bugs,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 5:55 pm BST Aug 30,2012

I’ve given examples in the past of how you can be suprised by Oracle when a mechanism that has “always worked” in the past suddenly just doesn’t work because some unexpected internal coding mechanism hits a boundary condition. One example I gave of this was rebuilding indexes online – where a limitation on the key size of index organized tables made it impossible to do an online rebuild of an index on a heap table because of an ORA-01450 (maximum key length) error that was raised against the (index-organized) journal table that  Oracle creates internally to support the rebuild.

Slightly more recently the topic of the sys_op_map_nonnull() function came up in some comments about the ora_hash function – and I made a passing comment about the function appearing in materialized view refreshes and then never followed it up. Fortunately I got an email from Jared Still and Don Seiler a few days ago which contained a nice short demonstration of the problem so, with their permission, I’ve published it below.

We start by creating a table with a varchar2(4000) column – and in my case I can do this because my database character set is a single-byte character set – then build a materialized view log on the table:

rem
rem     Script:         fast_refresh_bug_2.sql
rem     Author:         Jared Still and Don Seiler
rem     Dated:          Aug 2012
rem

create table frbug
as
select
        t.owner,
        t.table_name,
        cast('x' as varchar2(4000)) data
from
        all_tables t
where
        rownum <= 2000 -- > comment to avoid wordpress format issue
;

alter table frbug modify data not null;

create index frbug_pk_idx on frbug(owner,table_name);
alter table frbug add constraint frbug_pk primary key (owner,table_name);

create materialized view log on frbug
with primary key, sequence, rowid
(data)
including new values
;

You’ll notice that the code creates a non-unique index to support the primary key constraint. I haven’t checked to see if this makes any difference to the outcome of the demonstration but anyone who wants to repeat the test can do this.

After creating the log the next piece of code creates an aggregate materialized view on a pre-built table.

create table frbug_mv
as
select
        owner,
        table_name,
        data,
        1 mvcount
from
        frbug
;

create index frbug_mv_pk_idx on frbug_mv(owner,table_name);
alter table frbug_mv add constraint frbug_mv_pk primary key (owner,table_name);

create materialized view frbug_mv
on prebuilt table
refresh fast
enable query rewrite
as
select
        owner, table_name, data, count(*) mvcount
from
        frbug
group by
        owner, table_name, data
;


The primary key constraint on the materialized view is also protected by a non-unique index but in this case this is a requirement of Oracle’s implementation – if you support the primary (or unique) constraint with a unique index then a fast refresh may, in general, end up generating a “duplicate key in index” error.

Finally we insert a row into the base table and call for a fast refresh:

define m_length = 4000

insert into frbug(owner, table_name, data)
values('JKSTILL','BIG_TABLE',rpad('is this too big?', &m_length,'X'))
;

commit;

exec dbms_mview.refresh('FRBUG_MV', method => 'F')

*
ERROR at line 1:
ORA-12008: error in materialized view refresh path
ORA-01706: user function result value was too large
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2254
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2460
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2429
ORA-06512: at line 1


You’ll note that in my insert statement I’ve used a substitution variable ( &m_length) to dictate the length of the string I insert for column data. If I set this to 4,000 the refresh fails; any smaller value and the refresh succeeds. (Note: the line numbers reported in the ORA-06512 errors will vary with version of Oracle, but the ORA-01706 error is always the same.)

You can find out what’s gone wrong if you re-run the refresh with sql_trace enabled. In the trace file you’ll find a statement like the following (which I’ve tidied up considerably):

MERGE INTO TEST_USER.FRBUG_MV SNA$
USING   (
        SELECT   /*+  OPT_ESTIMATE(QUERY_BLOCK MAX=1000)   */
                DLT$0.OWNER GB0, DLT$0.TABLE_NAME GB1, DLT$0.DATA GB2, SUM(1) D0
        FROM    (
                SELECT
                        CHARTOROWID(MAS$.M_ROW$$) RID$, MAS$.DATA, MAS$.TABLE_NAME, MAS$.OWNER
                FROM
                        TEST_USER.MLOG$_FRBUG MAS$
                WHERE
                        MAS$.SNAPTIME$$ > :1
                ) AS OF SNAPSHOT (:2)  DLT$0
        GROUP BY
                DLT$0.OWNER, DLT$0.TABLE_NAME, DLT$0.DATA
        )AV$
ON
        (
                SYS_OP_MAP_NONNULL(SNA$.OWNER)      = SYS_OP_MAP_NONNULL(AV$.GB0)
        AND     SYS_OP_MAP_NONNULL(SNA$.TABLE_NAME) = SYS_OP_MAP_NONNULL(AV$.GB1)
        AND     SYS_OP_MAP_NONNULL(SNA$.DATA)       = SYS_OP_MAP_NONNULL(AV$.GB2)
        )
WHEN MATCHED THEN
        UPDATE SET SNA$.MVCOUNT = SNA$.MVCOUNT+AV$.D0
WHEN NOT MATCHED THEN
        INSERT (SNA$.OWNER, SNA$.TABLE_NAME, SNA$.DATA,SNA$.MVCOUNT)
        VALUES (AV$.GB0, AV$.GB1, AV$.GB2, AV$.D0)
;

Spot the problem: it’s the use of the sys_op_map_nonnull() function. As it says in the comments I mentioned above, this function adds one byte to the existing value – so if the input is already 4,000 bytes long the output exceeds the legal limit and results in a ORA-01706. (Try to select sys_op_map_nonnull(rpad(‘x’,4000,’x’)) from dual if you want a simpler demonstration.)

The worrying thing about this particular example is that you could run a production system with these definitions and everything could work perfectly for ages, until the day that someone inserted data that used the full 4,000 byte length of the column declaration – and then your materialized view can’t be refreshed.

And here’s an annoying detail that applies in this case – I altered the data column to declare it as not null, which made the sys_op_map_nonnull() redundant (but is was called anyway). That’s a drawback of reusable code – sometimes you re-use it when perhaps you ought to “special case” it.

Footnote:

It is possible that Oracle has some code that decides whether or not to use the merge command to do the refresh – it’s possible, though I haven’t checked it, that Oracle could choose between merge and a “delete/insert” cycle; if you find that Oracle doesn’t crash when you try this test with the 4,000 byte character string then check the trace file to see whether Oracle has used a different strategy for the refresh.

Teaser:

For an even stranger behaviour relating to fast refresh materialized views, see next week’s exciting episode.

Update (Feb 2021)

The problem is still present in 19.3, though the stack produced has changes quite a bit:

ORA-12008: error in materialized view or zonemap refresh path
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3012
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2424
ORA-01706: user function result value was too large
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 88
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 253
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2405
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 2968
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3255
ORA-06512: at "SYS.DBMS_SNAPSHOT_KKXRCA", line 3287
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 16
ORA-06512: at line 1

Note, however, that we still see the ORA-12008 (though its text has changed to include the option of Zone maps), and the ORA-01706.

Update (August 2023)

The problem is still present in 23c (Free).

It’s also worth noting that if you’ve set max_string_size = extended then the test has to be modified to create a varchar2(32767) rather than varchar2(4000), and change the rpad() similarly.

 

11 Comments »

  1. There are one more funny thing with sys_op_map_nonnull – you can create table with RAW(4000) column and use it succesfully:

    Connected to Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 
    Connected as test_user
     
    SQL> 
    SQL> begin
      2    execute immediate 'create table t2001 as select sys_op_map_nonnull(cast(null as raw(2000))) xxx from dual';
      3    for i in 1 .. 1999 loop
      4      execute immediate 'create table t'||(2001+i)||' as select sys_op_map_nonnull(xxx) xxx from t'||(2000+i);
      5      execute immediate 'drop table t'||(2000+i);
      6    end loop;
      7  end;
      8  /
     
    PL/SQL procedure successfully completed
     
    SQL> select dbms_metadata.get_ddl('TABLE', 'T4000') from dual;
     
    DBMS_METADATA.GET_DDL('TABLE',
    --------------------------------------------------------------------------------
     
      CREATE TABLE "TEST_USER"."T4000" 
       (	"XXX" RAW(4000)
       ) SEGMENT CREATION IMMEDIATE 
      PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
      STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
      PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
      TABLESPACE "USERS" 
     
     
    
    SQL> declare
      2    raw4000 raw(4000) := hextoraw(lpad('0',8000,'0'));
      3  begin
      4    dbms_output.put_line(utl_raw.length(raw4000));
      5    insert into T4000 values (raw4000);
      6  end;
      7  /
     
    PL/SQL procedure successfully completed
     
    SQL> select utl_raw.length(xxx) from T4000;
     
    UTL_RAW.LENGTH(XXX)
    -------------------
                   2000
                   4000
    

    Comment by Valentin Nikotin — August 30, 2012 @ 7:52 pm BST Aug 30,2012 | Reply

    • nice one, Valentin !

      Comment by Sokrates — August 31, 2012 @ 8:31 am BST Aug 31,2012 | Reply

    • Valentin,

      Extraordinary – how do you manage to find such bizarre cases ?!

      Comment by Jonathan Lewis — August 31, 2012 @ 9:17 am BST Aug 31,2012 | Reply

    • Please give an example, as raw it limited to 2000 on 11.2.0.3 that I am testing on.

      Comment by Jared — August 31, 2012 @ 9:01 pm BST Aug 31,2012 | Reply

      • Jared,

        If you click on the “show source” icon in the comment the window expands to show the echoed output of a demonstration script. (You might want to add a “purge” to the drop table command.)

        I ran it, and ended up with a table with a column of type raw(4000), and a block dump showed that the internal length of the newly inserted row was actually stored as 4,000 bytes – and yes, a simple “create table” won’t allow you to get past raw(2000).

        Comment by Jonathan Lewis — August 31, 2012 @ 9:12 pm BST Aug 31,2012 | Reply

        • Thanks Jonathan, I didn’t see the source icon. Interesting stuff, off to play now.

          Comment by Jared — August 31, 2012 @ 10:38 pm BST Aug 31,2012

    • Valentin, this is a very interesting find.
      Questions that come to mind:
      Why does this work?
      Why does Oracle allow creating a 4k raw in with ‘execute immediate’, but not with a straight ‘create table’?
      I suspect a difference in code path is responsible, but it still seems Oracle is imposing a somewhat artificial limit on RAW columns when a straight ‘create table’ is used.

      Not that I am expecting anyone to answer these directly, these are just the things that come to mind.

      Here is a slight modification to the code to create the 4k RAW column.
      Its only advantage it is quite a bit faster to run than the loop.
      The trick is nested calls to sys_op_map_nonnull to add the extra 2 bytes.

      drop table t4k purge;
      
      declare
         r1 raw(2000);
         r2 raw(2000);
         v_sql varchar2(10000);
      begin
          select cast(rpad('0',2000,'0') as raw(2000)) into r1 from dual;
          select cast(rpad('0',1998,'0') as raw(1998)) into r2 from dual;
          v_sql := 'create table t4k as select sys_op_map_nonnull(sys_op_map_nonnull(''' || r1||r2 || ''')) xxx from dual';
          dbms_output.put_line(v_sql);
          execute immediate v_sql;
      end;
      /
      
      desc t4k
      

      Comment by Jared — September 1, 2012 @ 12:02 am BST Sep 1,2012 | Reply

      • Ok, commenting on my own post here.

        I traced both a standard create statement and the execute immediate create.

        The trace for execute immedate revealed this:

        PARSING IN CURSOR #47836159092840 len=4081 dep=1 uid=90 oct=1 lid=90 tim=1346458078205323 hv=2343315936 ad='aaa2dee8' sqlid='88qs6bf5usag0'
        create table t4k as select sys_op_map_nonnull(sys_op_map_nonnull('000000' # truncated for brevity
        END OF STMT
        ...
        EXEC #47836159092840:c=69989,e=85539,p=0,cr=177,cu=85,mis=0,r=1,dep=1,og=1,plh=2781518217,tim=1346458078290968
        STAT #47836159092840 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD AS SELECT  (cr=113 pr=0 pw=1 time=36322 us)'
        STAT #47836159092840 id=2 cnt=1 pid=1 pos=1 obj=0 op='FAST DUAL  (cr=0 pr=0 pw=0 time=7 us cost=2 size=0 card=1)'
        CLOSE #47836159092840:c=0,e=8,dep=1,type=0,tim=1346458078291140
        

        The LOAD AS keywords indicate a direct path load.
        Direct path load is what is used by CTAS, so here is a method to create the table directly as DDL:

        
        drop table t4k;
        
        create table t4k
        as
        with data as (
           select
              cast(rpad('0',2000,'0') as raw(2000)) r1
              , cast(rpad('0',2000,'0') as raw(2000)) r2
           from dual
        )
        select  r1||r2 xxx
        from data
        /
        
        desc t4k;
        
        

        Comment by Jared — September 1, 2012 @ 12:30 am BST Sep 1,2012 | Reply

        • Jared,

          Nicely done. So it’s not a specific anomaly due to the use of sys_op_map_nonnull(), it’s a generic feature of a derived raw() value in a CTAS.

          One last tiny, peripheral, touch – if you add ‘where rownum = 0’ to the create statement, table t4k still gets created with a column of type raw(4000), but with no data in it.

          Comment by Jonathan Lewis — September 1, 2012 @ 10:09 am BST Sep 1,2012

  2. Thanks Jonathan, I also noticed that r2 is not needed, so here is the more succinct DDL for those following this:

    
    create table t4k
    as
    with data as (
       select cast(rpad('0',2000,'0') as raw(2000)) r1 from dual
    )
    select  r1||r1 xxx
    from data
    where rownum = 0
    
    

    I also came to the conclusion that sys_op_map_nonnull() is a bit of a red herring, though I had not thought much past that yet as to whether this might be intentional or a bug.

    The explanation for the error “ORA-00910: specified length too long for its datatype” provides a clue that helps ferret out why this may be allowed.

    [jkstill]$ oerr ora 910
    00910, 00000, “specified length too long for its datatype”
    // *Cause: for datatypes CHAR and RAW, the length specified was > 2000;
    // otherwise, the length specified was > 4000.
    // *Action: use a shorter length or switch to a datatype permitting a
    // longer length such as a VARCHAR2, LONG CHAR, or LONG RAW

    Notice that the length restriction pertains to CHAR as well as RAW data types.

    Suspecting that that allowance for 4k data types might have been made for the data dictionary, I ran a query looking for CHAR and RAW data type columns with a length > 2000.

    Output is edited for brevity:

    
    select owner, table_name, column_name, data_type, data_length
    from dba_tab_columns
    where data_type in ('CHAR','RAW')
    and data_length > 2000
    order by 1,2,3
    /
    
    
    OWNER      TABLE NAME                     COLUMN     DATA_TYPE  DATA_LENGTH
    ---------- ------------------------------ ---------- ---------- -----------
    JKSTILL    T4000                          XXX        RAW               4000
    SYS        KU$_10_1_FHTABLE_VIEW          TSTZ_COLS  CHAR              4000
    ...
    SYS        KU$_TABLE_DATA_VIEW            TSTZ_COLS  CHAR              4000
    
    23 rows selected.
    
    

    In this 11.2.0.3 database all of the columns that exceed the 2k limit belong to SYS and are all CHAR(4000), with the exception of the test table.
    In addition, these are all views that are used to track columns that are of type TIMESTAMP WITH TIMEZONE.

    Taking a look at how the view KU$_IOTABLE_DATA_VIEW is created (found in OH/rdbms/admin/catmeta.sql) it is seen that it and all the other views of interest here are created from object types, and use an inline view that selects from a funtion to create the TSTZ_COLS column:

    
      (select sys.dbms_metadata_util.has_tstz_cols(t.obj#) from dual)
    
    

    A little test shows that uses the same method as seen in catmeta.sql to create a test view creates a view with a column of CHAR(4000).

    
    create or replace type char_test_t as object
    (
       tstz_test char(1)
    )
    /
    
    create or replace function f return char
    as
    begin
       return 'Y';
    end;
    /
    
    create or replace force view char_test of char_test_t
    with object OID(1)
    as select (select f from dual)
    from dual
    /
    
    desc char_test
    
    

    Comment by Jared — September 1, 2012 @ 5:25 pm BST Sep 1,2012 | Reply

  3. […] week I published some code that demonstrated how a fast refresh of an aggregate materialized view could fail because of the internal use of the sys_op_map_nonnull() function. In some ways it was an easy […]

    Pingback by Bugs | Oracle Scratchpad — June 15, 2016 @ 8:44 am BST Jun 15,2016 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.