Oracle Scratchpad

April 13, 2021

CSV

Filed under: Infrastructure,Oracle,Problem Solving — Jonathan Lewis @ 3:00 pm BST Apr 13,2021

An old question with a little extra twist came up on the Oracle Developer forum recently – someone wanted to export a table into a CSV file but wanted to do a character set conversion at the same time because various “non-printable” characters were getting corrupted.

Looking at the code they had supplied they were running from SQL*Plus and were trying to use a select statement to generate a script that, for each row in a table, produced a statement of the form:

insert into tableX({list of columns}) values({list of values});

This worked reasonably well when they didn’t attempt to get a UTF8 output, but when they modified their client O/S environment to make SQL*Plus produce the desired UTF8 output the operation was much slower, increasing from about 1 minute to more than half an hour. (This may simply be the overhead of SQL*Net translating each line of output from the server character set to the client character set.)

Since the system in question was running on 12.2 I suggested a change of strategy so that they could at least take advantage of a built-in CSV generator to see how this behaved. Oracle extended the “set markup” command in 12c to include a CSV option:

set markup csv on delimiter {character} quote on

If this behaves nicely with the user’s attempt to get the output in their chosen characterset then a change in strategy to generate a “proper” CSV file and reload it through an external table, rather than generating and executing a list of “insert ….” statements might do a lot less work. (Unfortunately they had a requirement to generate the insert statements rather than adopting this bulk strategy – but I wasn’t going to let that stop me finishing this note.)

One detail to note with this markup option is that, unlike the “delimiter” character, the “quote” character cannot be specified and will always be the double-quote (ASCII character 34). If this character appears in any of the reported strings Oracle will “escape” it by duplicating it. Here’s a little example to demonstrate the effects – first a little data with a basic output (running 12.2.0.1):

rem
rem     Script:         markup.csv.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sep 2020
rem

create table t1 (v1 varchar2(10), n1 number, n2 number, v2 varchar2(10));

insert into t1 values('abcdef',1,1,'qwer');
insert into t1 values('abc"def',2,null,'asdf');
insert into t1 values (null,3,3,null);

commit;

prompt  ==================
prompt  markup not yet set
prompt  ==================

select * from t1;


V1                 N1         N2 V2
---------- ---------- ---------- ----------
abcdef              1          1 qwer
abc"def             2            asdf
                    3          3

3 rows selected.

I’ve included one row with a double-quote in one of the columns, and a couple of rows with NULLs in various columns. Now I just switch markup to CSV using its default options:

prompt  ==================
prompt  Default CSV markup
prompt  ==================

set markup csv on
select * from t1;


"V1","N1","N2","V2"
"abcdef",1,1,"qwer"
"abc""def",2,,"asdf"
,3,3,

3 rows selected.


As you can see this has behaved nicely – every column is present (even when empty – judging by the number of delimiters) and character strings have been quoted with the double-quote character, and the stored double-quote in the second row has been duplicated. The default option for CSV should be nearly perfect for use as an extrernal table – though we probably want to get rid of column headings, page breaks, and feedback etc.

Before creating a suitable external table, let’s play around with a couple of variations for the options:

prompt  ===========
prompt  delimiter =
prompt  ===========

set heading off
set pagesize 0
set feedback off

set markup csv on delimiter =
select * from t1;

"abcdef"=1=1="qwer"
"abc""def"=2=="asdf"
=3=3=

prompt  ================================
prompt  quote off (delimiter remembered)
prompt  ================================

set markup csv on quote off
select * from t1;

abcdef=1=1=qwer
abc"def=2==asdf
=3=3=

Now to read back one of the outputs – I’ll pick the default output for this (but with headers etc. off to leave just three lines of data). Here’s how I can define my external table to read back the values from the file:

create or replace directory ext_tab as '/mnt/working/';
create or replace directory ext_log as '/mnt/working/';

create table ext(
        v1 varchar2(10), 
        n1 number, 
        n2 number, 
        v2 varchar2(10)
)
organization external(
        type oracle_loader
        default directory ext_tab
        access parameters (
                records delimited by newline 
                discardfile     ext_log:'exttab_dis.txt'
                badfile         ext_log:'exttab_bad.txt'
                logfile         ext_log:'exttab_log.txt'
        fields 
                csv 
                with embedded
        missing field values are null
        )
        location ('markup_csv.lst')
)
reject limit 10
;

select * from ext;

V1                 N1         N2 V2
---------- ---------- ---------- ----------
abcdef              1          1 qwer
abc"def             2            asdf
                    3          3

3 rows selected.

I’ve created a couple of directories to get started – linking a physical location to a logical name that I can use in the external table definition.

The key details that allow Oracle to read the file correctly appear in the “fields” specification: it’s a pity that the output from the default CSV markup doesn’t become the default expectation for the external table CSV option, but I needed to use the “with embedded” to adjust the duplicated “double-quote” characters and then add “missing field values are null” to cope with a null column at the end of the line.

Summary

With a simple call to “set markup csv on” (and a handful of other set {formatting} commands) it’s very easy to turn a table into a flat CSV file that is comma separated, (double)quote deliminted.

It’s then very easy – but requires two non-default settings – to read this flat file back into an Oracle database using “fields CSV” as the basic formatting directive.

Footnote

I’ve kept the example very simple, but as soon as you start dealing with dates and timestamps – and even very large numbers – you’ll have to worry about formatting when generate the flat file (set nls_date_format, column n1 format 99999, etc.) to make sure you don’t lose precision on the output; and for dates and timestamps you’ll have to include the equivalent date_format options in the external table definition.

I’ve only demonstrated how to reload the file back into Oracle as an external table. If you wanted to use this markup option to prepare the data for an Excel spreadsheet all you have to do is spool to a file with the “.csv” extension and Excel will be able to open the output file and parse the data correctly into a multi-column display. (But see comment #1 below for a warning from Jason Bucata.)

February 1, 2021

data_default

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 4:50 pm GMT Feb 1,2021

Here’s a quirky little detail – probably totally irrelevant to virtually everyone – that came up in a question on the Oracle Developer Forum a couple of days ago. It concerns the way Oracle stores and records default values for columns, and it also does a hat-tip to the “commas at the start/end of the line” argument. Here’s a little script to create two identical tables:

create table t1 (
        t1 timestamp default '01-Jan-2021 12:00:00'
,       t2 timestamp default '01-Jan-2021 12:00:00'
,       t3 timestamp default '01-Jan-2021 12:00:00'
)
;


create table t2 (
        t1 timestamp default '01-Jan-2021 12:00:00',
        t2 timestamp default '01-Jan-2021 12:00:00',
        t3 timestamp default '01-Jan-2021 12:00:00')
;

Here’s a query to check that we’ve set the defaults correctly, followed by the output:

break on table_name skip 1
set linesize 180

spool default_length.lst

select
        table_name, column_name, default_length, data_default
from
        user_tab_columns
where
        table_name in ('T1', 'T2')
order by
        table_name,
        column_name
;

TABLE_NAME           COLUMN_NAME          DEFAULT_LENGTH DATA_DEFAULT
-------------------- -------------------- -------------- ----------------------------------
T1                   T1                               23 '01-Jan-2021 12:00:00'
                     T2                               23 '01-Jan-2021 12:00:00'
                     T3                               23 '01-Jan-2021 12:00:00'

T2                   T1                               22 '01-Jan-2021 12:00:00'
                     T2                               22 '01-Jan-2021 12:00:00'
                     T3                               22 '01-Jan-2021 12:00:00'

It would appear that we have the same default values set for the columns – but for table t1 the length of the default values is 23, while for table t2 it’s only 22. How strange, how do we investigate what’s going on.

A check of the view user_tab_columns tells us that data_default is a long column so we can’t dump() it, and we can’t substr() it. We could dump the relevant block from sys.col$, but rather than do that I’ll write a little PL/SQL block that reads the long into a PL/SQL varchar2() and outputs the last byte:

declare
        v1 varchar2(32);
begin
        for r in (
                select  table_name, column_name, default_length, data_default
                from    user_tab_columns
                where   table_name in ('T1','T2')
        ) loop
                v1 := r.data_default;
                dbms_output.put_line(
                        r.table_name || ' ' ||
                        r.column_name || ' ' ||
                        r.default_length || ' ' ||
                        ascii(substr(v1,r.default_length))
                );
        end loop;
end;
/

T1 T1 23 10
T1 T2 23 10
T1 T3 23 10
T2 T1 22 39
T2 T2 22 39
T2 T3 22 39

The last character of data_default for the t1 table (with length 23) is chr(10) – the line-feed, while the last character for the t2 table (with length 22) is chr(39) – the single-quote character.

The text stored in the data_default column is literally the text you supplied to Oracle (it’s not an expression that is stored and evaluated at table creation time); and the text that’s stored seems to be all the text that Oracle see up to the point where a new token tells it to stop, and in the case of t1 that’s the comma after the line-feed (if you’re running on Windows you might see the length as 24 since DOS uses “carriage return – line feed” compared to the UNIX line-feed only).

Here’s another variant, just to emphasise the point, showing another table declaration and the associated output from the PL/SQL:

create table t3 (
        t1 timestamp default '01-Jan-2021 12:00:00'    
,       t2 timestamp default '01-Jan-2021 12:00:00'     ,       t3 timestamp default '01-Jan-2021 12:00:00'
)
;


T3 T1 23 10
T3 T2 27 32
T3 T3 23 10

In this case there are 5 spaces between the declaration of column t2 and the comma that separates it from the declaration of column t3. As you can see the default length is longer and the last stored byte is chr(32) – the space character

Lagniappe

You could criticise me for not including a format string as part of my definition of the default value, so it should have been something like: to_timestamp(’01-Jan-2021 12:00:00′,’dd-mon-yyyy hh24:mi:ss’) There is, however, one drawback to this – the expression is now 62 characters long (at least), which means the default value won’t be cached in the dictionary cache (v$rowcache)- and this might introduce a parsing overhead that you would prefer to avoid.

January 18, 2021

Supplemental Defect

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 5:22 pm GMT Jan 18,2021

Here’s an anomaly that appeared in a question on the ODC recently about tweaking the output of dbms_metadata.get_ddl(), As the title suggests, though, this posting isn’t about the dbms_metadata package it’s about supplemental logging and something that should not to be possible and may have unfortunate side effects.

We start with a little script that creates a table, inserts some data, adds a couple of constraints, and then introduces some supplemental logging requirements. As the script header indicates I’ve only tested this on 19.3.0.0:

rem
rem     Script:         supplemental_defect.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2021
rem
rem     Last tested 
rem             19.3.0.0
rem

create table t1 (
        n1      number,
        n2      number,
        n3      number,
        n4      number
);

insert into t1 (n1, n2, n3, n4) values (1,2,3,4);
commit;

alter table t1 add constraint t1_pk primary key(n1, n2)
/

alter table t1 add constraint t1_uk unique(n3) using index (
        create index t1_uk on t1(n3, n4)
)
/

alter table t1 add supplemental log data (primary key, unique) columns
/

alter table t1 add supplemental log group t1_g1 (n1, n2) always
/

alter table t1 add supplemental log group t1_g2 (n1, n2) always
/

There’s nothing particularly special or complex about this code, and every statement runs successfully. You might notice that I’ve created two identical supplemental log groups, of course, and that’s clearly a mistake and I’d say that Oracle should have raised an error when I tried to create group t1_g2.

In fact my script is slightly worse than you might think at first sight because (n1, n2) is the primary key of the table, and I’ve added supplemental logging on the primary key already, so even supplemental log group t1_g1 really ought to raise an error as well.

Side note: you’ll notice that I’ve got a single-column unique constraint protected by non-unique multi-column index, and part of my supplemental log data specifies unique columns. If you check the syntax this option can be expressed as “unique” or “unique indexes” – which leaves me wondering whether Oracle my code would treat this as a command relating to n3 and n4, or just to n3.

So far all I’ve done is create a table in a way that’s left me with a couple of questions – let’s see what we get when we use dbms_metadata.get_ddl() to generate code to recreate the table:

begin
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'PRETTY',true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SQLTERMINATOR',true);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'SEGMENT_ATTRIBUTES',false);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'STORAGE', false);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'TABLESPACE',false);
        dbms_metadata.set_transform_param(dbms_metadata.session_transform,'CONSTRAINTS',true);
end;
/

set long 20000
set heading off

select dbms_metadata.get_ddl('TABLE','T1') from dual;

================================================================

  CREATE TABLE "TEST_USER"."T1"
   (	"N1" NUMBER,
	"N2" NUMBER,
	"N3" NUMBER,
	"N4" NUMBER,
	 CONSTRAINT "T1_PK" PRIMARY KEY ("N1", "N2")
  USING INDEX  ENABLE,
	 CONSTRAINT "T1_UK" UNIQUE ("N3")
  USING INDEX  ENABLE,
	 SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS,
	 SUPPLEMENTAL LOG DATA (UNIQUE INDEX) COLUMNS,
	 SUPPLEMENTAL LOG GROUP "T1_G1" ("N1", "N2") ALWAYS,
	 SUPPLEMENTAL LOG GROUP "T1_G2" ("N1", "N2") ALWAYS
   ) ;

The “create table” statement includes 4 supplemental log clauses – notice how my original “two-part” clause for the primary key and unique constraints has been split into two – with the key word index being added to the latter.

You might note that if you try to execute this statement to recreate the table you’re going to run into an immediate problem – Oracle will create a unique singe-column index to protect the unique constraint – it doesn’t know that the constraint should be protected by a two-column non-unique index.

But there’s another problem that will protect you from the indexing issue. Here’s what I get when I try to execute the statement:

	 SUPPLEMENTAL LOG GROUP "T1_G1" ("N1", "N2") ALWAYS,
	                                *
ERROR at line 12:
ORA-02261: such unique or primary key already exists in the table

The funniest thing about this error comes when you look at the error message file ($ORACLE_HOME/rdbms/mesg/oraus.msg in my case), where we can find the “Cause” and “Action” that Oracle supplies for the error:

02261, 00000, "such unique or primary key already exists in the table"
// *Cause: Self-evident.
// *Action: Remove the extra key.

It’s not too difficult to guess why the error has appeared but “Self-evident” seems a little optimistic. In fact your first guess about the error may be wrong. Obviously I could simply delete the lines that create the t1_g1 and t1_g2 logs since they appear to be redundant copies of the (primary key) supplemental log – but what if I just delete the line that creates the (primary key) supplemental log? Oracle still raises the ORA-02261 error. It’s only when I delete the declaration of the primary key (and I can leave the declaration of the (primary key) supplemental log in place) that the error stops appearing and I recreate the table … even if I’ve left the decarations of the two supplemental log groups on (n1, n2) in place.

On the plus side – although the export/import utilities expdp and impdp make use of the dbms_metadata package they don’t run into the same problem. After I’d exported, dropped and imported the t1 table with the sqlfile=xxx.sql option this is how impdp reported the code it had run to recreate the table (edited to remove the space management bits):

CREATE TABLE "TEST_USER"."T1" 
   (    "N1" NUMBER, 
        "N2" NUMBER, 
        "N3" NUMBER, 
        "N4" NUMBER
   ) 
;

ALTER TABLE "TEST_USER"."T1" ADD CONSTRAINT "T1_PK" PRIMARY KEY ("N1", "N2")
;
ALTER TABLE "TEST_USER"."T1" ADD CONSTRAINT "T1_UK" UNIQUE ("N3")
;
ALTER TABLE "TEST_USER"."T1" ADD SUPPLEMENTAL LOG DATA (UNIQUE INDEX) COLUMNS;
ALTER TABLE "TEST_USER"."T1" ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
ALTER TABLE "TEST_USER"."T1" ADD SUPPLEMENTAL LOG GROUP "T1_G1" ("N1", "N2") ALWAYS;
ALTER TABLE "TEST_USER"."T1" ADD SUPPLEMENTAL LOG GROUP "T1_G2" ("N1", "N2") ALWAYS;

The underlying code for expdp/impdp generates constraints separately from the table and then adds the constraints to the tables – and it views supplemental logs as a variant on constraints (you can stop the supplemental log clauses appearing by using the dbms_metadata.transform_param() procedure to disable the generation of constraints) and adds them later.

You might notice in passing that the index created by impdp to support the unique constraint is NOT the index originally specified. This is a very old problem – though the nature of the problem has varied with time – so watch out if you move data from a production system to a test system using export/import.

Summary

The internal code to handle supplemental logging allows you to create redundant supplemental log groups that will cause errors if you use dbms_metadata.get_ddl() in the most straightforward way to generate code to recreate the table.

The error message that appears if you haven’t allowed for the anomaly is a little counter-intuitive.

January 13, 2021

Check Constraints

Filed under: constraints,Oracle — Jonathan Lewis @ 3:17 pm GMT Jan 13,2021

This is a note I drafted in 2018 but never got around to publishing. It’s an odd quirk of behaviour that I discovered in 12.2.0.1 but I’ve just checked and it’s still present in 19.3.0.0.

Here’s a funny little thing that I found while checking some notes I had on adding constraints with minimum service interruption – a topic I last wrote about a couple of years ago [ed. now nearly 5 years ago]. This time around I did something a little different, and here’s a cut-n-paste from the first couple of steps of my test when I had previously used another session to delete a row without committing (table t1 is a table I created as select * from all_objects).

Note that the first SQL statement uses disable while the second uses enable:


SQL> alter table t1 add constraint c1 check(owner = upper(owner)) disable novalidate;
alter table t1 add constraint c1 check(owner = upper(owner)) disable novalidate
            *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

SQL> alter table t1 add constraint c1 check(owner = upper(owner)) enable novalidate;

At this point my session was hanging – and I find it a little surprising that the attempt to create the constraint disabled returns an immediate ORA-00054, while the attempt to create it enabled waits. A quick check of v$lock showed that my session was requesting a TX enqueue in mode 4 (transaction, share mode) waiting for the other session to commit or rollback .

In the following output from 12.1.0.2 my session is SID 16 and I’ve simply reported all the rows for the two sessions from v$lock:


       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK     CON_ID
---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ----------
        16 TX     327704      12790          0          4        169          0          0
           TX      65550       9613          6          0        169          0          0
           TM     192791          0          2          0        169          0          0
           OD     192791          0          4          0        169          0          0
           AE        133          0          4          0        579          0          0

       237 TX     327704      12790          6          0        466          1          0
           TM     192791          0          3          0        466          0          0
           AE        133          0          4          0        582          0          0

You’ll notice my session is holding an OD enqueue in mode 4 and a TM lock in mode 2 – the value 192791 is the object_id of the table in question. The OD lock is described in v$lock_type as “Lock to prevent concurrent online DDLs”.

It would appear, therefore, that we are stuck until the other session commits – so I hit ctrl-C to interrupt the wait, and then tried to add the constraint again, stil without committing (or rolling back) the other session. Here’s the cut-n-paste from that sequence of events:

{hits ctrl-C}
alter table t1 add constraint c1 check(owner = upper(owner)) enable novalidate
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation

SQL> alter table t1 add constraint c1 check(owner = upper(owner)) enable novalidate;
alter table t1 add constraint c1 check(owner = upper(owner)) enable novalidate
                              *
ERROR at line 1:
ORA-02264: name already used by an existing constraint

I’ve interrupted the command and (according to Oracle) cancelled the current operation – but it then becomes apparent that I have successfully added the constraint anyway!

SQL> select constraint_name, constraint_type, search_condition from user_constraints where table_name = 'T1';

CONSTRAINT_NAME      C SEARCH_CONDITION
-------------------- - --------------------------------------------------------------------------------
SYS_C0018396         C "OWNER" IS NOT NULL
SYS_C0018397         C "OBJECT_NAME" IS NOT NULL
SYS_C0018398         C "OBJECT_ID" IS NOT NULL
SYS_C0018399         C "CREATED" IS NOT NULL
SYS_C0018400         C "LAST_DDL_TIME" IS NOT NULL
SYS_C0018401         C "NAMESPACE" IS NOT NULL
C1                   C owner = upper(owner)

And this is what happened when I switched to the other session – where I had still not committed or rolled back – and tried to execute an update:


SQL> update t1 set owner = lower(owner) where owner = 'SYSTEM' and rownum = 1;
update t1 set owner = lower(owner) where owner = 'SYSTEM' and rownum = 1
*
ERROR at line 1:
ORA-02290: check constraint (TEST_USER.C1) violated

So the constraint really is present and is visible to other sessions – even though the attempt to add it hung and had to be interrupted!

I can’t think of any reason why this might cause a problem in the real world – but it is an oddity that might have echoes in other cases where it matters.

Update (next day)

When I posted a link to this post on twitter one of the replies referred me to a very similar post by Oren Nakdimon with a very similar date to my 2018 draft, which makes me wonder if I had decided not to publish it because I’d done a Google search on the topic after I had written this note.

I’ve just done a search on ODC (formerly OTN) for possible references (date range around the end of 2018, with text “enable novalidate”)  and found this question  So maybe we had seen the same question and done the same tests at the same time.

 

December 21, 2020

Why Why Why Why?

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 2:24 pm GMT Dec 21,2020

Here’s a little puzzle – and if you don’t see the answer almost immediately you did need to read this note. The question comes from a thread on the Oracle Groundbreakers’ Forum –

“Why I am getting 0020 instead of 2020 for below query?”

select to_char(to_date(sysdate,'dd-mm-yyyy'),'iyyy') from dual.

Instead of sysdate if I give date like ‘5-dec-2020’ it’s giving correct. Any ideas why iso value for sysdate year is coming as 0020?

There are many things that people do wrong with dates and many ways in which Oracle tries to help so I’ll start with a general-purpose reminder: Oracle supplies a “date” datatype, if you want to store dates, compare dates, or do date arithmetic make sure you are using the date datatype.

(Technically, of course, Oracle’s date datatype is actually a “date with time to nearest second” type, so there are a couple of details you need to remember to avoid running into surprises that are due to the presence of the time component – for example you might use a constraint like “check(date_col = trunc(date_col)” to enforce date-only values for a column.)

Sysdate is a date, and the function to_date() expects its first parameter to be a character string; so Oracle implicitly converts sysdate to a character type with the to_char() function before it does anything else and it uses the session’s nls_date_format parameter to supply the formatting string. On my instance this parameter has the value ‘DD-MON-RR’ (an option created in an attempt to work around the “Y2K” problem – which some of you will remember).

So sysdate was converted (on the day the question was asked) to the character string ’10-DEC-20′, and when the to_date() function call tried to convert this back to a date type using the explicitly supplied format dd-mm-yyyy Oracle used leading zeros to pad the 2-digit year to a 4-digit year which made the year 0020 which is exactly what the OP saw.

Solution

To correct this code, take out the call to to_date().

SQL> select to_char(to_date(sysdate,'dd-mm-yyyy'),'iyyy') from dual;

TO_C
----
0020

SQL> select to_char(sysdate,'iyyy') from dual;

TO_C
----
2020

If you can’t correct the code then you might be able to work around the error by setting the nls_date_format to a more appropriate value. In fact the nls_date_format is one of those parameters that you probably ought to change from its default value the moment you set up your database. It’s just rather scary to do so if you’re running a system that has been around for a few years and may (accidentally) include some code that depends on the default setting to get the right answers or best execution plans.

SQL> alter session set nls_date_format='dd-Mon-yyyy';
SQL> select to_char(to_date(sysdate,'dd-mm-yyyy'),'iyyy') from dual;

TO_C
----
2020

SQL>  alter session set nls_date_format='dd-Mon-rrrr';
SQL> select to_char(to_date(sysdate,'dd-mm-yyyy'),'iyyy') from dual;

TO_C
----
2020

SQL> alter session set nls_date_format='dd-Mon-yyyy';
SQL> select to_date('01-Dec-20','dd-mm-rrrr') from dual;

TO_DATE('01
-----------
01-Dec-2020

SQL> select to_date('01-Dec-50','dd-mm-rrrr') from dual;

TO_DATE('01
-----------
01-Dec-1950

I’ve included a couple of examples using the ‘rrrr’ (or ‘RRRR’) format for the year component. Oracle introduced the ‘RR’ format over 20 years ago as a best-guess workaround to Y2K problem. The two-character ‘RR’ format means values between 50 and 99 imply previous century, and values between 00 and 49 imply current century (as show in the last two simpler examples). It’s an option that should have been deprecated within a few years of its introduction and desupported soon after, but the terrible inertia of IT juggernaut means it’s still around. If you’re worried about the impact of changing your nls_date_format from ‘dd-mon-RR’ to ‘dd-Mon-yyyy’ you may feel a little safer switching to ‘dd-Mon-RRRR’ – whatever you do, though, you’ll almost certainly find examples where the code misbehaves because of the side-effects of the change in formatting.

One tiny detail you might have noticed in the original posting is that the user tested their code with the literal value ‘5-Dec-2020’, and got the result they wanted even though the format they had used to convert from character to date was ‘dd-mm-yyyy’. Oracle tries quite hard to cope with date conversions, as Connor McDonald pointed out several years ago.

While we’re on the topic of conversion it’s worth revisiting my comment about the date type including a time component. I’ve often seen expressions like to_date(to_char(sysdate,’dd-mon-yyyy’)) being used to ensure that a date that might include a time component is reduced to a “date-only” value (although that really means the time-component is “00:00:00”). If you need only the date component it’s far better to use trunc(date_expression) rather than this double type-conversion; there’s probably not a lot of savings in terms of simple CPU-usage, but (a) you might as well take it and (b) you might be able to give the optimizer a chance of getting a better cardinality estimate hence a better execution plan.

Summary

  • sysdate is a date type, don’t try to use to_date() on it.
  • to_char(date_expression) will use the nls_date_format value if you haven’t supplied an explicit format string so you should always include an explicitly chosen format expression in the call.
  • The nls_date_format defaults to a value that is bad on two counts: it expects a 2-digit year and uses RR rather than YY. You should be using four digits for the year, and the only argument for using RRRR is if you are temporarily in transition from RR to YYYY.

To misquote Napolean (the pig, Animal Farm): “4 Y’s good, 2 Y’s bad”. And it’s not going to change to “4 Y’s good, 2 Y’s better”.

December 9, 2020

Edition-based Redefinition

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 12:00 pm GMT Dec 9,2020

This note is a list of links to the presentations that Bryn Llewellyn gave to the Swedish Oracle Users Group in a full-day seminar on EBR in 2017. I’ve had it as a draft note on my blog for some time, but have just revisited it and thought that others might like to be reminded about it.

There are 8 videos ranging from about 40 to 60 minutes each, and the last link is for the full playlist.

November 30, 2020

In-row LOBs

Filed under: Infrastructure,LOBs,Oracle — Jonathan Lewis @ 11:27 am GMT Nov 30,2020

If you’re like me there are probably lots of little things that you know but find terribly easy to forget. Here’s one of my bêtes noires – starting with a little quiz:

rem
rem     Script:         lob_in_row.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2020
rem     Purpose:
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

create table t1 (
        v1      varchar2(20),
        b1      clob,
        s1      clob
)
lob(b1) store as basicfile  b_lob,
lob(s1) store as securefile s_lob
;

insert into t1 values(
        rpad('x',20,'x'),
        rpad('x',20,'x'),
        rpad('x',20,'x')
)
/

commit;

execute dbms_stats.gather_table_stats(user,'t1');

select  column_name, avg_col_len
from    user_tab_columns
where   table_name = 'T1'
/

select  avg_row_len
from    user_tables
where   table_name = 'T1'
/

column rel_file_no new_value m_file_no
column block_no    new_value m_block_no

select
        dbms_rowid.rowid_relative_fno(rowid)    rel_file_no,
        dbms_rowid.rowid_block_number(rowid)    block_no
from
        t1
;

alter system flush buffer_cache;
alter system dump datafile &m_file_no block &m_block_no;

I’ve created a table with a varchar2() column, a basicfile CLOB, and a securefile CLOB, and inserted the same value into all three. By default the CLOB columns will be defined as ‘enable storage in row’ and since the inserted value is very short it will be stored “in row” for both these CLOBs. The question is:

Which column takes the most space in the data block?

To answer this question we need only gather stats and query user_tab_columns and/or user_tables – except the results may be a little surprising, so my script also uses the dbms_rowid package to find the file and block number where the row has been inserted, flushes the buffer cache to make sure that we don’t get confused by older versions of the block (you probably shouldn’t run this script on a big busy system) then dumps the block into the session trace file.

You’ll see why it’s necessary to look at the block dumps when I show you the results from gathering stats. The three sets of results below come from 11.2.0.4, 12.2.0.1, and 19.3.0.0 in that order. A key feature the databases have in common is that they were all created with a multi-byte character set:


11.2.0.4
========
COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                            21
B1                           123
S1                           117 

AVG_ROW_LEN
-----------
        261

12.2.0.1
========
COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                            21
B1                           171
S1                           165

AVG_ROW_LEN
-----------
        357

19.3.0.0
========
COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                            21
B1                            77
S1                            71

AVG_ROW_LEN
-----------
        169

There are three things that the three versions agree on.

  • First the row length is the sum of the column lengths.
  • Secondly the average length of the varchar2() column is 21 – remember that the avg_col_len includes the length byte(s) when we collect stats with the dbms_stats package.
  • Finally that the difference between the basicfile and securefile LOBs is 6.

But are the actual values reported for the LOB Lengths in any way truthful, or are we seeing reporting errors (that vary with version).

That’s why we need to look at the block dump, and this is what we find as the interesting bit of the dump – which is basically the same in all three cases:


block_row_dump:
tab 0, row 0, @0x1eec
tl: 172 fb: --H-FL-- lb: 0x1  cc: 3

col  0: [20]  78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78

col  1: [76]
 00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 02 75 f6 ff fb 00 38 09 00 00
 00 00 00 00 28 00 00 00 00 00 01 00 78 00 78 00 78 00 78 00 78 00 78 00 78
 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00
 78

col  2: [70]
 00 54 00 01 02 0c 80 80 00 02 00 00 00 01 00 02 75 f6 ff fc 00 32 48 90 00
 2c 00 00 28 01 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78
 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78

end_of_block_dump

When Oracle 19c reported the avg_col_len (and maybe the same would be true of 18c) it was correctly reporting the space taken up in the row by the entries for the two LOBs (and their overheads) that  I had inserted.  The earlier versions of Oracle are doing something a little odd.

A key thing you’ll notice, of course, is that the varchar2() column is storing my input string as one byte per character, but because I used a multibyte characterset for my database characterset in all three cases Oracle has stored the LOB equivalents of the string using a two-byte fixed-width characterset that has effectively doubled the storage (and then has to include the “Lob Locator” overhead – hence the 20 character string turning into 70+ bytes).

This highlights two important issues. First that trying to work out how to “tune” your table storage based on the avg_col_len / avg_row_len is a lot harder to do if you’ve got LOB columns in the table, and your method of estimating storage will have to change to suit the version of Oracle. Secondly, and much more importantly, if you’re thinking of changing a varchar2() column (such as “description” or “comments” from a character data type to a CLOB to allow for a small number of cases where people want to supply longer bits of text then if you’re using a multibyte character set there are two awful consequences:

  • the storage requirement for the column will more than double
  • the LOB will go out of line when the in-row storage requirement exceeds 4,000 bytes – which means when the original string content exceeds something between 1,938 and 1,985 characters depending on your version of Oracle and whether you are using basicfile or securefile LOBs.  (And a “long” input string will exceed a single block in the LOB segment whent it goes over roughly 4,000 characters.)

So if you want to handle “long comments” in multibyte charactersets you might want to consider making your code more complex so that up to 4,000 bytes you store the data as a varchar2(), and only use a CLOB column when the data goes over that limit.

Footnote

If you’re wondering  what all the overheads are for the in-row LOB you’ll find that the 11g block dump gives you some extra details (though the extra details disappeared from my 12c and 19c block dumps).  Here’s the content that appeared immediately after the col1 and col2 data respectively in the 11g block dump.

LOB
Locator:
  Length:        84(76)
  Version:        1
  Byte Length:    2
  LobID: 00.00.00.01.00.02.75.f6.ff.fb
  Flags[ 0x02 0x0c 0x80 0x00 ]:
    Type: CLOB
    Storage: BasicFile
    Enable Storage in Row
    Characterset Format: IMPLICIT
    Partitioned Table: No
    Options: VaringWidthReadWrite
  Inode:
    Size:     56
    Flag:     0x09 [ Valid DataInRow ]
    Future:   0x00 (should be '0x00')
    Blocks:   0
    Bytes:    40
    Version:  00000.0000000001
    Inline data[40]
Dump of memory from 0x00007FB4244BED8D to 0x00007FB4244BEDB5
7FB4244BED80                            00780001              [..x.]
7FB4244BED90 00780078 00780078 00780078 00780078  [x.x.x.x.x.x.x.x.]
        Repeat 1 times
7FB4244BEDB0 00780078 54004678                    [x.x.xF.T]



LOB
Locator:
  Length:        84(70)
  Version:        1
  Byte Length:    2
  LobID: 00.00.00.01.00.02.75.f6.ff.fc
  Flags[ 0x02 0x0c 0x80 0x80 ]:
    Type: CLOB
    Storage: SecureFile
    Characterset Format: IMPLICIT
    Partitioned Table: No
    Options: VaringWidthReadWrite
  SecureFile Header:
    Length:   50
    Old Flag: 0x48 [ DataInRow SecureFile ]
    Flag 0:   0x90 [ INODE Valid ]
    Layers:
      Lengths Array: INODE:44
      INODE:
        00 00 28 01 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78
        00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78
        00 78 00 78

Footnote 2

Oracle allows for an empty_clob() to populate a CLOB column with a zero length CLOB, and an empty_clob() is not the same as a null.

Many years ago, when I wrote “Practical Oracle 8i” I pointed out that if you were using a LOB column it would be better to use empty_[bc]lob() to represent an absence of data to avoid the need for checking the column in two different ways for “no data”.

There are two details to remember about this, though. First, empty_clob() take 36 bytes of storage for a basicfile and 30 bytes for a secure file (reported at 37/31 in 19c, 135/131 in 12cR2 and 103/97 in 11gR2). You may decide that’s expensive way of saying “nothing to see here”.

Secondly, while PL/SQL will allow you to compare a PL/SQL clob variable with empty_clob() you cannot use a simple “column = empty_clob()” predicate in SQL and will need something like:

where length(s1) = 0
where dbms_lob.getlength(s1) = 0
where dbms_lob.compare(s1, empty_clob) = 0

As a final closing comment about the oddity of avg_col_len and LOBs – my first test to check the lengths reported for an empty_clob() also inserted NULL into the v1 column. In this special case (all the columns in the row were empty_clob() or null) 11g reported the avg_col_len of s1 and b1 as zero!

November 12, 2020

rowsets

Filed under: Execution plans,Infrastructure,Oracle,Performance — Jonathan Lewis @ 12:35 pm GMT Nov 12,2020

Here’s a little demonstration of the benefit of rowsets. It started with a very simple question that arrived in my in-box from Kaley Crum.

  • “Why does the query with the UNION ALL take so much longer than the query without the UNION ALL?”

Here are the two queries – surely they should take virtually the same amount of time.

select count(*)
from (select /*+ no_merge */
             object_id
      from drop_me);


select count(*)
from (select /*+ no_merge */
             object_id
      from drop_me
      --
      union all
      --
      select 1 from dual);

Look closely – the difference between the two queries is just the addition through a UNION ALL of “select 1 from dual”. The first query took about 3.5 seconds to complete (there were 70M rows in the table), and the second took nearly 69 seconds.

Here are the execution plans showing the rowsource execution statistics (and Query Block // Object Alias information) – first the baseline query:

QL_ID  0ph1vfuuxkbqb, child number 0
-------------------------------------
select count(*) from (select /*+ no_merge */              object_id    
  from drop_me)
 
Plan hash value: 3609429292
 
--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name    | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |      1 |        | 26465 (100)|          |      1 |00:00:03.34 |   95896 |  95891 |
|   1 |  SORT AGGREGATE     |         |      1 |      1 |            |          |      1 |00:00:03.34 |   95896 |  95891 |
|   2 |   VIEW              |         |      1 |     70M| 26465   (2)| 00:00:02 |     70M|00:00:03.32 |   95896 |  95891 |
|   3 |    TABLE ACCESS FULL| DROP_ME |      1 |     70M| 26465   (2)| 00:00:02 |     70M|00:00:03.27 |   95896 |  95891 |
--------------------------------------------------------------------------------------------------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SEL$2 / from$_subquery$_001@SEL$1
   3 - SEL$2 / DROP_ME@SEL$2

And then the UNION ALL query:

SQL_ID  0chdajr28y0ub, child number 0
-------------------------------------
select count(*) from (select /*+ no_merge */              object_id    
  from drop_me       --       union all       --       select 1 from
dual)
 
Plan hash value: 3408528233
 
---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name    | Starts | E-Rows | Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |      1 |        | 26467 (100)|          |      1 |00:01:18.58 |   95896 |  95891 |
|   1 |  SORT AGGREGATE      |         |      1 |      1 |            |          |      1 |00:01:18.58 |   95896 |  95891 |
|   2 |   VIEW               |         |      1 |     70M| 26467   (2)| 00:00:02 |     70M|00:01:10.84 |   95896 |  95891 |
|   3 |    UNION-ALL         |         |      1 |        |            |          |     70M|00:00:53.13 |   95896 |  95891 |
|   4 |     TABLE ACCESS FULL| DROP_ME |      1 |     70M| 26465   (2)| 00:00:02 |     70M|00:00:19.28 |   95896 |  95891 |
|   5 |     FAST DUAL        |         |      1 |      1 |     2   (0)| 00:00:01 |      1 |00:00:00.01 |       0 |      0 |
---------------------------------------------------------------------------------------------------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - SET$1 / from$_subquery$_001@SEL$1
   3 - SET$1
   4 - SEL$2 / DROP_ME@SEL$2
   5 - SEL$3 / DUAL@SEL$3

This is not a trick question – I’ve got the SQL to create the data set and run the test; and I’ve run the test through 19.3.0.0 and 12.2.0.1 with results very similar to the above. Here’s a slightly modified version of the script – you’ll notice the column name has changed because I’ve used my usual “large data set” generator rather than scaling up a clone of all_objects:

rem
rem     Script:         rowset_kaley.sql
rem     Author:         Kaley Crum / JP Lewis
rem     Dated:          Oct 2020
rem
rem     Last tested 
rem             19.3.0.0
rem

create table drop_me
pctfree 0
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        rownum  n1
from
        generator
cross join
        generator
where rownum <= 7e7
;

alter session set statistics_level = all;

set timing on
set serveroutput off

prompt  ===========================================
prompt  Baseline: Simple query without a union all.
prompt  Runs in 3-4 seconds
prompt  ===========================================

select
         count(*)
from (select /*+ no_merge */
             n1
      from drop_me);

select *
from table(dbms_xplan.display_cursor(format => 'allstats last advanced'));

prompt  ==========================================
prompt  Add a UNION ALL for one record from dual.
Prompt  Runs in over a minute
prompt  ==========================================

pause Press return

select
         count(*)
from (select /*+ no_merge */
             n1
      from drop_me
      --
      union all
      --
      select 1 from dual);

select *
from table(dbms_xplan.display_cursor(format => 'allstats last advanced'));

There are two factors involved in the massive variation in performance. The first factor is one that the test case will report – though I didn’t show it in the output above, the second is that the test case has enabled rowsource execution statistics.

Here’s the big clue – from the Column Projection Information, which is one of the things that appears with the “advanced” format option (or when you add the “projection” format option) in the call to dbms_xplan.display_xxx()

First for the fast query:

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) COUNT(*)[22]
   2 - (rowset=1019)
   3 - (rowset=1019)

The rowsource passes 1,019 rows at a time from the tablescan operation to the view operation and from the view operation to the sort operation, for a total of roughly 70,000 calls for each of those two steps of the plan.

Compare this to the projection information for the slow UNION ALL query:

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) COUNT(*)[22]

I haven’t missed a bit in copying – this really is it. The rows move up the stack one at a time, not in a rowset array. That’s 70,000,000 subroutime calls for each of the two steps of the plan.

I’m sure most of us have heard the mantra “row by row is slow by slow” (or some variant on the theme). This is true all the way down to the internal levels of an execution plan.

Of course 70M calls vs. 70K calls shouldn’t really explain a difference of more than 60 seconds – but when the calls to the O/S for timing information for rowsource execution stats are repeated for each call (or even if it’s only a 1% sample of the calls) down the stack that’s where a lot of time can disappear.

On my laptop, running Oracle 19.3 in a VM, this is how my timing went:

  • Fast query: 1.94 seconds, dropping to 1.79 seconds when I disabled rowsource execution stats.
  • Slow query: 65.91 seconds, dropping to a far less astounding 3.26 seconds when I disabled rowsource execution stats.

So there really is a noticeable difference between row-by-row and array-processing but most of the difference in the original test came from using rowsource execution statistics to measure how much of a difference there would be.

It’s also worth mentioning that this is probably the most extreme case you could produce to show the difference – using the largest possible rowset size with the smallest possible rows when you want to mazimise the gap between (internal) array processing and single row processing – and then processing a very large number of rows. [But isn’t ridiculously large numbers of rows what you do with Exadata?].

If you want further evidence that the difference is due to the rowset size you can always alter session set “_rowsets_enabled”=false; and watch the fast query slow down. It will take about half the time of the slow query as it only has to pass 70M rows up one step of the plan rather than the two steps that are in the UNION ALL plan.

In theory it looks as if you could also restrict the size of the rowset by setting _rowsets_max_rows or _rowsets_target_maxsize, but since the default value for the former is 256 the results above suggest that the settings might be ignored, and when I tried adjusting them at the session level nothing changed.

Another test you could run is to adjust (alter session) the parameter “_rowsource_statistics_sampfreq”, When I set this to 100 the time for the slow query dropped to about 21 seconds (and the fast query – with rowsets disabled – dropped to about 11 seconds).

Footnote

It’s always easy to hit an accidental special case without realising it, so when you’re testing something it’s important to think about what’s special in the example. I came up with two possibilities in this case – dual is always going to be a very special case in any circumstances, and I’ve got a table in my query that is guaranteed to return exactly one row. So I repeated the test with:

  • a real table with one row and primary key in case the reference to dual was disabling rowsets
  • a real table with 100 rows in case this enabled a rowset size of 100

In neither case did Oracle start using array processing.

Footnote 2

One of the patterns I see occasionally follows the form of the slow query above. Some front-end tool executes a select statement that starts with a select {list of desired column headings} from dual so that the front-end code doesn’t have to be programmed to create headings as it receives the data.

Although the impact this would have on rowsets looks like a large percentage of the work done in the database in this case, in more realistic cases it would probably be irrelevant, until someone starts testing with the statistics_level set to all, or _rowsource_execution_statistics explicitly enabled.

There is a related effect, though, from the SQL Monitor feature (which activates automatically for a statement if it’s executing in parallel, or if it’s predicted to take more than 5 seconds to complete). When I added the /*+ monitor */ hint to the two queries (and didn’t fiddle with the rowsets parameter) the fast query averaged 2.26 seconds instead of 1.79 seconds, and the slow query averaged 3.51 seconds instead of 3.26 seconds. I have to say, though, that the variation between executions of the same statement was quite significant compared the variation produced by adding or eliminating the hint.

November 10, 2020

Quiz time

Filed under: Infrastructure,Oracle,redo — Jonathan Lewis @ 9:28 am GMT Nov 10,2020

Here’s a fun little item that I’ve just rediscovered from 2016.

There’s a fairly commonly cited query that aggregates and tabulates the first_time column from the v$log_history view by hour of day, reporting the hour across the page and the date down the page. It often appears in the responses to the question “how big should my online redo log files be?” but I’m not going to supply the query here because it’s irrelevant, instead I’m going to show you a line of the output it produced (on a real production system) then ask the important question that might prompt you decide to resize your log files:

DAY        00  01  02  03  04  05  06  07  08  09  10  11  12  13  14  15  16  17  18  19  20  21  22  23

--------- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

06-MAY-16   4   3   4   5   3   2   0   1   3   4   1   5  12   8   2   1   7   3   0   1   5   5   2   4


The important question is this:

  • when was most time recorded for events relating to log file switches?

I’ll supply the answer some time tomorrow. The only comment I’ll make in advance is that there’s a huge history of variation in the way you (and Oracle) might view redo log file switches, and it would be easy for two highly competent DBAs to have dramatically different view points if all you gave them was a set of numbers and no context.

Answer

There is no way you could have deduced the right answer from the line of numbers above, and it’s gratifying to see that all the comments understood the rules of this “no-win” game. The correct answer to the question was

The point at which the most significant redo waits appeared was from 20:58 to 21:02 when there were 10 log file switches in just over 3 minutes.

There are plenty of arguments that you might put forward to justify guessing some other time period, and I think they were all explained (or, rather, explained away) in the comments.

One person volunteered to highlight the “obvious” answer – which is the 12:00 – 13:00 interval which recorded the single highest number. Clearly there may have been a bit of a peak in the rate of redo generation around that interval (from 11:30 to 13:30, say) but that doesn’t mean the load on the system was high enough to cause any waits.

There are a couple of places where the count shows zero, and that might be because there was no redo generated, but it might be because the log writer got stuck on (e.g.) a CI enqueue wait and everything else stopped in a wait for logfile switch.

Then, of course, if the actual workload peak had occurred just a few minutes later another of the intervals would have been recording 12 switches leaving the less experienced DBA a choice of time periods to investigate.

There are two problems with this report: first there’s the fundamental understanding – the rate at which redo is generated doesn’t tell you anything about how much time is lost due to contention related to redo. Secondly the report has an inherent scoping issue – as one of the comments put it – there’s a big difference between 12 logfile switches in 5 minutes and 12 logfile switches in 60 minutes. (Almost a prescient comment since the right answer was 10 logfile switches in just over 3 minutes – without showing 10 logfile switches in an hour).

As with so many of the “popular” monitoring queries that have been around for the last quarter of a century – the only use of a report like this is to give you a little histoary so that you can look at the last couple of weeks and wonder what happened around 9:00 last night that was different from usual, because that spike hasn’t been there before.

October 8, 2020

Direct Path

Filed under: Infrastructure,Oracle,Performance,Troubleshooting,Tuning — Jonathan Lewis @ 12:29 pm BST Oct 8,2020

This is a little addendum to a note I wrote a couple of days ago about serial direct path reads and KO (fast object checkpoint) enqueue waits.

The original note was prompted by a problem where someone had set the hidden parameter “_serial_direct_read” to ‘always’ because there were running 11g and wanted some “insert as select” statements to use direct path reads on the select portion and 11g wasn’t co-operating.

Serial direct path reads were introduced as a possibility in (at least) the 8.1.7.4 timeline, but the parameter was set to false until 11gR2 where it changed to auto. (Legal values are: false, true, never, auto, always.)

In 11.2, though, even though a simple select statement could use serial direct path reads for segment scans, Oracle would not use the mechanism for “insert as select”.

This note is just a little piece of code to demonstrate the point.  Run it on 11g and (unless your buffer cache is large enough to make the test table “small”) Oracle will use direct path reads on the select, but scattered reads to cache for the insert. Upgrade to 12.1 and Oracle will use direct path reads on both.

rem
rem     Script:         serial_fail.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2020
rem

create table t1
as
select
        ao.*
from
        all_objects     ao,
        (select rownum from dual connect by level <= 16) mult
/

create table t2
as
select  *
from    t1
where   rownum = 0
/

alter system flush buffer_cache;

prompt  =============
prompt  Simple Select
prompt  =============

execute snap_events.start_snap
select * from t1 where object_id = 98765;
execute snap_events.end_snap

prompt  ================
prompt  Insert as select
prompt  ================

execute snap_events.start_snap
insert into t2
select * from t1 where object_id = 98765;
execute snap_events.end_snap

prompt  =====================
prompt  Insert as select with
prompt  _serial_direct=always
prompt  =====================

alter session set "_serial_direct_read"=always;

execute snap_events.start_snap
insert /* serial direct */ into t2
select * from t1 where object_id = 98765;
execute snap_events.end_snap

alter session set "_serial_direct_read"=auto;

The calls to the snap_events package are the to produce the change in v$session_event for my session during the SQL.

You’ll notice I’ve included three main SQL statements rather than two – the third statement (2nd execution of the insert) is to demonstrate that it is possible to get direct path reads on the insert by setting the hidden parameter to ‘always’.

One detail to remember when testing this particular feature (and the same guideline applies to some other features), the “direct / not direct” becomes an attribute of the cursor, it’s not an attribute of the execution plan. This is why I’ve added a comment to the 2nd insert; if I hadn’t done so Oracle would have reused the (identical text) cursor from the first insert, which would have resulted in scattered reads being used instead of direct path reads. This distinction between cursor and plan explains why there is not hint that will allow you to force direct path reads for a specific query (not even the infamous opt_param() hint).

Here are the three sets of results from a system running 11.2.0.4:

=============
Simple Select
=============

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
db file sequential read                               1           0           0.10        .100           4
direct path read                                    114           0          20.86        .183           6
SQL*Net message to client                             4           0           0.00        .000           0
SQL*Net message from client                           4           0           0.11        .028     174,435

================
Insert as select
================

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
db file sequential read                              22           0           0.60        .027           4
db file scattered read                              130           0          35.97        .277           5
SQL*Net message to client                             4           0           0.01        .002           0
SQL*Net message from client                           4           0           0.10        .025     174,435

=====================
Insert as select with
_serial_direct=always
=====================

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
direct path read                                    108           0          17.44        .161           6
SQL*Net message to client                             4           0           0.00        .000           0
SQL*Net message from client                           4           0           0.09        .022     174,435


Note the db file scattered read waits in the mddle test. If you re-run the test on 12.1.0.x (or later) you’ll find that the middle set of results will change to direct path read waits.

For reference, this limitation is covered by MOS note13250070.8: Bug 13250070 – Enh: Serial direct reads not working in DML. The actual bug note is not public.

Footnote (a couple of hours later):

A response from Roger MacNicol to my publication tweet has told us that the bug note says only that direct path reads had been restricted unnecessarily and the restriction has been removed.

August 28, 2020

FK on delete

Filed under: constraints,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 8:51 am BST Aug 28,2020

This is part 1 of a short reference note about the work load created by referential integrity constraints when you delete from a parent table. It was prompted by a question on the Oracle Developer Community forum about how to handle a very large delete from a table which (a) included a column of type CLOB and (b) had 9 child tables.

The 9 referential integrity constraints were declared with “on delete cascade”, but the delete was taking too long even though all the related child data had been deleted before the parent delete. In outline the process was designed to operate in batches as follows:

  • populate global temporary table with a small set of IDs
  • delete from 9 child tables where parent_id in (select id from gtt)
  • delete from parent where id in (select id from gtt);

The process was running very slowly. At first sight – and without examining any trace files or workload statistics – the “obvious” guess would be that this was something to do with the CLOBs – but a CLOB delete is usually a “logical” delete, i.e. it sets a flag amd shouldn’t really be doing a lot of work actually deleting and freeing space, and it doesn’t generate undo for the CLOB itself. Examination of the session statistics showed that the problem was with the work that Oracle was doing to handle the referential integrity – even though all related rows had been deleted in advance of the parent delete.

A highly suggestive statistic from the session stats (v$sesstat for the session) for a controlled test that deleted 1170 parent rows was the “execute count” which was 10,892.  Why would you execute that many statements when all you’re doing is a simple “delete from parent where id in (select from gtt)” – check the arithmetic: 1,170 * 9 = 10,530 which is a fairly good indicator that every row deleted from the parent results in 9 statements being executed to delete from the 9 child tables.

Foreign Key options

Oracle is a little restrictive in how it allows you to define foreign key constraints – MySQL, for example, allows you to specify the action the database should take on update or delete of a parent row,, and it allows 5 possible actions, so you have:


on update [restrict | cascade | set null | no action | set default]
on delete [restrict | cascade | set null | no action | set default]

In comparison Oracle only implements “on delete”, and the only options it allows are “cascade”, “set null”, and “no action” (which is the default, though you can’t actually specify it).

Side note: “No Action” and “Restrict” are very similar – if you try to delete a parent for which a child row exists then the attempt will raise an error. The difference between the two operations is that “no action” will delete the parent then rollback when it finds the child while “restrict” will check to see if the child exists before attempting to delete the parent. (See footnote 1 for detailed examination of how Oracle handles “no action”).

As an initial investigation of the work that Oracle does to handle referential integrity and deletes I want to examine one simple but extreme example. I’m going to create a parent/child pair of tables with “on delete no action”, and delete one parent row. The thing that makes the demonstration “extreme”  is that I will first create, and then delete, a large number of rows for that parent row before deleting the parent. Here’s the initial data setup:

rem
rem     Script:         fk_lock_stress.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2012
rem
rem     Last tested 
rem             19.3.0.0
rem

create table child
as
with generator as (
        select
                rownum id 
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid wordpress format issue
)
select
        trunc((rownum-1)/1200)  n1,
        lpad('x',80)            idx_padding,
        lpad(rownum,8,'0')      small_vc
from
        generator       v1,
        generator       v2
where
        rownum <= 6e4 -- > comment to avoid wordpress format issue
;

create index child_ix on child(n1, idx_padding) pctfree 95;

create table parent as
select
        id,
        lpad(rownum,8,'0')      small_vc,
        lpad('x',80)            padding
from    (
        select 
                distinct n1             id
        from
                child
        )
;

alter table parent add constraint par_pk primary key(id);
alter table child add constraint chi_fk_par foreign key(n1) references parent;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'PARENT',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'CHILD',
                method_opt       => 'for all columns size 1'
        );
end;
/


select  index_name, num_rows, distinct_keys, leaf_blocks, avg_leaf_blocks_per_key 
from    user_indexes
where   index_name in ('PARENT','CHILD')
;

select  object_id, object_name
from    user_objects
order by
        object_id
;

delete from child where n1 = 10;
commit;
delete from child where n1 = 20;
commit;
delete from child where n1 = 30;
commit;
delete from child where n1 = 40;
commit;

execute dbms_stats.gather_table_stats(user, 'child', cascade=>true)

alter system flush buffer_cache;

For each parent row there are 1,200 child rows, and I’ve given the “foreign key” index on child a pctfree of 95% which has resulted in the index needing roughly 600 leaf blocks per key. (When I first created this test my value for x$kcbbf (buffer pins) was 500 and I wanted to see what would happen if I needed to pin more buffers that could fit the array).

After creating the index and gathering stats I’ve deleted all the rows for 4 of the possible parent values from the child table because I want to see how much work it takes to delete a parent for which there are no longer any child rows, as follows:

execute snap_events.start_snap
execute snap_my_stats.start_snap

alter session set events '10046 trace name context forever, level 8';
alter session set tracefile_identifier = 'del';

delete from parent where id = 10;
commit;

delete from parent where id = 40;
commit;

alter session set tracefile_identifier = '';
alter session set events '10046 trace name context off';

execute snap_my_stats.end_snap
execute snap_events.end_snap

The “snap_my_stats / snap_events” packages are just my usual capture of workload information over the interval. I’ve enabled extended tracing at level 8 (waits) so that I can see if Oracle runs any interesting recursive SQL and, since I;ve flushed the buffer cache, this will also let me see what datablocks Oracle has to read from the database.

Here are the key results from a test run on 19.3.0.0:


Statistic                                                Value
---------                                                -----
physical reads                                           1,225

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
db file sequential read                           1,225           0           0.86        .001           1
events in waitclass Other                            11           0           0.01        .001          73

Is it a coincidence that I’ve done 1,225 single block reads when the number of leaf blocks per key in the child index is a little over 600 and I’ve just deleted two parent key values? (Spoiler: NO).

When we apply tkprof to the trace file we find the following summary of the first delete:


SQL ID: 0u6t174agkq27 Plan Hash: 3366423708

delete from parent
where
 id = 10


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.01          0          0          1           0
Execute      1      0.00       0.02        608          1        609           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.03        608          1        610           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 107
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  DELETE  PARENT (cr=1 pr=608 pw=0 time=23591 us starts=1)
         1          1          1   INDEX UNIQUE SCAN PAR_PK (cr=1 pr=1 pw=0 time=60 us starts=1 cost=0 size=3 card=1)(object id 94905)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                       608        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.00          0.00
********************************************************************************

The “Row Source Operation” statistics tell use that it has taken 608 single block reads to access a single row from the parent table by unique index and delete it. So let’s look at the content of the trace file – looking for any WAIT, that is a “db file sequential read” wait  relating to this cursor. (I started by looking for the statement in the trace file so that I could extract its cursor number before using the following grep command):


grep "#139773970345352" or19_ora_17541_del.trc | grep "WAIT.*db file sequential read" | sed "s/^.*obj#=//" | sed "s/ .*$//" | sort | uniq -c | sort -n

      2 0
    604 94903
      1 94904
      1 94905

If you’re wondering which objects these numbers correspond to (though you may have made a reasonalble guess by now), here are the results from my earlier query against user_objects:


 OBJECT_ID OBJECT_NAME
---------- --------------------
     94902 CHILD
     94903 CHILD_IX
     94904 PARENT
     94905 PAR_PK

I visit one index block in the parent index, then one block in the parent table, then 604 blocks in the child index to check that there are no current, or potential, index entries in each leaf block. I’ve done that with single block reads, walking the child index in order, and I do it without showing any SQL that might help me spot that it’s happening.

Exactly the same mechanism then comes into play for deleting parent_id = 40. We effectively I do a “silent” index range scan of the child index where n1 = 40 to make sure that there are no current child rows.

When I first created this test it was because I wanted to see if Oracle would pin all the buffers holding the current image of the relevant child index leaf blocks in exclusive mode as it walked through the index – this seemed to be the obvious way to ensure that no other session could insert a child row into a leaf block that had already been checked – but when I enabled pin tracing I found 600+ pairs of “Aquire/Release”. This may help to explain Oracle’s choice of “No Action” (rather than “Restrict”): if the parent row is locked and marked for deletion then there is no need to keep the child leaf blocks pinned exclusively as any attempt by another session to introduce a new child would first check the parent and discover that it was marked for deletion and wait for the deleting transaction to commit or rollback.

There is a gap in this hypothesis, though, which shows up when there is no appropriate index on the table. (Which includes the case where the “foreign key” index has been created as a bitmap instead of a B-tree, or where there’s a B-tree index that includes the foreign key column(s) but not at the start of the index definition, or if the foreign key columns are at the start of the index but declared as descending).

If there is no appropriate “foreign key” index there are two changes in the activity. First, the child table will temporarily be locked in mode 4 (share mode) or mode 5 (share row exclusive mode) for the duration of the referential integrity check. Secondly you will see the text of the recursive SQL statement that does the check; here, for example, is the SQL extracted from a 19.3 trace file in a few variants of the original test:


select /*+ all_rows */ count(1)
from
 "TEST_USER"."CHILD" where "N1" = :1

Oracle’s treatment of this statement is cost-based, and in my case, depending on the choice of index, level of compression, and settings for system statistics I found that I could get any one of the following plans:


Bitmap index
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=3 pr=2 pw=0 time=139 us starts=1)
         0          0          0   BITMAP CONVERSION COUNT (cr=3 pr=2 pw=0 time=132 us starts=1 cost=2 size=3600 card=1200)
         0          0          0    BITMAP INDEX RANGE SCAN CHILD_IX (cr=3 pr=2 pw=0 time=128 us starts=1)(object id 95098)

No index at all
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=826 pr=412 pw=0 time=4880 us starts=1)
         0          0          0   TABLE ACCESS FULL CHILD (cr=826 pr=412 pw=0 time=4870 us starts=1 cost=114 size=3600 card=1200)

Index on (idx_padding, n1)
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=826 pr=412 pw=0 time=5601 us starts=1)
         0          0          0   TABLE ACCESS FULL CHILD (cr=826 pr=412 pw=0 time=5591 us starts=1 cost=114 size=3600 card=1200)

Index on (idx_padding, n1) compress 1 (and tweaking MBRC)
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  SORT AGGREGATE (cr=128 pr=126 pw=0 time=2467 us starts=1)
         0          0          0   INDEX SKIP SCAN CHILD_IX (cr=128 pr=126 pw=0 time=2460 us starts=1 cost=122 size=3600 card=1200)(object id 95111)

You’ll note that I have two cases where the strategy chosen for the query seems to be a close match to the operation performed with the “proper foreign key index” in place – moreover the number of blocks read for the tablescan strategy is less than the number of blocks read when we had the “foreign key” index in place. So what’s the difference that makes it possible to for Oracle to avoid locking the table in mode 4/5 when we have that index?

These results, by the way, explain why Mikhail Velikikh (in comment #1) can say that large numbers of buffer gets and disk reads for the parent delete are NOT an indication of a missing foreign key index … when the index is missing the large numbers are correctly reported against the SQL that checks the child, not against the parent delete.

Summary

Even if you’ve deleted all the child data before attempting to delete a parent row, and YOU know that that there’s no child data when ORACLE is told to delete the parent row it has to check that there’s no child data, and in a large scale delete (particularly where there may be many child rows per parent) you may end up seeing a a surprising amount of I/O on single block reads of the “foreign key” indexes on any child tables.

This means for a very large “house-keeping” or archiving delete with a limited time-window you may want to take special steps, such as rebuilding “foreign key” indexes after deleting large number of child rows and before deleting the parent. You may even want to disable/drop the foreign key constraint before the parent delete and re-enable it afterwards if you think you can avoid getting any bad data into the job is running.

In the next article:  on delete cascade, on delete set null..

Footnote 1

To demonstrate Oracle’s behaviour for the default “no action  foreign key constraint when you attempt to delete a parent row for which child data exists (which will result in raising an Oracle error or the form: “ORA-02292: integrity constraint (TEST_USER.CHI_FK_PAR) violated – child record found”) you need only set up a
small amount of data, attempt the delete and then dump the redo for the transaction – if nothing else is going on you’ll find the following set or redo change vectors (taken from a trace file generated in 19.3, using the command “grep OP {filename}”):

CHANGE #1 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00084 OBJ:94828 SCN:0x00000b860f303fa4 SEQ:2 OP:11.3 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:35 AFN:17 DBA:0x04400110 OBJ:4294967295 SCN:0x00000b860f303f98 SEQ:2 OP:5.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #3 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c0009c OBJ:94829 SCN:0x00000b860f303fa4 SEQ:2 OP:10.4 ENC:0 RBL:0 FLG:0x0000
CHANGE #4 CON_ID:3 TYP:0 CLS:36 AFN:17 DBA:0x04404564 OBJ:4294967295 SCN:0x00000b860f303f98 SEQ:3 OP:5.1 ENC:0 RBL:0 FLG:0x0000
CHANGE #5 CON_ID:3 TYP:0 CLS:36 AFN:17 DBA:0x04404564 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:1 OP:5.1 ENC:0 RBL:0 FLG:0x0000

CHANGE #1 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c0009c OBJ:94829 SCN:0x00000b860f304062 SEQ:1 OP:10.5 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:36 AFN:17 DBA:0x04404564 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:2 OP:5.6 ENC:0 RBL:0 FLG:0x0000

CHANGE #1 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00084 OBJ:94828 SCN:0x00000b860f304062 SEQ:1 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:35 AFN:17 DBA:0x04400110 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:1 OP:5.11 ENC:0 RBL:0 FLG:0x0000

CHANGE #1 CON_ID:3 TYP:0 CLS:35 AFN:17 DBA:0x04400110 OBJ:4294967295 SCN:0x00000b860f304062 SEQ:2 OP:5.4 ENC:0 RBL:0 FLG:0x0000

11.3  delete row piece
5.2   start transaction
10.4  delete leaf row    (ed: this is for the primary key index)
5.1   create undo record for table block
5.1   create undo record for index leaf block

10.5  restore leaf row during rollback
5.6   mark index undo as applied

11.2  insert rowpiece
5.11  mark table undo as applied

5.4   commit;

So Oracle deletes the parent, (discovers the child) then rolls back the parent delete.

 

 

August 24, 2020

Flashback Bug

Filed under: 18c,Bugs,Oracle,redo,Troubleshooting — Jonathan Lewis @ 9:34 am BST Aug 24,2020

Here’s a problem with the “flashback versions” technology that showed up at the end of last week. There’s a thread about it on the Oracle Developer community forum, and a chain of tweets that was my initial response to a twitter alert about it that Daniel Stein posted.

The problem appears somewhere in the 18c timeline – it doesn’t seem to be present in 12.2.0.1 – so if you’re running any versions 18+ here’s a modified version of the test case supplied by Daniel Stein to demonstrate the issue.

rem
rem     Script:         flashback_bug.sql
rem     Author:         Jonathan Lewis / Daniel Stein
rem     Dated:          Aug 2020
rem
rem     Last tested 
rem             19.3.0.0        Wrong result
rem             12.2.0.1        Correct result
rem

create table t1 (n1 number(4)) 
segment creation immediate
;

prompt  =============================================================
prompt  Sleeping 10 seconds after create table to avoid subsequent
prompt  ORA-01466: unable to read data - table definition has changed
prompt  =============================================================


execute dbms_lock.sleep(10)

prompt  =========
prompt  Start SCN
prompt  =========

column current_scn new_value scn_vorher2 
select to_char(current_scn,'9999999999999999') current_scn from V$DATABASE;

insert into t1(n1) values (1);
insert into t1(n1) values (2);
insert into t1(n1) values (3);
insert into t1(n1) values (4);
insert into t1(n1) values (5);
insert into t1(n1) values (6);
insert into t1(n1) values (7);
insert into t1(n1) values (8);
insert into t1(n1) values (9);
insert into t1(n1) values (10);
insert into t1(n1) values (11);
insert into t1(n1) values (12);

delete from t1 where n1 in (2, 5, 8, 11);

commit;

prompt  =======
prompt  End SCN
prompt  =======

column current_scn new_value scn_nachher
select to_char(current_scn,'9999999999999999') current_scn from V$DATABASE;

prompt  =============
Propmt  Version Query 
prompt  =============

column VERSIONS_STARTSCN        format 9999999999999999 heading "Start SCN"
column VERSIONS_ENDSCN          format 9999999999999999 heading "End SCN"
column VERSIONS_OPERATION       format A9               heading "Operation"
column SCN_RANGE                format A35              heading "SCN Range"

select 
        n1, rowid, 
        versions_operation, versions_startscn, versions_endscn,
        &scn_vorher2||' and '||&scn_nachher scn_range
from
        t1 versions between scn &scn_vorher2 and &scn_nachher
order by 
        rowid, versions_startscn, versions_operation, versions_endscn
;

prompt  ==================
prompt  Current Table data
prompt  ==================

select n1,rowid from t1;


alter system dump redo scn min &scn_vorher2 scn max &scn_nachher;

I’ve created a table then inserted a few rows and deleted some of them in the same transaction. I’ve captured the database SCN at the start and end of the transaction and then tried to run a “versions between” query across that range of SCNs. Here are the results from the “versions between” query and the final query of the current contents of the table on a test on a database running 19.3:

        N1 ROWID              Operation         Start SCN           End SCN SCN Range
---------- ------------------ --------- ----------------- ----------------- -----------------------------------
         1 AAAXFOAATAAAACDAAA I            12670408139684                   12670408139679 and 12670408139687
         3 AAAXFOAATAAAACDAAC I            12670408139684                   12670408139679 and 12670408139687
         4 AAAXFOAATAAAACDAAD I            12670408139684                   12670408139679 and 12670408139687
         5 AAAXFOAATAAAACDAAE I            12670408139684                   12670408139679 and 12670408139687
         6 AAAXFOAATAAAACDAAF I            12670408139684                   12670408139679 and 12670408139687
         7 AAAXFOAATAAAACDAAG I            12670408139684                   12670408139679 and 12670408139687
         8 AAAXFOAATAAAACDAAH I            12670408139684                   12670408139679 and 12670408139687
         9 AAAXFOAATAAAACDAAI I            12670408139684                   12670408139679 and 12670408139687
        10 AAAXFOAATAAAACDAAJ I            12670408139684                   12670408139679 and 12670408139687
        11 AAAXFOAATAAAACDAAK I            12670408139684                   12670408139679 and 12670408139687

10 rows selected.

        N1 ROWID
---------- ------------------
         1 AAAXFOAATAAAACDAAA
         3 AAAXFOAATAAAACDAAC
         4 AAAXFOAATAAAACDAAD
         6 AAAXFOAATAAAACDAAF
         7 AAAXFOAATAAAACDAAG
         9 AAAXFOAATAAAACDAAI
        10 AAAXFOAATAAAACDAAJ
        12 AAAXFOAATAAAACDAAL

8 rows selected.

There’s a trap in this demonstration. If you follow the link to the forum thread you’ll find that Daniel Stein says there’s a problem with Oracle 19.6 and upwards (but not 12.2) – then various other people test on their versions and find that 19.7, 18.5, and 19.3 are broken. That last one is what made me run the test on the 19.3 I had at hand – and I got the right result, unlike the results above.

If you look at the output from this test run you’ll see that the versioning query says that in the SCN range I’ve inserted 10 rows, and that rows 2 and 12 have never been inserted. Given the SQL I’d run I was expecting to see 16 rows in the output, 12 of them saying I’d inserted values 1 through 12, and then 4 saying I’d deleted 2, 5, 8 and 11 – but (when it works correctly) Oracle manages to combine the insert and delete in the same transaction to cancel the entries. So the versioning query ought to be showing exactly the 8 rows that appear in the current data query.

So I had two puzzles:

  • What’s gone wrong for everyone else ?
  • Why hasn’t it gone wrong for me ?

To answer the second question first: I took a look at my instance parameters and found that I had set statistics_level to all. Clearly this ought to have nothing to do with an error in flashback version queries, but I changed it to typical anyway and re-ran the test: and got the (now expected) wrong results. If you read the twitter thread you’ll find that Oracle support had had an SR opened on this problem for a few days but had been unable to reproduce the problem – perhaps they, too, had a system with statistics_level set to all.

Getting back to the first question: what’s gone wrong. I couldn’t think of a good reason why setting the statistics_level should make a difference, so I took the view that the versioning query is running through the redo log to find all the changes that took place in the range, so maybe the answer will appear in the redo. That’s why the last line in the demo above is a command to dump the redo generated between the two SCNs.

Short-circuiting the details of what I looked at I’m just going to show you the layer 11 (table) redo change vectors for my DML, which I extracted from the trace file using the command grep “OP:11” {filename}.

First the set when statistics_level = all:


CHANGE #3 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1b SEQ:2 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1b SEQ:3 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:1 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:2 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:3 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:4 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:5 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:6 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:7 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:8 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:9 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:10 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #3 CON_ID:3 TYP:2 CLS:1 AFN:10 DBA:0x01009c57 OBJ:720 SCN:0x00000b860f2d3cb8 SEQ:1 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1c SEQ:11 OP:11.3 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1d SEQ:1 OP:11.3 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1d SEQ:2 OP:11.3 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1d SEQ:3 OP:11.3 ENC:0 RBL:0 FLG:0x0000

What you can see are 13 op codes “11.2” – which is “insert row piece”. Most of these are for OBJ 94554, which is the object_id for my table, and one of them if for OBJ 720, which is a table owned by sys called exp_head$ (which is something to do with “expression tracking”). After the inserts there are 4 op Ccodes “11.3” which is “delete row piece”. So there’s no obvious error or complication in the redo.

And now when we set statistics_level to typical and re-run the script (which means we get a new OBJ number):

CHANGE #3 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8a SEQ:2 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8a SEQ:3 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:1 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:2 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:3 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:4 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:5 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:6 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:7 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:8 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:9 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:10 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #3 CON_ID:3 TYP:2 CLS:1 AFN:10 DBA:0x01009c57 OBJ:720 SCN:0x00000b860f2d3d1d SEQ:1 OP:11.2 ENC:0 RBL:0 FLG:0x0000
CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:11 OP:11.12 ENC:0 RBL:0 FLG:0x0000

In this case we get 13 “insert row piece” Op Codes, and then a single “11.12” which is “delete multiple rows”. So there IS a difference in the redo, and something about that difference is probably behind the error coming and going.

Taking a closer look at the 11.12 and one of the 11.2 redo change vectors, this is what we see:


CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94545 SCN:0x00000b860f2d3d8b SEQ:11 OP:11.12 ENC:0 RBL:0 FLG:0x0000
KTB Redo
op: 0x02  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C  uba: 0x04400a24.0672.3c
KDO Op code: QMD row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x04c00083  hdba: 0x04c00082
itli: 1  ispac: 0  maxfr: 4858
tabn: 0 lock: 1 nrow: 4
slot[0]: 1
slot[1]: 4
slot[2]: 7
slot[3]: 10


CHANGE #2 CON_ID:3 TYP:0 CLS:1 AFN:19 DBA:0x04c00083 OBJ:94544 SCN:0x00000b860f2d3d1d SEQ:3 OP:11.3 ENC:0 RBL:0 FLG:0x0000
KTB Redo
op: 0x02  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: C  uba: 0x04400467.0595.2c
KDO Op code: DRP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x04c00083  hdba: 0x04c00082
itli: 1  ispac: 0  maxfr: 4858
tabn: 0 slot: 10(0xa)

The 11.12 Op Code shows us a list of 4 entries in the block’s “row directory” for “table 0”. THe 11.3 Op Code shows us just one – it’s the last of the four deletes so it’s reporting entry 10 in the “row directory”, the previous three Op Code 11.3 vectors were reporting slots 1, 4, and 7 respectively.

Obviously we can’t draw any firm conclusions about what’s going wrong, we simply have a few observations that might give us some ideas of the type of error; and after a few more experiments I decided that the code handling flashback versioning mis-interpreting the “delete multiple row” record. It consistently seemed to delete the first row identified the slot[0] entry and then jumped straight to the last slot but add one to the index entry before attempting to apply the vector.

Footnote

I’ve used dbms_lock.sleep() to pause for 10 seconds after creating the table. The sleep() procedure was copied into the dbms_session package in recent versions of Oracle so that end-user code could access it without having to receive privileges on the rather more dangerous dbms_lock package.

I’ve referenced the statistics_level parameter as the one affecting the generation of the “delete multiple row” redo vector. In fact you can get the same effect by setting the hidden parameter “_rowsource_execution_statistics” to true. Setting statistics_level to all has the effect of enabling rowsource execution statistics so this isn’t surprising.

I pointed out that the flashback versioning code seemed to “add 1” to the final slot[n] identifier before producing the flashback output. Think about that that means if you change the demonstration to delete the LAST row in the list of rows inserted. Here’s what mning query did in a test when I changed 11 to 12 in my delete statement:


ERROR at line 2:
ORA-00600: internal error code, arguments: [ktrvtgr_2], [79691907], [0], [11], [], [], [], [], [], [], [], []

79691907 is the block number of the block where the table rows are stored.
11 is the correct row directory index for the last row (12) in the table – buy my hypothesis is that the flashback code was trying to find (non-existent) directory index 12 because it adds one to the supplied index entry.

Update

In the interval between completing the final draft yesterday and publishing the note this morning, Oracle support replied to the the forum thread with a bug number (28910586 – not yet publicly visible) and a note that a patch could be available on request if an SR were raised giving the Oracle version and platform. It will be interesting to see whether the patch blocks the creation of the 11.12 op codes or whether it corrects the interpretation of 11.12 codes by the flashback versions code.

August 18, 2020

Explain Rewrite

Filed under: Materialized view,Oracle,Troubleshooting — Jonathan Lewis @ 7:07 pm BST Aug 18,2020

This is one of those notes that I thought I’d written years ago. It answers two questions:

  • what can I do with my materialized view?
  • why isn’t the optimizer using my materialized view for query rewrite?

I’ve actually supplied an example of code to address the first question as a throwaway comment in a blog that dealt with a completely different problem, but since the two questions above go together, and the two answers depend on the same package, I’m going to repeat the first answer.

The reason for writing this note now is that the question “why isn’t this query using my materialized view” came up on the Oracle Developer community forum a few days ago – and I couldn’t find the article that I thought I’d written.

Note: a couple of days after I started drafting this note Frank Pachot tweeted the links to a couple of extensive posts on materialized views that included everything I had to say (and more) about “what can my materialized view do”. Fortunately he didn’t get on to second question – so I decided to publish the whole of my note anyway as the two questions go well together.

The key feature is the dbms_mview package, and the two procedures (both overloaded) explain_mview() and explain_rewrite(). To quote the 12.2 “PL/SQL Supplied Packages” manual page, the first procedure:

“explains what is possible with a materialized view or potential [ed: my emphasis] materialized view”

note particularly that the materialized view doesn’t need to have been created before you run the package – the second procedure:

“explains why a query failed to rewrite or why the optimizer chose to rewrite a query with a particular materialized view or materialized views”.

Here’s the relevant extract from the SQL*Plus describe of the package –

PROCEDURE EXPLAIN_MVIEW
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 MV                             VARCHAR2                IN
 STMT_ID                        VARCHAR2                IN     DEFAULT

PROCEDURE EXPLAIN_MVIEW
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 MV                             CLOB                    IN
 STMT_ID                        VARCHAR2                IN     DEFAULT

PROCEDURE EXPLAIN_MVIEW
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 MV                             VARCHAR2                IN
 MSG_ARRAY                      EXPLAINMVARRAYTYPE      IN/OUT

PROCEDURE EXPLAIN_MVIEW
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 MV                             CLOB                    IN
 MSG_ARRAY                      EXPLAINMVARRAYTYPE      IN/OUT

PROCEDURE EXPLAIN_REWRITE
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 QUERY                          VARCHAR2                IN
 MV                             VARCHAR2                IN     DEFAULT
 STATEMENT_ID                   VARCHAR2                IN     DEFAULT

PROCEDURE EXPLAIN_REWRITE
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 QUERY                          CLOB                    IN
 MV                             VARCHAR2                IN     DEFAULT
 STATEMENT_ID                   VARCHAR2                IN     DEFAULT

PROCEDURE EXPLAIN_REWRITE
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 QUERY                          VARCHAR2                IN
 MV                             VARCHAR2                IN     DEFAULT
 MSG_ARRAY                      REWRITEARRAYTYPE        IN/OUT

PROCEDURE EXPLAIN_REWRITE
 Argument Name                  Type                    In/Out Default?
 ------------------------------ ----------------------- ------ --------
 QUERY                          CLOB                    IN
 MV                             VARCHAR2                IN     DEFAULT
 MSG_ARRAY                      REWRITEARRAYTYPE        IN/OUT

As you can see there are 4 overloaded versions of explain_mview() and 4 of explain_rewrite(): both procedures take an input of an SQL statement (parameter mv for explain_mv(), parameter query for explain_rewrite()) – which can be either a varchar2() or a CLOB, and both procedures supply an output which can be written to a table or written to an “in/out” pl/sql array.

Two possible input options times two possible output options gives 4 overloaded versions of each procedure. In this note I’ll restrict myself to the first version of the two procedures – varchar2() input, writing to a pre-created table.

Here’s a simple demo script. Before we do anything else we need to call a couple of scripts in the $ORACLE_HOME/rdbms/admin directory to create the target tables. (If you want to do something a little clever you could tweak the scripts to create them as global temporary tables in the sys schema with a public synonym – just like the plan_table used in calls to explain plan.)

rem
rem     Script:         c_explain_mv.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2002
rem

@$ORACLE_HOME/rdbms/admin/utlxmv.sql
@$ORACLE_HOME/rdbms/admin/utlxrw.sql

-- @$ORACLE_HOME/sqlplus/demo/demobld.sql

create materialized view dept_cost
refresh complete on demand
enable query rewrite
as
select 
        d.deptno,sum(e.sal) 
from 
        emp e,dept d
where 
        e.deptno = d.deptno
group by 
        d.deptno
;
 
set autotrace traceonly explain
 
select 
        d.deptno,sum(e.sal) 
from 
        emp e,dept d
where 
        e.deptno = d.deptno
and     d.deptno=10
group by 
        d.deptno
;

set autotrace off

/*

Execution Plan
----------------------------------------------------------
Plan hash value: 3262931184

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |     7 |     2   (0)| 00:00:01 |
|*  1 |  MAT_VIEW REWRITE ACCESS FULL| DEPT_COST |     1 |     7 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DEPT_COST"."DEPTNO"=10)

*/

My original script is rather old and in any recent version of Oracle the Scott tables are no longer to be found in demobld.sql, so I’ve added them at the end of this note.

After creating the emp and dept tables I’ve created a materialized view and then enabled autotrace to see if a suitable query will use the materialized view – and as you can see from the execution plan the materialized view can be used.

So let’s do a quick analysis of the view and the rewrite:


column audsid new_value m_audsid
select sys_Context('userenv','sessionid') audsid from dual;

begin
        dbms_mview.explain_mview(
--              mv      => 'DEPT_COST',
                mv      => q'{
                        create materialized view dept_cost
                        refresh complete on demand
                        enable query rewrite
                        as
                        select 
                                d.deptno,sum(e.sal) 
                        from 
                                emp e,dept d
                        where 
                                e.deptno = d.deptno
                        group by 
                                d.deptno
                }',
                stmt_id         => '&m_audsid'
        );
end;
/
 
set linesize 180

column cap_class noprint
column capability_name format a48
column related_text format a15
column short_msg format a90

break on cap_class skip 1
 
select
        substr(capability_name,1,3) cap_class,
        capability_name, possible, related_text, msgno, substr(msgtxt,1,88) short_msg
from
        mv_capabilities_table
where
        mvname       = 'DEPT_COST'
and     statement_id = '&m_audsid'
order by
        substr(capability_name,1,3), related_num, seq
;


I’ve captured the session’s audsid because the mv_capabilities_table table and the rewrite_table table both have a statement_id column and the audsid is a convenient value to use to identify the most recent data you’ve created if you’re sharing the table. Then I’ve called dbms_mview.explain_mview() indicating two possible strategies (with one commented out, of course).

I could pass in a materialized view name as the mv parameter, or I could pass in the text of a statement to create a materialized view (whether or not I have previously created it). As you can see, I’ve also used a substitution variable to pass in my audsid as the statement id.

After setting up a few SQL*Plus format options I’ve then queried some of the columns from the mv_capabilitie_table table, with the following result:


CAPABILITY_NAME                                  P RELATED_TEXT    MSGNO SHORT_MSG
------------------------------------------------ - --------------- ----------------------------------------
PCT_TABLE                                        N EMP             2068 relation is not a partitioned table
PCT_TABLE_REWRITE                                N EMP             2068 relation is not a partitioned table
PCT_TABLE                                        N DEPT            2068 relation is not a partitioned table
PCT_TABLE_REWRITE                                N DEPT            2068 relation is not a partitioned table
PCT                                              N

REFRESH_FAST_AFTER_ONETAB_DML                    N SUM(E.SAL)      2143 SUM(expr) without COUNT(expr)
REFRESH_COMPLETE                                 Y
REFRESH_FAST                                     N
REFRESH_FAST_AFTER_INSERT                        N TEST_USER.EMP   2162 the detail table does not have a materialized view log
REFRESH_FAST_AFTER_INSERT                        N TEST_USER.DEPT  2162 the detail table does not have a materialized view log
REFRESH_FAST_AFTER_ONETAB_DML                    N                 2146 see the reason why REFRESH_FAST_AFTER_INSERT is disabled
REFRESH_FAST_AFTER_ONETAB_DML                    N                 2142 COUNT(*) is not present in the select list
REFRESH_FAST_AFTER_ONETAB_DML                    N                 2143 SUM(expr) without COUNT(expr)
REFRESH_FAST_AFTER_ANY_DML                       N                 2161 see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
REFRESH_FAST_PCT                                 N                 2157 PCT is not possible on any of the detail tables in the materialized view

REWRITE                                          Y
REWRITE_FULL_TEXT_MATCH                          Y
REWRITE_PARTIAL_TEXT_MATCH                       Y
REWRITE_GENERAL                                  Y
REWRITE_PCT                                      N                 2158 general rewrite is not possible or PCT is not possible on any of the detail tables


20 rows selected.

If you want the complete list of possible messages, the msgno is interpreted from $ORACLE_HOME/rdbms/mesg/qsmus.msg (which is shared with the explain_rewrite() procedure. Currently (19.3) it holds a slightly surprising 796 messages. A call to “oerr qsm nnnnn” will translate the number to the message text.

In a similar vein we can call dbms_mview.explain_rewrite(). Before we do so I’m going to do something that will stop the rewrite from taking place, then call the procedure:

update emp set ename = 'JAMESON' where ename = 'JAMES';
commit;

begin
        dbms_mview.explain_rewrite ( 
                query => q'{
                        select 
                                d.deptno,sum(e.sal) 
                        from 
                                emp e,dept d
                        where 
                                e.deptno = d.deptno
                        and     d.deptno=10
                        group by 
                                d.deptno
                        }',
                mv => null,
                statement_id => '&m_audsid'
        );
end;
/

column message format a110

select
        sequence, message 
from 
        rewrite_table
where
        statement_id = '&m_audsid'
order by
        sequence
/

You’ll notice that there’s an input parameter of mv – there may be cases where the optimizer has a choice of which materialized view to use to rewrite a query, you could supply the name of a materialized view for this parameter to find out why Oracle didn’t choose the materialized view you were expecting. (In this case mv has to supply a materialized view name, not the text to create a materialized view)

In my example I get the following results:

  SEQUENCE MESSAGE
---------- --------------------------------------------------------------------------------------------------------------
         1 QSM-01150: query did not rewrite
         2 QSM-01106: materialized view, DEPT_COST, is stale with respect to some partition(s) in the base table(s)
         3 QSM-01029: materialized view, DEPT_COST, is stale in ENFORCED integrity mode

3 rows selected.

The 2nd and 3rd messages are a bit of a clue – so let’s change the session’s query_rewrite_integrity parameter to stale_tolerated and re-execute the procedure; which gets us to:


  SEQUENCE MESSAGE
---------- --------------------------------------------------------------------------------------------------------------
         1 QSM-01151: query was rewritten
         2 QSM-01033: query rewritten with materialized view, DEPT_COST

Footnote:

You’ll notice that I’ve quote-escaping in my inputs for the materialized view definition and query. This is convenience that let’s me easily cut and paste a statement into the scripts – or even use the @@script_name mechanism – writing a query (without a trailing slash, semi-colon, or any blank lines) in a separate file and then citing the script name between the opening and closing quote lines, e.g.

begin
        dbms_mview.explain_rewrite ( 
                query        => q'{
@@test_script
                                }',
                mv           => null,
                statement_id => '&m_audsid'
        );
end;
/

It’s worth noting that the calls to dbms_mview.explain_mv() and dbms_mview.explain_rewrite() don’t commit the rows they write to their target tables, so you ought to include a rollback; at the end of your script to avoid any confusion as you test multiple views and queries.

Footnote:

Here’s the text of the demobld.sql script as it was when I copied it in the dim and distant past. I’m not sure which version it came from – but I don’t think it’s the original v4/v5 release which I’m fairly sure had only the emp and dept tables. (For reference, and hash joins, there used to be a MOS note explaining that EMP was the big table and DEPT was the small table ;)


CREATE TABLE EMP (
        EMPNO           NUMBER(4) NOT NULL,
        ENAME           VARCHAR2(10),
        JOB             VARCHAR2(9),
        MGR             NUMBER(4),
        HIREDATE        DATE,
        SAL             NUMBER(7, 2),
        COMM            NUMBER(7, 2),
        DEPTNO          NUMBER(2)
);

insert into emp values
        (7369, 'SMITH',  'CLERK',     7902,
        to_date('17-DEC-1980', 'DD-MON-YYYY'),  800, NULL, 20);

insert into emp values
        (7499, 'ALLEN',  'SALESMAN',  7698,
        to_date('20-FEB-1981', 'DD-MON-YYYY'), 1600,  300, 30);

insert into emp values
        (7521, 'WARD',   'SALESMAN',  7698,
        to_date('22-FEB-1981', 'DD-MON-YYYY'), 1250,  500, 30);

insert into emp values
        (7566, 'JONES',  'MANAGER',   7839,
        to_date('2-APR-1981', 'DD-MON-YYYY'),  2975, NULL, 20);

insert into emp values
        (7654, 'MARTIN', 'SALESMAN',  7698,
        to_date('28-SEP-1981', 'DD-MON-YYYY'), 1250, 1400, 30);

insert into emp values
        (7698, 'BLAKE',  'MANAGER',   7839,
        to_date('1-MAY-1981', 'DD-MON-YYYY'),  2850, NULL, 30);

insert into emp values
        (7782, 'CLARK',  'MANAGER',   7839,
        to_date('9-JUN-1981', 'DD-MON-YYYY'),  2450, NULL, 10);

insert into emp values
        (7788, 'SCOTT',  'ANALYST',   7566,
        to_date('09-DEC-1982', 'DD-MON-YYYY'), 3000, NULL, 20);

insert into emp values
        (7839, 'KING',   'PRESIDENT', NULL,
        to_date('17-NOV-1981', 'DD-MON-YYYY'), 5000, NULL, 10);

insert into emp values
        (7844, 'TURNER', 'SALESMAN',  7698,
        to_date('8-SEP-1981', 'DD-MON-YYYY'),  1500,    0, 30);

insert into emp values
        (7876, 'ADAMS',  'CLERK',     7788,
        to_date('12-JAN-1983', 'DD-MON-YYYY'), 1100, NULL, 20);

insert into emp values
        (7900, 'JAMES',  'CLERK',     7698,
        to_date('3-DEC-1981', 'DD-MON-YYYY'),   950, NULL, 30);

insert into emp values
        (7902, 'FORD',   'ANALYST',   7566,
        to_date('3-DEC-1981', 'DD-MON-YYYY'),  3000, NULL, 20);

insert into emp values
        (7934, 'MILLER', 'CLERK',     7782,
        to_date('23-JAN-1982', 'DD-MON-YYYY'), 1300, NULL, 10);

CREATE TABLE DEPT(
        DEPTNO  NUMBER(2),
        DNAME   VARCHAR2(14),
        LOC     VARCHAR2(13) 
);

insert into dept values (10, 'ACCOUNTING', 'NEW YORK');
insert into dept values (20, 'RESEARCH',   'DALLAS');
insert into dept values (30, 'SALES',      'CHICAGO');
insert into dept values (40, 'OPERATIONS', 'BOSTON');

CREATE TABLE BONUS
        (
        ENAME VARCHAR2(10)      ,
        JOB VARCHAR2(9)  ,
        SAL NUMBER,
        COMM NUMBER
);

CREATE TABLE SALGRADE
      ( GRADE NUMBER,
        LOSAL NUMBER,
        HISAL NUMBER 
);

INSERT INTO SALGRADE VALUES (1,700,1200);
INSERT INTO SALGRADE VALUES (2,1201,1400);
INSERT INTO SALGRADE VALUES (3,1401,2000);
INSERT INTO SALGRADE VALUES (4,2001,3000);
INSERT INTO SALGRADE VALUES (5,3001,9999);
COMMIT;

 

July 12, 2020

Massive Deletes

Filed under: 12c,Infrastructure,Oracle,Upgrades — Jonathan Lewis @ 7:36 pm BST Jul 12,2020

One of the recurrent questions on the Oracle Developer Commuity forum is:

What’s the best way to delete millions of rows from a table?

There are an enormous number of relevant details that you need to know before you can give the “right” answer to this question, e.g.

  • Which version of Oracle
  • Are you running Standard Edition or Enterprise Edition
  • Is “millions” a tiny percentage of the table or a large percentage
  • Are there any referential integrity constraints in place
  • Does the system have to keep running while the deletion takes place
  • Is the table compressed
  • How many indexes are there – and can you drop, or mark unusable, some of them
  • How much space do you have to do this job
  • How much time do you have to do this job

One of the most important ones, of course, is “Which version of Oracle?” because it can make an enormous difference to the range of possible strategies. I’m writing this particular note because the question came up a little while ago where the user wanted to delete all the data from 2008 through to the end of 2018, keeping only the last 18 months of data.

That sounds like the volume of data to be deleted (11 years) is very much larger than the volume of data to be kept (1.5 years) – but we can’t be sure of that since businesses tend to grow over time so that last 18 months of data might actually be just as big as the previous 11 years.

As usually happens in response to this question there were suggestions to “create a new table selecting the data you want to keep”, “use dbms_parallel_execute to delete by rowid ranges in parallel”, and a relatively new one “convert to a partitioned table so that the data you want to keep is in its own partition and drop the other partition(s)”. 

I wrote a note a few years ago giving an example of converting a simple heap table to a partitioned table online while maintaining indexes (choosing between local and global) and eliminating the data you don’t want to keep so there’s no need to waste resources copying redundant data.  So, after learning that the OP was running 12.2 Enterprise Edition with the Partitioning option, I suggested that (s)he convert the table into a hash partitioned table with a single partition as this should (for purposes of optimisation) behave just like a simple heap table using the “including rows” clause to copy only the last 18 months of data.

I pointed out that their version of Oracle(EE + PO) gave them the 2nd best option – because I knew that in 19c you could simply do something like:

rem
rem     Script:         122_move.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem

create table t1
as
select  *
from    all_objects
where   rownum <= 50000 -- > comment to avoid wordpress format issue
;

alter table t1 add constraint t1_pk primary key(object_id);

alter table t1 move
        including rows where owner != 'SYS'
        online
;

It wasn’t until a little later that a tiny nagging doubt crept into my mind that maybe this enhancement to the basic move command may have appeared at the same time as the modify partition enhancement – in other words in 12.2.0.1; so I ran the test above and found that it did actually seem to work. (And I haven’t yet found any bugs on MOS suggesting that it shouldn’t be used.)

Having discovered that the command was available I thought that I’d better check whether it was also documented, and found that it was in the 12.2 SQL Reference Manual (though not the 12.1 reference manual – for the obvious reason) under Alter Table. Page down to the “tram-tracks” for the Alter Table command and follow the link for the “move_table_clause”, and from there follow the link for “filter_condition”.

Note:

This option is not available on 12.1 and, if you run the test using that version, Oracle will raise error “ORA-25184: column name expected” at the point where the word “rows” appears. This may look somewhat counter-intuitive but, for a very long time, a command like “alter table TabX move including ColY online” is how you would rebuild an index organized table (IOT) with all columns up to ColY in the “IOT_TOP” segment.

Update [The following morning]

Once you’ve got the framework of a test in place it really doesn’t take very long to start running through “what if” cases or potential boundary conditions.  So this morning I added one very obvious test – what happens if you have referential integrity declared between two tables and try to move both of them including a subset of rows from each that ensures that the referential integrity is still in place:


rem
rem     Script:         122_move_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:
rem
rem     Last tested
rem             19.3.0.0        Parent can't move
rem             12.2.0.1        Parent can't move
rem

create table parent
as
select  *
from    all_objects
where   rownum <= 50000 -- > comment to avoid wordpress format issue
;

alter table parent add constraint par_pk primary key(object_id);

create table child
as
select  *
from    parent
order by
        dbms_random.value
;

alter table child add constraint chi_pk primary key(object_id);
alter table child add constraint chi_fk_par foreign key(object_id) references parent;

I’ve created the child table from the parent data, with random ordering. Now I’m going to delete all the child rows where owner = ‘PUBLIC’ using an online move, then I’ll try and do the same for the parent.


alter table child move
        including rows where owner != 'PUBLIC'
        online
;

-- Child move succeeds (of course)

alter table parent move
        including rows where owner != 'PUBLIC'
        online
;

--
-- Trying to do the matching move on the parent results in:
-- ORA-02266: unique/primary keys in table referenced by enabled foreign keys
--

So there’s a gap in the functionality that makes it less desirable than the simplest case suggests. The referential integrity constraint has to be disabled before the parent table can be restructured.

But something that merits a little further investigation is the option to set the foreign key to “disable validate” (which is sufficient to allow the parent move to take place) and then to set the constraint back to “enable”. When I tried this I had expected Oracle to do a lot of work to revalidate the constraint before enabling it, but I couldn’t find any indication that any such work had taken place.

Update (Nov 2020 – following comment #3 below about IOTs)

It’s important to check the manuals before getting stuck into too many experiments. The “problem” with foreign key constaints is one of the specified restrictions, as given in the 19c Language Reference manual under Alter Table:

Restrictions on Filter Conditions

The following restrictions apply to the filter_condition clause:

    • Filter conditions are supported only for heap-organized tables.
    • Filter conditions can refer only to columns in the table being altered. Filter conditions cannot contain operations, such as joins or subqueries, that reference other database objects.
    • Filter conditions are unsupported for tables with primary or unique keys that are referenced by enabled foreign keys.

Update (Feb 2021)

Another example of experimentation and reading the manuals – there are restrictions on the online move command.  Among other things the online option is not available if the table has domain indexes (e.g. context indexes). The annoying thing about such restrictions, of course, is that it might not even occur to you that it’s something you would need to check the manuals for before you build the model. From the 19c SQL Language Reference manual for Alter Table again:

Restrictions on the ONLINE Clause

The ONLINE clause is subject to the following restrictions when moving table partitions:

    • You cannot specify the ONLINE clause for tables owned by SYS
    • You cannot specify the ONLINE clause for index-organized tables.
    • You cannot specify the ONLINE clause for heap-organized tables that contain object types or on which bitmap join indexes or domain indexes are defined.
    • Parallel DML and direct path INSERT operations require an exclusive lock on the table. Therefore, these operations are not supported concurrently with an ongoing online partition MOVE, due to conflicting locks.

There are a number of little features scattered through this section of the manual (e.g. drop column) that have their own restrictions on the use of the ONLINE option, so you may have to search for “online” if you want to get complete coverage.

 

February 4, 2020

maxquerylen

Filed under: Bugs,Infrastructure,Oracle,undo — Jonathan Lewis @ 3:50 pm GMT Feb 4,2020

The view v$undostat is a view holding summary information about undo activity that can be used by the automatic undo mechanism to deal with optimising the undo retention time (hence undo space allocation). The view holds one row for every ten minute interval in the last 4 days (96 hours) and includes two columns called maxquerylen and maxqueryid – which tell you something about the query that was considered to be the longest running query active in the interval.

In this note I want to explain why the contents of these two columns are sometimes (possibly often) completely irrelevant despite there being a few notes on the internet about how you should investigate them to help you decide on a suitable setting for the undo_retention.

The descriptions in the 19c reference manual for these columns are as follows:

  • maxquerylen – Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.
  • maxqueryid – SQL identifier of the longest running SQL statement in the period

It would be hard to misunderstand the meaning of the second column: if the first column tells us that Oracle has spotted a “longest query” then the second column gives us the sql_id so we can check v$sql to find out what it was. But what sort of queries are going to show up as the longest query in the interval?

Here’s an example from a few hours of a nearly idle instance, querying the begin and end times (formatted to show just day of month + time), with the two critical columns,


select
        begin_time, end_time, maxquerylen, maxqueryid
from
        v$undostat
order by
        begin_time
;

BEGIN_TIME	END_TIME    MAXQUERYLEN MAXQUERYID
--------------- ----------- ----------- -------------
...
04 10:50:18	04 11:00:18	      0
04 11:00:18	04 11:10:18	    356 f3yfg50ga0r8n
04 11:10:18	04 11:20:18	    883 25u1mbkcr9rnu
04 11:20:18	04 11:30:18	   1486 25u1mbkcr9rnu
04 11:30:18	04 11:40:18	   2090 25u1mbkcr9rnu
04 11:40:18	04 11:50:18	      0
04 11:50:18	04 12:00:18	   3299 25u1mbkcr9rnu
04 12:00:18	04 12:10:18	   3903 25u1mbkcr9rnu
04 12:10:18	04 12:20:18	   4507 25u1mbkcr9rnu
04 12:20:18	04 12:30:18	      0
04 12:30:18	04 12:40:18	      0
04 12:40:18	04 12:50:18	      0
04 12:50:18	04 13:00:18	      0
04 13:00:18	04 13:10:18	      0
04 13:10:18	04 13:20:18	      0
04 13:20:18	04 13:30:18	      0
04 13:30:18	04 13:37:27	   9035 25u1mbkcr9rnu

173 rows selected.

Notice, particularly, that the SQL_ID 25u1mbkcr9rnu disappears from the 11:40 interval, then re-appears at 11:50, then disappears again from 12:20 through 13:20 (lunchtime), then reappears again at 13:30. And when it reappears after an absence the query length has increased in a way that’s consistent with the gap. So it looks as if the query wasn’t running during the gap, but turns out to have been running after the gap ended. (Is this a Quantum query – does it only run if you’re looking at it?)

The explanation is in the detail of the definition: “from the cursor open time to the last fetch/execute time”. From an SQL*Plus session I “set pause on” then executed the query “select * from all_objects” and hit return a few times to get a few pages of output. Then, spread over the next couple of hours, I hit return a few more times to get a few more pages of output. Each time I hit return my session does another fetch call, and the code behind v$undostat notices that my query is still active.

I don’t know exactly how Oracle is keeping track of “active” statements because there seem to be some inconsistencies in the reporting (and I’ll comment on those in a moment) but, as a first approximation, until you close a cursor (either explicitly or implicitly) some piece of Oracle’s code registers the fact that the query might do further fetches, which means it might need to apply more undo to make current data read-consistent with the cursor’s start SCN, which means that it should take note of the time the cursor has been open because the undo retention time might need to be that long.

Inconsistencies.

I said there were some inconsistencies in the reporting. I’ve noticed three anomalies – perhaps due to the extreme idleness of the instance I was using for testing.

  1. At about 12:45 I hit return a few times to get the maxquerylen and maxqueryid refreshed – but the code didn’t seem to notice that I had done a few more fetches of a long running query. So it seems to be possible for Oracle to miss the query that should be reported.
  2. At about 11:52 I hit return a few times again, and you can see that the two critical columns were refreshed, but if you do the arithmetic Oracle has attributed 600 seconds to the query – the whole of the 10 minute refresh interval, not just the time up to the moment of the last fetch that I had done in that interval.
  3. I didn’t hit return again until about 12:45 (the moment in point 1 above when the two columns didn’t refresh the way I though they should), but the columns kept updating through 12:00 and 12:10 intervals anyway before disappearing from the 12:20 interval. So it looks like queries can be reported as long running even when they haven’t been executing or fetching.

The is one last anomaly – and this relates to the reason I started looking at this columns. If you check the entry for 11:00 you’ll see that the SQL Id ‘f3yfg50ga0r8n’ has been reported as “running” for 356 seconds. But here’s what I found in v$sql for that query:


select  executions, parse_calls, fetches, end_of_fetch_count, elapsed_time, cpu_time, rows_processed, sql_text
from    v$sql
where   sql_id = 'f3yfg50ga0r8n'
;

EXECUTIONS PARSE_CALLS	  FETCHES END_OF_FETCH_COUNT ELAPSED_TIME   CPU_TIME ROWS_PROCESSED SQL_TEXT
---------- ----------- ---------- ------------------ ------------ ---------- -------------- ------------------------------------------
        79          79         79                  0        20487      10667             79 select obj# from obj$ where dataobj# = :1

The SQL lookes like a sys-recursive statement which, in 79 executions, has accumulated 20 milliseconds of elapsed time (rather than 356 seconds – but that difference could just be related to one or other of the anomalies I reported earlier). The key thing to note is that the value of column end_of_fetch_count is zero: this looks like a statement where Oracle has simply not bothered to fetch all the data and has then not bothered to close the cursor. As a result it’s possible that each time the statement is executed (note that parse_calls = executions, so we’re not looking at a “held” cursor) the code behind v$undostat looks back at the time the cursor was initially opened to measure the query length, rather than looking at the time the statement was re-executed.

This may go a long way to answering the question that came up on Oracle-l earlier on today as follows:

The following query (sql_id is 89w8y2pgn25yd) was recorded in v$undostat.maxqueryid in a 12.2. database during a period of a high undo usage: select ts# from sys.ts$ where ts$.online$ != 3 and bitand(flags,2048) != 2048;

select
        undoblks,txncount,maxquerylen,maxconcurrency,activeblks
from    v$undostat u
where   maxqueryid='89w8y2pgn25yd'
;

UNDOBLKS  TXNCOUNT  MAXQUERYLEN  MAXCONCURRENCY  ACTIVEBLKS
--------  --------  -----------  --------------  ----------
   39199      4027         1378               5     2531960

What is this query, and how does it manage to report a maximum query length of 1,378 seconds?

Just like the one above it’s a sys-recursive query; and this one appears when you query dba_tablespaces – and even though it executes once for every row it takes just fractions of a second to execute. But if you trace a query like “select tablespace_name from dba_tablespaces” you’ll see that every time the query is called the trace file will show lines for: “Parse, Exec, Fetch, Close” until the last call – which doesn’t report a “Close”.

Just like my original “select * from all_objects” there’s a query dangling with an open cursor, leaving Oracle with the opportunity to go back to the moment it was opened and report it as a “long running query”.

tl;dr

The maxquerylen and maxqueryid in v$undostat don’t tell you about statements that have taken a long time to change a lot of data and generate a lot of undo; they tell you about statements that might need to apply a lot of undo to see read-consistent data and therefore might become victims in a “ORA-01555: snapshot too old” event.

The statements reported may be completely irrelevant because there are various reasons (some, possibly, bug-related) why a cursor may be treated as if it opened a long time in the past when it was actually a re-use of an existing “dangling” cursor. It’s also possible that some bad client code will treat cursor in a way that does no harm to the client program but hides a more interesting query that would otherwise have been reported by these two columns.

Update Feb 2020

I don’t know why I didn’t think of searching MOS before poking around at this issue, but I’ve just done a search on “maxquerylen maxqueryid” and one of the first Knowledgebase articles to show up – dated September 2019 – was Doc ID 2005931.1: Wrongly calculated MAXQUERYLEN Causing ORA-01555 or ORA-30036 Errors.

The document reports several bugs all relating to the same issue, and all closed as duplicates of unpublished bug: 12314489: OCIRELEASESTMT() TO PUT THE CURSORS IN CURBOUND STATE WHEN PLACING THEM IN CACHE This is also closed and replaced by bug 26833932 which is also unpublished and doesn’t even have a description.

Update Dec 2020

A question on the MOS Community Forums today raised a question about SQL_ID f3yfg50ga0r8n appearing in v$active_session_history with a an sql_exec_start (22:25) that was nearly 38 minutes before the sample_time (23:03) with no other samples appearing in ASH between those two times. This doesn’t necessarily mean anything special (it could be a query which is very efficient at the database but is only asked to return a few rows to the client every few seconds), but it’s the same query that I commented on in the section on Inconsistencies.

This prompted me to do another search for the bug numbers I reported in February, and I found that something has been published fairly recently about bug 26833932 – search for the document id 26833932.8, and you’ll find “Bug 26833932 OCIStmtRelease() To Put Cursors In CURBOUND State When Placing Them In Cache” . This is reported as fixed in 18.1 with various patches available for 12.1.0.2 and 12.2.0.1

Update Jan 2021

Writing up some notes on undo handling I noticed that my 19.3 v$undostat was reporting a maxqueryid of  g0t052az3rx44 that had spontaneously appeared with a maxquerylen of 42,000 seconds, but no previous appearances. The text of the query was another sys-recursive statement:

select name,intcol#,segcol#,type#,length,nvl(precision#,0),decode(type#,2,nvl(scale,-127/*MAXSB1MINAL*/),178,scale,179,scale,180,sca
le,181,scale,182,scale,183,scale,231,scale,0),null$,fixedstorage,nvl(deflength,0),default$,rowid,col#,property, nvl(charsetid,0),nvl
(charsetform,0),spare1,spare2,nvl(spare3,0), nvl(evaledition#,1),nvl(unusablebefore#,0),nvl(unusablebeginning#,0), case when (type#
in (1,8,9,96,112)) then nvl(collid, 16382) else 0 end case, nvl(collintcol#,0), nvl(acdrrescol#, 0), nvl(spare7, 0), nvl(spare9, 0),
 nvl(spare10, 0) from col$ where obj#=:1 order by intcol#

So the problem still hasn’t gone away yet.

Next Page »

Website Powered by WordPress.com.