Oracle Scratchpad

May 24, 2018

Missing Audit

Filed under: audit,Oracle — Jonathan Lewis @ 10:27 am BST May 24,2018

Here’s a detail I discovered a long time ago – and rediscovered very recently: it’s possible to delete data from a table which is subject to audit without the delete being audited. I think the circumstances where it would matter are a little peculiar, and I’ve rarely seen systems that use the basic Oracle audit feature anyway, but this may solve a puzzle for someone, somewhere, someday.

The anomaly appears if you create a referential integrity constraint as “on delete cascade”. A delete from the parent table will automatically (pre-)delete matching rows from the child table but the delete on the child table will not be audited. Here’s a demonstration script – note that you will need to have set the parameter audit_trail to ‘DB’ to prove the point.


rem
rem     Script:         del_cascade_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2004
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4
rem

drop table t2 purge;
drop table t1 purge;

create table t1 (
        id              number(6),
        v1              varchar2(10),
        padding         varchar2(100),
        constraint t1_pk 
                primary key (id)
);


create table t2 (
        id_par          number(6),
        id_ch           number(6),
        v1              varchar2(10),
        padding         varchar2(100),
        constraint t2_pk 
                primary key (id_par,id_ch),
        constraint t2_fk_t1 
                foreign key (id_par) references t1 
                on delete cascade
);


insert into t1
select
        rownum,
        rownum,
        rpad('x',100)
from
        all_objects
where
        rownum <= 100 -- > comment to avoid wordpress format issue
;


insert into t2
select
        1+trunc((rownum-1)/5),
        rownum,
        rownum,
        rpad('x',100)
from
        all_objects
where
        rownum <= 500 -- > comment to avoid wordpress format issue
;

commit;

prompt  =================================
prompt  Parent/Child rowcounts for id = 1
prompt  =================================

select count(*) from t1 where id = 1;
select count(*) from t2 where id_par = 1;

column now new_value m_now
select to_char(sysdate,'dd-mon-yyyy hh24:mi:ss') now from dual;

audit delete on t2 by access; 
audit delete on t1 by access; 

prompt  =======================================================
prompt  If you allow the cascade (keep the t2 delete commented)
prompt  then the cascade deletion is not audited.
prompt  =======================================================

-- delete from t2 where id_par = 1;
delete from t1 where id = 1;

noaudit delete on t1; 
noaudit delete on t2; 

column obj_name format a32

select  action_name, obj_name 
from    user_audit_trail
where   timestamp >= to_date('&m_now','dd-mon-yyyy hh24:mi:ss')
;

The script has an optional delete from the child table (t2) just before the delete from the parent table (t1). When you run the script you should see that before the delete t1 reports one row while t2 reports 5 rows. After the delete(s) both tables will report zero rows.

If you leave the t2 delete commented out then the delete from t2 will have been the recursive delete due to the cascade and the number of rows returned from user_audit_trail will be one (the delete from t1). If you allow the explicit delete from t2 to take place then user_audit_trail will report two rows, one each for t1 and t2.

Sample output (with a little cosmetic editing) – when the delete from t2 is commented out:

=================================
Parent/Child rowcounts for id = 1
=================================

  COUNT(*)
----------
         1


  COUNT(*)
----------
         5

Audit succeeded.
Audit succeeded.

=======================================================
If you allow the cascade (keep the t2 delete commented)
then the cascade deletion is not audited.
=======================================================

1 row deleted.


Noaudit succeeded.
Noaudit succeeded.


  COUNT(*)
----------
         0


  COUNT(*)
----------
         0


ACTION_NAME                  OBJ_NAME
---------------------------- --------------------------------
DELETE                       T1

1 row selected.

As you can see, I’ve deleted just one row from one table (the t1 delete), but the final query against t2 shows that the child rows have also been deleted, but the only audit record reported is the one for the parent – despite the fact that if you enable sql_trace before the delete from t1 you will find the actual recursive statement ‘ delete from “TEST_USER”.”T2″ where “ID_PAR” = :1’ in the trace file.

The “recursive” nature of the delete in the trace file might be a bit of a clue – it is reported as being operated by SYS (lid = 0), not by the real end-user, and the parsing_user_id and parsing_schema_id in v$sql are both SYS (i.e. 0). Checking dba_audit_trail and the audit_file_dest for audited SYS operations, though, there was still no sign of any audit record for the child delete.

 

August 16, 2016

Month End

Filed under: audit,CBO,Non-technical — Jonathan Lewis @ 1:04 pm BST Aug 16,2016

A question about parallel query and cardinality estimates appeared on OTN a little while ago that prompted me to write this note about helping the optimizer do the best job with the least effort.  (A critical point in the correct answer to the original question is that parallel query may lead to “unexpected” dynamic sampling, which can make a huge difference to the choice of execution plans, but that’s another matter.)

The initial cardinality error in the plan came from the following predicate on a “Date dimension” table:


      AR_DATE.CALENDAR_DT   = AR_DATE.MONTH_END_DT 
AND   AR_DATE.CALENDAR_DT  >= ADD_MONTHS(TRUNC(SYSDATE,'fmyy'), -36) 
AND   AR_DATE.MONTH_END_DT >= ADD_MONTHS(TRUNC(SYSDATE,'fmyy'), -36)

In the parallel plan the estimated number of rows on a full tablescan of the table was 742, while on the serial plan the same tablescan produced a cardinality of 1. You will appreciate that having an estimate of 1 (or less) that is nearly three orders of magnitude wrong is likely to lead to a very bad execution plan.

My first thought when I saw this was (based on a purely intuitive interpretation): “there’s one day every month that’s the last day of the month and we’re looking at roughly that last 36 months so we might predict a cardinality of about 36”. That’s still a long way off the 742 estimate and 1,044 actual for the parallel query, but it’s a warning flag that the serial estimate is probably an important error – it’s also an example of the very simple “sanity checking” mental exercises that can accompany almost any execution plan analysis.

My second thought (which happened to be wrong, and would only have been right some time well before version 10.2.0.5) was that the optimizer would treat the add_months() expressions as unknown values and assign a selectivity of 5% to each of the predicates, reducing the combined selectivity to 1/400th of the selectivity it gave to the first predicate. In fact the optimizer evaluates the expressions and would have used the normal (required range / total range) calculation for those two predicates.

It’s the first predicate that I want to examine, though – how does the optimizer calculate a selectivity for it ? Here’s some code to generate sample data to use for testing.


rem
rem     Script:         month_end.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table t1
nologging
as
select
        rownum                                                   id,
        to_date('01-Jan-2010','dd-mon-yyyy') + rownum - 1       calendar_date,
        add_months(
                trunc(to_date('01-Jan-2010','dd-mon-yyyy') + rownum - 1 ,'MM' ),
                1
        ) - 1                                                   month_end_date
from
        dual
connect by
        level <= trunc(sysdate) - to_date('01-jan_2010','dd-mon-yyyy') + 1
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1')

This clunky bit of code gives me consecutive dates from 1st Jan 2010 up to “today” with the month_end_date column holding the month end date corresponding to the row’s calendar_date. So now we can check what the optimizer makes of the predciate calendar_date = month_end_date:


set autotrace on explain

select count(*) from t1 where calendar_date = month_end_date;

  COUNT(*)
----------
        79

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    16 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    16 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |    16 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CALENDAR_DATE"="MONTH_END_DATE")

Looking at operation 2 we can see that, in effect, the optimizer has considered two independent predicates “calendar_date = {unknown}” and “month_end_date = {unknown}” and taken the lower of the two selectivities – which means the cardinality estimate is 1 because the calendar_date column is unique across this table.

There are various ways to work around problems like this. One of the simplest would be to tell Oracle to sample this table with the (table-level) hint /*+ dynamic_sampling(t1 1) */; in fact, since this predicate is effectively treated as an examination of two predicates the (cursor-level) hint /*+ dynamic_sampling(4) */ would also cause sampling to take place – note that level 4 or higher is required to trigger sampling for “multiple” predicates on a single table. As a general guideline we always try to minimise the risk of side effects so if this problem were embedded in a much larger query I would prefer the table-level hint over the cursor-level hint.

There are other options, though, that would allow you to bypass sampling – provided you can modify the SQL. The script I used to create this table also included the following statement:


alter table t1 add (
        date_offset1 generated always as (calendar_date - month_end_date) virtual,
        date_flag generated always as (case when calendar_date - month_end_date = 0 then 'Y' end) virtual
);

In 12c I would declare these virtual columns to be invisible to avoid problems with any SQL that didn’t use explicit column lists. For demonstration purposes I’ve set up two options – I can find the rows I want with one of two obvious predicates:

    date_offset1 = 0
    date_flag = 'Y'

In fact there’s a third predicate I could use that doesn’t need to know about the virtual columns:

    calendar_date - month_end_date = 0

Unfortunately I can’t arbitrarily swap the order of the two dates in the last predicate, and the optimizer won’t spot that it is also equivalent to “calendar_date = month_end_date”. Here are a few execution plans – for which the only significant bit is the cardinality estimate of the full tablescans:


select count(*) from t1 where date_flag = 'Y';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     2 |     4  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     2 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    79 |   158 |     4  (25)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATE_FLAG"='Y')



select count(*) from t1 where date_offset1 = 0;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    78 |   312 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATE_OFFSET1"=0)



select count(*) from t1 where calendar_date - month_end_date = 0;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    78 |   312 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."DATE_OFFSET1"=0)

It’s interesting to note that the optimizer has transformed the last predicate into the equivalent virtual column expression to do the arithmetic. You might also note that the date_flag option is slightly more accurate, but that’s because it’s based on an expression which is null for the rows we don’t want while the date_offset1 column has a value for every row and a little bit of arithmetical rounding comes into play. You might also note that there’s a small cost difference – which I’d ascribe to the CPU cost that the optimizer has added for the CASE expression being applied on top of the simple date arithmetic.

Of course, whatever else you might play around with when working around a cardinality problem like this, I think the strategic aim for a data warehouse system would be to get a REAL flag column on the table and populate it at data loading time if month-end dates played an important part in the proceedings – though I have to say that the virtual flag column is one I do rather fancy.

 

June 7, 2011

Audit Excess

Filed under: audit,Bugs,Infrastructure,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 6:18 pm BST Jun 7,2011

So you’ve decided you want to audit a particular table in your database and think that Oracle’s built in audit command will do what you want. You discover two options that seem to be relevant:

audit all on t1 by access;
audit all on t1 by session;

To check the audit state of any object in your schema you then run a simple query – with a few SQL*Plus formatting commands – to see something like the following:

(more…)

June 1, 2011

audsid

Filed under: audit,Oracle,Troubleshooting,Upgrades — Jonathan Lewis @ 5:42 pm BST Jun 1,2011

Here’s an example of how the passing of time can allow a problem to creep up on you.

A recent client had a pair of logon/logoff database triggers to capture some information that wasn’t available in the normal audit trail, and they had been using these triggers successfully for many years, but one day they realised that the amount of redo generated per hour had become rather large, and had actually been quite bad and getting worse over the last few months for no apparent reason. (You’ve heard this one before … “honest, guv, nothing has changed”).
(more…)

May 27, 2011

Audit Ouch!

Filed under: audit,Bugs,Infrastructure,Oracle,redo — Jonathan Lewis @ 5:37 pm BST May 27,2011

A few days ago I was rehearsing a presentation about how to investigate how Oracle works, and came across something surprising. Here’s a simple bit of code:
(more…)

May 25, 2011

audit

Filed under: audit,Bugs,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:41 pm BST May 25,2011

Here’s one of those funny little details that can cause confusion:

SQL> select * from user_audit_object;

no rows selected

SQL> audit select on indjoin by session whenever successful;

Audit succeeded.

SQL> select
  2     count(*)
  3  from
  4     indjoin         ij
  5  where
  6     id between 100 and 200
  7  and        val between 50 and 150
  8  ;

  COUNT(*)
----------
        51

1 row selected.

SQL> select * from user_audit_object where obj_name = 'INDJOIN';

no rows selected

(more…)

April 5, 2010

Failed Login

Filed under: audit,Infrastructure,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 7:59 pm BST Apr 5,2010

Here’s a piece of code I found recently running every half hour on a client site:

SQL_ID = 2trtpvb5jtr53

SELECT
        TO_CHAR(current_timestamp AT TIME ZONE :"SYS_B_0", :"SYS_B_1") AS curr_timestamp,
        COUNT(username) AS failed_count
FROM
        sys.dba_audit_session
WHERE
        returncode != :"SYS_B_2"
AND     TO_CHAR(timestamp, :"SYS_B_3") >= TO_CHAR(current_timestamp - TO_DSINTERVAL(:"SYS_B_4"), :"SYS_B_5")

(more…)

September 19, 2009

Audit

Filed under: audit,Infrastructure,Oracle — Jonathan Lewis @ 12:18 pm BST Sep 19,2009

Just a quick pointer to a comment about deleting from aud$ that came up recently on the OTN Database Forum.

I know the answer’s in the manuals, but sometimes you just don’t spot little details, or remember where you think you read them, so it’s nice to capture the comment when you can.  (And there’s a nice little follow-up from Mark Powell that extends the topic to fga_log$ – the “fine-grained audit” table)

Update Feb 2012

Here’s another OTN link to capture a reference on how to clean out the audit trail (aud$, fga_log$, or file system text of XML logs) “officially” and automatically in 11g using the dbms_audit_mgmt package. (The link currently points to Chapter 27 of the 11.2 PL/SQL Packages reference guide).

September 9, 2007

Set Role

Filed under: audit,Oracle,Troubleshooting — Jonathan Lewis @ 7:24 pm BST Sep 9,2007

I’ve been meaning to post a couple of comments about the set role command for several months – and a few days ago someone came on to the OTN Database Forum with the exact problem that I had been meaning to write about (except that it turned out that there was an even more important problem [thread subsequently deleted, unfortunately, so I’ve removed the URL] causing the issue).

(more…)

Website Powered by WordPress.com.