Oracle Scratchpad

August 16, 2016

Month End

Filed under: audit,CBO,Non-technical — Jonathan Lewis @ 1:04 pm BST Aug 16,2016

A question about parallel query and cardinality estimates appeared on OTN a little while ago that prompted me to write this note about helping the optimizer do the best job with the least effort.  (A critical point in the correct answer to the original question is that parallel query may lead to “unexpected” dynamic sampling, which can make a huge difference to the choice of execution plans, but that’s another matter.)

The initial cardinality error in the plan came from the following predicate on a “Date dimension” table:


      AR_DATE.CALENDAR_DT   = AR_DATE.MONTH_END_DT 
AND   AR_DATE.CALENDAR_DT  >= ADD_MONTHS(TRUNC(SYSDATE,'fmyy'), -36) 
AND   AR_DATE.MONTH_END_DT >= ADD_MONTHS(TRUNC(SYSDATE,'fmyy'), -36)

In the parallel plan the estimated number of rows on a full tablescan of the table was 742, while on the serial plan the same tablescan produced a cardinality of 1. You will appreciate that having an estimate of 1 (or less) that is nearly three orders of magnitude wrong is likely to lead to a very bad execution plan.

My first thought when I saw this was (based on a purely intuitive interpretation): “there’s one day every month that’s the last day of the month and we’re looking at roughly that last 36 months so we might predict a cardinality of about 36”. That’s still a long way off the 742 estimate and 1,044 actual for the parallel query, but it’s a warning flag that the serial estimate is probably an important error – it’s also an example of the very simple “sanity checking” mental exercises that can accompany almost any execution plan analysis.

My second thought (which happened to be wrong, and would only have been right some time well before version 10.2.0.5) was that the optimizer would treat the add_months() expressions as unknown values and assign a selectivity of 5% to each of the predicates, reducing the combined selectivity to 1/400th of the selectivity it gave to the first predicate. In fact the optimizer evaluates the expressions and would have used the normal (required range / total range) calculation for those two predicates.

It’s the first predicate that I want to examine, though – how does the optimizer calculate a selectivity for it ? Here’s some code to generate sample data to use for testing.


rem
rem     Script:         month_end.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table t1
nologging
as
select
        rownum                                                   id,
        to_date('01-Jan-2010','dd-mon-yyyy') + rownum - 1       calendar_date,
        add_months(
                trunc(to_date('01-Jan-2010','dd-mon-yyyy') + rownum - 1 ,'MM' ),
                1
        ) - 1                                                   month_end_date
from
        dual
connect by
        level <= trunc(sysdate) - to_date('01-jan_2010','dd-mon-yyyy') + 1
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 1')

This clunky bit of code gives me consecutive dates from 1st Jan 2010 up to “today” with the month_end_date column holding the month end date corresponding to the row’s calendar_date. So now we can check what the optimizer makes of the predciate calendar_date = month_end_date:


set autotrace on explain

select count(*) from t1 where calendar_date = month_end_date;

  COUNT(*)
----------
        79

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    16 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    16 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |    16 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CALENDAR_DATE"="MONTH_END_DATE")

Looking at operation 2 we can see that, in effect, the optimizer has considered two independent predicates “calendar_date = {unknown}” and “month_end_date = {unknown}” and taken the lower of the two selectivities – which means the cardinality estimate is 1 because the calendar_date column is unique across this table.

There are various ways to work around problems like this. One of the simplest would be to tell Oracle to sample this table with the (table-level) hint /*+ dynamic_sampling(t1 1) */; in fact, since this predicate is effectively treated as an examination of two predicates the (cursor-level) hint /*+ dynamic_sampling(4) */ would also cause sampling to take place – note that level 4 or higher is required to trigger sampling for “multiple” predicates on a single table. As a general guideline we always try to minimise the risk of side effects so if this problem were embedded in a much larger query I would prefer the table-level hint over the cursor-level hint.

There are other options, though, that would allow you to bypass sampling – provided you can modify the SQL. The script I used to create this table also included the following statement:


alter table t1 add (
        date_offset1 generated always as (calendar_date - month_end_date) virtual,
        date_flag generated always as (case when calendar_date - month_end_date = 0 then 'Y' end) virtual
);

In 12c I would declare these virtual columns to be invisible to avoid problems with any SQL that didn’t use explicit column lists. For demonstration purposes I’ve set up two options – I can find the rows I want with one of two obvious predicates:

    date_offset1 = 0
    date_flag = 'Y'

In fact there’s a third predicate I could use that doesn’t need to know about the virtual columns:

    calendar_date - month_end_date = 0

Unfortunately I can’t arbitrarily swap the order of the two dates in the last predicate, and the optimizer won’t spot that it is also equivalent to “calendar_date = month_end_date”. Here are a few execution plans – for which the only significant bit is the cardinality estimate of the full tablescans:


select count(*) from t1 where date_flag = 'Y';

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     2 |     4  (25)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     2 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    79 |   158 |     4  (25)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATE_FLAG"='Y')



select count(*) from t1 where date_offset1 = 0;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    78 |   312 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATE_OFFSET1"=0)



select count(*) from t1 where calendar_date - month_end_date = 0;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    78 |   312 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."DATE_OFFSET1"=0)

It’s interesting to note that the optimizer has transformed the last predicate into the equivalent virtual column expression to do the arithmetic. You might also note that the date_flag option is slightly more accurate, but that’s because it’s based on an expression which is null for the rows we don’t want while the date_offset1 column has a value for every row and a little bit of arithmetical rounding comes into play. You might also note that there’s a small cost difference – which I’d ascribe to the CPU cost that the optimizer has added for the CASE expression being applied on top of the simple date arithmetic.

Of course, whatever else you might play around with when working around a cardinality problem like this, I think the strategic aim for a data warehouse system would be to get a REAL flag column on the table and populate it at data loading time if month-end dates played an important part in the proceedings – though I have to say that the virtual flag column is one I do rather fancy.

 

June 7, 2011

Audit Excess

Filed under: audit,Bugs,Infrastructure,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 6:18 pm BST Jun 7,2011

So you’ve decided you want to audit a particular table in your database, and think that Oracle’s built in audit command will do what you want. You discover two options that seem to be relevant:

audit all on t1 by access;
audit all on t1 by session;

To check the audit state of anything in your schema you can then run a simple query – with a few SQL*Plus formatting commands – to see something like the following:

(more…)

June 1, 2011

audsid

Filed under: audit,Oracle,Troubleshooting,Upgrades — Jonathan Lewis @ 5:42 pm BST Jun 1,2011

Here’s an example of how the passing of time can allow a problem to creep up on you.

A recent client had a pair of logon/logoff database triggers to capture some information that wasn’t available in the normal audit trail, and they had been using these triggers successfully for many years, but one day they realised that the amount of redo generated per hour had become rather large, and had actually been quite bad and getting worse over the last few months for no apparent reason. (You’ve heard this one before … “honest, guv, nothing has changed”).
(more…)

May 27, 2011

Audit Ouch!

Filed under: audit,Bugs,Infrastructure,Oracle,redo — Jonathan Lewis @ 5:37 pm BST May 27,2011

A few days ago I was rehearsing a presentation about how to investigate how Oracle works, and came across something surprising. Here’s a simple bit of code:
(more…)

May 25, 2011

audit

Filed under: audit,Bugs,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:41 pm BST May 25,2011

Here’s one of those funny little details that can cause confusion:

SQL> select * from user_audit_object;

no rows selected
 
SQL> audit select on indjoin by session whenever successful;

Audit succeeded.

SQL> select
  2     count(*)
  3  from
  4     indjoin         ij
  5  where
  6     id between 100 and 200
  7  and        val between 50 and 150
  8  ;

  COUNT(*)
----------
        51

1 row selected.

SQL> select * from user_audit_object where obj_name = 'INDJOIN';

no rows selected

(more…)

April 5, 2010

Failed Login

Filed under: audit,Infrastructure,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 7:59 pm BST Apr 5,2010

Here’s a piece of code I found recently running every half hour on a client site:

SQL_ID = 2trtpvb5jtr53
SELECT
        TO_CHAR(current_timestamp AT TIME ZONE :"SYS_B_0", :"SYS_B_1") AS curr_timestamp,
        COUNT(username) AS failed_count
FROM
        sys.dba_audit_session
WHERE
        returncode != :"SYS_B_2"
AND     TO_CHAR(timestamp, :"SYS_B_3") >= TO_CHAR(current_timestamp - TO_DSINTERVAL(:"SYS_B_4"), :"SYS_B_5")

(more…)

September 19, 2009

Audit

Filed under: audit,Infrastructure,Oracle — Jonathan Lewis @ 12:18 pm BST Sep 19,2009

Just a quick pointer to a comment about deleting from aud$ that came up recently on the OTN Database Forum.

I know the answer’s in the manuals, but sometimes you just don’t spot little details, or remember where you think you read them, so it’s nice to capture the comment when you can.  (And there’s a nice little follow-up from Mark Powell that extends the topic to fga_log$ – the “fine-grained audit” table)

Update Feb 2012

Here’s another OTN link to capture a reference on how to clean out the audit trail (aud$, fga_log$, or file system text of XML logs) “officially” and automatically in 11g using the dbms_audit_mgmt package. (The link currently points to Chapter 27 of the 11.2 PL/SQL Packages reference guide).

Blog at WordPress.com.