Oracle Scratchpad

October 17, 2014


Filed under: Oracle,Statistics,System Stats,Troubleshooting — Jonathan Lewis @ 1:22 pm BST Oct 17,2014

Here’s a question that appeared in my email a few days ago:

Based on the formula: “sreadtim = ioseektim + db_block_size/iotrfrspeed”, sreadtim should always bigger than ioseektim.

But I just did a query on my system, find it otherwise, get confused:


SNAME                          PNAME                               PVAL1 PVAL2
------------------------------ ------------------------------ ---------- --------------------
SYSSTATS_INFO                  STATUS                                    COMPLETED
SYSSTATS_INFO                  DSTART                                    10-08-2014 10:45
SYSSTATS_INFO                  DSTOP                                     10-10-2014 10:42
SYSSTATS_INFO                  FLAGS                                   1
SYSSTATS_MAIN                  CPUSPEEDNW                     680.062427
SYSSTATS_MAIN                  IOSEEKTIM                              10
SYSSTATS_MAIN                  IOTFRSPEED                           4096
SYSSTATS_MAIN                  SREADTIM                            4.716
SYSSTATS_MAIN                  MREADTIM                            2.055
SYSSTATS_MAIN                  CPUSPEED                             1077
SYSSTATS_MAIN                  MBRC                                    4
SYSSTATS_MAIN                  MAXTHR                          956634112
SYSSTATS_MAIN                  SLAVETHR                           252928

How do we explain this ?

This question highlights two points – one important, the other only slightly less so.

The really important point is one of interpretation.  Broadly speaking we could reasonably say that the (typical) time required to perform a single block read is made up of the (typical) seek time plus the transfer time which, using the names of the statistics above, would indeed give us the relationship: sreadtim = ioseektim + db_block_size/iotfrspeed; but we have to remember that we are thinking of a simplified model of the world. The values that we capture for sreadtim include the time it takes for a request to get from Oracle to the O/S, through the various network software and hardware layers and back again; the formula ignores those components completely and, moreover, doesn’t allow for the fact that some “reads” could actually come from one of several caches between Oracle and the disc without any physical disc access actually taking place. Similarly we should be aware that the time for an actual I/O seek would vary dramatically with the current position  of the read head, the radial position of the target block, the speed and current direction of movement of the read head, and the rotational distance to the target block. The formula is not attempting to express a physical law, it is simply expressing an approximation that we might use in a first line estimate of performance.

In fact we can see in the figures above that multi-block reads (typically of 4 blocks)  were faster than single block reads on this hardware for the duration of the sampling period – and that clearly doesn’t fit with the simple view embedded in our formula of how disc drives work.  (It’s a fairly typical effect of SANs, of course, that large read requests make the SAN software start doing predictive read-ahead, so that the next read request from Oracle may find that the SAN has already loaded the required data into its cache.)

There is, however, the second point that these figures highlight – but you have to be in the know to spot the detail: whatever the complexities introduced by SAN caching, we’re not comparing the right numbers. The ioseektim and iotfrspeed shown here are the default values used by Oracle. It looks as if the user has called dbms_stats.gather_system_stats() with a 48 hour workload (dstart = 8th Oct, dstop = 10th Oct) but hasn’t yet executed the procedure using the ‘noworkload’ option. Perhaps the ioseektim and iotfrspeed figures from a noworkload call would look a little more reasonable when compared with the 4.716 milliseconds of the gathered sreadtim.

There may still be a large gap between the model and the reality, but until the two sets of figures we’re using come from the same place we shouldn’t be comparing them.


September 12, 2014

Unusual Deadlock

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 1:08 pm BST Sep 12,2014

Prompted by a question on OTN I came up with a strategy for producing an ORA-00060 deadlock that DIDN’T produce a deadlock graph (because there isn’t one) and didn’t get reported in the alert log (at least, not when tested on It’s a situation that shouldn’t arise in a production system because it’s doing the sorts of things that you shouldn’t do in a production system: but possibly if you’re trying to do some maintenance or upgrades while keeping the system live it could happen. Here’s the starting code:

drop procedure p2;
drop procedure p1;

drop table t1 purge;
create table t1 (n1 number);

insert into t1 values(1);
insert into t1 values(3);

create or replace procedure p1
        update t1 set n1 = 2 where n1 = 1;
        update t1 set n1 = 4 where n1 = 3;

create or replace procedure p2

        procedure q
                pragma autonomous_transaction;
                execute immediate 'drop procedure p1';


        update t1 set n1 = 4 where n1 = 3;


Of course you’re asking for trouble if you start doing DDL as part of your production code; and you’re asking for trouble if you start playing around with autonomous transaction; and if you use one to do the other you’re almost guaranteed to hit a problem eventually. All it takes in this case is a simple sequence of actions followed by a short wait (ca. 10 seconds):

Session 1: execute p1
Session 2: wait a couple of seconds, then execute p2

I got the following result from session 2:


ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "TEST_USER.P2", line 8
ORA-06512: at "TEST_USER.P2", line 14
ORA-06512: at line 1

While this doesn’t show up in the alert log, I do get a trace file dumped for the session; in fact I got a trace file from both processes. The trace from the process that reported the deadlock started like this:


  Performing diagnostic dump and signaling ORA-00060

  Complete deadlock information is located in the trace file of process (pid: 8, osid: 17861, DIA0)
    trace file: /u01/app/oracle/diag/rdbms/test/test/trace/test_dia0_17861.trc


*** 2014-09-09 12:44:13.427

    dump requested by process (pid: 8, osid: 17861, DIA0)
    trace file: /u01/app/oracle/diag/rdbms/test/test/trace/test_dia0_17861.trc

The other process didn’t mention a deadlock, but started at the “HUNG PROCESS DIAGNOSTIC” line – the trace file had been triggered by the DIAG process. Of course, the DIAG process also dumped a trace file, summarising the situation; and this started with the following important note:

*** 2014-09-09 12:44:13.426

DEADLOCK DETECTED (id=0xf678cfe3)

Chain Signature: 'enq: TX - row lock contention'<='library cache pin' (cycle)
Chain Signature Hash: 0x39f9868d

The rest of the trace file told me what the other two processes had been doing when the ORA-00060 was initiated, but the point I want to pick up here is that we have a very brief summary in the “Chain Signature” that tells us we’ve had a collision between some DML (enq: TX) and some sort of cursor-like problem (library cache pin) and not a simple data cross-over.

If you’ve looked at the original OTN posting you’ll see that the Chain Signature in that case is “Chain Signature: ‘library cache lock'<=’library cache pin’ (cycle)”, which indicates a collision restricted entirely to the library cache (the lock suggests someone is using a package/cursor while the pin suggests that another session is trying to destroy/(re)compile it – and they’re each trying to do it to each other’s package ! (In the past when I’ve a deadlock of this type it’s been reported as ORA-04020 rather than ORA-00060.) I still have to work out exactly how the OP got into their deadlock (especially in view of their latest comment) -but since I don’t have the right version of Oracle to hand, and it might be a version-specific bug, I don’t think I’m going to try.


September 8, 2014

ASSM Truncate.

Filed under: ASSM,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 11:34 am BST Sep 8,2014

Here’s one that started off with a tweet from Kevin Closson, heading towards a finish that shows some interesting effects when you truncate large objects that are using ASSM. To demonstrate the problem I’ve set up a tablespace using system allocation of extents and automatic segment space management (ASSM).  It’s the ASSM that causes the problem, but it requires a mixture of circumstances to create a little surprise.

	tablespace test_8k_auto_assm
	datafile	-- OMF
	SIZE 1030M
	autoextend off
	blocksize 8k
	extent management local
	segment space management auto

create table t1 (v1 varchar2(100)) pctfree 99 tablespace test_8k_auto_assm storage(initial 1G);

insert into t1 select user from dual;

alter system flush buffer_cache;

truncate table t1;

I’ve created a table with an initial definition of 1GB, which means that (in a clean tablespace) the autoallocate option will jump straight to extents of 64MB, with 256 table blocks mapped per bitmap block for a total of 32 bitmap blocks in each 64MB extent. Since I’m running this on and haven’t included “segment creation immediate” in the definition I won’t actually see any extents until I insert the first row.

So here’s the big question – when I truncate this table (using the given command) how much work will Oracle have to do ?

Exchanging notes over twitter (140 char at a time) and working from a model of the initial state, it took a little time to get to understand what was (probably) happening and then produce this silly example – but here’s the output from a snapshot of v$session_event for the session across the truncate:

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
local write wait                                    490           0          83.26        .170          13
enq: RO - fast object reuse                           2           0         104.90      52.451         105
db file sequential read                              47           0           0.05        .001           0
db file parallel read                                 8           0           0.90        .112           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.67        .067         153
events in waitclass Other                             2           0           0.04        .018         109

The statistic I want to highlight is the number recorded against “local write wait”: truncating a table of one row we wait for 490 blocks to be written! We also have 8 “db file parallel read”  waits which, according to a 10046 trace file, were reading hundreds of blocks. (I think the most significant time in this test – the RO enqueue wait – may have been waiting for the database writer to complete the work needed for an object checkpoint, but I’m not sure of that.)

The blocks written were the space management bitmap blocks for the extent(s) that remained after the truncate – even the ones that referenced extents above the high water mark for the table. Since we had set the tables initial storage to 1GB, we had a lot of bitmap blocks. At 32 per extent and 16 extents (64MB * 16 = 1GB) we might actually expect something closer to 512 blocks, but actually Oracle had formatted the last extent with only 8 space management blocks. and the first extent had an extra 2 to cater for the level 2 bitmap lock and segment header block giving: 32 * 15 + 8 + 2 = 490.

As you may have seen above, the impact on the test that Kevin was doing was quite dramatic – he had set the initial storage to 128GB (lots of bitmap blocks), partitioned the table (more bitmap blocks) and was running RAC (so the reads were running into waits for global cache grants).

I had assumed that this type of behaviour happened only with the “reuse storage” option of the truncate command: and I hadn’t noticed before that it also appeared even if you didn’t reuse storage – but that’s probably because the effect applies only to the bit you keep, which may typically mean a relatively small first extent. It’s possible, then, that in most cases this is an effect that isn’t going to be particularly visible in production systems – but if it is, can you work around it ? Fortunately another tweeter asked the question “What happens if you ‘drop all storage?'” Here’s the result from adding that clause to my test case:

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
enq: RO - fast object reuse                           1           0           0.08        .079           0
log file sync                                         1           0           0.03        .031           0
db file sequential read                              51           0           0.06        .001           0
SQL*Net message to client                            10           0           0.00        .000           0
SQL*Net message from client                          10           0           0.56        .056         123
events in waitclass Other                             3           0           0.87        .289         186

Looking good – if you don’t keep any extents you don’t need to make sure that their bitmaps are clean. (The “db file sequential read” waits are almost all about the data dictionary, following on from my “flush buffer cache”).

Footnote 1: the same effect appears in
Footnote 2: it’s interesting to note that the RO enqueue wait time seems to parallel the local write wait time: perhaps a hint that there’s some double counting going on. (To be investigated, one day).

August 21, 2014

Quiz night

Filed under: CBO,Indexing,NULL,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 6:05 pm BST Aug 21,2014

Here’s a script to create a table, with index, and collect stats on it. Once I’ve collected stats I’ve checked the execution plan to discover that a hint has been ignored (for a well-known reason):

create table t2
        mod(rownum,200)         n1,
        mod(rownum,200)         n2,
        rpad(rownum,180)        v1
        rownum <= 3000

create index t2_i1 on t2(n1);

                method_opt => 'for all columns size 1'

explain plan for
select  /*+ index(t2) */
from    t2
where   n2 = 45

select * from table(dbms_xplan.display);

| Id  | Operation         | Name | Rows  | Bytes | Cost  |
|   0 | SELECT STATEMENT  |      |    15 |   120 |    15 |
|*  1 |  TABLE ACCESS FULL| T2   |    15 |   120 |    15 |

Predicate Information (identified by operation id):
   1 - filter("N2"=45)

Of course we don’t expect the optimizer to use the index because we didn’t declare n1 to be not null, so there may be rows in the table which do not appear in the index. The only option the optimizer has for getting the right answer is to use a full tablescan. So the question is this – how come Oracle will obey the hint in the following SQL statement:

explain plan for
                leading (t2 t1)
                index(t2) index(t1)
        t2.n1, t1.n2
        t2      t2,
        t2      t1
        t2.n2 = 45
and     t2.n1 = t1.n1

select * from table(dbms_xplan.display);

| Id  | Operation                             | Name  | Rows  | Bytes | Cost  |
|   0 | SELECT STATEMENT                      |       |   225 |  3600 |  3248 |
|   1 |  NESTED LOOPS                         |       |   225 |  3600 |  3248 |
|   2 |   NESTED LOOPS                        |       |   225 |  3600 |  3248 |
|*  3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T2    |    15 |   120 |  3008 |
|   4 |     INDEX FULL SCAN                   | T2_I1 |  3000 |       |     8 |
|*  5 |    INDEX RANGE SCAN                   | T2_I1 |    15 |       |     1 |
|   6 |   TABLE ACCESS BY INDEX ROWID         | T2    |    15 |   120 |    16 |

Predicate Information (identified by operation id):
   3 - filter("T2"."N2"=45)
   5 - access("T2"."N1"="T1"."N1")

I ran this on, but it does the same on earlier versions.


This was clearly too easy – posted at 18:04, answered correctly at 18:21. At some point in it’s evolution the optimizer acquired a rule that allowed it to infer unwritten “is not null” predicates from the join predicate.




July 13, 2014

Deferrable RI – 2

Filed under: Infrastructure,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 7:46 pm BST Jul 13,2014

A question came up on Oracle-L recently about possible locking anomalies with deferrable referential integrity constraints.

An update by primary key is taking a long time; the update sets several columns, one of which is the child end of a referential integrity constraint. A check on v$active_session_history shows lots of waits for “enq: TX – row lock contention” in mode 4 (share), and many of these waits also identify the current object as the index that has been created to avoid the “foreign key locking” problem on this constraint (though many of the waits show the current_obj# as -1). A possible key feature of the issue is that foreign key constraint is defined as “deferrable initially deferred”. The question is, could such a constraint result in TX/4 waits.

My initial thought was that if the constraint was deferrable it was unlikely, there would have to be other features coming into play.

Of course, when the foreign key is NOT deferrable it’s easy to set up cases where a TX/4 appears: for example you insert a new parent value without issuing a commit then I insert a new matching child, at that point my session will wait for your session to commit or rollback. If you commit my insert succeeds if you rollback my session raises an error (ORA-02291: integrity constraint (schema_name.constraint_name) violated – parent key not found). But if the foreign key is deferred the non-existence (or potential existence, or not) of the parent should matter.  If the constraint is deferrable, though, the first guess would be that you could get away with things like this so long as you fixed up the data in time for the commit.

I was wrong. Here’s a little example:

create table parent (
	id	number(4),
	name	varchar2(10),
	constraint par_pk primary key (id)

create table child(
	id_p	number(4)
		constraint chi_fk_par
		references parent
		deferrable initially deferred,
	id	number(4),
	name	varchar2(10),
	constraint chi_pk primary key (id_p, id)

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');



pause Press return

update child set id_p = 3 where id_p = 2 and id = 2;

If you don’t do anything after the pause and before the insert then the update will succeed – but fail on a subsequent commit unless you insert parent 3 before committing. But if you take advantage of the pause to use another session to insert parent 3 first, the update will then hang waiting for the parent insert to commit or rollback – and what happens next may surprise you. Basically the deferrability doesn’t protect you from the side effects of conflicting transactions.

The variations on what can happen next (insert the parent elsewhere, commit or rollback) are interesting and left as an exercise.

I was slightly surprised to find that I had had a conversation about this sort of thing some time ago, triggered by a comment to an earlier post. If you want to read a more thorough investigation of the things that can happen and how deferrable RI works then there’s a good article at this URL.


May 23, 2014

10053 trace

Filed under: CBO,Oracle,Troubleshooting — Jonathan Lewis @ 1:37 pm BST May 23,2014

I published a note yesterday about enabling SQL trace system-wide for a single statement – and got a response on twitter from Bertrand Drouvot referencing a blog post he’d done a few months ago about using a similar method to dump the optimizer trace (10053) for a statement whenever it was optimized. He also highlighted the dbms_sqldiag package with its dump_trace() procedure – which is something I’d wanted to use a couple of weeks ago but couldn’t remember the package name – and supplied a reference to MoS note 225598.1 which then led on to Greg Rahn’s blog note about dbms_sqldiag.

The package is so important that I decided I’d create a reference note on my blog about it (I finally found my own reference notes on my laptop after getting a clue from Bertrand’s blog about the filename); but, at the same time, I think I have to report that it might not have helped me when I was looking for it two weeks ago. Here’s a critical paragraph from the MoS note:

How to Obtain Tracing of Optimizer Computations (EVENT 10053) (Doc ID 225598.1)

NOTE: The parse environment uses information captured in V$SQL_OPTIMIZER_ENV which does not record all information about the environment that parsed the query such as NLS settings. The result of this is that trace extracted from this may not always generate a trace that is truly representative of what happens when parsed from an application client. For example, if you parse from a client with NLS_SORT set differently to the Database then the application plan may be different to the database and so a trace generated from V$SQL_OPTIMIZER_ENV may cause confusion when the plan for a given cursor in the application is different to the one extracted.

The particular task I wanted to do was to re-optimize a query that had been run by another user (I was logged on as SYS at the time) because the critical user would have added a security predicate based on their current context to the table definitions in the query – and I didn’t know what that predicate would look like, but I knew it would have affected the execution path.  Judging from the note, though, it seems likely that the call to the package wouldn’t have re-acquired the user’s context and might, therefore, not be able to regenerate the necessary predicates. On the other hand, the actual security predicates used during the original optimisation could still be in-memory (in v$vpd_policy) so maybe dump_trace() would be able to do something about them.

Status: to be tested when I next have a few minutes free.


May 22, 2014


Filed under: Oracle,trace files,Troubleshooting — Jonathan Lewis @ 1:24 pm BST May 22,2014

Here’s a convenient aid to trouble-shooting that appeared in 11g with its enhancements to setting events. It’s a feature that may help you to work out (among other things) why a given statement seems to have a highly variable performance profile. If you can find the SQL_ID for a parent cursor you can enable tracing for just that cursor whenever it executes, whoever executes it.

--	Last tested:

define m_sql_id = '&1'
define m_sql_id = '9tz4qu4rj9rdp'

alter system set events
	sql_trace[SQL: &m_sql_id ]

pause Press return to stop tracing

alter system set events
	sql_trace[SQL: &m_sql_id ]

This is the whole of a little script I’ve got – the generic &1 is how I normally use it, I’ve just included a specific value (which is the sql_id for “select count(*) from all_objects;” as an example – that starts tracing across the entire system, but only for a given SQL_ID. On a production system, if I think I really need to do this, I would check the expected frequency of execution for the statement and wait enough time to capture a few occurrences before disabling the trace. Each session generates its own trace file, but if you’ve been sufficiently patient you get a reasonable sampling of the different workloads – and if you’ve captured the bind variables as well, this may give you some clues about why different work loads can appear.

A nice detail about this feature is that if the SQL_ID is for a pl/sql block, all the SQL executing inside the block is traced as part of the trace on the block (and that was particularly helpful the last time I had to make use of the feature); in the sample I’ve given I also found that some recursive SQL – relating to the XMLSCHEMA object types – executed within the view was also traced as the main statement was traced … so that makes it easy to see the effects of SQL statements calling PL/SQL functions that contain SQL statements.


May 14, 2014

Feature Bypass

Filed under: CBO,Oracle,Troubleshooting — Jonathan Lewis @ 1:23 pm BST May 14,2014

Here’s a little tip that might be helpful occasionally when you’re trying to work out why the optimizer transformation you were expecting isn’t appearing

If you’ve ever checked the 10053 trace (and who wants to do that for a complex query) you may have noticed lines like:

SU: SU bypassed: Remote table referenced.

So now you know that SU – Subquery Unnesting – has limitations in distributed queries.

When I first saw a line like this, it crossed my mind that it would be useful to keep a reference list of features that could be reported as bypassed, which I do through a simple unix line:

strings -a oracle | grep -i bypassed > bypassed.txt

If you need a reference for the various short codes for transformations you can find it near the top of the 10053 trace, looking like this:

[sourecode gutter=”false”]
CBQT – cost-based query transformation
JPPD – join predicate push-down
OJPPD – old-style (non-cost-based) JPPD
FPD – filter push-down
PM – predicate move-around
CVM – complex view merging
SPJ – select-project-join
SJC – set join conversion
SU – subquery unnesting
OBYE – order by elimination
OST – old style star transformation
ST – new (cbqt) star transformation
CNT – count(col) to count(*) transformation
JE – Join Elimination
JF – join factorization
SLP – select list pruning
DP – distinct placement
… big gap here
AP – adaptive plans

April 4, 2014

NVL() change

Filed under: CBO,Oracle,Troubleshooting,Upgrades — Jonathan Lewis @ 6:10 pm BST Apr 4,2014

One of the problems of functions is that the optimizer generally doesn’t have any idea on how a predicate based on function(col) might affect the cardinality. However,  the optimizer group are constantly refining the algorithms to cover an increasing number of special cases more accurately. This is a good thing, of course – but it does mean that you might be unlucky on an upgrade where a better cardinality estimate leads to a less efficient execution plan. Consider for example the simple query (where d1 is column of type date):

select	*
from	t1
where	nvl(d1,to_date('01-01-1900','dd-mm-yyyy')) < sysdate

Now, there are many cases in many versions of Oracle, where the optimizer will appear to calculate the cardinality of

nvl(columnX,{constant}) operator {constant}

as if it were:

columnX is null or columnX operator {constant}

Unfortunately this doesn’t seem to be one of them – until you get to 11.2.something. Here’s a little demonstration code:

create table t1
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
		when rownum > 100 then null else sysdate - rownum
	end	d1
	generator	v1,
	generator	v2
	rownum <= 50000

		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'

set autotrace traceonly explain

prompt	query with NVL

select	*
from	t1
where	nvl(d1,to_date('01-01-1900','dd-mm-yyyy')) < sysdate

prompt	query with OR clause

select	*
from	t1
where	d1 is null or d1 < sysdate

If you run this code in you get the following – with numeric variations for cost (which I’m interested not in at the moment):

query with NVL
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |  2500 |  2500 |    18  (39)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  2500 |  2500 |    18  (39)| 00:00:01 |

Predicate Information (identified by operation id):
   1 - filter(NVL("D1",TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd

query with OR clause
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      | 50000 | 50000 |    13  (16)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 50000 | 50000 |    13  (16)| 00:00:01 |

Predicate Information (identified by operation id):
   1 - filter("D1" IS NULL OR "D1"<SYSDATE@!)

Take note particularly of the difference in the estimated cardinality for the tablescans.

When you upgrade to (possibly earlier – though there are some nvl() related patches that appeared only in, you get this:

query with NVL
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      | 50000 | 50000 |    18  (39)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 50000 | 50000 |    18  (39)| 00:00:01 |

Predicate Information (identified by operation id):

   1 - filter(NVL("D1",TO_DATE(' 1900-01-01 00:00:00', 'syyyy-mm-dd

query with OR clause
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      | 49900 | 49900 |    13  (16)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 49900 | 49900 |    13  (16)| 00:00:01 |

Predicate Information (identified by operation id):
   1 - filter("D1" IS NULL OR "D1"<SYSDATE@!)

As you can see the estimate for the “NVL()” example is now correct – which means vastly different from the estimate in which seemed to be using the standard “5% for range-based predicate on function(col)”.

It’s interesting to note that a (relatively) small error has crept in to the “OR” example – interestingly the size of the error is exactly the number of rows where d1 is not null (which looks like enough of a coincidence to be a bug – but maybe there’s a good rationale for it)


Yet again, a simple upgrade has the capacity to make a dramatic change to a cardinality estimate – which could mean a significant change to an execution plan and major change in performance. If you’ve read this note, though, you may recognise the pattern that is the driving feature behind the problem.


If you have access to versions through and find that the test data produces different cardinalities please publish the results in the comments – it would be nice to know exactly when this change appears.  (You need only show the body of the execution plans  with labels, not the entire output of the script).


March 28, 2014


Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 8:12 am BST Mar 28,2014

One of the problems of “knowing” so much about Oracle is that the more you know the more you have to check on each new release of the software. An incoming ping on my posting “Lock Horror” reminded me that I was writing about, and the terminal release is, and the whole thing may have changed in – so I ought to re-run some tests to make sure that the articel is up to date if it’s likely to be read a few times in the next few days.

Unfortunately, although I often add a URL to scripts I’ve used to confirm results published in the blog, I don’t usually include a script name in my blog postings  to remind me where to go if I want to re-run the tests. So how do I find the right script(s) ? Typically I list all the likely scripts and compare dates with the date on the blog; so here’s what I got for “lock”.

SQL> host ls -ltr *lock*.sql | grep -v block
-rwxr-xr-x 1 jonathan dba 1569 Jun 28  2002 c_bitlock.sql
-rwxr-xr-x 1 jonathan dba 1303 Oct  5  2002 ddl_deadlock.sql
-rwxr-xr-x 1 jonathan dba 1875 Oct  7  2002 ddl_deadlock_2.sql
-rwxr-xr-x 1 jonathan dba 1654 Aug  6  2003 hw_lock.sql
-rwxr-xr-x 1 jonathan dba 2626 Sep 17  2004 lock_oddity.sql
-rwxr-xr-x 1 jonathan dba 1804 Sep 17  2004 lock_speed.sql
-rwxr-xr-x 1 jonathan dba 3194 May  8  2006 space_locks.sql
-rwxr-xr-x 1 jonathan dba 4337 Jan  3  2008 tm_deadlock.sql
-rwxr-xr-x 1 jonathan dba 1149 Jan  3  2008 show_lock.sql
-rwxr-xr-x 1 jonathan dba 2068 Apr 21  2008 hw_lock_2.sql
-rwxr-xr-x 1 jonathan dba 1482 Feb  5  2010 tt_lock.sql
-rwxr-xr-x 1 jonathan dba 1692 Feb 16  2010 to_lock.sql
-rwxr-xr-x 1 jonathan dba 3308 Jun  1  2010 skip_locked.sql
-rwxr-xr-x 1 jonathan dba 2203 Nov  2  2010 deadlock_statement.sql
-rwxr-xr-x 1 jonathan dba 2883 Nov  3  2010 merge_locking.sql
-rwxr-xr-x 1 jonathan dba 1785 Dec 14  2010 sync_lock.sql
-rwxr-xr-x 1 jonathan dba  984 Apr 23  2011 para_dml_deadlock.sql
-rwxr-xr-x 1 jonathan dba 4305 Jun  4  2011 locking_fifo.sql
-rwxr-xr-x 1 jonathan dba 5970 Jun  5  2011 locking_fifo_2.sql
-rwxr-xr-x 1 jonathan dba  917 Jun 30  2011 ul_deadlock.sql
-rwxr-xr-x 1 jonathan dba  936 Jul  8  2011 funny_deadlock.sql
-rwxr-xr-x 1 jonathan dba  741 Sep  8  2011 row_lock_wait_index.sql
-rwxr-xr-x 1 jonathan dba 2590 Nov 30  2012 fk_lock_stress.sql
-rwxr-xr-x 1 jonathan dba 4561 Feb  6  2013 dbms_lock.sql
-rwxr-xr-x 1 jonathan dba 1198 Apr  6  2013 libcache_locks.sql
-rwxr-xr-x 1 jonathan dba 5636 Nov 27 19:40 ash_deadlocks.sql
-rwxr-xr-x 1 jonathan dba  379 Mar 27 19:17 fk_constraint_locks.sql

Nothing leaps out as an obvious candidate, though “funny_deadlock.sql” catches my eye for future reference; maybe I should look for “foreign key”.

SQL> host ls -ltr *fk*.sql | grep -v fkr
-rwxr-xr-x 1 jonathan dba  2140 Jun 16  2005 fk_check.sql
-rwxr-xr-x 1 jonathan dba  2897 Jun 16  2005 fk_order.sql
-rwxr-xr-x 1 jonathan dba   650 Oct 26  2007 pk_fk_null.sql
-rwxr-xr-x 1 jonathan dba  5444 Nov  4  2007 c_fk.sql
-rwxr-xr-x 1 jonathan dba  1568 Dec  5  2008 null_fk.sql
-rwxr-xr-x 1 jonathan dba  2171 Mar  2  2009 fk_anomaly_2.sql
-rwxr-xr-x 1 jonathan dba  3922 Mar  2  2009 fk_anomaly.sql
-rwxr-xr-x 1 jonathan dba  5512 Oct 15  2009 fk_check_2.sql
-rwxr-xr-x 1 jonathan dba  1249 Feb 15  2010 c_pk_fk_2.sql
-rwxr-xr-x 1 jonathan dba  1638 Feb 16  2010 c_pk_fk_3.sql
-rwxr-xr-x 1 jonathan dba  5121 Jun  1  2012 c_pt_fk_2.sql
-rwxr-xr-x 1 jonathan dba  4030 Jun  5  2012 c_pt_fk_3.sql
-rwxr-xr-x 1 jonathan dba  2062 Jun  5  2012 c_pt_fk_3a.sql
-rwxr-xr-x 1 jonathan dba  2618 Sep 23  2012 c_pk_fk_02.sql
-rwxr-xr-x 1 jonathan dba  1196 Oct 19  2012 deferrable_fk.sql
-rwxr-xr-x 1 jonathan dba  2590 Nov 30  2012 fk_lock_stress.sql
-rwxr-xr-x 1 jonathan dba  4759 Sep  1  2013 fk_bitmap.sql
-rwxr-xr-x 1 jonathan dba  1730 Sep 30 07:51 virtual_fk.sql
-rwxr-xr-x 1 jonathan dba  3261 Dec 22 09:41 pk_fk_gets.sql
-rwxr-xr-x 1 jonathan dba  8896 Dec 31 13:19 fk_delete_gets.sql
-rwxr-xr-x 1 jonathan dba 10071 Dec 31 14:52 fk_delete_gets_2.sql
-rwxr-xr-x 1 jonathan dba  4225 Jan 14 11:15 c_pk_fk.sql
-rwxr-xr-x 1 jonathan dba  2674 Jan 14 13:42 append_fk.sql
-rwxr-xr-x 1 jonathan dba  1707 Feb 10 12:34 write_cons_fk.sql
-rwxr-xr-x 1 jonathan dba  9677 Feb 24 17:23 c_pt_fk.sql
-rwxr-xr-x 1 jonathan dba   379 Mar 27 19:17 fk_constraint_locks.sql

(The “-fkr” is to eliminate scripts about “first K rows optimisation”). With a little luck, the dates are about right, c_pk_fk_2.sql and c_pk_fk_3.sql will be relevant. So keep an eye on “Lock Horror” for an update in the next few days.

You’ll notice that some of the scripts have a very old datestamp on them – that’s an indication of how hard it is to keep up; when I re-run a script on a new version of Oracle I invariably add a “Last Tested:” version to the header, and a couple of notes about changes.  A couple of my scripts date back to June 2001 – but that is, at least, the right century, and some people are still using Oracle 7.


It should be obvious that I can’t test everything on every new release – but it’s amazing how often on a client site I can recognize a symptom and pick at script that I’ve used in the past to construct the problem – and that’s when a quick bit of re-testing helps me find a solution or workaround (or Oracle bug note).


March 26, 2014


Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 2:13 pm BST Mar 26,2014

Here’s a little test you might want to try. Examine the following script, and decide what sort of symptoms you would see in the AWR report.

create global temporary table gtt1(n1 number);

execute dbms_workload_repository.create_snapshot;

insert into gtt1 values(1);
truncate table gtt1;

-- repeat insert/truncate for a total of 100 cycles

execute dbms_workload_repository.create_snapshot;

-- generate an AWR report across the interval.

I don’t need anyone to tell me their results – but if your predictions and the actual results match then you can give yourself a pat on the head.
You might also like to enable SQL trace for all the inserts/truncate to see if that shows you anything interesting.

This is one of the simpler scripts of the 3,500 I have on my laptop that help me interpret the symptoms I see in client systems.

March 3, 2014

Flashback Fail ?

Filed under: Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 4:19 pm BST Mar 3,2014

Sitting in an airport, waiting for a plane, I decided to read a note (pdf) about Flashback data archive written by Beat Ramseier from Trivadis.  I’d got about three quarters of the way through it when I paused for thought and figured out that on the typical database implementation something nasty is going to happen after approximately 3 years and 9 months.  Can you guess why ?

It’s all about smon_scn_time – which normally records one row every five minutes (created by smon) with a continuous cycle of 24 hours – typically giving you about 1,440 rows in the table. The table is in a cluster, and the cluster key is the instance (thread) number. Clearly this was originally a clever idea from someone who realised that a cluster key of thread number would be beneficial if you had a RAC system with multiple instances – each instance gets its own blocks and the data for any one instance is as well clustered as possible.

The trouble is, when you enable flashback data archive smon no longer sticks to a 24 hour cycle, it just keeps adding rows. Now on my 8KB block tablespace I see 6 rows per block in the table/cluster – which means I get through 48 blocks per days,  17,520 blocks per year, and in 3 years and 9 months I’ll get to roughly 65,700 blocks – and that’s the problem.  An index entry in a cluster index points to a chain of cluster blocks, and the last two bytes of the “rowid” in the index entry identify which block within the chain the cluster key scan should start at – and two bytes means you can only have 65,536 blocks for a single cluster key.

I don’t know what’s going to happen when smon tries to insert a row into the 65,535th (-ish) block for the current thread – but it ought to raise an Oracle error, and then you’ll probably have to take emergency action to make sure that the flashback mechanisms carry on running.

Although oraus.msg indicates that it’s an error message about hash clusters it’s possible that the first sight will be: Oracle error: “ORA-02475 maximum cluster chain block count of %s has been exceeded”. If you’re using a 16KB block size then you’ve got roughly 7.5 years, and 32KB block sizes give you about 15 years (not that that’s a good argument for selecting larger block sizes, of course.)


Searching MoS for related topics (smon_scn_time flashback) I found doc ID: 1100993.1 from which we could possibly infer that the 1,440 rows was a fixed limit in 10g but that the number of rows allowed in smon_scn_time could increase in 11g if you enable automatic undo management. I also found a couple of bugs relating to index or cluster corruption – fixed by, though.



February 10, 2014

Row Migration

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:55 pm BST Feb 10,2014

At one of the presentations I attended at RMOUG this year the presenter claimed that if a row kept increasing in size and had to migrate from block to block as a consequence then each migration of that row would leave a pointer in the previous block so that an indexed access to the row would start at the original table block and have to follow an ever growing chain of pointers to reach the data.

This is not correct, and it’s worth making a little fuss about the error since it’s the sort of thing that can easily become an urban legend that results in people rebuilding tables “for performance” when they don’t need to.

Oracle behaves quite intelligently with migrated rows. First, the migrated row has a pointer back to the original location and if the row has to migrate a second time the first place that Oracle checks for space is the original block, so the row might “de-migrate” itself; however, even if it can’t migrate back to the original block, it will still revisit the original block to change the pointer in that block to refer to the block it has moved on to – so the row is never more than one step away from its original location. As a quick demonstration, here’s some code to generate and manipulate some data:


RAC Plans

Filed under: Execution plans,Hints,Oracle,RAC,Troubleshooting — Jonathan Lewis @ 1:12 pm BST Feb 10,2014

Recently appeared on Mos – “Bug 18219084 : DIFFERENT EXECUTION PLAN ACROSS RAC INSTANCES”

Now, I’m not going to claim that the following applies to this particular case – but it’s perfectly reasonable to expect to see different plans for the same query on RAC, and it’s perfectly possible for the two different plans to have amazingly different performance characteristics; and in this particular case I can see an obvious reason why the two nodes could have different plans.

Here’s the query reported in the bug:


February 5, 2014


Filed under: Execution plans,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 5:42 pm BST Feb 5,2014

Here’s a little script to demonstrate an observation about a missed opportunity for avoiding work that appeared in my email this morning (that’s morning Denver time):


« Previous PageNext Page »

Blog at