Oracle Scratchpad

May 13, 2013

Parse Time

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:59 pm UTC May 13,2013

Dominic Brooks published a note recently about some very nasty SQL – originally thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a run-time problem, but when I first read Dominic’s note I was sufficiently surprised that I decided to try modelling the query.

Unfortunately the query had more than 1,000 predicates, (OR’ed together) and some of them included in-lists.  Clearly, writing this up by hand wasn’t going to be a good idea, so I wrote a script to generate both the data, and the query, as follows – first a table to query:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum		id1,
	rownum		id2,
	rownum		id,
	lpad(rownum,10)	v1,
	rpad('x',100)	padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5
;

create index t1_i1 on t1(id1, id2);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

Then a piece of code to write a nasty query:

set pagesize 0
set feedback off
set termout off

spool temp1.sql

prompt select * from t1 where 1 = 2

select
	'or (id1 = ' || rownum || ' and id2 = ' || (rownum + 1) || ')'
from
	t1
where
	rownum <= 750
union all
select
	'or ( id1 =  ' || (rownum + 1000) || ' and id2 in (' || rownum || ',' || (rownum+1) || '))'
from
	t1
where
	rownum <= 250
;

prompt /

spool off

Here’s an example of the text generated by the code – with the parameters set to 5 and 3 respectively (and notice how I’ve rigged the query so that it doesn’t return any data, whatever the optimizer thinks):

select * from t1 where 1 = 2
or (id1 = 1 and id2 = 2)
or (id1 = 2 and id2 = 3)
or (id1 = 3 and id2 = 4)
or (id1 = 4 and id2 = 5)
or (id1 = 5 and id2 = 6)
or ( id1 =  1001 and id2 in (1,2))
or ( id1 =  1002 and id2 in (2,3))
or ( id1 =  1003 and id2 in (3,4))
/

So here’s the plan from the above query:


---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     8 |  1008 |    16   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   8 |   INLIST ITERATOR             |       |       |       |            |          |
|   9 |    TABLE ACCESS BY INDEX ROWID| T1    |     5 |   630 |     7   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | T1_I1 |     5 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID1"=1003)
       filter("ID2"=3 OR "ID2"=4)
   5 - access("ID1"=1002)
       filter((LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND
              ("ID2"=2 OR "ID2"=3))
   7 - access("ID1"=1001)
       filter((LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND ("ID2"=1 OR
              "ID2"=2))
  10 - access(("ID1"=1 AND "ID2"=2 OR "ID1"=2 AND "ID2"=3 OR "ID1"=3 AND
              "ID2"=4 OR "ID1"=4 AND "ID2"=5 OR "ID1"=5 AND "ID2"=6))
       filter((LNNVL("ID1"=1001) OR LNNVL("ID2"=1) AND LNNVL("ID2"=2)) AND
              (LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)))

As you can see, the first five predicates end up in line 10 of the plan with 10 repetitions (5 * 2) of the lnnvl() function. The last three predicates show up in lines 3, 5, and 7 – and the on each line we see two more lnnvl() calls than on the previous – just imagine, then, how many lnnvl() calls the optimizer will have added to the query plan by the time we have 750 occurrences in the inlist iterator (line 8) and 250 occurrences of the slightly complex predicate. Here are the relevant CPU stats (from v$session_stats) from running the generated script on 11.1.0.7, on Windows 32-bit, 2.8GHz CPU:

Name                                           Value
----                                           -----
recursive cpu usage                            1,848
CPU used when call started                     1,854
CPU used by this session                       1,854
DB time                                        1,870
parse time cpu                                 1,847
parse time elapsed                             1,862

Clearly the parse time is extreme – though not as dramatic as in Dominic’s example; but having set up the first draft of the sample code it’s easy enough to change the number of occurrences of each type of predicate, and it’s pretty easy to make longer in-lists in the more complex of the two types of predicate. It’s not too difficult to get an execution plan that mimics Dominic’s in length and time to parse.

It’s not just the parse times that are interesting when you start doing this, by the way – it’s worth playing around to see what happens. It’s probably best to run the query to pull the plans from memory if you want to see the plans, though – if you try using “explain plan” then you start using memory in the SGA for some of the work: in one of my examples I had to abort the instance after a few minutes.

May 10, 2013

Hakan Factor

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 3:52 pm UTC May 10,2013

Here’s a quick and dirty script to create a procedure (in the SYS schema – so be careful) to check the Hakan Factor for an object. If you’re not familiar with the Hakan Factor, it’s the value that gets set when you use the command “alter table minimize records_per_block;”.

I was prompted to publish this note by an item on the OTN SQL forum describing a problem with partition exchange with a table when there were bitmap indexes in place and the table had been changed to have some extra columns added. (Problem as yet unresolved as I publish).

If you start playing with the Hakan Factor, you’ll find that there are some odd little bugs in what gets stored and how it gets used. (SQL updated to use bitand() to reflect comments below and Karsten Spang’s blog note; also edited following a comment on OTN to show the rest of the spare1 flag bits)


create or replace procedure show_hakan(
	i_table		in	varchar2,
	i_owner		in	varchar2 default user
) as
	m_obj		number(8,0);
	m_flags		varchar2(12);
	m_hakan		number(8,0);
begin

	/* created by show_hakan.sql	*/
	select
		obj#, 
/*
		case 
			when	(spare1 > 5 * power(2,15)) 
				then	(spare1 - 5 * power(2,15))
			when	(spare1 > power(2,17)) 
				then	(spare1 - power(2,17))
			when	(spare1 > power(2,15)) 
				then	(spare1 - power(2,15))
				else	spare1
		end 				hakan
*/
	to_char(
		bitand(
			spare1, to_number('ffff8000','xxxxxxxx')
		),
		'xxxxxxxx'
	)					flags,
	bitand(spare1, 32767)			hakan	-- 0x7fff	
	into
		m_obj,
		m_flags,
		m_hakan
	from
		tab$
	where	obj# in (
			select	object_id 
			from	dba_objects
			where	object_name = upper(i_table)
			and	object_type = 'TABLE'
			and	owner = upper(i_owner)
		)
	;

	dbms_output.put_line(
		'Hakan factor for object ' || 
		m_obj   || ' (' ||
		i_owner || '.' || 
		i_table || ') is ' || 
		m_hakan || ' with flags ' ||
		m_flags
	);
end;
/

drop public synonym show_hakan;
create public synonym show_hakan for show_hakan;
grant execute on show_hakan to public;

 

You’ll notice that I’ve done an “upper()” on the table and owner – that means you’re in trouble if you have created an schemas or tables with mixed-case names (but you wouldn’t do that in a production system, would you?)

Update – A little bug history

One of the odd details of the Hakan factor is that the value it shows is one less than the number of rows that will be stored in a block; and since it looks as if the factor is not allowed to drop to zero, you can’t hack the Hakan factor to force one row per block.

So here’s a (trivial and sub-optimal) piece of code to check current number of rows per block in a simple heap table (assuming the tablespace consists of a single file):

select 
	ct, count(*) 
from	(
		select	dbms_rowid.rowid_block_number(rowid), count(*) ct  
		from	t1
		group by
			dbms_rowid.rowid_block_number(rowid)
	)
group by ct
order by ct
;

Here’s the output of a session, running under 9.2.0.8, cut and pasted from the screen:

SQL> @afiedt.buf

        CT   COUNT(*)
---------- ----------
         9          1
        16          1
SQL> alter table t1 nominimize records_per_block;
SQL> alter table t1 minimize records_per_block;
SQL> execute show_hakan('t1')
Hakan factor for object 48865 (TEST_USER.t1) is: 15
SQL> alter table t1 move;
SQL> @afiedt.buf

        CT   COUNT(*)
---------- ----------
        10          1
        15          1
SQL> alter table t1 nominimize records_per_block;
SQL> alter table t1 minimize records_per_block;
SQL> execute show_hakan('t1')
Hakan factor for object 48865 (TEST_USER.t1) is: 14

Every time you moved the table, 9.2.0.8 (and earlier) used the actual stored value of the Hakan Factor to rebuild the table; but if you regenerated the Hakan Factor the stored value was one less than the actual row count. So if you kept repeating the process the number of rows per block would decrease by one each time and the table would get bigger and bigger.

It’s a silly example – but the real-world relevance was that a direct path insert behaved differently from a normal insert and this could result in a significant amount of wasted space if you were doing bulk loads in your overnight batch; so the code changed in 10g to make the normal and direct path inserts consistent with each other, but the change went the wrong way and, as a side effect, you get one more row per block than suggested by the Hakan Factor – and you can’t trick the Hakan factor into enforcing one row per block any more.

May 9, 2013

Clustering_factor

Filed under: CBO,Indexing,Oracle — Jonathan Lewis @ 8:14 am UTC May 9,2013

Cost Based Oracle – Fundamentals (November 2005)

But the most interesting function for our purposes is sys_op_countchg(). Judging from its name, this function is probably counting changes, and the first input parameter is the block ID portion (object_id, relative file number, and block number) of the table’s rowid, so the function is clearly matching our notional description of how the clustering_factor is calculated. But what is that 1 we see as the second parameter?

When I first understood how the clustering_factor was defined, I soon realized that its biggest flaw was that Oracle wasn’t remembering recent history as it walked the index; it only remembered the previous table block so that it could check whether the latest row was in the same table block as last time or in a new table block. So when I saw this function, my first guess (or hope) was that the second parameter was a method of telling Oracle to remember a list of previous block visits as it walked the index.

And finally, Oracle Corp. had implemented an official interface to the second parameter of sys_op_countchg() – provided you install the right patch – through a new table (or schema, or database) preference type available to the dbms_stats.set_table_prefs() procedure.

I’ve been meaning to write this post for two or three months, ever since Sean Molloy sent me an email about short blog note from Martin Decker describing Bug 13262857  Enh: provide some control over DBMS_STATS index clustering factor computation. Unfortunately I’ve not yet had time to investigate the patch, but I don’t think I need to any more because Richard Foote has written it up in his latest blog post.

Read Richard’s post – it’s important.

Update 10th May

Richard’s post has, unsurprisingly, produced a buzz of excitement in his reader – and started up the discussion of how best to use this capability; so here’s another quote from the book (p.111 – available in the download of chapter 5):

So using Oracle’s own function for calculating the clustering_factor, but substituting the freelists value for the table, may be a valid method for correcting some errors in the clustering_factor for indexes on strongly sequenced data. (The same strategy applies if you use multiple freelist groups—but multiply freelists by freelist groups to set the second parameter.)

Can a similar strategy be used to find a modified clustering_factor in other circumstances? I think the answer is a cautious “yes” for tables that are in ASSM tablespaces. Remember that Oracle currently allocates and formats 16 new blocks at a time when using automatic segment space management (even when the extent sizes are very large, apparently). This means that new data will be roughly scattered across groups of 16 blocks, rather than being tightly packed.

Calling Oracle’s sys_op_countchg() function with a parameter of 16 could be enough to produce a reasonable clustering_factor where Oracle currently produces a meaningless one. The value 16 should, however, be used as an upper bound. If your real degree of concurrency is typically less than 16, then your actual degree of concurrency would probably be more appropriate.

Whatever you do when experimenting with this function—don’t simply apply it across the board to all indexes, or even all indexes on a particular table. There will probably be just a handful of critical indexes where it is a good way of telling Oracle a little more of the truth about your system—in other cases you will simply be confusing the issue.

Note particularly the comments about how the best value depends on the data in the indexed columns, the table configuration, and the degree of concurrency - you don’t necessarily want to use the same value for every index on a given table. That’s a shame, since Oracle has defined the interface as a TABLE preference, so if you set it then you get the same for every index. Despite this, if you’re prepared to put in a little control work, it does mean that you can use an official Oracle mechanism to play the game I was suggesting in the book – for each “special” index, set the preference, collect the stats, then clear the preference.

May 5, 2013

SLOB

Filed under: Oracle,Performance — Jonathan Lewis @ 6:21 pm UTC May 5,2013

Anyone who has used Kevin Closson’s “Silly Little Oracle Benchmark” will want to know about his significantly updated SLOB2.

May 2, 2013

v$lock

Filed under: Locks,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 7:03 pm UTC May 2,2013

The problem of slow queries on v$lock just came up again on the OTN database forum, so I thought I’d better push out a post that’s been hanging around on my blog for the last few months. This is actually mentioned in MOS in note 1328789.1: “Query Against v$lock Run from OEM Performs Slowly” which points out that it is basically a problem of bad statistics and all you have to do is collect the stats.

However, because the view is a messy union and join of several dynamic performance views sitting on top of a load of x$ structures, the advisory from MOS is to call the procedure  dbms_stats.gather_fixed_objects_stats. This is not a nice thing to do on a busy production system, especially if it has a large number of users and a large shared pool – and doing it at a representative run-time is important if you’re going to do it at all.

There is an alternative. Here’s the basic execution plan (on my 11.2.0.2) for “select * from v$lock”:

-----------------------------------------------------
| Id  | Operation              | Name       | Rows  |
-----------------------------------------------------
|   0 | SELECT STATEMENT       |            |     1 |
|   1 |  HASH JOIN             |            |     1 |
|   2 |   VIEW                 | GV$_LOCK   |    10 |
|   3 |    UNION-ALL           |            |       |
|   4 |     FILTER             |            |       |
|   5 |      VIEW              | GV$_LOCK1  |     2 |
|   6 |       UNION-ALL        |            |       |
|   7 |        FIXED TABLE FULL| X$KDNSSF   |     1 |
|   8 |        FIXED TABLE FULL| X$KSQEQ    |     1 |
|   9 |     FIXED TABLE FULL   | X$KTADM    |     1 |
|  10 |     FIXED TABLE FULL   | X$KTATRFIL |     1 |
|  11 |     FIXED TABLE FULL   | X$KTATRFSL |     1 |
|  12 |     FIXED TABLE FULL   | X$KTATL    |     1 |
|  13 |     FIXED TABLE FULL   | X$KTSTUSC  |     1 |
|  14 |     FIXED TABLE FULL   | X$KTSTUSS  |     1 |
|  15 |     FIXED TABLE FULL   | X$KTSTUSG  |     1 |
|  16 |     FIXED TABLE FULL   | X$KTCXB    |     1 |
|  17 |   MERGE JOIN CARTESIAN |            |   100 |
|  18 |    FIXED TABLE FULL    | X$KSUSE    |     1 |
|  19 |    BUFFER SORT         |            |   100 |
|  20 |     FIXED TABLE FULL   | X$KSQRS    |   100 |
-----------------------------------------------------

Note, particularly, the Cartesian merge join at line 17, which assumes there will be one row from v$session (x$ksuse) joined to 100 rows from v$resource (x$ksqrs – the structure used to represent any resources that you want to lock). The big UNION ALL is then all the different types of locks (enqueues) that you might attach to a resource. The estimates relating to these two structures are the most significant problem – v$session always has FAR more than one row in it, and v$resource isn’t small: in my little system the Cartesian join produced about 325,000 rows, and that was just after starting up the database.

Having identified a couple of critical tables, I decided to see what would happen if I just collected stats on these two objects rather than doing the whole system, and the following little piece of pl/sql did what I wanted:

begin
        dbms_stats.gather_table_stats('SYS','x$ksuse',method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('SYS','x$ksqrs',method_opt=>'for all columns size 1');
end;
/

The effect of the stats was to change the plan to the following which, while it might not be the absolute best in all cases, is certainly better than the previous one (it’s possible that you may also find that it helps to collect stats on x$ksqeq (the “generic enqueues” structure) which is likely to be the next most significant in terms of number of rows):

------------------------------------------------------
| Id  | Operation               | Name       | Rows  |
------------------------------------------------------
|   0 | SELECT STATEMENT        |            |   133 |
|   1 |  HASH JOIN              |            |   133 |
|   2 |   HASH JOIN             |            |    10 |
|   3 |    VIEW                 | GV$_LOCK   |    10 |
|   4 |     UNION-ALL           |            |       |
|   5 |      FILTER             |            |       |
|   6 |       VIEW              | GV$_LOCK1  |     2 |
|   7 |        UNION-ALL        |            |       |
|   8 |         FIXED TABLE FULL| X$KDNSSF   |     1 |
|   9 |         FIXED TABLE FULL| X$KSQEQ    |     1 |
|  10 |      FIXED TABLE FULL   | X$KTADM    |     1 |
|  11 |      FIXED TABLE FULL   | X$KTATRFIL |     1 |
|  12 |      FIXED TABLE FULL   | X$KTATRFSL |     1 |
|  13 |      FIXED TABLE FULL   | X$KTATL    |     1 |
|  14 |      FIXED TABLE FULL   | X$KTSTUSC  |     1 |
|  15 |      FIXED TABLE FULL   | X$KTSTUSS  |     1 |
|  16 |      FIXED TABLE FULL   | X$KTSTUSG  |     1 |
|  17 |      FIXED TABLE FULL   | X$KTCXB    |     1 |
|  18 |    FIXED TABLE FULL     | X$KSUSE    |   252 |
|  19 |   FIXED TABLE FULL      | X$KSQRS    |  1328 |
------------------------------------------------------

Thanks to Timur Akhmadeev who recently published a note pointing out that you could collect stats on individual X$ tables. Do make sure you test this on your specific version of Oracle, though, and don’t use the production system as your first test case.

April 29, 2013

MV Refresh

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 2:13 pm UTC Apr 29,2013

Materialized views open up all sorts of possibilities for making reporting more efficient – but at the same time they can introduce some “interesting” side effects when you start seeing refreshes taking place. (Possibly one of the most dramatic surprises appeared in the upgrade that switched many refreshes into “atomic” mode, changing a “truncate / append” cycle into a massively expensive “delete / insert” cycle).

If you want to have some ideas of the type of work that is involved in the materialized view “fast refresh”, you could look at a recent pair of articles by Alberto Dell’Era on (very specifically) outer join materialized views (which a link back to a much older article on inner join materialized view refresh):

 

 

April 16, 2013

systimestamp

Filed under: Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 12:46 am UTC Apr 16,2013

For your entertainment – there’s nothing up my sleeves, this was a simple cut-n-paste after real-time typing with no tricks:

20:39:51 SQL> create table t1 (t1 timestamp);

Table created.

20:39:55 SQL> insert into t1 values(systimestamp);

1 row created.

20:39:59 SQL> select t1 - systimestamp  from t1;

T1-SYSTIMESTAMP
---------------------------------------------------------------------------
+000000000 04:59:50.680620

1 row selected.

20:40:08 SQL>

My laptop runs Oracle so quickly that it took only 4 seconds for 5 hours to elapse !
11.2.0.3 on 64-bit Linux – the client is running with TZ=EST5EDT, while the server is running UK Time (currently BST (GMT+1))

Comments available on MOS: 340512.1 Timestamps & time zones – Frequently Asked Questions
Another MOS note, thanks to Jure Bratina in the comments: 227334.1 – “Dates & Calendars – Frequently Asked Questions” in the question

Update:

As Niall quotes in the comments: “times are difficult”.
Oracle supplies three timestamps: systimestamp, localtimestamp, and current_timestamp. (For reasons of consistency, only one of uses an underscore ;) )
Oracle also supplies three timestamp types: timestamp, timestamp with time zone, and timestamp with local time zone.
Oracle also supplies two timezone calls: dbtimezone, and sessiontimezone

If you need to figure out all the details of how these things hang together, I think you need to set your machine timezone to something that isn’t UTC (or GMT as I still tend to call it), then use two separate machines as clients, with their timezones set to two other timezones (again avoiding UTC).

I’ve done a few experiments but without being so rigorous in my settings – my machine was running on GMT, but I opened a (UNIX) session and set the session time zone to EST5EDT to start the database, while running other (UNIX) session with different TZ settings. The reason I should have restarted the machine in a different timezone is that Oracle “normalises” some timestamps to UTC – which means there are cases when I can’t be certain whether the stored value is in UTC because it has been normalised or because it simply was the actual machine time.

So here’s a little experiment (11.2.0.2, instance started in EST5EDT, unix session running in UTC, connecting across the network to the server).

select
        current_timestamp,
        localtimestamp,
        systimestamp
from
        dual
;

CURRENT_TIMESTAMP
---------------------------------------------------------------------------
LOCALTIMESTAMP
---------------------------------------------------------------------------
SYSTIMESTAMP
---------------------------------------------------------------------------
17-APR-13 11.37.10.870658 AM +01:00
17-APR-13 11.37.10.870658 AM
17-APR-13 06.37.10.870554 AM -04:00

Notes:
systimestamp reflects the instance timestamp – which is 5 hours earlier than the session timestamp.
systimestamp returns a timestamp with time zone, not just a timestamp
localtimestamp and current_timestamp show the client time, but localtimestamp doesn’t show the timezone, current_timestamp does (the +1:00 appears because Daylight Saving Time (British Summer Time) is active so my session is one hour ahead of UTC, while the database is 4 hours behind.)

Another quick test:

create table t1 (
        t0              timestamp,
        tz              timestamp with time zone,
        tl              timestamp with local time zone,
        ts_type         varchar2(20)
)
;

insert into t1 values(
        systimestamp, systimestamp, systimestamp,
        'sys Timestamp'
);

commit;

select * from t1;

T0
---------------------------------------------------------------------------
TZ
---------------------------------------------------------------------------
TL                                                                          TS_TYPE
--------------------------------------------------------------------------- --------------------
17-APR-13 06.44.04.353489 AM
17-APR-13 06.44.04.353489 AM -04:00
17-APR-13 11.44.04.353489 AM                                                sys Timestamp

select
        dump(t0,16),
        dump(tz,16),
        dump(tl,16),
        ts_type
from
        t1
;
DUMP(T0,16)
------------------------------------------------------------------------------------------------------------------------
DUMP(TZ,16)
------------------------------------------------------------------------------------------------------------------------
DUMP(TL,16)
------------------------------------------------------------------------------------------------------------------------
TS_TYPE
--------------------
Typ=180 Len=11: 78,71,4,11,7,2d,5,15,11,d0,68
Typ=181 Len=13: 78,71,4,11,b,2d,5,15,11,d0,68,10,3c
Typ=231 Len=11: 78,71,4,11,b,2d,5,15,11,d0,68
sys Timestamp

Notes:
T0 – the timestamp column, has the instance timestamp in it – but doesn’t have any timezone information stored; the raw dump show the value 6:44:04 (7, 2d, 5 – convert from hex and substract one). Anyone on ANY timezone will see their output showing 6:44:04 if they select this column.

TZ – the timestamp with time zone column, has the instance timestamp, but has stored it as (b, 2d,5 – 11:44:04) with time zone information (10,3c) that allows the session to know what “global” moment the information really represents and the location (or, rather, time zone) where is was entered.

TL – the timestamp with local time zone, has the instance timestamp, but has stored it as (b, 2d, 5 – 11:44:04) with NO timezone information. So the output when you query this column is adjusted to suit the local timestamp. It’s the right “global” moment, and it displays as the relevant local time. But, as a penalty, it’s lost the information about where (in which time zone) it was entered.

I think that examination of the content of the raw dumps of the three different types may help you understand why you need to store timestamps in a column type that includes a time zone – if you don’t then you lose some information, and time-based arithmetic will give you some surprises if your application crosses timezones.

Next Issue:

Indexing time (though the link in the comments below to Tony Hasler’s blog probably gives you all the answers you need), and an Oracle design error that I’ve visited before.

April 13, 2013

Deadlocks

Filed under: deadlocks,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 11:36 am UTC Apr 13,2013

Here’s a deadlock graph that might cause a little confusion:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
 Resource Name          process session holds waits  process session holds waits
 TX-001a0015-00014787        34      90     X             32       3           S
 TX-00190008-0000601b        32       3     X             34      90           S

 session 90: DID 0001-0022-00000327      session 3: DID 0001-0020-000009E9
 session 3: DID 0001-0020-000009E9       session 90: DID 0001-0022-00000327

 Rows waited on:
  Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
  (dictionary objn - 199909, file - 6, block - 276654, slot - 0)
  Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
  (dictionary objn - 199909, file - 6, block - 433952, slot - 0)

Both sessions are holding X and waiting S on a TX lock. There are several well-known reasons why you might see a share (mode 4) lock on a transaction table slot: collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples, and problems with interested transaction lists (ITLs) or freelists are the common “internal” ones – so the presence of the waits in share mode shouldn’t, of themselves, a source of confusion.

The confusion is in the reported rowids. If you try to interpret them as real rowids you may be unlucky and discover that they seem to be related to the SQL reported for the deadlocked sessions when really the information they hold is garbage. (Just to avoid confusion, let me make it clear that there are cases where the rowids reported definitely WILL be garbage; on the other hand, there may be some scenarios where the rowids are relevant – although I haven’t done any exhaustive check to see if there really are such scenarios.)

When I see a deadlock graph on transaction locks and the waits are for S mode I tend to assume that the information about the rows waited on is probably misleading; when the slot number for the rowid is zero this increases my confidence that the rowid is rubbish. (Zero is a legal value for a rowid slot, of course, so a zero doesn’t prove that the rowid is rubbish, it’s just a coincidence that allows me to continue following a hypothesis.)

The problem is that Oracle doesn’t waste resources tidying up after itself, and in the case of deadlock graphs this laziness shows up in the rowids reported. The trace is simply reporting whatever happens to be in the row_wait_obj#, row_wait_file#, row_wait_block# and row_wait_row# columns of v$session; and if the waiting process hasn’t updated these columns with current row information you could be looking at the details of the last row (or block) that the session waited for. Here’s the description of a test to demonstrate the behaviour:

create table t1 (n1 number, n2 number);
insert into t1 values(1,1);
create unique index t1_i1 on t1(n1);
create unique index t1_i2 on t1(n2);

session 1:	insert into t1 values(2,11);
session 2:	insert into t1 values(3,21);
session 1:	insert into t1 values(4,21);
session 2:	insert into t1 values(5,11);

With this table, and sequence of events following it, session 1 raised a deadlock error, and dumped the following trace:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-000a0021-0000c4aa        16      93     X             13      90           S
TX-00020020-0000fdb9        13      90     X             16      93           S
session 93: DID 0001-0010-00000057	session 90: DID 0001-000D-000000F2
session 90: DID 0001-000D-000000F2	session 93: DID 0001-0010-00000057
Rows waited on:
Session 90: obj - rowid = 00000009 - AAAAAJAABAAAQJcAAA
  (dictionary objn - 9, file - 1, block - 66140, slot - 0)
Session 93: obj - rowid = 0002E7DC - AAAufaAAFAAAAAJAAA
  (dictionary objn - 190428, file - 5, block - 9, slot - 0)

Object 9 is the I_FILE#_BLOCK# index in the data dictionary – and session 90 is definitely not doing anything with that object in this transaction.
Object 190428 is another table in the test schema, but session 93 didn’t access it in this transaction, and the block referenced is the segment header block, not a block that could hold a row.

In fact, just before I started the sequence of inserts I ran this query from a third session (connected as sys) with the following results:

select
	sid,
	row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#
from
	v$session
where
	sid in (90,93)
;

       SID ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
---------- ------------- -------------- --------------- -------------
        90             9              1           66140             0
        93        190428              5               9             0

The “rows” reported for the deadlock simply echoed the values that were already in the row_wait columns before the test started. (The sessions were waiting on “SQL*Net message from client” at the time.)

Footnote:

Session 90 created the tables and indexes, that’s probably why it happened to have an outstanding reference to the i_file#_block# index.
Session 93 had just run a script to drop all the objects in the schema, which may explain why it happened to have an outstanding reference to a segment header block
If the row_wait_obj# had been set to -1 for either session then the deadlock graph would have reported “No row” for that session.

April 7, 2013

DML Tracking

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 10:04 pm UTC Apr 7,2013

You’ve probably seen questions on the internet occasionally about finding out how frequently an object has been modified. The question is a little ambiguous – does it mean how much change has occurred, or how many DML statements have been executed; either may be an interesting measure. Of course, Oracle gave us a method of answering the first question a long time ago: v$segstat (or v$segment_statistics if you don’t mind doing the join) and the resulting content in the AWR or Statspack reports:

Segments by DB Blocks Changes   DB/Inst: XXXXXXXX/XXXXXXXX  Snaps: 85563-85564
-> % of Capture shows % of DB Block Changes for each top segment compared
-> with total DB Block Changes for all segments captured by the Snapshot

           Tablespace                      Subobject  Obj.      DB Block    % of
Owner         Name    Object Name            Name     Type       Changes Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
XXXXXXXXX  DATA       MLOG$_xxxxxxxxxxxxxx            TABLE    1,144,112   22.16
XXXXXXXXX  DATA       PK_xxxxxxxxxxxxxxxxx            INDEX      614,256   11.90
XXXXXXXXX  DATA       xxxxxxxxxxxxxxx                 TABLE      490,080    9.49
XXXXXXXXX  IDX        IDX_xxxxxxxxxxxxxxxx            INDEX      353,392    6.84
XXXXXXXXX  IDX        PK_yyyyyyyyyyyyyyy              INDEX      273,664    5.30
          -------------------------------------------------------------

There’s still a little ambiguity in this report (in the right circumstances you can change multiple rows in a block while only reporting a single block change), but it’s a very good indicator of the volume of change to an object; but is the million or so block changes due to one SQL statement or 1,000,000 SQL statements ?

With 11.2.0.2 there is an (undocumented) option for finding the answer to that question. Since it’s undocumented I wouldn’t use it on production unless I was really desperate, and even then I’d do the standard “check with Oracle support” first; however I might use it on a test system if I were do something like running tests of a new version of an overnight suite of batch jobs.

You can enable “DML frequency tracking” which seems to count the number of DML statements that modify an object or, to be more accurate, a table. It doesn’t do anything to identify indexes, although it will report index organized tables (using the table name); it won’t identify partition-level DML individually for partitioned tables, it will simply record the DML as being relevant to the table.

To enable the feature you modify hidden parameter _dml_frequency_tracking and then query dynamic performance view v$object_dml_frequencies.


alter system set "_dml_frequency_tracking"=true;

select 
	object, working, slot0, slot1, slot2, slot3, slot4 
from	
	v$object_dml_frequencies
order by
	working + slot0 + slot1 + slot2 + slot3
;

The view has 25 columns for numbered “slots”, and a “working” column. Each slot represents a 15 minute interval (set by parameter _dml_frequency_tracking_slot_time) and only 4 slots are maintained by default (set by parameter _dml_frequency_tracking_slots). The values roll from slot to slot every 15 minutes, and the “working” column keeps track of objects that have suffered DML since the last roll-over. Unfortunately there doesn’t seem to be any obvious way of tracking when a roll-over takes place, so you can’t tell how long the “working” column has been active for.

If you have the default 4 slots set, then “working” and slots 0 to 2 are populated with 15 minutes worth of information each, and slot3 seems to be the accumulated history of all accesses since you enabled the feature.

The view sits on top of x$ksxm_dft, but there are no interesting extra columns in the x$ that aren’t displayed in the v$ and gv$. (It’s an interesting little quirk that the v$ doesn’t following the usual pattern of being a view of the gv$ restricted to the current instance.) You can see the information in the shared pool (by querying v$sgastat) in an area named “dml frequency” – although the figure reported the last time I checked was wrong by a factor of roughly 3 – show about 8KB when I was expecting roughly 24KB.

I haven’t spent a lot of time investigating the feature yet, so there are probably a few questions and boundary conditions to test before using it in anger; but there don’t appear to be any special latches associated with the counters (so possibly the results would be under-count in a highly concurrent system, but maybe the results are collected under a mutex).

Related parameters

_dml_frequency_tracking (false) Control DML frequency tracking — can be set at system level on demand
_dml_frequency_tracking_advance (true) Control automatic advance and broadcast of DML frequencies — ?
_dml_frequency_tracking_slot_time (15) Time length of each slot for DML frequency tracking — needs restart
_dml_frequency_tracking_slots ( 4) Number of slots to use for DML frequency tracking — needs restart

March 31, 2013

Index Selectivity

Filed under: CBO,Oracle,Troubleshooting — Jonathan Lewis @ 6:42 pm UTC Mar 31,2013

Here’s a summary of a recent posting on OTN:

I have two indexes (REFNO, REFTYPESEQNO) and (REFNO,TMSTAMP,REFTYPESEQNO). When I run the following query the optimizer uses the second index rather than the first index – which is an exact match for the predicates, unless I hint it otherwise:

select *
from   RefTable
where  RefTypeSeqNo = :1
and    RefNo = :2;

Default plan:
---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     3 |   126 |     6   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| REFTABLE      |     3 |   126 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | REFTABLE_CX03 |     3 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("REFNO"=TO_NUMBER(:2) AND "REFTYPESEQNO"=TO_NUMBER(:1))
       filter("REFTYPESEQNO"=TO_NUMBER(:1))

Hinted plan:
---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     3 |   126 |    15   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| REFTABLE      |     3 |   126 |    15   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | REFTABLE_CX02 |    14 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("REFNO"=TO_NUMBER(:2) AND "REFTYPESEQNO"=TO_NUMBER(:1))

This is actually an example of a feature of the optimizer that I described a few years ago. The original note described a change as you moved from 10.1 to 10.2 and on to 11.1 – but once you’ve seen the basic issue there are a number of variations on how it might appear. In this case the OP seems to be using 10gR2, where the distinct_keys value from an index is used to calculate the cost and row estimate for the index and for the table access cost when that specific indexed access path is being considered.

So with the hint to use the accurate index we see an index cardinality estimate of 14 rows with a table cardinality of 3 despite the fact that the plan shows no extra predicates applied at the table; the cost of accessing the table is also clearly related to the cardinality estimate on the index line.

In the default plan when the wrong index is used, the optimizer doesn’t pay any attention to the distinct_keys from the other index, and simply uses the standard “product of column selectivities”.

11g introduces two changes – when calculating the table cardinality the distinct_keys value for the index is carried forward (so the plan with the high index cardinality but low table cardinality would report the same cardinality for both operations), and the distinct_keys from the first index would be used when doing the calculations for the second index – which would increase the cost of using the wrong index.

There’s really very little you can do to find a strategic fix for this type of problem in 10g – obviously you could add hints whenever Oracle used the wrong index, but that’s not reallya desirable approach, and it is possible to adjust column statistics in such a way that the calculations the optimizer uses give better approximations, but that’s not always very easy to do well. Ultimately you just have to be very careful about your choice of indexes – and when you think that two indexes show a significant overlap in columns consider the possibility that one carefully defined index may be able to do the job of both.

 

 

March 27, 2013

Open Cursors

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:13 pm UTC Mar 27,2013

Here’s a little detail that appeared in 11gR2 that may help you answer questions about open cursors. Oracle has added a “cursor type” column to the view v$open_cursor, so you can now see which cursors have been held open because of the pl/sql cursor cache, which have been held by the session cursor cache, and various other reasons why Oracle may take a short-cut when you fire a piece of SQL at it.

The following is the output showing the state of a particular session just after it has started up in SQL*Plus and called a PL/SQL procedure to run a simple count:

select
        cursor_type, sql_text
from
        V$open_cursor
where
        sid = 17
order by
        cursor_type,
        sql_text
;

CURSOR_TYPE                      SQL_TEXT
-------------------------------- ------------------------------------------------------------
DICTIONARY LOOKUP CURSOR CACHED  BEGIN DBMS_APPLICATION_INFO.SET_MODULE(:1,NULL); END;
                                 BEGIN DBMS_OUTPUT.DISABLE; END;
                                 BEGIN DBMS_OUTPUT.ENABLE(1000000); END;
                                 BEGIN dbms_random.seed(0); END;
                                 SELECT ATTRIBUTE,SCOPE,NUMERIC_VALUE,CHAR_VALUE,DATE_VALUE F
                                 SELECT CHAR_VALUE FROM SYSTEM.PRODUCT_PRIVS WHERE   (UPPER('
                                 SELECT USER FROM DUAL
                                 select /*+ connect_by_filtering */ privilege#,level from sys
                                 select SYS_CONTEXT('USERENV', 'SERVER_HOST'), SYS_CONTEXT('U
                                 select decode(failover_method, NULL, 0 , 'BASIC', 1, 'PRECON
                                 select metadata from kopm$  where name='DB_FDO'
                                 select privilege# from sysauth$ where (grantee#=:1 or grante
                                 select to_char(sysdate,'hh24miss') time_now from dual
                                 select value$ from props$ where name = 'GLOBAL_DB_NAME'

OPEN                             BEGIN spin_1; END;
                                 table_1_ff_208_0_0_0

OPEN-RECURSIVE                   insert into sys.aud$( sessionid,entryid,statement,ntimestamp

PL/SQL CURSOR CACHED             SELECT COUNT(*) X FROM KILL_CPU CONNECT BY N > PRIOR N START

SESSION CURSOR CACHED            BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;
                                 SELECT DECODE('A','A','1','2') FROM DUAL

Variations are left to the user.
There are a few other cursor types – here’s the list given in the 11.2 Server Reference manual under the definition of v$open_cursor:

  • BUNDLE DICTIONARY LOOKUP CACHED
  • CONSTRAINTS CURSOR CACHED
  • DICTIONARY LOOKUP CURSOR CACHED
  • OPEN
  • OPEN-PL/SQL
  • OPEN-RECURSIVE
  • PL/SQL CURSOR CACHED
  • REPLICATION TRIGGER CURSOR CACHED
  • SESSION CURSOR CACHED

It’s an interesting exercise to consider why there are so many types, and then create some tests to confirm or refute your hypothesis. I haven’t checked, but here are a few ideas:

  • replication trigger cursor cached: I don’t remember which version introduced the change, but once upon a time the triggers updating the materialized view logs were real after insert/update/delete triggers, but now they’re “pre-compiled” – so it’s not surprising they form a special case.
  • dictionary lookup cursor cached: are these, perhaps, the statements that are currently cached in the “_row_cache_cursors” cache for data dictionary access; the parameter was once set to 10, but currently defaults to 20.
  • bundle dictionary lookup cached: why would there be a special case of dictionary lookup ? perhaps this is the set of cursors needed to read the first few tables in the data dictionary that allow the optimizer to do its work (how do you optimize a query against tab$ if you need to query syn$, obj$ and tab$ to discover that tab$ is a table ?)
  • constraints cursor cached: probably something to do with the SQL (internal, or externalised) that Oracle has to run to check or implement details of referential integrity constraints.

Footnote (28th March):

By a strange coincidence a note came up on OTN today that pointed to a different version of the Oracle manual where the possible cursor types are listed under their “internal” names – but I’m not sure if there’s a version of Oracle where you’d see them looking like this:

  • CACHED
  • KNT CACHED
  • KQD BUNDLE CACHED
  • KQD CACHED
  • KXCC CACHED
  • PL/SQL
  • PL/SQL CACHED
  • SYSTEM

March 22, 2013

LOB Update

Filed under: Infrastructure,LOBs,Oracle — Jonathan Lewis @ 10:36 pm UTC Mar 22,2013

This note is about a feature of LOBs that I first desribed in “Practial Oracle 8i” but have yet to see used in real life. It’s a description of how efficient Oracle can be, which I’ll start with a description of, and selection from, a table:

create table test_lobs (
	id              number(5),
	bytes           number(38),
	text_content    clob
)
lob (text_content) store as text_lob(
	disable storage in row
	cache
)
;

-- insert a row

SQL> desc test_lobs
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID                               NUMBER(5)
 BYTES                            NUMBER(38)
 TEXT_CONTENT                     CLOB

SQL> select id, bytes, dbms_lob.getlength(text_content) from test_lobs;

        ID      BYTES DBMS_LOB.GETLENGTH(TEXT_CONTENT)
---------- ---------- --------------------------------
         1     365025                           365025

1 row selected.

I’ve got a table with a single CLOB column holding a single row. The size of the single CLOB is roughly 365KB (or about 45 blocks of 8KB). Old hands who have had to suffer LONG columns will recognise the trick of recording the size of a LONG as a separate column in the table; it’s a strategy that isn’t really necessary with LOBs but old coding habits die hard. It’s quite hard to find details of how much space has been used in a LOB segment (the space_usage procedure in the dbms_space package doesn’t allow you to examine LOBSEGMENTs), but I did a coupld of block dumps to check on this LOBSEGMENT and it had allocated 46 blocks on the first insert.

So here’s the clever bit – how big will the LOBSEGMENT grow when I update that one CLOB ?

It’s common knowledge (to users of LOBs) that the undo mechanism Oracle has for LOBs is simply to leave the old LOB in place and create a new one – so the intial response to the question might be to guess that the LOBSEGMENT will grow to roughly double the size. But it doesn’t have to be like that, at least, not if you update the LOB the way I happen to have done, which is like this:

declare

	m_length	integer;
	m_lob		clob;

begin

	select
		text_content,
		dbms_lob.getlength(text_content)
	into	m_lob, m_length
	from
		test_lobs
	where
		id = 1
	for update
	;

	dbms_output.put_line('Lob size: ' || m_length);

	dbms_lob.write(
		lob_loc	=> m_lob,
		amount	=> 17,
		offset	=> 1,
		buffer	=> 'This is an update'
	);

	commit;

end;
/

My code very specifically changes only the first 17 bytes of the LOB. So how much does Oracle have to do to effect this change ? The LOB-handling mechanisms are smart enough to work out that only the first (of 45) blocks in the LOB need to be changed, so Oracle need only add one block to the segment and write the new version of the first LOB block to that one block. (In fact the segment – which was in a tablespace using freelist management – grew by the “standard” 5 blocks from which Oracle selected just one block to add to the LOB.)

So how does Oracle keep track of the whole LOB if it can change it one piece at a time ? This is where the (notionally invisible and you don’t need to know about it) LOBINDEX comes into play. Oracle maintains an index keyed by (LOB_ID, chunk_number) *** pointing to all the chunks of a LOB in order, so when you update a single chunk Oracle simply creates an updated copy of the chunk and changes the appropriate index entry to point to the new chunk. So here’s an image representing our one LOB value just after we’ve created it and before we’ve updated:

lob_1

And then we “modify” the first chunk – which means we have to add a chunk (which in this case is a single block) to the segment, create a new version of the first chunk, modify the index to point to the new block, and add an index entr – keyed by time-stamp – to the end of the index to point to the old chunk; something like this:

lob_2

Now, when we run a query to select the LOB, Oracle will follow the index entries in order and pick up the new chunk from the end of the LOBSEGMENT. But the LOBINDEX is protected by undo in the standard fashion, so if another long-running query that started before our update needs to see the old version of the LOB it will create a read-consistent copy of the relevant index leaf block- which means that from its perspective the index will automatically be pointing to the correct LOB chunk.

The index is actually quite an odd one because it serves two functions; apart from pointing to current lobs by chunk number, it also points to “previous” chunks by timestamp (specifically the number of seconds between Midnight of 1st Jan 1970 and the time at which the chunk was “overwritten”). This makes it easy for Oracle to deal with the retention interval for LOBs – any time it needs space in the LOBSEGMENT it need only find the minimum timestamp value in the index and compare it with “sysdate – retention” to see if there are any chunks available for re-use.

To sum up – when you update LOBs, and it’s most beneficial if you have an application which doees piece-wise updates, you leave a trail of old chunks in  in the LOBSEGMENT. The version of the LOB you see is dictated by the version of the index that you generate when you request a copy of the LOB at a given SCN.

 

*** Footnote: My description of the LOBINDEX was an approximation. Each index entry carries a fixed size “payload” listing up to eight lob chunks; so the (LOB_ID, chunk_number) index entries in a LOBINDEX may point to every 8th chunk in the LOB. The significance of the “fixed size” payload is that the payload can be modified in place if the pointer to a LOB chunk has to be changed – and this minimises disruption of the index (at a cost of some wasted space).

 

March 20, 2013

Lock Bug

Filed under: Bugs,deadlocks,Locks,Oracle — Jonathan Lewis @ 12:08 pm UTC Mar 20,2013

Here’s an oddity that I ran into a little while ago while trying to prepare a sample trace file showing a particular locking pattern; it was something that I’d done before, but trace files can change with different versions of Oracle so I decided to use a copy of 11.2.0.2 that happened to be handy at the time to check if anything had changed since the previous (11gR1) release. I never managed to finish the test; here are the steps I got through:


-- created a table t1 to work with, no data needed, column definitions don't matter.

Session 1:
lock table t1 in row exclusive mode;        -- mode 3, lock acquired

Session 2:
lock table t1 in row exclusive mode;        -- mode 3, lock acquired

Session 3:
lock table t1 in exclusive mode;            -- mode 6, goes into wait for session 1

Session 1 (again):
lock table t1 in share row exclusive mode;  -- mode 5, should move to converter queue, wait for session 2

The mode 3 to mode 5 conversion is what happens if you have a foreign key referential integrity constraint without a covering index and run code that does something like “delete child rows for parent X, delete parent X” (which also happens under the covers if your foreign key is declared as “on delete cascade”).

Notice the “should” on the lock conversion line – this is exactly what happens in 10.2.0.3 and 11.2.0.3; but on the system I was using session 1 got an immediate deadlock (ORA-00060) error – in the absence of a deadlock scenario !

My problem is this – the instance that gives the deadlock error is 64-bit 11.2.0.2 running RAC on Linux (OEL 5); the instances that don’t give the error are 32-bit, non-RAC, running on Windows XP Pro. SO is the anomaly due to some difference in:

  • Windows vs. Linux
  • 32 bit vs. 64 bit
  • RAC vs. non-RAC
  • Specific version of Oracle

I’m inclined to think it’s a version dependent bug, but it’s possible that it’s a necessary side effect of RAC. So if you’ve got 11.2.0.3 RAC, or 11.2.0.2 non-RAC, or the terminal releases of 10g and 11.1 whether RAC or non-RAC, I’d like to hear from you which versions – if any – produce the same deadlock and which don’t. (And the scientific method being what it is, anyone with 11.2.0.2 on RAC on Linux might like to confirm – or contradict – my result.)

Update 8:30 p.m. GMT

The results in so far tend to support the idea that this may be a RAC-related issue. Justin Mitchell’s result is particularly interesting, and I confirmed on my 11.2.0.2 RAC system that if the two starting sessions are on different nodes then I don’t get the deadlock behaviour.

Users of RAC will probably be aware of the fact that v$lock.block commonly sets itself to 2 (potential blocker) on RAC the moment you acquire the lock; and this is what happens to both the “row exclusive” locks. When we request the exclusive lock, both TM locks change from block=2 to block=1 if they are on the same node; if they are on different nodes then only one of them changes.

Unfortunately Oleksandr didn’t see a deadlock when he tried RAC 11.2.0.3 – and since he didn’t make any comment about using multiple nodes we will have to wait to see if he can repeat the test and tell us whether his results are consistent with Justin’s, or whether they suggest that there are further considerations in effect.

March 4, 2013

Duplicate indexes ?

Filed under: Indexing,Oracle,Troubleshooting — Jonathan Lewis @ 5:39 pm UTC Mar 4,2013

I don’t think this is likely to happen on a production system (until 12c) – but look what you can do if you try hard enough:

  1  select
  2     index_name, column_name from user_ind_columns
  3  where
  4     table_name = 'T1'
  5  order by
  6*    index_name , column_position
SQL> /

INDEX_NAME           COLUMN_NAME
-------------------- --------------------
T1_I1                N1
                     V1

T1_I2                N1
                     V1


4 rows selected.

That’s a straight cut-n-paste from an Oracle 11.1.0.7 SQL*Plus session. (You can tell I typed it in real time because I missed the return before the FROM, and couldn’t be bothered to go back and do it again ;) )

March 1, 2013

Source Control

Filed under: Advertisements,Oracle — Jonathan Lewis @ 6:37 pm UTC Mar 1,2013

You may recall that I spent some time with the developers at the redgate offices in Cambridge (UK) a little while ago, looking at their Source Control for Oracle package. The product is about to go live, with a launch date of 12th March.

Because of the help I’ve given them they’ve offered my readers the chance of winning one of two 5-user licences for the product – provided I devise a strategy for picking the recipients.

So, to make it easy, all you have to do is persuade me that you really need or really deserve a copy. Write up a short description in the comments of the worst problem you’ve had to face because you didn’t have a decent source control system, or the  best use you think you could make of a source control system for Oracle.

I’ll get the redgate developers to read what you wrote, and they will be the final judges of the two most interesting, or possibly most horrifying, or maybe even the most entertaining, cases.

Entries to be in by 23:59 GMT on 11th March.

Update

The product has launched early !

You can find out more and see screenshots on this web page.

And there’s an online demo of the tool on March 14 at 16:00 GMT (17:00 CET / 11:00 EST / 10:00 CST / 08:00 PST), where you will be able to ask any questions. If you’re interesting in seeing it you’ll have to register at this “go to meeting” URL.

The closing date for entries is still 11th March, 23:59 GMT.

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 1,391 other followers