Oracle Scratchpad

May 23, 2013

Dynamic Sampling – 2

Filed under: CBO,Hints,Oracle,Troubleshooting — Jonathan Lewis @ 12:46 pm UTC May 23,2013

I’ve written about dynamic sampling in the past, but here’s a little wrinkle that’s easy to miss. How do you get the optimizer to work out the correct cardinality for a query like (the table creation statement follows the query):

select	count(*)
from	t1
where	n1 = n2
;

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	mod(rownum, 1000)	n1,
	mod(rownum, 1000)	n2
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6 ; 

If you’re running 11g and can changed the code there are a couple of easy options – adding a virtual column, or applying extended stats and then modifying the SQL accordingly would be appropriate.

 -- Virtual Column alter table t1 add ( 	n3	generated always as ( case n1 when n2 then 1 end) virtual ) ; execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for columns n3 size 1')

-- Extended Stats

begin
	dbms_output.put_line(
		dbms_stats.create_extended_stats(
			ownname		=> user,
			tabname		=> 'T1',
			extension	=> '(case n1  when n2 then 1 else null end)'
		)
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		block_sample 	 => true,
		method_opt 	 => 'for columns (case n1  when n2 then 1 else null end) size 1'
	);
end;
/

select	count(*)
from	t1
where	(case n1 when n2 then 1 else null end)= 1
;

If you can’t change the SQL statement, there’s always the option for bypassing the problem by fixing a suitable execution plan with an SQL Baseline, of course. Alternatively, if you can think of the right hint you could create an “SQL Patch” for the statement – but what hint might be appropriate ? I’ll answer that question in a minute.

Here’s another option, though: get Oracle to use dynamic sampling. (You probably guessed that from the title of the post.) So which level would you use to make this work ? Left to its own devices, Oracle would calculate the selectivity of the predicate n1 = n2 as the smaller of the two separate predicates “n1 = unknown” and “n2 = unknown”. So you might hope that level 3 (Oracle is “guessing”) or level 4 (more than one predicate on a single table) might be appropriate. It’s the latter that works. If you execute “alter session set optimizer_dynamic_sampling=4;” before executing this query, Oracle will sample the table before optimising.

The method works, but can you apply it ? Possibly not, if you’re not allowed to inject any extra SQL anywhere – after all, you probably don’t want to set the parameter at the system level (spfile or init.ora) because it may affect lots of other queries – introducing more work because of the sample, and then risking unexpected changes in execution plans. Setting the parameter for a session is often no better. And this brings me back to the SQL Patch approach – if you don’t want to create a baseline for the query then perhaps a patch with the hint /*+ opt_param(‘optimizer_dynamic_sampling’ 4) */ will do the trick. Don’t forget all the doubling of single quotes that you’ll need, though (this is the code fragment I used):

begin
	sys.dbms_sqldiag_internal.i_create_patch(
		sql_text	=>
'
select
	count(*)
from	t1
where	n1 = n2
',
		hint_text	=> 'opt_param(''optimizer_dynamic_sampling'' 4)'

	);
end;
/

For more analysis and commentary on the SQL Patch mechanism, you might like to read Dominic Brooks’ mini-series:

May 22, 2013

gather_plan_statistics – 2

Filed under: Performance,Oracle,Troubleshooting — Jonathan Lewis @ 12:40 pm UTC May 22,2013

Some time ago – actually a few years ago – I wrote a note about the hint /*+ gather_plan_statistics */ making some informal comments about the implementation and relevant hidden parameters. I’ve recently discovered a couple of notes from Alexander Anokhin describing the feature in far more detail and describing some of the misleading side effects of the implementaiton. There are two parts (so far): part 1 and part 2.

 

May 17, 2013

UKOUG

Filed under: Uncategorized — Jonathan Lewis @ 2:25 pm UTC May 17,2013

The call for papers is open for Tech 13 – the “server-side” conference of the UKOUG.

The conference was getting so big that we’ve split Apps from Server Tech and will be running the two conferences separately this year. The Server Tech conference will be in Manchester from 2nd to 4th Dec.

The closing date for submissions is Friday 31st May (only 2 weeks !) and confirmation of acceptance given by August.

There is a slightly shorter route to submission (if you don’t want to watch the video on “Why to speak”).

 

 

 

May 16, 2013

Performance Monitoring

Filed under: Uncategorized — Jonathan Lewis @ 4:51 am UTC May 16,2013

Updated – just a quick reminder for next week; I’ll be doing a short webinar next Wednesday comparing the performance monitoring tools Oracle and SQL server provide.

I think I may have broken my record with 6 countries in 6 weeks – so I haven’t been very thorough at updating my blog recently. Just time, before I head off to Heathrow once again, to do a quick advert for the next redgate webinar that I’m doing with Grant Fritchey. This time comparing built-in performance monitoring tools. Details and Registrations at this URL.

I’ll see if I can catch up with a couple of answers while I’m in the airport lounge – but no promises, since the simple act of walking into an airport makes me  feel like falling asleep.

Webinars

Filed under: Advertisements — Jonathan Lewis @ 4:47 am UTC May 16,2013

Reminder:

The webinars on “Smarter Statistics in 11g” are on tomorrow (Friday) at 2:00 pm and 6:00 pm. There’s a waiting list for the 6:00 pm event, so if you’ve signed up but can’t make it please delete your registration. (The event will be repeated on 10th June).  If you want to vote a better time for me to do short webinars there’s a poll at the end of the article.

 

 

I’m about to make a serious move into online webinars, and as a warm-up exercise I’ll be doing a couple of one-hour free events on Friday 17th May.

I’ll be talking through a Powerpoint presentation called “Smarter Statistics in 11g” twice, once at 2:00 pm – 3:00 pm BST, and again at 6:00 pm BST  (12:00 pm - 1:00 pm CDT) . Broadly speaking the first one is for the benefit people from the UK and eastwards, and the second is for the benefit of people from the US and westwards. This is just a trial run, of course, and if it works well I will be doing more of the same, perhaps three times per day to spread across more time zones.

John Goodhue (my O1 sponsor for the USA) is arranging all the mechanical details, and I’ll post links for registration when they become available – we’ll be using GoToWebinar as the supply mechanism, and we’ll be limiting access to 100 people (so if you do register and can’t attend, please remove yourself from the list; if you don’t manage to register for either event, you’ll get another chance later as I plan to repeat each event a few times.)

I’ll also be doing a full day paid event on 23rd May which will be my “Indexing Strategies” tutorial. This first full day event will be timed to suit the American audience – although anyone can register, of course – but we plan to have further events suited to other time zones. The URL for registration is now available – with an option to purchase a 30-day window to the recording of my “Oracle Mechanisms” presentation in Minneapolis.

May 13, 2013

Parse Time

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:59 pm UTC May 13,2013

Dominic Brooks published a note recently about some very nasty SQL – originally thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a run-time problem, but when I first read Dominic’s note I was sufficiently surprised that I decided to try modelling the query.

Unfortunately the query had more than 1,000 predicates, (OR’ed together) and some of them included in-lists.  Clearly, writing this up by hand wasn’t going to be a good idea, so I wrote a script to generate both the data, and the query, as follows – first a table to query:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum		id1,
	rownum		id2,
	rownum		id,
	lpad(rownum,10)	v1,
	rpad('x',100)	padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5
;

create index t1_i1 on t1(id1, id2);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

Then a piece of code to write a nasty query:

set pagesize 0
set feedback off
set termout off

spool temp1.sql

prompt select * from t1 where 1 = 2

select
	'or (id1 = ' || rownum || ' and id2 = ' || (rownum + 1) || ')'
from
	t1
where
	rownum <= 750
union all
select
	'or ( id1 =  ' || (rownum + 1000) || ' and id2 in (' || rownum || ',' || (rownum+1) || '))'
from
	t1
where
	rownum <= 250
;

prompt /

spool off

Here’s an example of the text generated by the code – with the parameters set to 5 and 3 respectively (and notice how I’ve rigged the query so that it doesn’t return any data, whatever the optimizer thinks):

select * from t1 where 1 = 2
or (id1 = 1 and id2 = 2)
or (id1 = 2 and id2 = 3)
or (id1 = 3 and id2 = 4)
or (id1 = 4 and id2 = 5)
or (id1 = 5 and id2 = 6)
or ( id1 =  1001 and id2 in (1,2))
or ( id1 =  1002 and id2 in (2,3))
or ( id1 =  1003 and id2 in (3,4))
/

So here’s the plan from the above query:


---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     8 |  1008 |    16   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   8 |   INLIST ITERATOR             |       |       |       |            |          |
|   9 |    TABLE ACCESS BY INDEX ROWID| T1    |     5 |   630 |     7   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | T1_I1 |     5 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID1"=1003)
       filter("ID2"=3 OR "ID2"=4)
   5 - access("ID1"=1002)
       filter((LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND
              ("ID2"=2 OR "ID2"=3))
   7 - access("ID1"=1001)
       filter((LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND ("ID2"=1 OR
              "ID2"=2))
  10 - access(("ID1"=1 AND "ID2"=2 OR "ID1"=2 AND "ID2"=3 OR "ID1"=3 AND
              "ID2"=4 OR "ID1"=4 AND "ID2"=5 OR "ID1"=5 AND "ID2"=6))
       filter((LNNVL("ID1"=1001) OR LNNVL("ID2"=1) AND LNNVL("ID2"=2)) AND
              (LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)))

As you can see, the first five predicates end up in line 10 of the plan with 10 repetitions (5 * 2) of the lnnvl() function. The last three predicates show up in lines 3, 5, and 7 – and the on each line we see two more lnnvl() calls than on the previous – just imagine, then, how many lnnvl() calls the optimizer will have added to the query plan by the time we have 750 occurrences in the inlist iterator (line 8) and 250 occurrences of the slightly complex predicate. Here are the relevant CPU stats (from v$session_stats) from running the generated script on 11.1.0.7, on Windows 32-bit, 2.8GHz CPU:

Name                                           Value
----                                           -----
recursive cpu usage                            1,848
CPU used when call started                     1,854
CPU used by this session                       1,854
DB time                                        1,870
parse time cpu                                 1,847
parse time elapsed                             1,862

Clearly the parse time is extreme – though not as dramatic as in Dominic’s example; but having set up the first draft of the sample code it’s easy enough to change the number of occurrences of each type of predicate, and it’s pretty easy to make longer in-lists in the more complex of the two types of predicate. It’s not too difficult to get an execution plan that mimics Dominic’s in length and time to parse.

It’s not just the parse times that are interesting when you start doing this, by the way – it’s worth playing around to see what happens. It’s probably best to run the query to pull the plans from memory if you want to see the plans, though – if you try using “explain plan” then you start using memory in the SGA for some of the work: in one of my examples I had to abort the instance after a few minutes.

ISS

Filed under: Non-technical — Jonathan Lewis @ 6:45 pm UTC May 13,2013

I’d like to dedicate this posting to fellow Oak Table member Richard Foote, for reasons that the readers we have in common will immediately recognise: http://www.youtube.com/watch?v=KaOC9danxNo

The singer is Canadian astronaut Commander Chris Hadfield who has been tweeting and posting pictures from space – be careful, you may get hooked: https://twitter.com/Cmdr_Hadfield/status/332819772989378560/photo/1

Update:

When I posted the link to the video it had received 1.5M views; less than 24 hours later it’s up to roughly 7M. (And they weren’t all Richard Foote). Clearly the images have caught the imagination of a lot of people. If you have looked at the twitter stream it’s equally inspiring – and not just for the pictures.

 

May 10, 2013

Hakan Factor

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 3:52 pm UTC May 10,2013

Here’s a quick and dirty script to create a procedure (in the SYS schema – so be careful) to check the Hakan Factor for an object. If you’re not familiar with the Hakan Factor, it’s the value that gets set when you use the command “alter table minimize records_per_block;”.

I was prompted to publish this note by an item on the OTN SQL forum describing a problem with partition exchange with a table when there were bitmap indexes in place and the table had been changed to have some extra columns added. (Problem as yet unresolved as I publish).

If you start playing with the Hakan Factor, you’ll find that there are some odd little bugs in what gets stored and how it gets used. (SQL updated to use bitand() to reflect comments below and Karsten Spang’s blog note; also edited following a comment on OTN to show the rest of the spare1 flag bits)


create or replace procedure show_hakan(
	i_table		in	varchar2,
	i_owner		in	varchar2 default user
) as
	m_obj		number(8,0);
	m_flags		varchar2(12);
	m_hakan		number(8,0);
begin

	/* created by show_hakan.sql	*/
	select
		obj#, 
/*
		case 
			when	(spare1 > 5 * power(2,15)) 
				then	(spare1 - 5 * power(2,15))
			when	(spare1 > power(2,17)) 
				then	(spare1 - power(2,17))
			when	(spare1 > power(2,15)) 
				then	(spare1 - power(2,15))
				else	spare1
		end 				hakan
*/
	to_char(
		bitand(
			spare1, to_number('ffff8000','xxxxxxxx')
		),
		'xxxxxxxx'
	)					flags,
	bitand(spare1, 32767)			hakan	-- 0x7fff	
	into
		m_obj,
		m_flags,
		m_hakan
	from
		tab$
	where	obj# in (
			select	object_id 
			from	dba_objects
			where	object_name = upper(i_table)
			and	object_type = 'TABLE'
			and	owner = upper(i_owner)
		)
	;

	dbms_output.put_line(
		'Hakan factor for object ' || 
		m_obj   || ' (' ||
		i_owner || '.' || 
		i_table || ') is ' || 
		m_hakan || ' with flags ' ||
		m_flags
	);
end;
/

drop public synonym show_hakan;
create public synonym show_hakan for show_hakan;
grant execute on show_hakan to public;

 

You’ll notice that I’ve done an “upper()” on the table and owner – that means you’re in trouble if you have created an schemas or tables with mixed-case names (but you wouldn’t do that in a production system, would you?)

Update – A little bug history

One of the odd details of the Hakan factor is that the value it shows is one less than the number of rows that will be stored in a block; and since it looks as if the factor is not allowed to drop to zero, you can’t hack the Hakan factor to force one row per block.

So here’s a (trivial and sub-optimal) piece of code to check current number of rows per block in a simple heap table (assuming the tablespace consists of a single file):

select 
	ct, count(*) 
from	(
		select	dbms_rowid.rowid_block_number(rowid), count(*) ct  
		from	t1
		group by
			dbms_rowid.rowid_block_number(rowid)
	)
group by ct
order by ct
;

Here’s the output of a session, running under 9.2.0.8, cut and pasted from the screen:

SQL> @afiedt.buf

        CT   COUNT(*)
---------- ----------
         9          1
        16          1
SQL> alter table t1 nominimize records_per_block;
SQL> alter table t1 minimize records_per_block;
SQL> execute show_hakan('t1')
Hakan factor for object 48865 (TEST_USER.t1) is: 15
SQL> alter table t1 move;
SQL> @afiedt.buf

        CT   COUNT(*)
---------- ----------
        10          1
        15          1
SQL> alter table t1 nominimize records_per_block;
SQL> alter table t1 minimize records_per_block;
SQL> execute show_hakan('t1')
Hakan factor for object 48865 (TEST_USER.t1) is: 14

Every time you moved the table, 9.2.0.8 (and earlier) used the actual stored value of the Hakan Factor to rebuild the table; but if you regenerated the Hakan Factor the stored value was one less than the actual row count. So if you kept repeating the process the number of rows per block would decrease by one each time and the table would get bigger and bigger.

It’s a silly example – but the real-world relevance was that a direct path insert behaved differently from a normal insert and this could result in a significant amount of wasted space if you were doing bulk loads in your overnight batch; so the code changed in 10g to make the normal and direct path inserts consistent with each other, but the change went the wrong way and, as a side effect, you get one more row per block than suggested by the Hakan Factor – and you can’t trick the Hakan factor into enforcing one row per block any more.

May 9, 2013

Clustering_factor

Filed under: CBO,Indexing,Oracle — Jonathan Lewis @ 8:14 am UTC May 9,2013

Cost Based Oracle – Fundamentals (November 2005)

But the most interesting function for our purposes is sys_op_countchg(). Judging from its name, this function is probably counting changes, and the first input parameter is the block ID portion (object_id, relative file number, and block number) of the table’s rowid, so the function is clearly matching our notional description of how the clustering_factor is calculated. But what is that 1 we see as the second parameter?

When I first understood how the clustering_factor was defined, I soon realized that its biggest flaw was that Oracle wasn’t remembering recent history as it walked the index; it only remembered the previous table block so that it could check whether the latest row was in the same table block as last time or in a new table block. So when I saw this function, my first guess (or hope) was that the second parameter was a method of telling Oracle to remember a list of previous block visits as it walked the index.

And finally, Oracle Corp. had implemented an official interface to the second parameter of sys_op_countchg() – provided you install the right patch – through a new table (or schema, or database) preference type available to the dbms_stats.set_table_prefs() procedure.

I’ve been meaning to write this post for two or three months, ever since Sean Molloy sent me an email about short blog note from Martin Decker describing Bug 13262857  Enh: provide some control over DBMS_STATS index clustering factor computation. Unfortunately I’ve not yet had time to investigate the patch, but I don’t think I need to any more because Richard Foote has written it up in his latest blog post.

Read Richard’s post – it’s important.

Update 10th May

Richard’s post has, unsurprisingly, produced a buzz of excitement in his reader – and started up the discussion of how best to use this capability; so here’s another quote from the book (p.111 – available in the download of chapter 5):

So using Oracle’s own function for calculating the clustering_factor, but substituting the freelists value for the table, may be a valid method for correcting some errors in the clustering_factor for indexes on strongly sequenced data. (The same strategy applies if you use multiple freelist groups—but multiply freelists by freelist groups to set the second parameter.)

Can a similar strategy be used to find a modified clustering_factor in other circumstances? I think the answer is a cautious “yes” for tables that are in ASSM tablespaces. Remember that Oracle currently allocates and formats 16 new blocks at a time when using automatic segment space management (even when the extent sizes are very large, apparently). This means that new data will be roughly scattered across groups of 16 blocks, rather than being tightly packed.

Calling Oracle’s sys_op_countchg() function with a parameter of 16 could be enough to produce a reasonable clustering_factor where Oracle currently produces a meaningless one. The value 16 should, however, be used as an upper bound. If your real degree of concurrency is typically less than 16, then your actual degree of concurrency would probably be more appropriate.

Whatever you do when experimenting with this function—don’t simply apply it across the board to all indexes, or even all indexes on a particular table. There will probably be just a handful of critical indexes where it is a good way of telling Oracle a little more of the truth about your system—in other cases you will simply be confusing the issue.

Note particularly the comments about how the best value depends on the data in the indexed columns, the table configuration, and the degree of concurrency - you don’t necessarily want to use the same value for every index on a given table. That’s a shame, since Oracle has defined the interface as a TABLE preference, so if you set it then you get the same for every index. Despite this, if you’re prepared to put in a little control work, it does mean that you can use an official Oracle mechanism to play the game I was suggesting in the book – for each “special” index, set the preference, collect the stats, then clear the preference.

May 5, 2013

SLOB

Filed under: Oracle,Performance — Jonathan Lewis @ 6:21 pm UTC May 5,2013

Anyone who has used Kevin Closson’s “Silly Little Oracle Benchmark” will want to know about his significantly updated SLOB2.

May 2, 2013

v$lock

Filed under: Locks,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 7:03 pm UTC May 2,2013

The problem of slow queries on v$lock just came up again on the OTN database forum, so I thought I’d better push out a post that’s been hanging around on my blog for the last few months. This is actually mentioned in MOS in note 1328789.1: “Query Against v$lock Run from OEM Performs Slowly” which points out that it is basically a problem of bad statistics and all you have to do is collect the stats.

However, because the view is a messy union and join of several dynamic performance views sitting on top of a load of x$ structures, the advisory from MOS is to call the procedure  dbms_stats.gather_fixed_objects_stats. This is not a nice thing to do on a busy production system, especially if it has a large number of users and a large shared pool – and doing it at a representative run-time is important if you’re going to do it at all.

There is an alternative. Here’s the basic execution plan (on my 11.2.0.2) for “select * from v$lock”:

-----------------------------------------------------
| Id  | Operation              | Name       | Rows  |
-----------------------------------------------------
|   0 | SELECT STATEMENT       |            |     1 |
|   1 |  HASH JOIN             |            |     1 |
|   2 |   VIEW                 | GV$_LOCK   |    10 |
|   3 |    UNION-ALL           |            |       |
|   4 |     FILTER             |            |       |
|   5 |      VIEW              | GV$_LOCK1  |     2 |
|   6 |       UNION-ALL        |            |       |
|   7 |        FIXED TABLE FULL| X$KDNSSF   |     1 |
|   8 |        FIXED TABLE FULL| X$KSQEQ    |     1 |
|   9 |     FIXED TABLE FULL   | X$KTADM    |     1 |
|  10 |     FIXED TABLE FULL   | X$KTATRFIL |     1 |
|  11 |     FIXED TABLE FULL   | X$KTATRFSL |     1 |
|  12 |     FIXED TABLE FULL   | X$KTATL    |     1 |
|  13 |     FIXED TABLE FULL   | X$KTSTUSC  |     1 |
|  14 |     FIXED TABLE FULL   | X$KTSTUSS  |     1 |
|  15 |     FIXED TABLE FULL   | X$KTSTUSG  |     1 |
|  16 |     FIXED TABLE FULL   | X$KTCXB    |     1 |
|  17 |   MERGE JOIN CARTESIAN |            |   100 |
|  18 |    FIXED TABLE FULL    | X$KSUSE    |     1 |
|  19 |    BUFFER SORT         |            |   100 |
|  20 |     FIXED TABLE FULL   | X$KSQRS    |   100 |
-----------------------------------------------------

Note, particularly, the Cartesian merge join at line 17, which assumes there will be one row from v$session (x$ksuse) joined to 100 rows from v$resource (x$ksqrs – the structure used to represent any resources that you want to lock). The big UNION ALL is then all the different types of locks (enqueues) that you might attach to a resource. The estimates relating to these two structures are the most significant problem – v$session always has FAR more than one row in it, and v$resource isn’t small: in my little system the Cartesian join produced about 325,000 rows, and that was just after starting up the database.

Having identified a couple of critical tables, I decided to see what would happen if I just collected stats on these two objects rather than doing the whole system, and the following little piece of pl/sql did what I wanted:

begin
        dbms_stats.gather_table_stats('SYS','x$ksuse',method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('SYS','x$ksqrs',method_opt=>'for all columns size 1');
end;
/

The effect of the stats was to change the plan to the following which, while it might not be the absolute best in all cases, is certainly better than the previous one (it’s possible that you may also find that it helps to collect stats on x$ksqeq (the “generic enqueues” structure) which is likely to be the next most significant in terms of number of rows):

------------------------------------------------------
| Id  | Operation               | Name       | Rows  |
------------------------------------------------------
|   0 | SELECT STATEMENT        |            |   133 |
|   1 |  HASH JOIN              |            |   133 |
|   2 |   HASH JOIN             |            |    10 |
|   3 |    VIEW                 | GV$_LOCK   |    10 |
|   4 |     UNION-ALL           |            |       |
|   5 |      FILTER             |            |       |
|   6 |       VIEW              | GV$_LOCK1  |     2 |
|   7 |        UNION-ALL        |            |       |
|   8 |         FIXED TABLE FULL| X$KDNSSF   |     1 |
|   9 |         FIXED TABLE FULL| X$KSQEQ    |     1 |
|  10 |      FIXED TABLE FULL   | X$KTADM    |     1 |
|  11 |      FIXED TABLE FULL   | X$KTATRFIL |     1 |
|  12 |      FIXED TABLE FULL   | X$KTATRFSL |     1 |
|  13 |      FIXED TABLE FULL   | X$KTATL    |     1 |
|  14 |      FIXED TABLE FULL   | X$KTSTUSC  |     1 |
|  15 |      FIXED TABLE FULL   | X$KTSTUSS  |     1 |
|  16 |      FIXED TABLE FULL   | X$KTSTUSG  |     1 |
|  17 |      FIXED TABLE FULL   | X$KTCXB    |     1 |
|  18 |    FIXED TABLE FULL     | X$KSUSE    |   252 |
|  19 |   FIXED TABLE FULL      | X$KSQRS    |  1328 |
------------------------------------------------------

Thanks to Timur Akhmadeev who recently published a note pointing out that you could collect stats on individual X$ tables. Do make sure you test this on your specific version of Oracle, though, and don’t use the production system as your first test case.

April 29, 2013

MV Refresh

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 2:13 pm UTC Apr 29,2013

Materialized views open up all sorts of possibilities for making reporting more efficient – but at the same time they can introduce some “interesting” side effects when you start seeing refreshes taking place. (Possibly one of the most dramatic surprises appeared in the upgrade that switched many refreshes into “atomic” mode, changing a “truncate / append” cycle into a massively expensive “delete / insert” cycle).

If you want to have some ideas of the type of work that is involved in the materialized view “fast refresh”, you could look at a recent pair of articles by Alberto Dell’Era on (very specifically) outer join materialized views (which a link back to a much older article on inner join materialized view refresh):

 

 

April 16, 2013

systimestamp

Filed under: Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 12:46 am UTC Apr 16,2013

For your entertainment – there’s nothing up my sleeves, this was a simple cut-n-paste after real-time typing with no tricks:

20:39:51 SQL> create table t1 (t1 timestamp);

Table created.

20:39:55 SQL> insert into t1 values(systimestamp);

1 row created.

20:39:59 SQL> select t1 - systimestamp  from t1;

T1-SYSTIMESTAMP
---------------------------------------------------------------------------
+000000000 04:59:50.680620

1 row selected.

20:40:08 SQL>

My laptop runs Oracle so quickly that it took only 4 seconds for 5 hours to elapse !
11.2.0.3 on 64-bit Linux – the client is running with TZ=EST5EDT, while the server is running UK Time (currently BST (GMT+1))

Comments available on MOS: 340512.1 Timestamps & time zones – Frequently Asked Questions
Another MOS note, thanks to Jure Bratina in the comments: 227334.1 – “Dates & Calendars – Frequently Asked Questions” in the question

Update:

As Niall quotes in the comments: “times are difficult”.
Oracle supplies three timestamps: systimestamp, localtimestamp, and current_timestamp. (For reasons of consistency, only one of uses an underscore ;) )
Oracle also supplies three timestamp types: timestamp, timestamp with time zone, and timestamp with local time zone.
Oracle also supplies two timezone calls: dbtimezone, and sessiontimezone

If you need to figure out all the details of how these things hang together, I think you need to set your machine timezone to something that isn’t UTC (or GMT as I still tend to call it), then use two separate machines as clients, with their timezones set to two other timezones (again avoiding UTC).

I’ve done a few experiments but without being so rigorous in my settings – my machine was running on GMT, but I opened a (UNIX) session and set the session time zone to EST5EDT to start the database, while running other (UNIX) session with different TZ settings. The reason I should have restarted the machine in a different timezone is that Oracle “normalises” some timestamps to UTC – which means there are cases when I can’t be certain whether the stored value is in UTC because it has been normalised or because it simply was the actual machine time.

So here’s a little experiment (11.2.0.2, instance started in EST5EDT, unix session running in UTC, connecting across the network to the server).

select
        current_timestamp,
        localtimestamp,
        systimestamp
from
        dual
;

CURRENT_TIMESTAMP
---------------------------------------------------------------------------
LOCALTIMESTAMP
---------------------------------------------------------------------------
SYSTIMESTAMP
---------------------------------------------------------------------------
17-APR-13 11.37.10.870658 AM +01:00
17-APR-13 11.37.10.870658 AM
17-APR-13 06.37.10.870554 AM -04:00

Notes:
systimestamp reflects the instance timestamp – which is 5 hours earlier than the session timestamp.
systimestamp returns a timestamp with time zone, not just a timestamp
localtimestamp and current_timestamp show the client time, but localtimestamp doesn’t show the timezone, current_timestamp does (the +1:00 appears because Daylight Saving Time (British Summer Time) is active so my session is one hour ahead of UTC, while the database is 4 hours behind.)

Another quick test:

create table t1 (
        t0              timestamp,
        tz              timestamp with time zone,
        tl              timestamp with local time zone,
        ts_type         varchar2(20)
)
;

insert into t1 values(
        systimestamp, systimestamp, systimestamp,
        'sys Timestamp'
);

commit;

select * from t1;

T0
---------------------------------------------------------------------------
TZ
---------------------------------------------------------------------------
TL                                                                          TS_TYPE
--------------------------------------------------------------------------- --------------------
17-APR-13 06.44.04.353489 AM
17-APR-13 06.44.04.353489 AM -04:00
17-APR-13 11.44.04.353489 AM                                                sys Timestamp

select
        dump(t0,16),
        dump(tz,16),
        dump(tl,16),
        ts_type
from
        t1
;
DUMP(T0,16)
------------------------------------------------------------------------------------------------------------------------
DUMP(TZ,16)
------------------------------------------------------------------------------------------------------------------------
DUMP(TL,16)
------------------------------------------------------------------------------------------------------------------------
TS_TYPE
--------------------
Typ=180 Len=11: 78,71,4,11,7,2d,5,15,11,d0,68
Typ=181 Len=13: 78,71,4,11,b,2d,5,15,11,d0,68,10,3c
Typ=231 Len=11: 78,71,4,11,b,2d,5,15,11,d0,68
sys Timestamp

Notes:
T0 – the timestamp column, has the instance timestamp in it – but doesn’t have any timezone information stored; the raw dump show the value 6:44:04 (7, 2d, 5 – convert from hex and substract one). Anyone on ANY timezone will see their output showing 6:44:04 if they select this column.

TZ – the timestamp with time zone column, has the instance timestamp, but has stored it as (b, 2d,5 – 11:44:04) with time zone information (10,3c) that allows the session to know what “global” moment the information really represents and the location (or, rather, time zone) where is was entered.

TL – the timestamp with local time zone, has the instance timestamp, but has stored it as (b, 2d, 5 – 11:44:04) with NO timezone information. So the output when you query this column is adjusted to suit the local timestamp. It’s the right “global” moment, and it displays as the relevant local time. But, as a penalty, it’s lost the information about where (in which time zone) it was entered.

I think that examination of the content of the raw dumps of the three different types may help you understand why you need to store timestamps in a column type that includes a time zone – if you don’t then you lose some information, and time-based arithmetic will give you some surprises if your application crosses timezones.

Next Issue:

Indexing time (though the link in the comments below to Tony Hasler’s blog probably gives you all the answers you need), and an Oracle design error that I’ve visited before.

April 13, 2013

Deadlocks

Filed under: deadlocks,Locks,Oracle,Troubleshooting — Jonathan Lewis @ 11:36 am UTC Apr 13,2013

Here’s a deadlock graph that might cause a little confusion:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
 Resource Name          process session holds waits  process session holds waits
 TX-001a0015-00014787        34      90     X             32       3           S
 TX-00190008-0000601b        32       3     X             34      90           S

 session 90: DID 0001-0022-00000327      session 3: DID 0001-0020-000009E9
 session 3: DID 0001-0020-000009E9       session 90: DID 0001-0022-00000327

 Rows waited on:
  Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
  (dictionary objn - 199909, file - 6, block - 276654, slot - 0)
  Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
  (dictionary objn - 199909, file - 6, block - 433952, slot - 0)

Both sessions are holding X and waiting S on a TX lock. There are several well-known reasons why you might see a share (mode 4) lock on a transaction table slot: collisions on bitmap indexes, unique indexes, index organized tables, or referential integrity checks are the commonest “external” examples, and problems with interested transaction lists (ITLs) or freelists are the common “internal” ones – so the presence of the waits in share mode shouldn’t, of themselves, a source of confusion.

The confusion is in the reported rowids. If you try to interpret them as real rowids you may be unlucky and discover that they seem to be related to the SQL reported for the deadlocked sessions when really the information they hold is garbage. (Just to avoid confusion, let me make it clear that there are cases where the rowids reported definitely WILL be garbage; on the other hand, there may be some scenarios where the rowids are relevant – although I haven’t done any exhaustive check to see if there really are such scenarios.)

When I see a deadlock graph on transaction locks and the waits are for S mode I tend to assume that the information about the rows waited on is probably misleading; when the slot number for the rowid is zero this increases my confidence that the rowid is rubbish. (Zero is a legal value for a rowid slot, of course, so a zero doesn’t prove that the rowid is rubbish, it’s just a coincidence that allows me to continue following a hypothesis.)

The problem is that Oracle doesn’t waste resources tidying up after itself, and in the case of deadlock graphs this laziness shows up in the rowids reported. The trace is simply reporting whatever happens to be in the row_wait_obj#, row_wait_file#, row_wait_block# and row_wait_row# columns of v$session; and if the waiting process hasn’t updated these columns with current row information you could be looking at the details of the last row (or block) that the session waited for. Here’s the description of a test to demonstrate the behaviour:

create table t1 (n1 number, n2 number);
insert into t1 values(1,1);
create unique index t1_i1 on t1(n1);
create unique index t1_i2 on t1(n2);

session 1:	insert into t1 values(2,11);
session 2:	insert into t1 values(3,21);
session 1:	insert into t1 values(4,21);
session 2:	insert into t1 values(5,11);

With this table, and sequence of events following it, session 1 raised a deadlock error, and dumped the following trace:

Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-000a0021-0000c4aa        16      93     X             13      90           S
TX-00020020-0000fdb9        13      90     X             16      93           S
session 93: DID 0001-0010-00000057	session 90: DID 0001-000D-000000F2
session 90: DID 0001-000D-000000F2	session 93: DID 0001-0010-00000057
Rows waited on:
Session 90: obj - rowid = 00000009 - AAAAAJAABAAAQJcAAA
  (dictionary objn - 9, file - 1, block - 66140, slot - 0)
Session 93: obj - rowid = 0002E7DC - AAAufaAAFAAAAAJAAA
  (dictionary objn - 190428, file - 5, block - 9, slot - 0)

Object 9 is the I_FILE#_BLOCK# index in the data dictionary – and session 90 is definitely not doing anything with that object in this transaction.
Object 190428 is another table in the test schema, but session 93 didn’t access it in this transaction, and the block referenced is the segment header block, not a block that could hold a row.

In fact, just before I started the sequence of inserts I ran this query from a third session (connected as sys) with the following results:

select
	sid,
	row_wait_obj#, row_wait_file#, row_wait_block#, row_wait_row#
from
	v$session
where
	sid in (90,93)
;

       SID ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW#
---------- ------------- -------------- --------------- -------------
        90             9              1           66140             0
        93        190428              5               9             0

The “rows” reported for the deadlock simply echoed the values that were already in the row_wait columns before the test started. (The sessions were waiting on “SQL*Net message from client” at the time.)

Footnote:

Session 90 created the tables and indexes, that’s probably why it happened to have an outstanding reference to the i_file#_block# index.
Session 93 had just run a script to drop all the objects in the schema, which may explain why it happened to have an outstanding reference to a segment header block
If the row_wait_obj# had been set to -1 for either session then the deadlock graph would have reported “No row” for that session.

April 7, 2013

DML Tracking

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 10:04 pm UTC Apr 7,2013

You’ve probably seen questions on the internet occasionally about finding out how frequently an object has been modified. The question is a little ambiguous – does it mean how much change has occurred, or how many DML statements have been executed; either may be an interesting measure. Of course, Oracle gave us a method of answering the first question a long time ago: v$segstat (or v$segment_statistics if you don’t mind doing the join) and the resulting content in the AWR or Statspack reports:

Segments by DB Blocks Changes   DB/Inst: XXXXXXXX/XXXXXXXX  Snaps: 85563-85564
-> % of Capture shows % of DB Block Changes for each top segment compared
-> with total DB Block Changes for all segments captured by the Snapshot

           Tablespace                      Subobject  Obj.      DB Block    % of
Owner         Name    Object Name            Name     Type       Changes Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
XXXXXXXXX  DATA       MLOG$_xxxxxxxxxxxxxx            TABLE    1,144,112   22.16
XXXXXXXXX  DATA       PK_xxxxxxxxxxxxxxxxx            INDEX      614,256   11.90
XXXXXXXXX  DATA       xxxxxxxxxxxxxxx                 TABLE      490,080    9.49
XXXXXXXXX  IDX        IDX_xxxxxxxxxxxxxxxx            INDEX      353,392    6.84
XXXXXXXXX  IDX        PK_yyyyyyyyyyyyyyy              INDEX      273,664    5.30
          -------------------------------------------------------------

There’s still a little ambiguity in this report (in the right circumstances you can change multiple rows in a block while only reporting a single block change), but it’s a very good indicator of the volume of change to an object; but is the million or so block changes due to one SQL statement or 1,000,000 SQL statements ?

With 11.2.0.2 there is an (undocumented) option for finding the answer to that question. Since it’s undocumented I wouldn’t use it on production unless I was really desperate, and even then I’d do the standard “check with Oracle support” first; however I might use it on a test system if I were do something like running tests of a new version of an overnight suite of batch jobs.

You can enable “DML frequency tracking” which seems to count the number of DML statements that modify an object or, to be more accurate, a table. It doesn’t do anything to identify indexes, although it will report index organized tables (using the table name); it won’t identify partition-level DML individually for partitioned tables, it will simply record the DML as being relevant to the table.

To enable the feature you modify hidden parameter _dml_frequency_tracking and then query dynamic performance view v$object_dml_frequencies.


alter system set "_dml_frequency_tracking"=true;

select 
	object, working, slot0, slot1, slot2, slot3, slot4 
from	
	v$object_dml_frequencies
order by
	working + slot0 + slot1 + slot2 + slot3
;

The view has 25 columns for numbered “slots”, and a “working” column. Each slot represents a 15 minute interval (set by parameter _dml_frequency_tracking_slot_time) and only 4 slots are maintained by default (set by parameter _dml_frequency_tracking_slots). The values roll from slot to slot every 15 minutes, and the “working” column keeps track of objects that have suffered DML since the last roll-over. Unfortunately there doesn’t seem to be any obvious way of tracking when a roll-over takes place, so you can’t tell how long the “working” column has been active for.

If you have the default 4 slots set, then “working” and slots 0 to 2 are populated with 15 minutes worth of information each, and slot3 seems to be the accumulated history of all accesses since you enabled the feature.

The view sits on top of x$ksxm_dft, but there are no interesting extra columns in the x$ that aren’t displayed in the v$ and gv$. (It’s an interesting little quirk that the v$ doesn’t following the usual pattern of being a view of the gv$ restricted to the current instance.) You can see the information in the shared pool (by querying v$sgastat) in an area named “dml frequency” – although the figure reported the last time I checked was wrong by a factor of roughly 3 – show about 8KB when I was expecting roughly 24KB.

I haven’t spent a lot of time investigating the feature yet, so there are probably a few questions and boundary conditions to test before using it in anger; but there don’t appear to be any special latches associated with the counters (so possibly the results would be under-count in a highly concurrent system, but maybe the results are collected under a mutex).

Related parameters

_dml_frequency_tracking (false) Control DML frequency tracking — can be set at system level on demand
_dml_frequency_tracking_advance (true) Control automatic advance and broadcast of DML frequencies — ?
_dml_frequency_tracking_slot_time (15) Time length of each slot for DML frequency tracking — needs restart
_dml_frequency_tracking_slots ( 4) Number of slots to use for DML frequency tracking — needs restart

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 1,397 other followers