I wrote a note for my website some time ago about the dangers of using scripts that ‘looked cool’ by virtue of their reference to v$ and x$ objects. So I was quite amused to get the following script emailed to me:
select
distinct
ws.osuser locked, ws.process,
hs.osuser holding, hs.process,
obj$.name, sw.seconds_in_wait time
from
v$session ws,
v$session hs,
v$lock wl,
v$lock hl,
obj$,
v$session_wait sw
where
ws.sid = sw.sid
and ws.row_wait_obj# = obj$.obj#
and ws.row_wait_obj# > 0
and sw.seconds_in_wait > 0
and wl.sid = ws.sid
and wl.id1 = hl.id1
and wl.id2 = hl.id2
and hl.block = 1
and wl.request != 0
and hl.sid = hs.sid
;
With the script came the request for assistance in tuning it because: “at the client site they’ve managed to use OEM tuning advisor on 10.2.0.2 to get the time down from 40 seconds to eight seconds, but on my system I don’t get the same execution plan and it behaves quite reasonably”.
The functional description of the statement was that it would show: “who was locking what and who was waiting for this lock”, with the comment that the statement had originally been acquired from the Internet.
So here’s the generic advice for tuning any SQL:
- understand what the code is supposed to achieve
- understand the data
- determine the rational order of work
- discover if Oracle can follow that order automatically
- make Oracle follow that order (if necessary)
Of course, there are numerous subtle points behind this stark list of actions – but it does give you the broad outline of a sensible strategy. But there’s a rule zero for any SQL you find on the Internet: don’t use it until you’ve worked out what it’s doing.
If you apply these rules to this particular statement you can immediately start to pick up all sorts of details.
First – have a look at v$session: in 10g most of the information from v$session_wait is built into v$session (the underlying x$ksusecst is part of the same C-structure as x$ksuse) so there’s a completely redundant join in the query.
Second – what’s that distinct doing? There are times when you look at the intent of a query and know that the distinct is only there because the original author hadn’t quite got the code right and needed to get rid of unexpected duplicates. That leads us to the point that in the self-join of v$lock, the join on lock type is missing – in this case it’s not likely to make a difference, but it is a mistake.
Third – v$lock is a very messy object. Its definition in 10.2.0.3 includes a 10 table union all followed by a join to x$ksuse (v$session again) and x$ksqrs: so it might be better if we avoided that self-join – and we could even try to get clever with the direct link to v$session ! In fact, if this is a piece of code that’s supposed to run as efficiently as possible when we see a problem, we might even take the whole thing down to the x$ objects.
Fourth – dynamic performance views are not guaranteed to give us self-consistent data (it’s in-memory, there’s no consistent read mechanism) – so if we look at v$lock and then track to v$session we might get to v$session at a moment when the lock has disappeared and the object number and seconds in wait have changed, so the tests for seconds_in_wait and row_wait_obj# might lose some useful data – or even give some misleading data. In fact, whatever we do with this query, there is some chance that some of the results will be misleading because of the consistency problem, and that’s a point that we should make clear in our documentation for the script.
Finally – we need to consider the data volumes. I rarely have more than 20 sessions on my laptop, but some of my clients have several thousand connected sessions with a few hundred active. Does this tell me anything about a general-purpose execution path; and since I’m querying something as variable as the dynamic performance views, should I stack my query with hints ?
The purpose of the query is to find out how locks are causing problems, so we probably ought to start the query at v$lock – to find out if there are any locks to worry about, then add value through v$session and obj$.
Since v$lock is such a messy view, and we’re going to join it to itself, it would probably be a good idea to take advantage of subquery factoring to materialize the smallest result set before joining to the rest of the views. We might start with something like the following:
with locks as (
select /* materialize */
sid,
type, id1, id2,
block, request, ctime
from
v$lock
where
block = 1 -- blockers
or request != 0 -- waiters
)
select
/*+ qb_name(inline) */
wss.sid waiting_session,
wss.process waiting_process,
wss.osuser waiting_user,
bss.sid blocker_session,
bss.process blocker_process,
bss.osuser blocker_user,
blk.ctime blocker_hold_time,
wlk.ctime waiting_time,
blk.type blocker_type,
blk.id1 blocker_id1,
blk.id2 blocker_id2,
wss.row_wait_obj# waiting_obj
from
locks blk,
locks wlk,
v$session bss,
v$session wss
where
blk.block = 1
and wlk.request != 0
and wlk.type = blk.type
and wlk.id1 = blk.id1
and wlk.id2 = blk.id2
and bss.sid = blk.sid
and wss.sid = wlk.sid
)
I’ve included a /*+ materialize */ hint to force Oracle to generate a (GTT) global temporary table to hold the subset of data I need from v$lock. You’ll also notice that I’ve given the query a slightly surprising query block name – which I will explain shortly.
I haven’t bothered to join this view to obj$ (as the original query does) because it’s a waste of resources – remember, this is supposed to be a low-impact query that runs when the system is in trouble! An object isn’t going to change its id, so if this query finds anything interesting, I’ll use the reported object id to get the object name from a follow-up query.
On the the other hand it might not cost much to do the join. But which object id do I want, the one from v$lock (if it’s a lock of type ‘TM’) or the one from v$session (for all other lock types) ? Bear in mind, by the way, that the lack of read-consistency may mean that the object id in v$session could be the wrong one when the lock type shows ‘TM’.
Clearly, I need the following join predicate:
obj$.obj# = decode(blocker_type,
'TM',blocker_id1,
waiting_obj
)
UNfortunately this has to be turned into an outer join because the object id in v$session will be -1 (which doesn’t exist as an object) if there is no guilty object. And you can’t do outer joins to two different tables, or you get Oracle error:
ORA-01417: a table may be outer joined to at most one other table
And that’s why I had the funny query block name above. I’ve folded that query into an inline view before joining to the obj$ table. Here’s the final query:
select
/*+
leading(@sel$03ADD216 blk@inline wlk@inline)
use_hash(@sel$03ADD216 wlk@inline)
no_swap_join_inputs(@sel$03ADD216 wlk@inline)
*/
lock_view.*,
obj$.name,
obj$.subname -- plus a decode() for type ?, plus owner ?
from (
with locks as (
select /* materialize */
sid,
type, id1, id2,
block, request, ctime
from
v$lock
where
block = 1
or request != 0
)
select
/*+ qb_name(inline) */
wss.sid waiting_session,
wss.process waiting_process,
wss.osuser waiting_user,
bss.sid blocker_session,
bss.process blocker_process,
bss.osuser blocker_user,
blk.ctime blocker_hold_time,
wlk.ctime waiting_time,
blk.type blocker_type,
blk.id1 blocker_id1,
blk.id2 blocker_id2,
wss.row_wait_obj# waiting_obj
from
locks blk,
locks wlk,
v$session bss,
v$session wss
where
blk.block = 1
and wlk.request != 0
and wlk.type = blk.type
and wlk.id1 = blk.id1
and wlk.id2 = blk.id2
and bss.sid = blk.sid
and wss.sid = wlk.sid
) lock_view,
obj$
where
obj$.obj#(+) = decode(blocker_type,
'TM',blocker_id1,
waiting_obj
)
;
I’ve had to include a couple of hints to get the correct behaviour inside the (non-mergeable) lock_view – without the hints the query crashed with an interesting ORA-00600 error when I tried to generate an execution plan for it.
You’ll notice that I’ve had to use the method I decribed in an earlier post to discover the name of a transformed query block so that I could make sure that the optimizer started by doing a hash join that used the blockers set (which is likely to be the smaller set) as the build table and the waiters set as the probe table.
In theory I should have included a load more hints to make sure that the joins into v$session (or rather x$ksuse and x$ksled) followed the paths I wanted – but I got a bit lazy, so I’ve left that as an exercise to the reader. In my case, the optimizer used a nested loop all the way down – but for a big system it is possible that a nested loop into the blockers followed by a hash join into the waiters might be more appropriate.
The final execution plan – excluding the bit that generates the locks global temporary table:
-------------------------------------------------------------------------------
| Id | Operation | Name | Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 00:00:01 |
| 2 | VIEW | | 00:00:01 |
...
| 23 | NESTED LOOPS | | 00:00:01 |
| 24 | NESTED LOOPS | | 00:00:01 |
| 25 | NESTED LOOPS | | 00:00:01 |
| 26 | NESTED LOOPS | | 00:00:01 |
|* 27 | HASH JOIN | | 00:00:01 |
|* 28 | VIEW | | 00:00:01 |
| 29 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_73691C | 00:00:01 |
|* 30 | VIEW | | 00:00:01 |
| 31 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_73691C | 00:00:01 |
|* 32 | FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) | 00:00:01 |
|* 33 | FIXED TABLE FIXED INDEX | X$KSLED (ind:2) | 00:00:01 |
|* 34 | FIXED TABLE FIXED INDEX | X$KSUSE (ind:1) | 00:00:01 |
|* 35 | FIXED TABLE FIXED INDEX | X$KSLED (ind:2) | 00:00:01 |
| 36 | TABLE ACCESS BY INDEX ROWID | OBJ$ | 00:00:01 |
|* 37 | INDEX UNIQUE SCAN | I_OBJ1 | 00:00:01 |
-------------------------------------------------------------------------------
It’s probably about as efficient as it gets.
The only nasty thing about it is that it generates a global temporary table – so if the system is jammed in some way on the temporary tablespace, or temp space allocation lock, this query could just end up joining the jam (in which case you need to inline the locks subquery twice, and specify a /*+ no_merge */ hint each time).
I’m not going to guarantee this query – it’s just something I put together in 20 minutes. I think it’s correct and sensible, but I haven’t tested it properly. I still can’t think why the original query had a distinct in it, and that worries me a bit – have I overlooked something ?
I imagine Asimov would be proud of your Zeroth law!
Comment by Jeff Moss — July 2, 2007 @ 6:26 am UTC Jul 2,2007 |
I recognize the performance problem on v$ and also v$lock views. Now I always use my rewrite of Oracle’s ?/rdbms/admin/utllockt.sql, this shows a tree view of all blocking locks and their dependencies. Blockers & waiters need to be installed (?/rdbms/admin/catblock.sql). It runs fast and I’ve used it without trouble on 9i and 10g, but that’s no guarantee it will be OK on your database
WITH locks_temp AS
(SELECT /*+ MATERIALIZE */
*
FROM dba_locks)
, lock_holders1 AS
(
SELECT /*+ MATERIALIZE */
w.session_id waiting_session
, h.session_id holding_session
, w.lock_type lock_type
, h.mode_held mode_held
, w.mode_requested mode_requested
, w.lock_id1 lock_id1
, w.lock_id2 lock_id2
FROM locks_temp w
, locks_temp h
WHERE h.blocking_others = ‘Blocking’
AND h.mode_held != ‘None’
AND h.mode_held != ‘Null’
AND w.mode_requested != ‘None’
AND w.lock_type = h.lock_type
AND w.lock_id1 = h.lock_id1
AND w.lock_id2 = h.lock_id2)
, lock_holders2 AS
(
SELECT /*+ MATERIALIZE */
holding_session waiting_session
, 0 holding_session
, ‘None’ lock_type
, ” mode_held
, ” mode_requested
, ” lock_id1
, ” lock_id2
FROM lock_holders1
MINUS
SELECT waiting_session
, 0 waiting_session
, ‘None’ lock_type
, ” mode_held
, ” mode_requested
, ” lock_id1
, ” lock_id2
FROM lock_holders1)
, lock_holders_all AS
(SELECT *
FROM lock_holders1
UNION ALL
SELECT *
FROM lock_holders2)
SELECT RPAD (’ ‘, 3 * (LEVEL – 2), ‘ ‘)|| decode(level,1,”,’|–’) || l.waiting_session sid
, l.lock_type
, s.username username
, s.machine machine
, s.osuser osuser
, NVL (s.module, s.program) program
, ‘alter system kill session ‘||””||s.sid||’,'||s.serial#||””||’;’ kill
FROM lock_holders_all l
, v$session s
WHERE s.SID = l.waiting_session
CONNECT BY PRIOR l.waiting_session = l.holding_session
START WITH l.holding_session = 0
ORDER SIBLINGS by s.sid
Comment by bstav — July 2, 2007 @ 6:48 am UTC Jul 2,2007 |
Excellent post! I would like to see more of these detailed query walk throughs. They give an interesting insight into some of the complexities of writing efficient queries.
Possible display error – I think your query hints are in the incorrect format.
Best regards.
mathew butler
Comment by Mathew Butler — July 3, 2007 @ 8:02 am UTC Jul 3,2007 |
Hi Jonathan,
Thanks for this analysis. Your comment about the read consistency of v$ views got me to thinking. As you noted, some of these are very complex structures. Does the AWR snapshot provided in 10G take care of this? In other words, it may not pragmatically make a difference, but if the snapshot procedure selects from v$sql, and then later from v$sql_plan, the contents of the two views and their links may not correspond.
As I noted, it probably isn’t an issue, but I wondered if Oracle accounts for this somehow?
Thanks,
Steve
Comment by Steve Howard — July 3, 2007 @ 3:37 pm UTC Jul 3,2007 |
Mathew, if you’re thinking of the missing “+” from the /*+ materialize */ hint, I left that out to check that the view subquery would materialize with or without the hint – then forgot to put it back in.
Steve, I doubt if the AWR snapshot can do anything about it, but I’d have to walk through the snapshot SQL to check.
Comment by Jonathan Lewis — July 3, 2007 @ 5:16 pm UTC Jul 3,2007 |
[...] another top quality post, Jonathan reiterates the need to be wary of downloaded SQL and highlights the importance of understanding what you’re about to run. Continuing on the vein of caution and wariness, Niall Litchfield says just be careful out [...]
Pingback by Log buffer 52 - a carnival of the vanities for dbas « OraStory — July 6, 2007 @ 6:45 pm UTC Jul 6,2007 |
[...] of SQL to identify unindexed foreign keys, I was lazy and did a search on google, albeit with this caveat in mind. Interestingly I found a script on experts-exchange that I wrote to do exactly the same [...]
Pingback by Deadlock detection - unindexed foreign keys « OraStory — August 14, 2007 @ 9:00 am UTC Aug 14,2007 |