Here’s a deadlock graph that might cause a little confusion:
Resource Name process session holds waits process session holds waits
TX-001a0015-00014787 34 90 X 32 3 S
TX-00190008-0000601b 32 3 X 34 90 S
session 90: DID 0001-0022-00000327 session 3: DID 0001-0020-000009E9
session 3: DID 0001-0020-000009E9 session 90: DID 0001-0022-00000327
Rows waited on:
Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
(dictionary objn - 199909, file - 6, block - 276654, slot - 0)
Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
(dictionary objn - 199909, file - 6, block - 433952, slot - 0)
You’ve probably seen questions on the internet occasionally about finding out how frequently an object has been modified. The question is a little ambiguous – does it mean how much change has occurred, or how many DML statements have been executed; either may be an interesting measure. Of course, Oracle gave us a method of answering the first question a long time ago: v$segstat (or v$segment_statistics if you don’t mind suffering the join) and the resulting content in the AWR or Statspack reports:
Here’s a summary of a recent posting on OTN:
I have two indexes (REFNO, REFTYPESEQNO) and (REFNO,TMSTAMP,REFTYPESEQNO). When I run the following query the optimizer uses the second index rather than the first index – which is an exact match for the predicates, unless I hint it otherwise:
From time to time I’ve looked at an AWR report and pointed out to the owner the difference in work load visible in the “SQL ordered by” sections of the report when they compare the summary figure with the sum of the work done by the individual statements. Often the summary will state that the captured SQL in the interval represents some percentage of the total workload in the high 80s to mid 90s – sometimes you might see a statement that the capture represents a really low percentage, perhaps in the 30s or 40s.
You have to be a little sensible about interpreting these figures, of course – at one extreme it’s easy to double-count the cost of SQL inside PL/SQL, at the other you may notice that every single statement reported does about the same amount of work so you can’t extrapolate from a pattern to decide how significant a low percentage might be. Nevertheless I have seen examples of AWR reports where I’ve felt justified in suggesting that at some point in the interval some SQL has appeared, worked very hard, and disappeared from the library cache before the AWR managed to capture it.
Now, from Nigel Noble, comes another explanation for why the AWR report might be hiding expensive SQL – a bug, which doesn’t get fixed until 12.2 (although there are backports in hand).
Here’s a little detail that appeared in 11gR2 that may help you answer questions about open cursors. Oracle has added a “cursor type” column to the view v$open_cursor, so you can now see which cursors have been held open because of the pl/sql cursor cache, which have been held by the session cursor cache, and various other reasons why Oracle may take a short-cut when you fire a piece of SQL at it.
The following is the output showing the state of a particular session just after it has started up in SQL*Plus and called a PL/SQL procedure to run a simple count:
I don’t think this is likely to happen on a production system (until 12c) – but look what you can do if you try hard enough:
2 index_name, column_name from user_ind_columns
4 table_name = 'T1'
5 order by
6* index_name , column_position
4 rows selected.
That’s a straight cut-n-paste from an Oracle 184.108.40.206 SQL*Plus session. (You can tell I typed it in real time because I missed the return before the FROM, and couldn’t be bothered to go back and do it again ;) )
By some strange coincidence, the “London Bus” effect perhaps, there have been three posts on the OTN database forum in the last couple of days relating to deadlocks; and they have prompted me to indulge in a little rant about the myth of Oracle and deadlock detection; it’s the one that goes:
“Oracle detects and resolves deadlocks automatically.”
Oracle may detect deadlocks automatically, but it doesn’t resolve them, it simply reports them (by raising error ORA-00060 and rolling back one statement) then leaves the deadlocked sessions stuck until the session that received the report resolves the problem or an external agent resolves the problem.
Consider the following example (which, I have to admit, I wrote without access to a live instance):
Here’s a little follow-on from Friday’s posting. I’ll start it off as a quiz, and follow up tomorrow with an explanation of the results (though someone will probably have given the correct solution by then anyway).
I have a simple heap table t1(id number(6,0), n1 number, v1 varchar2(10), padding varchar2(100)). The primary key is the id column, and the table holds 3,000 rows where id takes the values from 1 to 3,000. There are no other indexes. (I’d show you the code, but I don’t want to make it too easy to run the code, I want you to try to work it out in your heads).
It’s very easy to get a lot of information from an AWR (or Statspack) report – provided you remember what all the numbers represent. From time to time I find that someone asks me a question about some statistic and my mind goes completely blank about the exact interpretation; but fortunately it’s always possible to cross check because so many of the statistics are cross-linked. Here’s an example of a brief mental block I ran into a few days ago – I thought I knew the answer, but realised that I wasn’t 100% sure that my memory was correct:
The example I gave last week showing how a SORT operation in an execution plan might include the work of resolving function calls in your SQL and might, therefore, be reporting much higher resource utilisation than expected reminded me of some problems I’ve had with gaps in execution plans in the past. So I thought I’d give a little demonstration of the way in which the completeness of execution plans can develop over time.
Here’s a little quirk of execution plans that came up recently on the Oak Table network. If you call a function in a query, and do some sorting with the results, where does the work of calling the function get reported in the execution plan if you trace the query or look at the in-memory rowsource execution stats. Let’s take a look at a simple example:
Warning – this is a catch question, and I haven’t given you enough information to have any idea of the right answer; though, by telling you that I haven’t given you enough information to have any idea of the right answer, you now have some information that might help you to get closer to the right answer.
A recent (Jan 2013) post on the OTN database forum reported a performance problem on Oracle 220.127.116.11 (so no AWR), and posted a complete statspack report to one of the public file-sharing sites. It’s been some time since I did a quick run through the highlights of trouble-shooting with statspack, so I’ve picked out a few points from this one to comment on.
As usual, although this specific report is Statspack, the same analysis would have gone into looking at a more modern AWR report, although I will make a couple of comments at the end about the extra material that would have been available by default with the AWR report that would have helped us help the OP.
When doing updates with statements that use correlated subqueries, it’s important to make sure that your brackets (parentheses) are in the right place. Here are two statements that look fairly similar but have remarkably different results – and sometimes people don’t realise how different the statements are:
Sometimes you find bugs on MOS (Metalink, OCIS, whatever) that make you feel positively ill. I’ve just been on a customer site where (in passing) they mentioned that one of their historic queries against v$sqlstats now tool just over one second (CPU) in 18.104.22.168 when it had previously taken about 200ms on 10.2.0.4***. After a little checking it seemed likely that the change was possibly related to the fact that they had increased the size of the SGA significantly, allowing for a much larger shared pool and library cache; however there have been numerous code changes in the shared pool area on the route from 10g to 11g, so I decided to check MOS to see if anyone else had seen a similar problem. I found this: