I thought I’d try to spend some of today catching up on old comments – first the easier ones, then the outstanding questions on Oracle Core.
The very first one I looked at was about pushing predicates, and the specific comment prompted me to jot down this little note about the 10053 trace file (the CBO trace).
In “real-life” I don’t often look at 10053 trace files because they tend to be very long and messy and usually you can see all you need from an execution plan – perhaps through SQL Monitoring, or possible with the rowsource execution stats enabled. However there are questions that can sometimes be resolved very quickly by a simple text search of a trace file; questions of the form: “why isn’t the optimizer using feature X”, for example: “why didn’t the optimizer use predicate push for this view?” If you have a question of this type, then search the trace file for the word “bypass” – you may find comments like the following:
JPPD: JPPD bypassed: Outer query references right side of outer join
JPPD: JPPD bypassed: View has a single or group set function.
JPPD: JPPD bypassed: OLD_PUSH_PRED hint specified
JPPD: JPPD bypassed: User-defined operator
JPPD: JPPD bypassed: Push-down not enabled
JPPD: JPPD bypassed: View not on right-side of outer join
JPPD: JPPD bypassed: View contains a group by.
JPPD: JPPD bypassed: View contains a window function.
JPPD: JPPD bypassed: View contains a MODEL clause.
JPPD: JPPD bypassed: View contains a DISTINCT.
JPPD: JPPD bypassed: View contains a rownum reference.
JPPD: JPPD bypassed: START WITH query block.
JPPD: JPPD bypassed: View is a set query block.
JPPD: JPPD bypassed: Negative hint found
JPPD: JPPD bypassed: Outline does not contain hint
JPPD: JPPD bypassed: Base table missing statistics
JPPD: JPPD bypassed: Remote table
JPPD: JPPD bypassed: Security violation
JPPD: JPPD bypassed: Possible security violation.
JPPD bypassed: Table level NO_PUSH_PRED hint.
JPPD bypassed: Table-level hint.
JPPD bypassed: Query block NO_PUSH_PRED hint.
JPPD bypassed: View semijoined to table
JPPD bypassed: Contained view has no table in fro list.
JPPD bypassed: View contains a START WITH query block.
JPPD bypassed: Does not contain a view.
JPPD bypassed: Found branches of different types
According to a note that I’ve got with this list, by the way, I used “strings -a” on the Oracle executable for 10.2.0.1 to generate it. It’s probably about time I did the same for newer versions of Oracle.
My plans for addressing comments have been disrupted somewhat. Just as I published this note, an email holding two 10053 trace files arrived. (The author had asked before sending them, and I was sufficiently curious that I had agreed to take a quick look). So I’ve spent most of the last hour doing what I’ve just said I hardly ever to – looking at 10053 trace files.
The question was “why does this query run serially if I have a particular scalar subquery in the select list, but run parallel if I replace it with a function”. The immediate answer, after I’d seen the query and thought about it for a bit, was: “because the manuals (10.2 – the relevant version) say that you don’t parallelize if you have scalar subqueries in the select list”; but this changed the question to: “why is it just this one scalar subquery that causes serialization when the other two scalar subqueries don’t”. Of the three scalar subqueries, only one of them cause the query to serialize.
The answer to that question is a little more subtle – and I’ll blog about it when I can find time to model the scenario.
Here’s an interesting anomaly from the OTN database forum – PL/SQL track. It’s a single row delete from a table that does a huge number of db block gets and (in the first example shown) physical reads. Here’s the supplied output with autotrace enabled:
I’ve written about dynamic sampling in the past, but here’s a little wrinkle that’s easy to miss. How do you get the optimizer to work out the correct cardinality for a query like (the table creation statement follows the query):
Some time ago – actually a few years ago – I wrote a note about the hint /*+ gather_plan_statistics */ making some informal comments about the implementation and relevant hidden parameters. I’ve recently discovered a couple of notes from Alexander Anokhin describing the feature in far more detail and describing some of the misleading side effects of the implementaiton. There are two parts (so far): part 1 and part 2.
Dominic Brooks published a note recently about some very nasty SQL – originally thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a run-time problem, but when I first read Dominic’s note I was sufficiently surprised that I decided to try modelling the query.
Here’s a quick and dirty script to create a procedure (in the SYS schema – so be careful) to check the Hakan Factor for an object. If you’re not familiar with the Hakan Factor, it’s the value that gets set when you use the command “alter table minimize records_per_block;”.
The problem of slow queries on v$lock just came up again on the OTN database forum, so I thought I’d better push out a post that’s been hanging around on my blog for the last few months. This is actually mentioned in MOS in note 1328789.1: “Query Against v$lock Run from OEM Performs Slowly” which points out that it is basically a problem of bad statistics and all you have to do is collect the stats.
For your entertainment – there’s nothing up my sleeves, this was a simple cut-n-paste after real-time typing with no tricks:
20:39:51 SQL> create table t1 (t1 timestamp);
20:39:55 SQL> insert into t1 values(systimestamp);
1 row created.
20:39:59 SQL> select t1 - systimestamp from t1;
1 row selected.
Here’s a deadlock graph that might cause a little confusion:
Resource Name process session holds waits process session holds waits
TX-001a0015-00014787 34 90 X 32 3 S
TX-00190008-0000601b 32 3 X 34 90 S
session 90: DID 0001-0022-00000327 session 3: DID 0001-0020-000009E9
session 3: DID 0001-0020-000009E9 session 90: DID 0001-0022-00000327
Rows waited on:
Session 90: obj - rowid = 00030CE5 - AAAwzlAAGAABDiuAAA
(dictionary objn - 199909, file - 6, block - 276654, slot - 0)
Session 3: obj - rowid = 00030CE5 - AAAwzlAAGAABp8gAAA
(dictionary objn - 199909, file - 6, block - 433952, slot - 0)
You’ve probably seen questions on the internet occasionally about finding out how frequently an object has been modified. The question is a little ambiguous – does it mean how much change has occurred, or how many DML statements have been executed; either may be an interesting measure. Of course, Oracle gave us a method of answering the first question a long time ago: v$segstat (or v$segment_statistics if you don’t mind suffering the join) and the resulting content in the AWR or Statspack reports:
Here’s a summary of a recent posting on OTN:
I have two indexes (REFNO, REFTYPESEQNO) and (REFNO,TMSTAMP,REFTYPESEQNO). When I run the following query the optimizer uses the second index rather than the first index – which is an exact match for the predicates, unless I hint it otherwise:
From time to time I’ve looked at an AWR report and pointed out to the owner the difference in work load visible in the “SQL ordered by” sections of the report when they compare the summary figure with the sum of the work done by the individual statements. Often the summary will state that the captured SQL in the interval represents some percentage of the total workload in the high 80s to mid 90s – sometimes you might see a statement that the capture represents a really low percentage, perhaps in the 30s or 40s.
You have to be a little sensible about interpreting these figures, of course – at one extreme it’s easy to double-count the cost of SQL inside PL/SQL, at the other you may notice that every single statement reported does about the same amount of work so you can’t extrapolate from a pattern to decide how significant a low percentage might be. Nevertheless I have seen examples of AWR reports where I’ve felt justified in suggesting that at some point in the interval some SQL has appeared, worked very hard, and disappeared from the library cache before the AWR managed to capture it.
Now, from Nigel Noble, comes another explanation for why the AWR report might be hiding expensive SQL – a bug, which doesn’t get fixed until 12.2 (although there are backports in hand).
Here’s a little detail that appeared in 11gR2 that may help you answer questions about open cursors. Oracle has added a “cursor type” column to the view v$open_cursor, so you can now see which cursors have been held open because of the pl/sql cursor cache, which have been held by the session cursor cache, and various other reasons why Oracle may take a short-cut when you fire a piece of SQL at it.
The following is the output showing the state of a particular session just after it has started up in SQL*Plus and called a PL/SQL procedure to run a simple count:
I don’t think this is likely to happen on a production system (until 12c) – but look what you can do if you try hard enough:
2 index_name, column_name from user_ind_columns
4 table_name = 'T1'
5 order by
6* index_name , column_position
4 rows selected.
That’s a straight cut-n-paste from an Oracle 18.104.22.168 SQL*Plus session. (You can tell I typed it in real time because I missed the return before the FROM, and couldn’t be bothered to go back and do it again ;) )
By some strange coincidence, the “London Bus” effect perhaps, there have been three posts on the OTN database forum in the last couple of days relating to deadlocks; and they have prompted me to indulge in a little rant about the myth of Oracle and deadlock detection; it’s the one that goes:
“Oracle detects and resolves deadlocks automatically.”
Oracle may detect deadlocks automatically, but it doesn’t resolve them, it simply reports them (by raising error ORA-00060 and rolling back one statement) then leaves the deadlocked sessions stuck until the session that received the report resolves the problem or an external agent resolves the problem.
Consider the following example (which, I have to admit, I wrote without access to a live instance):