Oracle Scratchpad

June 25, 2013

System Stats

Filed under: CBO,Oracle,Parallel Execution,Statistics,System Stats — Jonathan Lewis @ 5:27 pm BST Jun 25,2013

Several years ago I wrote the following in “Cost Based Oracle – Fundamentals” (p.47):

The maxthr and slavethr figures relate to throughput for parallel execution slaves. I believe that the figures somehow control the maximum degree of parallelism that any given query may operate at by recording the maximum rate at which slaves have historically been able to operate—but I have not been able to verify this.

Browsing the internet recently, I discovered that that no-one else seems to have published anything to verify my comment, so I decided it was about time I did so myself.  I’m going to work up to it in two blog notes , so if you do happen to know of any document that describes the impact of maxthr and slavethr on the optimizer’s costing algorithms please give me a reference in the comments – that way I might not have to write the second note.

(more…)

June 6, 2013

Parallel DML

Filed under: Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 8:06 am BST Jun 6,2013

Parallel DML is not enabled by default, and it’s easy to forget this and think you’ve made an operation “as parallel as possible” by declaring objects parallel or putting in hints for parallelism.

A recent question on OTN asked about speeding up a  materialized view refresh that seemed to be spending a lot of its time waiting on “PX Deq Credit: send blkd”. The manuals describe this as an “idle event”; but that’s not always true. The OP had supplied the output from tkprof for one of the sessions showing the “insert as select” that was the (complete) refresh and it was clear that the select was running in parallel, but the insert wasn’t – and that’s one case in which the “PX Deq Credit: send blkd” is arguably an “idle” wait (with a timeout of 2 seconds). It’s possible that the refresh could go faster if the OP enabled parallel DML.

The argument for calling this wait idle (in this case) is that N-1 of the parallel slaves that are trying to feed the query co-ordinator are waiting because the query co-ordinator is soaking up data from the Nth slave as fast as it can – the query co-ordinator can’t go any faster and the slaves are being told to wait until the query co-ordinator is ready for their input. On the other hand, if you don’t need those waits to happen at all you could argue that they aren’t idle because they are affecting the end-user response time. To stop them happening, you can minimise the messages from the PX slaves to the query co-ordinator by making the insert run in parallel; and since it’s important to recognise the difference in plans between a parallel and non-parallel insert I thought I’d give you a little model to test.

create table t1 as select * from all_objects;
create table t2 as select * from t1 where rownum <= 1;

alter table t1 parallel (degree 2);
alter table t2 parallel (degree 2);

explain plan for
insert /*+ append */ into t2 select * from t1;

select * from table(dbms_xplan.display(null,null,'-note -cost -rows -bytes'));

commit;
alter session enable parallel dml;

explain plan for
insert /*+ append */ into t2 select * from t1;

select * from table(dbms_xplan.display(null,null,'-note -cost -rows -bytes'));

This code creates a couple of tables, declared as parallel, then inserts (twice) from one to the other. In the first case only the select can run parallel so all the data from the parallel slaves will be passed to the query co-ordinator to be inserted. In the second case both the select and the insert can take place in parallel, so each slave would build its own data segments, and the only messages sent to the query co-ordinator would be about the list of extents in each “private” segment that have to be merged. Here are the two execution plans:

-----------------------------------------------------------------------
| Id  | Operation             | Name     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------
|   0 | INSERT STATEMENT      |          |        |      |            |
|   1 |  LOAD AS SELECT       | T2       |        |      |            |
|   2 |   PX COORDINATOR      |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)| :TQ10000 |  Q1,00 | P->S | QC (RAND)  |
|   4 |     PX BLOCK ITERATOR |          |  Q1,00 | PCWC |            |
|   5 |      TABLE ACCESS FULL| T1       |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------

-----------------------------------------------------------------------
| Id  | Operation             | Name     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------
|   0 | INSERT STATEMENT      |          |        |      |            |
|   1 |  PX COORDINATOR       |          |        |      |            |
|   2 |   PX SEND QC (RANDOM) | :TQ10000 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    LOAD AS SELECT     | T2       |  Q1,00 | PCWP |            |
|   4 |     PX BLOCK ITERATOR |          |  Q1,00 | PCWC |            |
|   5 |      TABLE ACCESS FULL| T1       |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------

Both plans are so simple in shape that you basically read them from the bottom up. The first plan shows the data passing from the PX slaves to the QC (PX coordinator) which does the LOAD AS SELECT. The second plan shows the PX slaves loading as they select (the load is grouped as PCWP – parallel combined with child – with the tablescan), and information is sent to the QC only after the load has completed.

Bottom line: if the PX SEND QC is above (lower line number) the LOAD AS SELECT the slaves are doing the insert, if the LOAD AS SELECT is above the PX SEND QC the query coordinator is doing the insert (and you’ll probably see lots of “PX Deq Credit: send blkd” at that point as the PX slaves wait for the QC to load data into blocks).

June 14, 2012

PX and system allocation

Filed under: fragmentation,Infrastructure,Oracle,Parallel Execution — Jonathan Lewis @ 5:00 pm BST Jun 14,2012

A few years ago (2007) I wrote about a problem that could appear when you mixed parallel execution with system managed extent allocation. A couple of years later I added a note that Christian Antognini had observed a patch in 11.1.0.7 that addressed the specific issue I had raised. Today, thanks to an email exchange with Christo Kutrovsky of Pythian, I can report that there is a variation of this issue still available even in 11.2.0.3.

The basic problem is that you can end up with a very large number of very small extents, leading to poor performance in parallel queries and a significant waste of space in a data segment. Here’s a simple, though not particularly realistic, way to demonstrate the problem.
(more…)

June 30, 2011

Virtual bug

Filed under: Bugs,Function based indexes,Indexing,Oracle,Parallel Execution,Statistics,Troubleshooting — Jonathan Lewis @ 5:37 pm BST Jun 30,2011

I’ve said in the past that one of the best new features, in my view, in 11g was the appearance of proper virtual columns; and I’ve also been very keen on the new “approximate NDV” that makes it viable to collect stats with the “auto_sample_size”.

Who’d have guessed that if you put them both together, then ran a parallel stats collection it would break :(

The bug number Karen quotes (10013177.8) doesn’t (appear to) mention extended stats – but since virtual columns, function-based indexes, and extended stats share a number of implementation details I’d guess that they might be affected as well.

January 3, 2010

Pseudo-parallel

Filed under: Infrastructure,Parallel Execution — Jonathan Lewis @ 11:32 am BST Jan 3,2010

There’s a posting on the OTN database forum asking for suggestions about restricting a query to data from one datafile of a tablespace.  It seems to be a slightly odd thing to do, but it reminded me of a “pseudo-parallel” trick I once used (and described) more than 10 years ago on a system running Oracle version 7. It was a long time ago, but I think I needed to do this because I wanted to execute a massive update with as many parallel threads as possible in the days when parallel DML wasn’t possible.

(more…)

January 19, 2009

Space Checks

Filed under: Infrastructure,Parallel Execution,Troubleshooting — Jonathan Lewis @ 7:20 pm BST Jan 19,2009

At least twice in the last few weeks, there’s been a question on OTN about a mismatch between dba_extents and dba_segments  after rebuilding some indexes.

Here’s one of the threads, where Randolf Geist supplies the explanation for the anomaly.

December 13, 2008

Direct Path

Filed under: Infrastructure,Parallel Execution,Performance — Jonathan Lewis @ 10:31 am BST Dec 13,2008

Doug Burn’s sent me an email a couple of days ago about a question that had appeared on his blog in response to a comment I made several months ago.

I’ve followed up with an answer, of course, but reading the blog reminded me that some interesting stuff had come up about direct path reads and writes, as well as some comments on parallel query. So I thought I’d publish a link to the item so you could see the whole discussion.

November 5, 2008

PX Buffer

Filed under: Execution plans,Parallel Execution,Performance,trace files,Troubleshooting — Jonathan Lewis @ 7:11 pm BST Nov 5,2008

Here’s a surprising anomaly that showed up in a question on the OTN forums a little while ago. Consider a simple query that uses a hash join between two tables.

select
	/*+
		ordered
		use_hash(t2)
		parallel(t1 2)
		parallel(t2 2)
		pq_distribute(t2 hash hash)
	*/
	t1.padding,
	t2.padding
from 	t1, t2
where	t2.n1 = t1.n1
and	t2.small_vc = t1.small_vc
;

When it runs serially the join completes in memory and the only I/O you see comes from the two tablescans. When the query runs parallel something causes a spill to the temporary tablespace.
(more…)

June 25, 2007

qb_name

Filed under: Execution plans,Hints,Parallel Execution,Tuning — Jonathan Lewis @ 8:15 pm BST Jun 25,2007

When Oracle tries to optimize a complex SQL statement one of the first steps it takes is to transform it into something simpler. The ideal target (from the optimizer’s perspective) is to eliminate any subqueries and in-line views so that your SQL statement ends up as a list of tables in a single from clause. Every in-line view and subquery is a separate query block, and the optimizer really wants to turn the whole thing into a single query block. Unfortunately this isn’t always possible, and sometimes it isn’t even a good idea - which is why 10g introduced ‘cost base query transformation’.

(more…)

May 29, 2007

Autoallocate and PX

Filed under: Infrastructure,Oracle,Parallel Execution,Performance,Troubleshooting — Jonathan Lewis @ 9:22 pm BST May 29,2007

Here’s a quirky little set of results from trace files. First a 10391 at level 64 – this shows the way in which an object was broken into granules for a parallel tablescan. I’ve just listed the first few granules: (more…)

March 14, 2007

How parallel

Filed under: Execution plans,Parallel Execution,trace files,Troubleshooting — Jonathan Lewis @ 7:16 am BST Mar 14,2007

Or as Doug Burns put it on his blog: how can I tell the actual DOP used for my parallel query”

As Doug points out, you first have to ask “When are you asking the question”. Immediately after you have run the query you could check v$pq_tqstat to see what happened, but apart from that you have  very little hope of finding out in retrospect what happened in a specific query. 

(more…)

February 19, 2007

Parallelism and CBO

Filed under: CBO,Parallel Execution,Troubleshooting — Jonathan Lewis @ 9:43 pm BST Feb 19,2007

In the 9i Database Performance Tuning Guide and Reference part no: A96533-01, there is a convenient list of features that “require the CBO” on page 1-15, with the note

“Use of any of these features enables the CBO, even if the parameter OPTIMIZER_MODE is set to RULE.”

The list is as follows:
(more…)

January 11, 2007

Resc/Resp

Filed under: CBO,Parallel Execution,trace files — Jonathan Lewis @ 8:56 pm BST Jan 11,2007

If you look at a 10053 trace file, you will find references to Resc and Resp (sometimes rsc and rsp) lurking in the background whenever the trace file mentions Cost. When you get to the 10gR2 trace file, you see these described in the “Legend” section as: (more…)

December 28, 2006

Parallel Execution

Filed under: Infrastructure,Parallel Execution,Partitioning,Performance — Jonathan Lewis @ 7:21 pm BST Dec 28,2006

Looking back at some of my previous posts I was reminded how easy it is to overlook one important feature when trying to comment on another.  In particular, in this case, a short posting on indexed access paths omitted any mention of parallel execution until a comment from David Aldridge jogged my memory. So here’s an important thought about parallel execution.

(more…)

« Previous Page

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,505 other followers