Or as Doug Burns put it on his blog: how can I tell the actual DOP used for my parallel query”
As Doug points out, you first have to ask “When are you asking the question”. Immediately after you have run the query you could check v$pq_tqstat to see what happened, but apart from that you have very little hope of finding out in retrospect what happened in a specific query.
This leads on to the suggestion that, since parallel queries are supposed to be heavy-duty events consuming lots of resources, you should probably adopt a system of ‘pre-emptive’ trouble-shooting that follows each parallel query with a query against v$pq_tqstat and dump its output to a trace file. Doug suggests a couple of summary queries – but I think you might as well dump the driving SQL statement, and the entire content of v$pq_tqstat.
There are two problems with this approach, though. First, as Doug points out, v$pq_tqstat doesn’t always work properly. The other problem is that “queries” don’t run in parallel, it’s “data flow operations” (DFOs) that run in parallel, and a single query can be made up of several data flow operations. And, unfortunately, v$pq_tqstat tends to go wrong at exactly the moment when the queries get tough and start using multiple DFOs.
The only workaround I know to this is set event 10391 at level 128. This dumps a bit of debug information that describes the allocation of slave processes. For example (from one complex parallel query):
*** SESSION ID:(9.7) 2007-03-14 06:57:35.906
kxfrDmpSys
dumping system information
arch:255 ((unknown)
sess:9 myiid:1 mynid:1 ninst:1 maxiid:1
Instances running on that system:
inum:0 iid:1
kxfrDmpUpdSys
allocated slave set: nsset:1 nbslv:4
Slave set 0: #nodes:1
Min # slaves 4: Max # slaves:4
List of Slaves:
slv 0 nid:0
slv 1 nid:0
slv 2 nid:0
slv 3 nid:0
List of Nodes:
node 0
kxfrDmpSys
dumping system information
arch:255 ((unknown)
sess:9 myiid:1 mynid:1 ninst:1 maxiid:1
Instances running on that system:
inum:0 iid:1
kxfrDmpUpdSys
allocated slave set: nsset:1 nbslv:7
Slave set 0: #nodes:1
Min # slaves 7: Max # slaves:7
List of Slaves:
slv 0 nid:0
slv 1 nid:0
slv 2 nid:0
slv 3 nid:0
slv 4 nid:0
slv 5 nid:0
slv 6 nid:0
List of Nodes:
node 0
kxfrDmpSys
dumping system information
arch:255 ((unknown)
sess:9 myiid:1 mynid:1 ninst:1 maxiid:1
Instances running on that system:
inum:0 iid:1
kxfrDmpUpdSys
allocated slave set: nsset:1 nbslv:5
Slave set 0: #nodes:1
Min # slaves 5: Max # slaves:5
List of Slaves:
slv 0 nid:0
slv 1 nid:0
slv 2 nid:0
slv 3 nid:0
slv 4 nid:0
List of Nodes:
node 0
kxfrDmpSys
dumping system information
arch:255 ((unknown)
sess:9 myiid:1 mynid:1 ninst:1 maxiid:1
Instances running on that system:
inum:0 iid:1
kxfrDmpUpdSys
allocated slave set: nsset:2 nbslv:3
Slave set 0: #nodes:1
Min # slaves 3: Max # slaves:3
List of Slaves:
slv 0 nid:0
slv 1 nid:0
slv 2 nid:0
List of Nodes:
node 0
Slave set 1: #nodes:1
Min # slaves 3: Max # slaves:3
List of Slaves:
slv 0 nid:0
slv 1 nid:0
slv 2 nid:0
List of Nodes:
node 0
As you can see from the dump, this deals with multiple DFOs (each kxfrDmpSys section is a separate DFO), and for users of RAC it identifies multi-node activity (node / nid) and slave counts fairly clearly.
This particular query used 22 slaves when they were available, and downgraded some of its DFOs if not – with details in the session statistics of “Parallel operations downgraded”. The output from v$pq_tqstat (9.2.0.8) unfortunately showed only the last DFO, and lost the query co-ordinator:
DFO_NUMBER TQ_ID SERVER_TYP PROCESS NUM_ROWS BYTES WAITS TIMEOUTS AVG_LATENCY INSTANCE
---------- ----- ---------- ------- -------- ------- ----- -------- ----------- --------
1 0 Consumer P003 2500 108756 6 1 0 1
P001 2500 108688 6 1 0 1
P000 2500 108549 6 1 0 1
P002 2500 108736 6 1 0 1
1 Producer P003 8 652 56 0 0 1
P001 8 652 52 0 0 1
P000 8 652 56 0 0 1
P002 8 652 57 0 0 1
Bottom line: I don’t think there’s really any easy way to see exactly what happened with a parallel query – but as a starting point, you have to use pre-emptive trouble-shooting. So set your code up to read:
- dump statement to trace file
- take starting snapshot of critical session statistics
- alter session set events ’10391 trace name context forever, level 128′;
- execute statement
- alter session set events ’10391 trace name context off’;
- dump delta of session statistics to trace
If you want to cover all options, you could also set event 10132 just before executing the query (which means you wouldn’t need to dump the query, as it would happen automatically) to get the current execution plan for the query.
For notes on dumping text into your own trace file, see the procedure dbms_system.ksdwrt() mentioned in my article on logoff triggers.
Update Jan 2011: I’ve recently re-read this note, and realised that it needs an update to point you at the new pieces of “Notes” information reported by dbms_xplan, as 11.1 and 11.2 to tell you about the degree of parallelism (DOP).

One advantage that we users of PQ have at least is that the overhead of all the optional tracing is relatively small, so automaticaly dumping post-query wait stats and plans, or pre-emptively turning on tracing, is not some system-halting issue.
I like this 10391 — useful stuff, and thanks.
Comment by David Aldridge — March 14, 2007 @ 2:50 pm UTC Mar 14,2007 |
Hi, I think that here DFO means DFO tree here, could you give me an example of query that using multiple DFO trees?
It’s best if you can also mail to kaiyao.huang@gmail.com:)
Comment by kaiyaohuang — January 10, 2008 @ 1:36 pm UTC Jan 10,2008 |
I had a little discussion with Greg Rahn recently on the topic of DFOs (data flow operations), and he pointed out that the terminology had changed, although some of the column names and statistics names had not.
In the output from v$pq_tqstat above, there is a column labelled dfo_number – and I had been using the term DFO to refer to all the table queues (TQ_ID) within a single DFO. Greg refers to these as parallelizers, and I think it’s the thing I call a table queue that is now called a DFO.
However, if you want to see a plan with multiple DFO sections (as per v$pq_tqstat) in it, then a parallel query that manages to serialize with some non-mergeable views is one way of doing it. For example:
create table t1 as
select
rownum id, object_name, object_type, owner
from
all_objects
where
rownum <= 10000
;
create table t2 as
select * from t1;
create table t3
partition by hash (id) partitions 8
as
select * from t1
where rownum <= 1;
begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
dbms_stats.gather_table_stats(
user,
't2',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/
alter session enable parallel dml;
spool pq_multi
explain plan for
select
/*+
parallel(t1,7)
*/
rownum, t1.object_name, t1.object_type, t1.owner
from
t1,
(
select
/*+
parallel(t1,5)
*/
rownum, t1.object_name, t1.object_type, t1.owner
from
t1,
(
select
/*+
ordered
parallel(t1,3)
parallel(t2,3)
use_hash(t2)
*/
rownum,
t1.object_name,
t2.object_type,
t2.owner
from
t1, t2
where
t2.object_name = t1.object_name
and t2.object_type = t1.object_type
and t2.owner = t1.owner
) v1
where
v1.object_name = t1.object_name
and v1.object_type = t1.object_type
and v1.owner = t1.owner
) v2
where
v2.object_name = t1.object_name
and v2.object_type = t1.object_type
and v2.owner = t1.owner
;
Comment by Jonathan Lewis — January 10, 2008 @ 6:48 pm UTC Jan 10,2008 |
“_px_trace” might be the most convenient way.
Metalink doc# 444164.1
Comment by Dion Cho — April 16, 2008 @ 9:02 am UTC Apr 16,2008 |
Dion Cho,
That’s a useful little paper. Be careful though, I just tried “high, all” on a little query and ended up with a 200 Mb trace file before I killed the query ;(
Comment by Jonathan Lewis — April 16, 2008 @ 6:58 pm UTC Apr 16,2008 |
[...] but it seems likely that they’re intended to supply the answer to the common question “How parallel did my query go, and why [...]
Pingback by Plan Notes « Oracle Scratchpad — January 5, 2011 @ 8:46 am UTC Jan 5,2011 |