Here’s a surprising (to me) execution plan from 12.1.0.2 – parallel execution to find one row in a table using a unique scan of a unique index – produced by running the following script (data creation SQL to follow):
set serveroutput off set linesize 180 set trimspool on set pagesize 60 alter session set statistics_level = all; variable b1 number exec :b1 := 50000 select /*+ parallel (3) */ id, v1 from t2 where id=:b1; select * from table(dbms_xplan.display_cursor(null,null,'allstats parallel')); break on dfo_number skip 1 on tq_id skip 1 on server_type select dfo_number, tq_id, server_type, instance, process, num_rows from v$pq_tqstat order by dfo_number, tq_id, server_type desc, instance, process ;
All I’ve done is enable rowsource execution statistics, set a bind variable to a value, query a table with a /*+ parallel(3) */ hint to find the one row that will be identified by primary key, and then reported the actual execution plan. When I first ran the test Oracle didn’t report the execution statistics correctly so I’ve also queried v$pq_tqstat to show the PX servers used and the flow of data through the plan. Here’s the plan, followed by the results from v$pq_tqstat:
SQL_ID 0dzynh9d29pt9, child number 0 ------------------------------------- select /*+ parallel (3) */ id,v1 from t2 where id=:b1 Plan hash value: 247082613 --------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | TQ |IN-OUT| PQ Distrib | A-Rows | A-Time | --------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | | | 1 |00:00:00.02 | | 1 | PX COORDINATOR | | 1 | | | | | 1 |00:00:00.02 | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 0 | 1 | Q1,01 | P->S | QC (RAND) | 0 |00:00:00.01 | | 3 | TABLE ACCESS BY INDEX ROWID | T2 | 0 | 1 | Q1,01 | PCWP | | 0 |00:00:00.01 | | 4 | BUFFER SORT | | 0 | | Q1,01 | PCWC | | 0 |00:00:00.01 | | 5 | PX RECEIVE | | 0 | 1 | Q1,01 | PCWP | | 0 |00:00:00.01 | | 6 | PX SEND HASH (BLOCK ADDRESS)| :TQ10000 | 0 | 1 | Q1,00 | S->P | HASH (BLOCK| 0 |00:00:00.01 | | 7 | PX SELECTOR | | 0 | | Q1,00 | SCWC | | 0 |00:00:00.01 | |* 8 | INDEX UNIQUE SCAN | T2_PK | 0 | 1 | Q1,00 | SCWP | | 0 |00:00:00.01 | --------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("ID"=:B1) Note ----- - Degree of Parallelism is 3 because of hint DFO_NUMBER TQ_ID SERVER_TYP INSTANCE PROCES NUM_ROWS ---------- ---------- ---------- ---------- ------ ---------- 1 0 Producer 1 P003 0 1 P004 1 1 P005 0 Consumer 1 P000 0 1 P001 1 1 P002 0 1 Producer 1 P000 0 1 P001 1 1 P002 0 Consumer 1 QC 1
As you can see the table access follows a unique scan of an index and, although the rowsource execution stats report zero starts for the unique scan, we can see from v$pq_tqstat that slave P004 acquired a “row” (actually a rowid) and passed it to slave P001 which then acquired a row from the table and passed that row to the query coordinator. Oracle really did execute a parallel query, starting and stopping a total of 6 sessions to perform a single unique index access.
You’ll notice operation 7 is one you’ve only seen in the latest version of Oracle. The PX SELECTOR was introduced in 12c to reduce the number of times a complex parallel query would funnel into the query coordinator (parallel to serial) and then fan out again (serial to parallel) generating a new data flow operation tree (DFO tree) spawning one or two new parallel server groups as it did so. To stop this happening a step that needs to serialise in a 12c parallel plan can nominate one of the existing PX server processes (from each set, if necessary) to do the job so that the same set of PX servers can carry on running the query without the need for a new DFO tree to appear.
This enhancement to parallel execution plans is a good idea – except when it appears in my silly little query and turns something that ought to be quick and cheap into a job that is far more resource-intensive than it should be.
At this point, of course, you’re probably wondering what kind of idiot would put a parallel() hint into a query that was doing nothing but selecting one row by primary key – the answer is: “the Oracle optimizer in 12c”. I discovered this anomaly while creating a demonstration of the way that a distributed parallel query has to serialise through a single database link even if the operations at the two ends of the link run parallel. Here’s the SQL I wrote for the full demonstration:
rem Script: distributed_pq.sql rem Author: Jonathan Lewis rem Dated: May 2016 define m_remote='test@loopback' define m_remote='orcl@loopback' create table t1 nologging as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 -- > comment to avoid wordpress format issue ) select cast(rownum as number(8,0)) id, cast(lpad(rownum,8,'0') as varchar2(8)) v1, cast(rpad('x',100) as varchar2(100)) padding from generator v1, generator v2 where rownum <= 1e5 -- > comment to avoid wordpress format issue ; create table t2 as select * from t1 where mod(id,100) = 0 ; alter table t2 add constraint t2_pk primary key(id); begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T2', method_opt => 'for all columns size 1' ); end; / set serveroutput off select /*+ parallel(3) */ t1.v1, t2.v1 from t1, t2@&m_remote where mod(t1.id,10) = 0 and t2.id = t1.id and mod(to_number(t2.v1),10) = 1 ; select * from table(dbms_xplan.display_cursor(null,null,'allstats parallel'));
If you want to run this demo you’ll need to do something about formatting the output; more importantly you’ll have to create a database link (with a loopback link) and set up a define identifying it at the line where I’ve got orcl@loopback and test@loopback (which are my 12c and 11g loopback links respectively).
Here’s the plan (with rowsource stats) I got from the 12c test:
---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | TQ |IN-OUT| PQ Distrib | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | | | 0 |00:01:14.67 | 7 | 0 | | 1 | NESTED LOOPS | | 1 | 10 | | | | 0 |00:01:14.67 | 7 | 0 | | 2 | PX COORDINATOR | | 1 | | | | | 10000 |00:00:00.11 | 7 | 0 | | 3 | PX SEND QC (RANDOM)| :TQ10000 | 0 | 1000 | Q1,00 | P->S | QC (RAND) | 0 |00:00:00.01 | 0 | 0 | | 4 | PX BLOCK ITERATOR | | 3 | 1000 | Q1,00 | PCWC | | 10000 |00:03:17.72 | 1745 | 1667 | |* 5 | TABLE ACCESS FULL| T1 | 39 | 1000 | Q1,00 | PCWP | | 10000 |00:00:00.06 | 1745 | 1667 | | 6 | REMOTE | T2 | 10000 | 1 | | | | 0 |00:01:14.44 | 0 | 0 | ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access(:Z>=:Z AND :Z<=:Z) filter(MOD("T1"."ID",10)=0) Remote SQL Information (identified by operation id): ---------------------------------------------------- 6 - SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1 (accessing 'ORCL@LOOPBACK' ) Note ----- - Degree of Parallelism is 3 because of hint
I have hacked this output a little – the “Remote SQL” section didn’t get reported by display_cursor(), so I’ve inserted the remote sql I got from a call to dbms_xplan.display() after using explain plan to generate a plan. Note the /*+ shared(3) */ hint that appears in the remote SQL – that’s the internal version of a parallel(3) hint.
In 11g the query complete in 2.4 seconds, in 12c the query took nearly 75 seconds to run thanks to the 12c enhancement that allowed it to obey the hint! Looking at the time column (and ignoring the anomalous 3:17 at operation 4 – which might roughly be echoing 3 * 1:14) we can see that the time goes on the calls to the remote database (and a check of v$session_event shows this time spent in “SQL*Net message from db link”), so the obvious thing to do is check what actually happened at the remote database and we can do that by searching the library cache for a recognizable piece of the remote SQL – here’s the SQL to do that, with the results from 11g followed by the results from 12c:
SQL> select sql_id, child_number, executions, px_servers_executions, sql_text from v$sql 2 where sql_text like '%SHARED%' and sql_text not like 'select sql_id%'; 11g results SQL_ID CHILD_NUMBER EXECUTIONS PX_SERVERS_EXECUTIONS ------------- ------------ ---------- --------------------- SQL_TEXT ------------------------------------------------------------------------------------------------------------------------------------ c0f292z5czhwk 0 10000 0 SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE MOD(TO_NUMBER("V1"),10)=1 AND "ID"=:1 12c results SQL_ID CHILD_NUMBER EXECUTIONS PX_SERVERS_EXECUTIONS ------------- ------------ ---------- --------------------- SQL_TEXT ------------------------------------------------------------------------------------------------------------------------------------ 7bk51w7vtagwd 0 10000 0 SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1 7bk51w7vtagwd 1 0 59995 SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1
It’s not surprising to see that the query has executed 10,000 times – that’s what we were told by the Starts statistic from dbms_output.display_cursor(), but 12c has 60,000 (with a little error) PX Servers executions of the statement. That’s 10,000 executions * degree 3 * the 2 slave sets we saw in my original execution plan. (It’s an odd little quirk of the two versions of Oracle that the order of predicates in the remote SQL was reversed between 11g and 12c – leading to two different SQL_IDs).
By enabling rowsource execution stats at the system level I was able to capture the remote execution plan with its stats:
SQL_ID 7bk51w7vtagwd, child number 0 ------------------------------------- SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 0 | | 0 |00:00:00.01 | 0 | | 1 | PX COORDINATOR | | 0 | | 0 |00:00:00.01 | 0 | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 0 | 1 | 0 |00:00:00.01 | 0 | |* 3 | TABLE ACCESS BY INDEX ROWID | T2 | 29983 | 1 | 0 |00:00:22.21 | 1000 | | 4 | BUFFER SORT | | 29995 | | 999 |00:00:21.78 | 0 | | 5 | PX RECEIVE | | 29924 | 1 | 994 |00:00:21.21 | 0 | | 6 | PX SEND HASH (BLOCK ADDRESS)| :TQ10000 | 0 | 1 | 0 |00:00:00.01 | 0 | | 7 | PX SELECTOR | | 29993 | | 999 |00:00:06.08 | 19992 | |* 8 | INDEX UNIQUE SCAN | T2_PK | 29999 | 1 | 1000 |00:00:00.24 | 20000 | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(MOD(TO_NUMBER("V1"),10)=1) 8 - access("ID"=:1)
Unlike the test case I started with, this output did show the number of starts (with a few missing) and the work done across the slaves. Our index probe had to do two buffer gets on every execution, and we have 10,000 executions of the query so 20,000 buffer gets on the index unique scan. Even though only one slave actually does any work with the PX Selector, all three slaves in that set seem to “start” the relevant operations. The definition of the data meant that only one index probe in 10 was successful so we only visited 1,000 buffers from the table. If you’re familiar with reading plans with rowsource execution stats you’ll appreciate that something has gone wrong in the reporting here – that 1,000 at operation 3 should read 21,000 because it ought to include the 20,000 from the index scan (at least, that’s what a serial plan would do).
If you’re still wondering why running this query as a parallel query should take so long – after all it’s only 10,000 executions in 70 seconds – bear in mind that Oracle has to allocate and deallocate 6 PX servers to new sessions each time it starts; the instance activity stats showed “logons cumulative” going up by 60,000 each time I ran the driving query: that’s about 850 logons (and log offs) per second. I don’t think my test machine would give a realistic impression of the impact of a couple of copies of this query running simultaneously, but when I tried the contention introduce increased the run time to 93 seconds.
tl;dr
Watch out for poor performance becoming disastrous for distributed parallel queries when you upgrade from 11g to 12c
Update (May 2017)
Thanks to Mauro Pagano – identifying this as fix 13345888 (check v$system_fix control) introduced in 12.1.0.1
Update (Sept 2021)
Still does the same horrible thing in 19.11.0.0 – even with fix_control 13345888 set to zero. To avoid getting the shared(3) hint sent to the remote database the hint in the main query needs to go from query level to object level, viz: /*+ parallel(t1 3) */.
Comments and related questions are welcome.