A quirky little feature of interval partitioning showed up on Twitter today – a parallel insert that would only use a single PX slave to do the inserting. With 1.1 billion rows and the option for running parallel 32 this made the loading process rather slower than it ought to have been.
Fortunately it’s quite easy to model (and work around) the oddity. So here’s a small data set and an empty partitioned table to work with:
rem rem Script: pt_int_load_anomaly.sql rem Author: Jonathan Lewis rem Dated: Jan 2020 rem create table t1 nologging as select ao.* from all_Objects ao, (select rownum id from dual connect by level <= 20) ; create table pt1 partition by range (object_id) interval (1000000) ( partition p1 values less than (1) ) as select * from all_Objects where rownum = 0 /
I’ve created a table by copying all_objects 20 times which, for my little sandbox, has given me a total of about 1.2M rows. Then I’ve created an empty interval-partitioned clone of all_objects, with the first partition defined to hold all rows where the object_id is less than 1 (and there’s no object in the database that could match that criterion). I’ve defined the interval to be 1,000,000 and since the highest object_id in my database is about 90,000 the first partition that gets added to this table will be able to hold all the data from t1.
So now we try to do a parallel insert from t1 into pt1, and check the execution plan and parallel execution statistics:
set serveroutput off insert /*+ append enable_parallel_dml parallel(6) */ into pt1 select * from t1; select * from table(dbms_xplan.display_cursor); start pq_tqstat
Note how I’ve used the hint /*+ enable_parallel_dml */ (possible a 12c hint back-ported to 11.2.0.4) rather than using an “alter session”, it’s just a little convenience to be able to embed the directive in the SQL. The pq_tqstat script is one I published some time ago to report the contents of the session-specific dynamic performance view v$pq_tqstat immediately after running a parallel statement.
Here’s the plan:
SQL_ID 25hub68pf1z1s, child number 0 ------------------------------------- insert /*+ append enable_parallel_dml parallel(6) */ into pt1 select * from t1 Plan hash value: 2888707464 ------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | | | 631 (100)| | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 1235K| 159M| 631 (10)| 00:00:01 | Q1,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT (HIGH WATER MARK BROKERED)| PT1 | | | | | Q1,01 | PCWP | | | 4 | OPTIMIZER STATISTICS GATHERING | | 1235K| 159M| 631 (10)| 00:00:01 | Q1,01 | PCWP | | | 5 | PX RECEIVE | | 1235K| 159M| 631 (10)| 00:00:01 | Q1,01 | PCWP | | | 6 | PX SEND RANDOM LOCAL | :TQ10000 | 1235K| 159M| 631 (10)| 00:00:01 | Q1,00 | P->P | RANDOM LOCA| | 7 | PX BLOCK ITERATOR | | 1235K| 159M| 631 (10)| 00:00:01 | Q1,00 | PCWC | | |* 8 | TABLE ACCESS FULL | T1 | 1235K| 159M| 631 (10)| 00:00:01 | Q1,00 | PCWP | | ------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access(:Z>=:Z AND :Z<=:Z) Note ----- - Degree of Parallelism is 6 because of hint
The most important detail of this plan is that the PX slaves do the load as select (operation 3), then send a message to the query coordinator (PX send QC, operation 2) to tell it about the data load. They do not send their data to the QC for the QC to do the load.
So the plan says we will be doing parallel DM, but here’s what v$pq_tqstat tells us:
DFO_NUMBER TQ_ID SERVER_TYPE INSTANCE PROCESS NUM_ROWS BYTES ROW_SHARE DATA_SHARE WAITS TIMEOUTS AVG_LATENCY ---------- ---------- --------------- -------- --------------- ---------- ---------- --------- ---------- ---------- ---------- ----------- 1 0 Producer 1 P006 215880 34785363 17.47 16.86 16 0 0 1 P007 202561 34436325 16.39 16.69 17 0 0 1 P008 207519 34564496 16.79 16.75 17 0 0 1 P009 208408 34594770 16.86 16.77 17 0 0 1 P00A 198915 33529627 16.10 16.25 16 0 0 1 P00B 202537 34430603 16.39 16.69 16 0 0 Consumer 1 P000 0 144 0.00 0.00 51 47 0 1 P001 0 144 0.00 0.00 51 47 0 1 P002 1235820 206340464 100.00 100.00 75 47 0 1 P003 0 144 0.00 0.00 51 47 0 1 P004 0 144 0.00 0.00 1138 1134 0 1 P005 0 144 0.00 0.00 1137 1133 0 1 Producer 1 P000 0 24 0.00 5.91 51 42 0 1 P001 0 24 0.00 5.91 50 41 0 1 P002 2 286 100.00 70.44 58 14 0 1 P003 0 24 0.00 5.91 51 43 0 1 P004 0 24 0.00 5.91 51 42 0 1 P005 0 24 0.00 5.91 51 43 0 Consumer 1 QC 2 406 100.00 100.00 311 179 0 19 rows selected.
The query did run parallel 6 as hinted – and 6 PX slaves scanned the t1 table; but they all sent all their data to one PX slave in the second slave set and that one PX slave did all the inserts. The plan was parallel, but the execution was effectively serial. (You’ll note there is something a little odd about the waits and timeout for p004 and p005 when they are acting as consumers. I may worry about that later, but it could be a host-based side effect of running parallel 6 on a VM with 4 CPUs).
The serialization leads to two questions
- What went wrong?
- How do we work around this and make the insert “truly” parallel
My answer to (1) is “I don’t know – but I’ll look at it if necessary” combined with the guess – it’s something to do with the table having only one partition at the outset and this has an unexpected side effect on the randomising function for the PX distribution.
My answer to (2) is “if I’m right about (1), why not try pre-defining two partitions, and I’ll even let both of them stay empty”.
So here’s my new definition for pt1:
create table pt1 partition by range (object_id) interval (1000000) ( partition p0 values less than (0), partition p1 values less than (1) ) as select * from all_Objects where rownum = 0 /
Re-running the test with the completely redundant, and permanently empty p0 partition the plan doesn’t change but the results from v$pq_tqstat change dramatically:
DFO_NUMBER TQ_ID SERVER_TYPE INSTANCE PROCESS NUM_ROWS BYTES ROW_SHARE DATA_SHARE WAITS TIMEOUTS AVG_LATENCY ---------- ---------- --------------- -------- --------------- ---------- ---------- --------- ---------- ---------- ---------- ----------- 1 0 Producer 1 P006 207897 34581153 16.82 16.76 23 4 0 1 P007 215669 34786429 17.45 16.86 30 5 0 1 P008 221474 36749626 17.92 17.81 28 5 0 1 P009 204959 34497164 16.58 16.72 22 2 0 1 P00A 177755 30141002 14.38 14.61 21 0 0 1 P00B 208066 35585810 16.84 17.25 25 2 0 Consumer 1 P000 213129 35612973 17.25 17.26 82 57 0 1 P001 200516 33570586 16.23 16.27 84 55 0 1 P002 203395 33950449 16.46 16.45 83 56 0 1 P003 205458 34235575 16.63 16.59 82 54 0 1 P004 204111 33999932 16.52 16.48 581 555 0 1 P005 209211 34971669 16.93 16.95 580 553 0 1 Producer 1 P000 2 286 16.67 16.67 422 149 0 1 P001 2 286 16.67 16.67 398 130 0 1 P002 2 286 16.67 16.67 405 128 0 1 P003 2 286 16.67 16.67 437 161 0 1 P004 2 286 16.67 16.67 406 116 0 1 P005 2 286 16.67 16.67 440 148 0 Consumer 1 QC 12 1716 100.00 100.00 242 111 0 19 rows selected.
Every consumer receives and inserts roughly 200,000 rows – it’s a totally fair parallel DML. Timings are pretty irrelevant for such a small data set but the excution time did drop from 7 seconds to 4 seconds when parallelism was working “properly”.
I’ve tested this script on 12.2.0.1 and 19.3.0.0 – the same anomaly appears in both versions though it might be worth noting that the strange skew in the waits and timeouts doesn’t appear in 19.3.0.0.
[…] Interval partition serialising (Feb 2020): Why did a parallel insert into an interval partitioned table serialise through one PX process – conjecture and workaround. […]
Pingback by Partitioning Catalogue | Oracle Scratchpad — August 10, 2022 @ 12:45 pm BST Aug 10,2022 |
[…] Interval partition serialising (Feb 2020): Why did a parallel insert into an interval partitioned table serialise through one PX process – conjecture and workaround. […]
Pingback by Troubleshooting catalogue | Oracle Scratchpad — August 10, 2022 @ 12:46 pm BST Aug 10,2022 |