Oracle Scratchpad

June 13, 2019

Parallel Fun – 2

Filed under: Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 6:24 pm BST Jun 13,2019

I started writing this note in March 2015 with the following introductory comment:

A little while ago I wrote a few notes about a very resource-intensive parallel query. One of the points I made about it was that it was easy to model, and then interesting to run on later versions of Oracle. So today I’m going to treat you to a few of the observations and notes I made after modelling the problem; and here’s the SQL to create the underlying objects:

Unfortunately I failed to do anything more with the model I had created until a few days ago (June 2019 – in case I stall again) when a related question came up on the ODC database forum. This time I’m ready to go a little further – so I’ll start with a bait-and-switch approach. Here are the first few lines (omitting the SQL) of an SQL Monitor report from an instance of 18.3 – is this a power-crazed machine or what ?


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  TEST_USER (169:11324)
 SQL ID              :  73y5quma4jnw4
 SQL Execution ID    :  16777216
 Execution Started   :  06/13/2019 22:06:32
 First Refresh Time  :  06/13/2019 22:06:32
 Last Refresh Time   :  06/13/2019 22:07:03
 Duration            :  31s
 Module/Action       :  MyModule/MyAction
 Service             :  SYS$USERS
 Program             :  sqlplus@linux183.localdomain (TNS V1-V3)
 Fetch Calls         :  591

Global Stats
=========================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs | Bytes |
=========================================================================================
|      14 |    3.18 |     0.00 |        0.05 |       11 |   591 |  25978 |   62 |  13MB |
=========================================================================================

Parallel Execution Details (DOP=3 , Servers Allocated=6730)
==========================================================================================

It didn’t take long to run the query, only about 31 seconds. But the thing to notice in the report is that while the DOP is reported as 3, the number of “Servers Allocated” is a massive 6,730. So the big question – before I show you more of the report, explain what’s happening, and supply the code to build the model: how many PX processes did I actually start.

Here’s a little more of the output:


Parallel Execution Details (DOP=3 , Servers Allocated=6730)
==========================================================================================================================================================
|      Name      | Type  | Group# | Server# | Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Buffer | Read | Read  |        Wait Events         |
|                |       |        |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes |         (sample #)         |
==========================================================================================================================================================
| PX Coordinator | QC    |        |         |      14 |    3.13 |          |        0.05 |       11 |  23727 |      |     . | PX Deq: Join ACK (5)       |
|                |       |        |         |         |         |          |             |          |        |      |       | PX Deq: Signal ACK EXT (2) |
|                |       |        |         |         |         |          |             |          |        |      |       | sql_id: 6405a2hc50bt4 (1)  |
| p004           | Set 1 |      1 |       1 |    0.00 |    0.00 |          |             |          |    180 |      |     . | library cache: mutex X (1) |
|                |       |        |         |         |         |          |             |          |        |      |       |                            |
| p005           | Set 1 |      1 |       2 |    0.00 |    0.00 |          |             |          |    100 |      |     . |                            |
| p006           | Set 1 |      1 |       3 |    0.00 |    0.00 |          |             |          |     90 |      |     . |                            |
| p000           | Set 1 |      2 |       1 |    0.01 |    0.01 |          |             |          |        |      |     . |                            |
| p001           | Set 1 |      2 |       2 |    0.02 |    0.02 |          |             |          |        |      |     . |                            |
| p002           | Set 2 |      2 |       1 |    0.01 |    0.01 |     0.00 |             |          |    944 |   32 |   7MB |                            |
| p003           | Set 2 |      2 |       2 |    0.01 |    0.01 |     0.00 |             |          |    937 |   30 |   7MB |                            |
==========================================================================================================================================================

Despite “allocating” 6,730 servers Oracle is only admitting to having used 7 of them -so let’s take a closer look at how they’re used. There are two groups, and we have one set of 3 slaves in group 1, and two sets of two slaves in group 2. (It looks to me as if the Group# and Type columns should be the other way around given the hierarchy of group / type / server#). We can understand a little more of what these numbers mean if we look at the execution plan – particularly the special columns relating to Data Flow Operations (DFOs) and “DFO trees”.


SQL Plan Monitoring Details (Plan Hash Value=3398913290)
========================================================================================================================================================================
| Id |          Operation           |   Name   |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity |      Activity Detail       |
|    |                              |          | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |        (# samples)         |
========================================================================================================================================================================
|  0 | SELECT STATEMENT             |          |         |      |        32 |     +0 |     1 |     8846 |      |       |     . |     2.70 | Cpu (1)                    |
|  1 |   FILTER                     |          |         |      |        32 |     +0 |     1 |     8846 |      |       |     . |     5.41 | PX Deq: Signal ACK EXT (2) |
|  2 |    PX COORDINATOR            |          |         |      |        32 |     +0 |     5 |     8846 |      |       |     . |          |                            |
|  3 |     PX SEND QC (RANDOM)      | :TQ20002 |    9146 |  128 |        29 |     +2 |     2 |     8846 |      |       |     . |          |                            |
|  4 |      HASH JOIN BUFFERED      |          |    9146 |  128 |        29 |     +2 |     2 |     8846 |      |       |   9MB |          |                            |
|  5 |       PX RECEIVE             |          |    8846 |   11 |        14 |     +2 |     2 |     8846 |      |       |     . |          |                            |
|  6 |        PX SEND HYBRID HASH   | :TQ20000 |    8846 |   11 |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  7 |         STATISTICS COLLECTOR |          |         |      |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  8 |          PX BLOCK ITERATOR   |          |    8846 |   11 |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  9 |           TABLE ACCESS FULL  | T2       |    8846 |   11 |         1 |     +0 |    23 |     8846 |   24 |   1MB |     . |          |                            |
| 10 |       PX RECEIVE             |          |   50000 |  116 |        14 |     +2 |     2 |     2509 |      |       |     . |          |                            |
| 11 |        PX SEND HYBRID HASH   | :TQ20001 |   50000 |  116 |         1 |     +0 |     2 |     2509 |      |       |     . |          |                            |
| 12 |         PX BLOCK ITERATOR    |          |   50000 |  116 |         1 |     +0 |     2 |     2509 |      |       |     . |          |                            |
| 13 |          TABLE ACCESS FULL   | T1       |   50000 |  116 |         1 |     +0 |    26 |     2509 |   38 |  12MB |     . |          |                            |
| 14 |    PX COORDINATOR            |          |         |      |        31 |     +1 |  8978 |     2252 |      |       |     . |    13.51 | PX Deq: Join ACK (5)       |
| 15 |     PX SEND QC (RANDOM)      | :TQ10000 |       1 |   77 |        32 |     +0 |  6667 |     3692 |      |       |     . |          |                            |
| 16 |      PX BLOCK ITERATOR       |          |       1 |   77 |        32 |     +0 |  6667 |    92478 |      |       |     . |     2.70 | Cpu (1)                    |
| 17 |       TABLE ACCESS FULL      | T3       |       1 |   77 |        32 |     +0 | 53118 |    92478 |   32 |   8MB |     . |    67.57 | Cpu (25)                   |
========================================================================================================================================================================

The “Name” column shows us that we have two DFO trees (:TQ2nnnn, and :TQ1nnnn) – this is why we see two “groups” in PX server detail, and why those groups can have difference deggrees of parallelism.

Looking at the general shape of the plan you can see that operation 1 is a FILTER operation with two child operations, one at operation 2 the other at operation 14. So we probably have a filter subquery in place operated as DFO tree #1 while the main query is operated as DFO tree #2. This means the main query is running at DOP = 2 (it’s a hash join with hash distribution so it needs two sets of slave processes so all the details agree with what we’ve learned abaout Group# 2 above); and the subquery is operating a DOP = 3 – and it’s using only one set of slave processes.

There is a little anomaly in the number of Execs of operation 14 – at some point I will examine this more closely, but it might simply be a reporting error that has added the number of Execs of its child operations to its own Execs, it might be something to do with counting in Exec calls by its parent, it might be a side effect of scalar subquery caching. I’ll worry about it when I have a good reason to do so. What I want to look at is the Execs of operations 15/16, the PX Block Iterator / PX Send QC. There are 6,667 reports of PX slave executing, and that matches up quite nicely with the 6,730 reported “Servers Allocated” – so it would appear that Oracle says it’s allocating a server whenever it uses a server. But does it really “allocate” (and, presumably, de-allocate).

Here’s how you find out – you run the query again, taking various snapshot and looking for cross-references. I’ve got some results from v$pq_tqstat and v$pq_slace for the run that produced the SQL Monitor report above, and some of the QC session stats and enqueue stats for a subsequent run. This is what we see:


select  process, count(*) 
from    v$pq_tqstat 
group by 
        process 
order by 
        process
;


PROCESS                    COUNT(*)
------------------------ ----------
P000                              3
P001                              3
P002                              2
P003                              2
P004                           2225
P005                           2214
P006                           2218
QC                             2243


SQL> select slave_name, sessions from V$pq_slave order by slave_name;

SLAV   SESSIONS
---- ----------
P000          1
P001          1
P002          1
P003          1
P004       2242
P005       2242
P006       2242

Key Session Stats
=================
Name                                                                         Value                                                                          
----                                                                         -----                                                                          
opened cursors cumulative                                                    6,955                                                                          
user calls                                                                  20,631                                                                          
recursive calls                                                             20,895                                                                          
enqueue requests                                                            22,699                                                                          
enqueue conversions                                                         13,610                                                                          
enqueue releases                                                            15,894                                                                          
parse count (total)                                                          6,857                                                                          
execute count                                                                6,966                                                                          
DFO trees parallelized                                                           2
Parallel operations not downgraded                                           2,268

Key Enqueue Stats
=================
Type Short name                   Requests       Waits     Success      Failed    Wait m/s                                                                  
---- ----------                   --------       -----     -------      ------    --------                                                                  
DA   Slave Process Array             2,272          13       2,272           0          43                                                                  
PS   contention                     27,160       1,586      27,080           7         415                                                                  
SE   contention                      6,784           0       6,785           0           0                                                                  

TYPE                 DESCRIPTION
-------------------- ------------------------------------------------------------------------
PS                   Parallel Execution Server Process reservation and synchronization
DA                   Slave Process Spawn reservation and synchronization
SE                   Lock used by transparent session migration

Oracle really did start and stop something like 6,700 PX session (constantly re-using the same small set of PX slave processes) for each execution of the filter subquery. This is definitely a performance threat – we keep acquiring and releasing PX slaves, we keep creating new sessions (yes, really), and we keep searching for cursors in the library cache. All these activities are highly contentious. If you start running multiple queries that do this sort of thing you find that you see increasing amounts of time being spent on latch contention, PX slave allocation, mutex waits, and all the other problems you get with sessions that log on, do virtually nothing, then log off in rapid succession.

So how do you write SQL that does this type of thing. Here’s my data model (you may want to limit the number of rows in the tables:


create table t1 as
select * from all_source;

create table t2 as
select * from all_source where mod(line,20) = 1;

create table t3 as
select * from all_source;

And here’s all you have to do to start creating problems – I’ve added explicit hints to force parallelism (particularly for the subquery), it’s more likely that it has been introduced accidentally by table or index definitions, or by an “alter session” to “force parallel”:


set feedback only

select
        /*+ 
                parallel(t1 2) 
                parallel(t2 2)
                leading(t1 t2)
                use_hash(t2)
                swap_join_inputs(t2)
                pq_distribute(t2 hash hash)
                cardinality(t1,50000)
        */
        t1.owner,
        t1.name,
        t1.type
from
        t1
join
        t2
on      t2.owner = t1.owner
and     t2.name = t1.name
and     t2.type = t1.type
where
        t1.line = 1
and     (
           mod(length(t1.text), 10) = 0
        or exists (
                select --+ parallel(t3 3) 
                        null
                from    t3
                where   t3.owner = t1.owner
                and     t3.name = t1.name
                and     t3.type = t1.type
                and     t3.line >= t1.line
                )
        )
;

set feedback on

I’ve written notes in the past about SQL that forces the optimizer to run subqueries as filter subqueries instead of unnesting them – this is just an example of that type of query, pushed into parallelism. It’s not the only way (see comment #1 from Dmitry Remizov below) to end up with scalar subqueries being executed many times as separate DFO trees even though Oracle has enhanced the optimizer several times over the years in ways that bypass the threat – but the probalm can still appear and it’s important to notice in development that you’ve got a query that Oracle can’t work around.

 

3 Comments »

  1. Hi Jonathan,

    I’m not sure that I’ve observed the same issue recently but looks very similar (tested on 11.2.0.4, 12.1).
    it was related to CASE case with internal subqueries inside.

    I’ve managed to created my simplified test case:

    DROP TABLE MAIN;  
    DROP TABLE TST1;  
    DROP TABLE TST2;  
      
    CREATE TABLE TST1 AS   
    SELECT rownum id,  
           1e5 * MOD(rownum + 7, 11) id2,  
           rpad('A', MOD(rownum + 17, 7), 'A') filler  
      FROM dual  
    CONNECT BY LEVEL <= 1e3;  
      
    CREATE TABLE TST2 AS   
    SELECT rownum id,  
           1e5 * MOD(rownum + 7, 27) id2,  
           rpad('A', MOD(rownum + 17, 9), 'A') filler  
      FROM dual  
    CONNECT BY LEVEL <= 1e2;  
      
    CREATE TABLE MAIN AS   
    SELECT rownum id,  
           MOD(1e5 - rownum + 7, 23) id2,  
           rpad('A', MOD(1e4 - rownum + 17, 17), 'A') filler  
      FROM dual  
    CONNECT BY LEVEL <= 2e4;  
    
    
    -------------------------- end tables preparation ---------------------------------------
    
    ALTER SESSION FORCE PARALLEL QUERY PARALLEL 2; 
    
    
    SELECT * FROM (  
    select  /*+ MONITOR */  
        CASE WHEN id2 IN (SELECT  id2 FROM TST1) THEN 'A1'  
             WHEN id2 IN (SELECT  id2 FROM TST2) THEN 'A2'  
             ELSE 'C'    
        END ctg,  
        m.*  
    from main m);
    
    

    Comment by Dmitry Remizov — June 13, 2019 @ 6:51 pm BST Jun 13,2019 | Reply

    • Dmitry,

      That’s one of the ways that this type of report can appear
      (I’ll have to use Plan B for my example now ;)

      I edited your comment to put in the “sourcecode” (in square brackets) tags that make it fixed font. I haven’t done a test run, so I hope I didn’t make any mistakes while doing the edit.

      Comment by Jonathan Lewis — June 13, 2019 @ 7:10 pm BST Jun 13,2019 | Reply

  2. […] By a lucky coincidence this allowed me to rediscover and publish an old example of parallel execution gone wild before moving on to talk about the fundamental problem exhibited in the latest […]

    Pingback by Can’t Unnest | Oracle Scratchpad — June 17, 2019 @ 3:36 pm BST Jun 17,2019 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.