Oracle Scratchpad

November 20, 2023

gby_pushdown

Filed under: CBO,Execution plans,Hints,Oracle,Parallel Execution — Jonathan Lewis @ 6:42 pm GMT Nov 20,2023

This is a note that’s been awaiting completion for nearly 10 years. It’s about a feature (or, at least, a hint for the feature) that appeared in 10.2.0.5 to control some of the inter-process messaging that takes place in parallel execution.

It’s a very simple idea that can make a significant difference in CPU usage for large parallel queries – can you aggregate the raw data before distributing it between slave sets (minimising the volume of traffic) or should you simply distribute the data and aggregate late to avoid aggregating twice. The strategy of aggregating early is known as “group by pushdown”.

I’ll start with a script to create a simple data set and a trivial query with a parallel hint:

rem
rem     Script:         gby_pushdown.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2023
rem
rem     Last tested 
rem             19.11.0.0
rem             12.2.0.1
rem

set linesize 180
set pagesize 60
set trimspool on
set tab off

create table t1 
as 
select 
        * 
from 
        all_objects 
where 
        rownum <= 50000
/


alter session set statistics_level = all;

set serveroutput off

prompt  ==================================================
prompt  Default (for this data) pushdown and hash group by
prompt  ==================================================

set feedback only

select 
        /*+ 
                qb_name(main)
                parallel(t1 2) 
--              no_gby_pushdown(@main)
--              no_use_hash_gby_for_pushdown(@main)
        */ 
        owner, object_type, count(*) ct
from 
        t1 
group by 
        owner, object_type
order by
        owner, object_type
/

set feedback on
select * from table(dbms_xplan.display_cursor(format=>'outline allstats cost hint_report -predicate'));

In the absence of any hints (apart from the qb_name() and parallel() hints), the plan I get from the query is as follows:

Plan hash value: 2919148568

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |    77 (100)|     96 |00:00:00.07 |       5 |      0 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |            |     96 |00:00:00.07 |       5 |      0 | 11264 | 11264 |     1/0/0|
|   2 |   PX SEND QC (ORDER)     | :TQ10001 |      0 |    276 |    77   (8)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   3 |    SORT GROUP BY         |          |      2 |    276 |    77   (8)|     96 |00:00:00.01 |       0 |      0 |  4096 |  4096 |     2/0/0|
|   4 |     PX RECEIVE           |          |      2 |    276 |    77   (8)|    121 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |      PX SEND RANGE       | :TQ10000 |      0 |    276 |    77   (8)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   6 |       HASH GROUP BY      |          |      2 |    276 |    77   (8)|    121 |00:00:00.04 |    1043 |    991 |  1079K|  1079K|     2/0/0|
|   7 |        PX BLOCK ITERATOR |          |      2 |  50000 |    73   (3)|  50000 |00:00:00.02 |    1043 |    991 |       |       |          |
|*  8 |         TABLE ACCESS FULL| T1       |     26 |  50000 |    73   (3)|  50000 |00:00:00.01 |    1043 |    991 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"MAIN")
      FULL(@"MAIN" "T1"@"MAIN")
      GBY_PUSHDOWN(@"MAIN")
      USE_HASH_GBY_FOR_PUSHDOWN(@"MAIN")
      END_OUTLINE_DATA
  */

You’ll notice in the Outline Data that Oracle has recorded the use_hash_gby_for_pushdown() hint and the gby_pushdown() hint. I’ll be repeating the query disabling each of these hints in turn – which is why the negative versions of the hints appear as comments in my original query.

If you look at operation 6 of the plan you can see that the optimizer has chosen to use a hash group by to aggregate the selected rows, reducing 50,000 rows to 121 rows. We could query v$pq_tqstat, or run the SQL Monitor report to get more detail about how much work each PX process did, but it’s fairly safe to assume that it was shared reasonably evenly between the two processes.

After aggregating their data the first layer of PX processes distributes the results by range (operation 5, PX Send Range) to the second layer of PX processes, which re-aggregate the much reduced data set. At this point Oracle chooses to aggregate by sorting (Sort Group By) as this will deal with the order by clause at the same time. (Note: you could tell Oracle to split the aggregation and ordering by adding a use_hash_aggregation hint to the query.)

With my data set it’s fairly clear that it’s a good idea to do this “two stage” aggregation because the initial raw data is reduced by a very large factor the first layer of PX processes before they forward the results – and the reduction in inter-process messaging is likely to be a good idea.

There may be cases, of course, where some feature of the data pattern means that two-stage aggregation is a good idea, but aggregating by a sort is more efficient than an aggregation by hashing – a cardinality or clustering estimate might have persuaded the optimizer to make the wrong choice – so let’s see what happens to the plan if we enable the no_use_hash_gby_for_pushdown() hint:

lan hash value: 3954212205

---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |    77 (100)|     96 |00:00:00.14 |       5 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |            |     96 |00:00:00.14 |       5 | 11264 | 11264 |     1/0/0|
|   2 |   PX SEND QC (ORDER)     | :TQ10001 |      0 |    276 |    77   (8)|      0 |00:00:00.01 |       0 |       |       |          |
|   3 |    SORT GROUP BY         |          |      2 |    276 |    77   (8)|     96 |00:00:00.03 |       0 |  4096 |  4096 |     2/0/0|
|   4 |     PX RECEIVE           |          |      2 |    276 |    77   (8)|    120 |00:00:00.03 |       0 |       |       |          |
|   5 |      PX SEND RANGE       | :TQ10000 |      0 |    276 |    77   (8)|      0 |00:00:00.01 |       0 |       |       |          |
|   6 |       SORT GROUP BY      |          |      2 |    276 |    77   (8)|    120 |00:00:00.13 |    1043 |  9216 |  9216 |     2/0/0|
|   7 |        PX BLOCK ITERATOR |          |      2 |  50000 |    73   (3)|  50000 |00:00:00.07 |    1043 |       |       |          |
|*  8 |         TABLE ACCESS FULL| T1       |     26 |  50000 |    73   (3)|  50000 |00:00:00.02 |    1043 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

It doesn’t really make any difference in this very small test case, though the sorting does seem to have take a fraction of a second more CPU. The key change is that operation 6 has become a Sort Group By.

There is an interesting detail to look out for, though, in the Outline Data and Hint Report:

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"MAIN")
      FULL(@"MAIN" "T1"@"MAIN")
      GBY_PUSHDOWN(@"MAIN")
      END_OUTLINE_DATA
  */

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   1 -  MAIN
           -  qb_name(main)

   8 -  MAIN / T1@MAIN
           -  parallel(t1 2)

I added the hint /*+ no_use_hash_gby_for_pushdown(@main) */ to the query, and the hint has clearly worked; but it’s not reported in the Hint Report, and it doesn’t appear in the Outline Data. This suggests that if you created an SQL Baseline for this query with this hint in place the plan would not reproduce because SQL Baseline would be missing the critical hint. (To be tested – left as an exercise to readers.)

The next step is to enable the no_gby_pushdown() hint. For the purposes of the note this is the important one. It’s also the one that you are more likely to use as it’s fairly obvious (if you know the data) when it’s a good idea to use it. (In some cases, of course, the optimizer might have avoided “group by pushdown” when it should have used it – in which case you’d be using the gby_pushdown() hint.) Here’s the plan when I block “group by pushdown”.

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |      1 |        |    77 (100)|     96 |00:00:00.08 |       5 |       |       |          |
|   1 |  PX COORDINATOR         |          |      1 |        |            |     96 |00:00:00.08 |       5 |  6144 |  6144 |     1/0/0|
|   2 |   PX SEND QC (ORDER)    | :TQ10001 |      0 |    276 |    77   (8)|      0 |00:00:00.01 |       0 |       |       |          |
|   3 |    SORT GROUP BY        |          |      2 |    276 |    77   (8)|     96 |00:00:00.11 |       0 |  4096 |  4096 |     2/0/0|
|   4 |     PX RECEIVE          |          |      2 |  50000 |    73   (3)|  50000 |00:00:00.07 |       0 |       |       |          |
|   5 |      PX SEND RANGE      | :TQ10000 |      0 |  50000 |    73   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|   6 |       PX BLOCK ITERATOR |          |      4 |  50000 |    73   (3)|  50091 |00:00:00.02 |    1051 |       |       |          |
|*  7 |        TABLE ACCESS FULL| T1       |     28 |  50000 |    73   (3)|  50091 |00:00:00.01 |    1051 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"MAIN")
      FULL(@"MAIN" "T1"@"MAIN")
      END_OUTLINE_DATA
  */

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3
---------------------------------------------------------------------------
   1 -  MAIN
           -  no_gby_pushdown(@main)
           -  qb_name(main)

   7 -  MAIN / T1@MAIN
           -  parallel(t1 2)

Key details to highlight here are:

  • There’s only one aggregation step, appearing at operation 3 after the PX SEND/RECEIVE
  • 50,000 rows are passed from slave set 1 (operation 6) to slave set 2 (operation 4).
  • The no_gby_pushdown(@main) does appear in the Hint Report
  • The no_gby_pushdown(@main) doesn’t appear in the Outline Data (which is, again, a little worrying).

Again, with this small data set, the change in plan isn’t going to make much difference to the end user, but you may find cases where there is a best choice to keep the user sufficiently happy and save machine resources.

Summary

When using parallel query the optimizer may choose to aggregate the data in two steps so that the current rowsource is reduced by one set of PX processes before it is passed to the next set of PX processes that does a second aggregation step. This tends to be a good idea if the first set of processes can reduce the size of the data set by a large factor and save a lot of machine resources on the subsequence inter-process communication; but it is a bad idea if the data doesn’t reduce and a large volume of data ends up being aggregated in both sets of PX processes.

You can over-ride the optimizer’s choice with the [no_]gby_pushdown() hint.

4 Comments »

  1. […] gby_pushdown (Nov 2023): a hint to change the way parallel query handles aggregation […]

    Pingback by Hinting Catalogue | Oracle Scratchpad — November 20, 2023 @ 6:48 pm GMT Nov 20,2023 | Reply

  2. […] gby_pushdown (Nov 2023): a hint to change the way parallel query handles aggregation […]

    Pingback by Parallel Execution Catalogue | Oracle Scratchpad — November 20, 2023 @ 6:49 pm GMT Nov 20,2023 | Reply

  3. Many thanks Jonathan,
    Especially for remembering the existence of these hints no_gby_pushdown and gby_pushdown.
    In my cases, there is often a lack of efficient group pushdown for early aggregation in optimizers decision. So I enforced this with subselects and no_merge hints.
    I’m curious if gby_pushdown does this more smoothly without rewriting the statement.
    Best regards
    Peter

    Comment by Anonymous — November 20, 2023 @ 8:49 pm GMT Nov 20,2023 | Reply

    • Thanks for the comment.

      It sounds as if the transformation you want to see is “group by placement” rather than this “group by pushdown”. The former is a logical rewrite of the statement that aims make it more efficient, the latter is just a mechanical detail of how the run-time engine should behave.

      I wrote an introductory note to “group by placement” quite a long time ago that might be of some help.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — November 21, 2023 @ 12:45 pm GMT Nov 21,2023 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.