Oracle Scratchpad

September 13, 2013

Quiz Night

Filed under: Exadata,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 6:32 pm BST Sep 13,2013

Here’s a little quiz about Bloom filtering. There seem to be at least three different classes of query where Bloom filters can come into play – all involving hash joins: partition elimination, aggregate reduction on non-mergeable aggregate views, and parallelism.

This quiz is about parallel queries – and all you have to do is work out how many Bloom filters were used in the following two execution plans (produced by 11.2.0.2), and where they were used.

I’ve got 4 tables, 3 very small dimensions and one large fact. I’ve joined the three dimensions to the fact on their primary key, and filtered on each dimension (SQL for the test supplied here). Stripping out the eighteen hints that I inserted to get the plans I wanted the queries both looked like this:

select
	count(t1.small_vc),
	count(t2.small_vc),
	count(t3.small_vc),
	count(t4.small_vc)
from
	t4,
	t1,
	t2,
	t3
where
	t1.id = t4.id1
and	t2.id = t4.id2
and	t3.id = t4.id3
and	t1.small_vc in (1,2,3)
and	t2.small_vc in (1,2,3)
and	t3.small_vc in (1,2,3)
;

The t4 table is the fact table and t1, t2, t3 are the dimensions. Here are two execution plans; in both cases I’ve hinted a path that produces three consecutive hash joins, building in-memory hash tables from the dimension tables, before scanning the fact table and probing the three dimensions in turn. I captured the plans from memory with calls to dbms_xplan.display_cursor() after running the queries.

Plan 1 – which uses the “broadcast” distribution for the dimension tables.

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |   464 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE             |          |     1 |    38 |            |          |        |      |            |
|   2 |   PX COORDINATOR            |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)      | :TQ10003 |     1 |    38 |            |          |  Q1,03 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE          |          |     1 |    38 |            |          |  Q1,03 | PCWP |            |
|*  5 |      HASH JOIN              |          |    26 |   988 |   464   (2)| 00:00:06 |  Q1,03 | PCWP |            |
|   6 |       PX RECEIVE            |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
|   7 |        PX SEND BROADCAST    | :TQ10000 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |
|   8 |         PX BLOCK ITERATOR   |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|*  9 |          TABLE ACCESS FULL  | T3       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|* 10 |       HASH JOIN             |          |   612 | 19584 |   461   (1)| 00:00:06 |  Q1,03 | PCWP |            |
|  11 |        PX RECEIVE           |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
|  12 |         PX SEND BROADCAST   | :TQ10001 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,01 | P->P | BROADCAST  |
|  13 |          PX BLOCK ITERATOR  |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|* 14 |           TABLE ACCESS FULL | T2       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|* 15 |        HASH JOIN            |          | 14491 |   367K|   459   (1)| 00:00:06 |  Q1,03 | PCWP |            |
|  16 |         PX RECEIVE          |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
|  17 |          PX SEND BROADCAST  | :TQ10002 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | P->P | BROADCAST  |
|  18 |           PX BLOCK ITERATOR |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | PCWC |            |
|* 19 |            TABLE ACCESS FULL| T1       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  20 |         PX BLOCK ITERATOR   |          |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,03 | PCWC |            |
|* 21 |          TABLE ACCESS FULL  | T4       |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,03 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T3"."ID"="T4"."ID3")
   9 - access(:Z>=:Z AND :Z<=:Z)
       filter((TO_NUMBER("T3"."SMALL_VC")=1 OR TO_NUMBER("T3"."SMALL_VC")=2 OR TO_NUMBER("T3"."SMALL_VC")=3))
  10 - access("T2"."ID"="T4"."ID2")
  14 - access(:Z>=:Z AND :Z<=:Z)
       filter((TO_NUMBER("T2"."SMALL_VC")=1 OR TO_NUMBER("T2"."SMALL_VC")=2 OR TO_NUMBER("T2"."SMALL_VC")=3)) 
  15 - access("T1"."ID"="T4"."ID1") 
  19 - access(:Z>=:Z AND :Z<=:Z) 
       filter((TO_NUMBER("T1"."SMALL_VC")=1 OR TO_NUMBER("T1"."SMALL_VC")=2 OR TO_NUMBER("T1"."SMALL_VC")=3)) 
  21 - access(:Z>=:Z AND :Z<=:Z) 
       filter(SYS_OP_BLOOM_FILTER_LIST(SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID1"),SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID2"),SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID3")))

Plan 2 – which uses the “hash” distribution for the dimension tables.

---------------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib | 
---------------------------------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT                   |          |       |       |   464 (100)|          |        |      |            | 
|   1 |  SORT AGGREGATE                    |          |     1 |    38 |            |          |        |      |            | 
|   2 |   PX COORDINATOR                   |          |       |       |            |          |        |      |            | 
|   3 |    PX SEND QC (RANDOM)             | :TQ10006 |     1 |    38 |            |          |  Q1,06 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE                 |          |     1 |    38 |            |          |  Q1,06 | PCWP |            |
|*  5 |      HASH JOIN                     |          |    26 |   988 |   464   (2)| 00:00:06 |  Q1,06 | PCWP |            |
|   6 |       JOIN FILTER CREATE           | :BF0000  |     3 |    18 |     2   (0)| 00:00:01 |  Q1,06 | PCWP |            |
|   7 |        PX RECEIVE                  |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,06 | PCWP |            |
|   8 |         PX SEND HASH               | :TQ10004 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,04 | P->P | HASH       |
|   9 |          PX BLOCK ITERATOR         |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,04 | PCWC |            |
|* 10 |           TABLE ACCESS FULL        | T3       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,04 | PCWP |            |
|  11 |       PX RECEIVE                   |          |   612 | 19584 |   461   (1)| 00:00:06 |  Q1,06 | PCWP |            |
|  12 |        PX SEND HASH                | :TQ10005 |   612 | 19584 |   461   (1)| 00:00:06 |  Q1,05 | P->P | HASH       |
|  13 |         JOIN FILTER USE            | :BF0000  |   612 | 19584 |   461   (1)| 00:00:06 |  Q1,05 | PCWP |            |
|* 14 |          HASH JOIN BUFFERED        |          |   612 | 19584 |   461   (1)| 00:00:06 |  Q1,05 | PCWP |            |
|  15 |           JOIN FILTER CREATE       | :BF0001  |     3 |    18 |     2   (0)| 00:00:01 |  Q1,05 | PCWP |            |
|  16 |            PX RECEIVE              |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,05 | PCWP |            |
|  17 |             PX SEND HASH           | :TQ10002 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | P->P | HASH       |
|  18 |              PX BLOCK ITERATOR     |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | PCWC |            |
|* 19 |               TABLE ACCESS FULL    | T2       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
|  20 |           PX RECEIVE               |          | 14491 |   367K|   459   (1)| 00:00:06 |  Q1,05 | PCWP |            |
|  21 |            PX SEND HASH            | :TQ10003 | 14491 |   367K|   459   (1)| 00:00:06 |  Q1,03 | P->P | HASH       |
|  22 |             JOIN FILTER USE        | :BF0001  | 14491 |   367K|   459   (1)| 00:00:06 |  Q1,03 | PCWP |            |
|* 23 |              HASH JOIN BUFFERED    |          | 14491 |   367K|   459   (1)| 00:00:06 |  Q1,03 | PCWP |            |
|  24 |               JOIN FILTER CREATE   | :BF0002  |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
|  25 |                PX RECEIVE          |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
|  26 |                 PX SEND HASH       | :TQ10000 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | P->P | HASH       |
|  27 |                  PX BLOCK ITERATOR |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|* 28 |                   TABLE ACCESS FULL| T1       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
|  29 |               PX RECEIVE           |          |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,03 | PCWP |            |
|  30 |                PX SEND HASH        | :TQ10001 |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,01 | P->P | HASH       |
|  31 |                 JOIN FILTER USE    | :BF0002  |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,01 | PCWP |            |
|  32 |                  PX BLOCK ITERATOR |          |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,01 | PCWC |            |
|* 33 |                   TABLE ACCESS FULL| T4       |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,01 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T3"."ID"="T4"."ID3")
  10 - access(:Z>=:Z AND :Z<=:Z) 
       filter((TO_NUMBER("T3"."SMALL_VC")=1 OR TO_NUMBER("T3"."SMALL_VC")=2 OR TO_NUMBER("T3"."SMALL_VC")=3)) 
  14 - access("T2"."ID"="T4"."ID2")   19 - access(:Z>=:Z AND :Z<=:Z) 
       filter((TO_NUMBER("T2"."SMALL_VC")=1 OR TO_NUMBER("T2"."SMALL_VC")=2 OR TO_NUMBER("T2"."SMALL_VC")=3)) 
  23 - access("T1"."ID"="T4"."ID1")   28 - access(:Z>=:Z AND :Z<=:Z) 
       filter((TO_NUMBER("T1"."SMALL_VC")=1 OR TO_NUMBER("T1"."SMALL_VC")=2 OR TO_NUMBER("T1"."SMALL_VC")=3)) 
  33 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID1"))

If you can answer the Bloom filter question then here’s a bonus question – what is the significance of the access(:Z>=:Z AND :Z<=:Z) predicates that appear all over the place ?

It’s quite important to be able to answer the Bloom filter question correctly if you’re running parallel queries – especially if you’re on an Exadata machine – since it can make a big difference to the performance. The significance of the Exadata comment is that Bloom filters can be pushed down to the cell servers, which could reduce the traffic back to the database servers quite dramatically (and reduce the memory and CPU utilisation of the database servers at the same time).

Update (Feb 2014)

As often happens, each new version of Oracle fills some of the gaps left in older versions – here’s the execution plan for the “broadcast” query from 12.1.0.1 (from v$sql_monitor); you might like to compare it with Sayan’s “prediction” in the comments:


Plan hash value: 722858074
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Starts | E-Rows |E-Bytes| Cost  |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |      1 |        |       |   345 |        |      |            |      1 |00:00:00.03 |       5 |      0 |       |       |          |
|   1 |  SORT AGGREGATE              |          |      1 |      1 |    38 |       |        |      |            |      1 |00:00:00.03 |       5 |      0 |       |       |          |
|   2 |   PX COORDINATOR             |          |      1 |        |       |       |        |      |            |      3 |00:00:00.03 |       5 |      0 |       |       |          |
|   3 |    PX SEND QC (RANDOM)       | :TQ10000 |      0 |      1 |    38 |       |  Q1,00 | P->S | QC (RAND)  |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   4 |     SORT AGGREGATE           |          |      3 |      1 |    38 |       |  Q1,00 | PCWP |            |      3 |00:00:00.08 |    6158 |   6018 |       |       |          |
|*  5 |      HASH JOIN               |          |      3 |     26 |   988 |   345 |  Q1,00 | PCWP |            |     27 |00:00:00.08 |    6158 |   6018 |  2061K|  2061K|     3/0/0|
|   6 |       JOIN FILTER CREATE     | :BF0000  |      3 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      9 |00:00:00.01 |      12 |      0 |       |       |          |
|*  7 |        TABLE ACCESS FULL     | T3       |      3 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      9 |00:00:00.01 |      12 |      0 |       |       |          |
|*  8 |       HASH JOIN              |          |      3 |    612 | 19584 |   343 |  Q1,00 | PCWP |            |     27 |00:00:00.08 |    6146 |   6018 |  2061K|  2061K|     3/0/0|
|   9 |        JOIN FILTER CREATE    | :BF0001  |      3 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      9 |00:00:00.01 |      12 |      0 |       |       |          |
|* 10 |         TABLE ACCESS FULL    | T2       |      3 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      9 |00:00:00.01 |      12 |      0 |       |       |          |
|* 11 |        HASH JOIN             |          |      3 |  14491 |   367K|   341 |  Q1,00 | PCWP |            |     27 |00:00:00.07 |    6134 |   6018 |  2061K|  2061K|     3/0/0|
|  12 |         JOIN FILTER CREATE   | :BF0002  |      3 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      9 |00:00:00.01 |      12 |      0 |       |       |          |
|* 13 |          TABLE ACCESS FULL   | T1       |      3 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      9 |00:00:00.01 |      12 |      0 |       |       |          |
|  14 |         JOIN FILTER USE      | :BF0000  |      3 |    343K|  6699K|   339 |  Q1,00 | PCWP |            |     27 |00:00:00.07 |    6122 |   6018 |       |       |          |
|  15 |          JOIN FILTER USE     | :BF0001  |      3 |    343K|  6699K|   339 |  Q1,00 | PCWP |            |     27 |00:00:00.07 |    6122 |   6018 |       |       |          |
|  16 |           JOIN FILTER USE    | :BF0002  |      3 |    343K|  6699K|   339 |  Q1,00 | PCWP |            |     27 |00:00:00.07 |    6122 |   6018 |       |       |          |
|  17 |            PX BLOCK ITERATOR |          |      3 |    343K|  6699K|   339 |  Q1,00 | PCWC |            |     27 |00:00:00.07 |    6122 |   6018 |       |       |          |
|* 18 |             TABLE ACCESS FULL| T4       |     52 |    343K|  6699K|   339 |  Q1,00 | PCWP |            |     27 |00:00:00.04 |    6122 |   6018 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T3"."ID"="T4"."ID3")
   7 - filter((TO_NUMBER("T3"."SMALL_VC")=1 OR TO_NUMBER("T3"."SMALL_VC")=2 OR TO_NUMBER("T3"."SMALL_VC")=3))
   8 - access("T2"."ID"="T4"."ID2")
  10 - filter((TO_NUMBER("T2"."SMALL_VC")=1 OR TO_NUMBER("T2"."SMALL_VC")=2 OR TO_NUMBER("T2"."SMALL_VC")=3))
  11 - access("T1"."ID"="T4"."ID1")
  13 - filter((TO_NUMBER("T1"."SMALL_VC")=1 OR TO_NUMBER("T1"."SMALL_VC")=2 OR TO_NUMBER("T1"."SMALL_VC")=3))
  18 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER_LIST(SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID1"),SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID2"),SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID3")))

The improvement, of course, is the in introduction of the “JOIN FILTER CREATE” and “JOIN FILTER USE” lines. (There is still the odd little detail that the predicate section for line 18 manages to cite Bloom filter :BF0000 three times, rather than synchronising the labels with the body of the plan – in more complex plans this could be important.)

For completeness, here’s the 12c version of the hash distribution plan:

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Starts | E-Rows |E-Bytes| Cost  |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem |  O/1/M   | Max-Tmp |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |      1 |        |       |   345 |        |      |            |      1 |00:00:00.96 |      14 |      0 |      0 |       |       |          |         |
|   1 |  SORT AGGREGATE                       |          |      1 |      1 |    38 |       |        |      |            |      1 |00:00:00.96 |      14 |      0 |      0 |       |       |          |         |
|   2 |   PX COORDINATOR                      |          |      1 |        |       |       |        |      |            |      3 |00:00:00.96 |      14 |      0 |      0 |       |       |          |         |
|   3 |    PX SEND QC (RANDOM)                | :TQ10006 |      0 |      1 |    38 |       |  Q1,06 | P->S | QC (RAND)  |      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|   4 |     SORT AGGREGATE                    |          |      3 |      1 |    38 |       |  Q1,06 | PCWP |            |      3 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|*  5 |      HASH JOIN                        |          |      3 |     26 |   988 |   345 |  Q1,06 | PCWP |            |     27 |00:00:00.01 |       0 |      0 |      0 |  2061K|  2061K|     3/0/0|         |
|   6 |       JOIN FILTER CREATE              | :BF0000  |      3 |      3 |    18 |     2 |  Q1,06 | PCWP |            |      9 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|   7 |        PX RECEIVE                     |          |      3 |      3 |    18 |     2 |  Q1,06 | PCWP |            |      9 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|   8 |         PX SEND HYBRID HASH           | :TQ10004 |      0 |      3 |    18 |     2 |  Q1,04 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|   9 |          STATISTICS COLLECTOR         |          |      3 |        |       |       |  Q1,04 | PCWC |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|  10 |           PX BLOCK ITERATOR           |          |      3 |      3 |    18 |     2 |  Q1,04 | PCWC |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|* 11 |            TABLE ACCESS FULL          | T3       |      2 |      3 |    18 |     2 |  Q1,04 | PCWP |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|  12 |       PX RECEIVE                      |          |      3 |    612 | 19584 |   343 |  Q1,06 | PCWP |            |    604 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  13 |        PX SEND HYBRID HASH            | :TQ10005 |      0 |    612 | 19584 |   343 |  Q1,05 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  14 |         JOIN FILTER USE               | :BF0000  |      3 |    612 | 19584 |   343 |  Q1,05 | PCWP |            |    604 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|* 15 |          HASH JOIN BUFFERED           |          |      3 |    612 | 19584 |   343 |  Q1,05 | PCWP |            |    630 |00:00:00.01 |       0 |      0 |      0 |  2405K|  2405K|     3/0/0|         |
|  16 |           JOIN FILTER CREATE          | :BF0001  |      3 |      3 |    18 |     2 |  Q1,05 | PCWP |            |      9 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  17 |            PX RECEIVE                 |          |      3 |      3 |    18 |     2 |  Q1,05 | PCWP |            |      9 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  18 |             PX SEND HYBRID HASH       | :TQ10002 |      0 |      3 |    18 |     2 |  Q1,02 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  19 |              STATISTICS COLLECTOR     |          |      3 |        |       |       |  Q1,02 | PCWC |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|  20 |               PX BLOCK ITERATOR       |          |      3 |      3 |    18 |     2 |  Q1,02 | PCWC |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|* 21 |                TABLE ACCESS FULL      | T2       |      2 |      3 |    18 |     2 |  Q1,02 | PCWP |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|  22 |           PX RECEIVE                  |          |      3 |  14491 |   367K|   341 |  Q1,05 | PCWP |            |   1233 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  23 |            PX SEND HYBRID HASH        | :TQ10003 |      0 |  14491 |   367K|   341 |  Q1,03 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  24 |             JOIN FILTER USE           | :BF0001  |      3 |  14491 |   367K|   341 |  Q1,03 | PCWP |            |   1233 |00:00:02.68 |       0 |     42 |     42 |       |       |          |         |
|* 25 |              HASH JOIN BUFFERED       |          |      3 |  14491 |   367K|   341 |  Q1,03 | PCWP |            |  14700 |00:00:02.68 |       0 |     42 |     42 |  2405K|  2405K|     3/0/0|    1024 |
|  26 |               JOIN FILTER CREATE      | :BF0002  |      3 |      3 |    18 |     2 |  Q1,03 | PCWP |            |      9 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  27 |                PX RECEIVE             |          |      3 |      3 |    18 |     2 |  Q1,03 | PCWP |            |      9 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  28 |                 PX SEND HYBRID HASH   | :TQ10000 |      0 |      3 |    18 |     2 |  Q1,00 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  29 |                  STATISTICS COLLECTOR |          |      3 |        |       |       |  Q1,00 | PCWC |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|  30 |                   PX BLOCK ITERATOR   |          |      3 |      3 |    18 |     2 |  Q1,00 | PCWC |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|* 31 |                    TABLE ACCESS FULL  | T1       |      2 |      3 |    18 |     2 |  Q1,00 | PCWP |            |      3 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|  32 |               PX RECEIVE              |          |      3 |    343K|  6699K|   339 |  Q1,03 | PCWP |            |  14700 |00:00:00.06 |       0 |      0 |      0 |       |       |          |         |
|  33 |                PX SEND HYBRID HASH    | :TQ10001 |      0 |    343K|  6699K|   339 |  Q1,01 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |      0 |      0 |       |       |          |         |
|  34 |                 JOIN FILTER USE       | :BF0002  |      3 |    343K|  6699K|   339 |  Q1,01 | PCWP |            |  14700 |00:00:00.10 |    6122 |   6018 |      0 |       |       |          |         |
|  35 |                  PX BLOCK ITERATOR    |          |      3 |    343K|  6699K|   339 |  Q1,01 | PCWC |            |  14700 |00:00:00.10 |    6122 |   6018 |      0 |       |       |          |         |
|* 36 |                   TABLE ACCESS FULL   | T4       |     52 |    343K|  6699K|   339 |  Q1,01 | PCWP |            |  14700 |00:00:00.07 |    6122 |   6018 |      0 |       |       |          |         |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T3"."ID"="T4"."ID3")
  11 - access(:Z>=:Z AND :Z<=:Z)
       filter((TO_NUMBER("T3"."SMALL_VC")=1 OR TO_NUMBER("T3"."SMALL_VC")=2 OR TO_NUMBER("T3"."SMALL_VC")=3))
  15 - access("T2"."ID"="T4"."ID2")
  21 - access(:Z>=:Z AND :Z<=:Z)
       filter((TO_NUMBER("T2"."SMALL_VC")=1 OR TO_NUMBER("T2"."SMALL_VC")=2 OR TO_NUMBER("T2"."SMALL_VC")=3))
  25 - access("T1"."ID"="T4"."ID1")
  31 - access(:Z>=:Z AND :Z<=:Z)
       filter((TO_NUMBER("T1"."SMALL_VC")=1 OR TO_NUMBER("T1"."SMALL_VC")=2 OR TO_NUMBER("T1"."SMALL_VC")=3))
  36 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T4"."ID1"))

There ‘s basically no change in presentation quality of this plan, although the “px send hybrid hash” operators, along with the associated “statisics collector” operators, show that Oracle has considered adaptive execution and could switch from a hash distribution to a broadcast distribution at three different points in the plan. Again there’s a little oddity with the name of the Bloom filters (:BFnnnn) – they’re numbered in the opposite order to which they are created and used.

16 Comments »

  1. Plan 1 – 1 bloom filter
    Plan 2 – 3 bloom filters

    Filter predicate :Z >= :Z indicates allocation of blocks among parallel processes.

    Comment by vishaldesai — September 13, 2013 @ 7:45 pm BST Sep 13,2013 | Reply

  2. I think:
    1. in the first plan was 1 bloom filter, but it checked by list, which contains 3 bloom filters sets(SYS_OP_BLOOM_FILTER_LIST with 3 bloom filters). It seems like and-conditions.
    2. in the second – 3 different bloom filters in different places

    “access(:Z>=:Z AND :Z<=:Z)" – as far as I can remember, I never seen such predicates related to bloom filters. It may be datatype(size) checking?

    Comment by Sayan Malakshinov — September 13, 2013 @ 8:02 pm BST Sep 13,2013 | Reply

    • Oops, i forgot that access(:Z>=:Z AND :Z<=:Z) – it just blocks iterating through rowid filtering. But are they related to bloom filters in these plans?

      Comment by Sayan Malakshinov — September 13, 2013 @ 8:24 pm BST Sep 13,2013 | Reply

      • I think, the first plan should be really like that:

        -----------------------------------------------------------------------------------------------------------------------
        | Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
        ------------------------------------------------------------------------------------------------------------------------
        |   0 | SELECT STATEMENT               |          |       |       |   464 (100)|          |        |      |            |
        |   1 |  SORT AGGREGATE                |          |     1 |    38 |            |          |        |      |            |
        |   2 |   PX COORDINATOR               |          |       |       |            |          |        |      |            |
        |   3 |    PX SEND QC (RANDOM)         | :TQ10003 |     1 |    38 |            |          |  Q1,03 | P->S | QC (RAND)  |
        |   4 |     SORT AGGREGATE             |          |     1 |    38 |            |          |  Q1,03 | PCWP |            |
        |*  5 |      HASH JOIN                 |          |    26 |   988 |   464   (2)| 00:00:06 |  Q1,03 | PCWP |            |
        | AAA |       JOIN FILTER LIST CREATE  | :BF0000  |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
        |   6 |        PX RECEIVE              |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
        |   7 |         PX SEND BROADCAST      | :TQ10000 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |
        |   8 |          PX BLOCK ITERATOR     |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |
        |*  9 |           TABLE ACCESS FULL    | T3       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |
        |* 10 |       HASH JOIN                |          |   612 | 19584 |   461   (1)| 00:00:06 |  Q1,03 | PCWP |            |
        | BBB |        JOIN FILTER LIST APPEND | :BF0000  |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
        |  11 |         PX RECEIVE             |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
        |  12 |          PX SEND BROADCAST     | :TQ10001 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,01 | P->P | BROADCAST  |
        |  13 |           PX BLOCK ITERATOR    |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |
        |* 14 |            TABLE ACCESS FULL   | T2       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |
        |* 15 |        HASH JOIN               |          | 14491 |   367K|   459   (1)| 00:00:06 |  Q1,03 | PCWP |            |
        | CCC |         JOIN FILTER LIST APPEND| :BF0000  |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
        |  16 |          PX RECEIVE            |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,03 | PCWP |            |
        |  17 |           PX SEND BROADCAST    | :TQ10002 |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | P->P | BROADCAST  |
        |  18 |            PX BLOCK ITERATOR   |          |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | PCWC |            |
        |* 19 |             TABLE ACCESS FULL  | T1       |     3 |    18 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |
        | DDD |         JOIN FILTER LIST USE   |          |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,03 | PCWC |            |
        |  20 |          PX BLOCK ITERATOR     |          |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,03 | PCWC |            |
        |* 21 |           TABLE ACCESS FULL    | T4       |   343K|  6699K|   455   (1)| 00:00:06 |  Q1,03 | PCWP |            |
        ------------------------------------------------------------------------------------------------------------------------
        

        But such plan operations do not exist yet

        Comment by Sayan Malakshinov — September 13, 2013 @ 11:45 pm BST Sep 13,2013 | Reply

  3. in Christian Antognini’s paper on bloom filters (http://antognini.ch/papers/BloomFilters20080620.pdf) there is the statement: “[the operation creates] the bloom filter named :BF0000 (if there are serveral bloom filters, the numeric value is increased, i.e. the second one would be named :BF0001)”. According to this I would say that there are three bloom filters in query 2 and just one bloom filter in query 1. But that’s not a technical analysis but merely text exegesis…

    Comment by Martin Preiss — September 13, 2013 @ 11:07 pm BST Sep 13,2013 | Reply

  4. Answers so far:

    The “access(:Z>=:Z AND :Z<=:Z)" appears because parallel tablescans operate through rowid ranges, so the PX slave receives a rowid range as a "between bind1 and bind2" clause, so this check ensures that bind1 is less than bind 2. There's a related statistic counting rowid range access: "table scans (rowid ranges)" – if you want to play with it you may find that a single tablescan results in 13 x DOP tablescans by rowed range for any reasonable sized table.

    There are 3 Bloom filters created and used in the first plan, though you can't see them being created and they are all used at the same time at line 21 of the plan, where you can only see them being used because of the filter predicate. Sayan's modified plan shows you where they are created.

    There are 3 Blooms being created and one filter used according to the second plan structure. Unfortunately the predicate at line 33 says that only one filter is being used, and it's not the filter that the body of the plan says is being used. So we still need to know – how many filters really are used, and where are they used in the hash/hash plan.

    Comment by Jonathan Lewis — September 14, 2013 @ 7:36 am BST Sep 14,2013 | Reply

    • Jonathan,

      I suggest the very first information to notice is to read TQ column properly. Although the first plan is also parallel, thanks to broadcast distribution it can be read in same way as serial plan. The execution starts at line 9, the Parallel Process Set 1 read table T3 in parallel and send to the Set 2 (BF is also created, of course). Then Set 1 read table T2 (creating the second BF silently) and send data to the Set 2, again the same story with table T3. Finally, Set 2 read table T4 (huge fact table) in parallel , with the help of all 3 Bloom Filters, and performing all the necessary hash joins, since each Parallel Process in the second Set has all broadcasted data from T1,T2,T3 in virtual tables Q0,Q1,Q2.

      This is not the case for the second plan. The Set 1 read table T1 (line 28), redistributing data by HASH to the Set 2, which will be joining data, same time creating BF by the first Set. Since we have only two PQ Sets and there are more execution layers, data has to be written to TEMP, which can be also observed on line 23 (HASH JOIN BUFFERED, blocking operation). To be honest, I’m not able to say how is work exactly distributed between PQ Sets at this moment (HASH JOIN BUFFERED is still shrouded in mystery to me). However, it seems to be clear that only one BF can be used while reading T4, immediately discarding majority of rows which won’t match later.
      The table T2 is scanned thereafter, creating the second BF, which will help only to the resultset of join T1,T4. Similarly, the last BF will be created while reading T3 and used for resultset of T1,T4,T2.

      Comment by Pavol Babel — September 17, 2013 @ 3:20 am BST Sep 17,2013 | Reply

  5. And the answer is – as Sayan Malakshinov described:

    Both plans produced three Bloom filters – but plan 1 (with the broadcast distribution) doesn’t show them being created, and doesn’t have an operation showing them being used; you have to infer the details from the predicate section at line 21 which shows three Bloom filters (for columns id1, id2 and id3) being used in a “Bloom filter list” as table T4 is scanned.

    Plan 2, on the other hand, shows operations for the Bloom filter creation and usage that goes on (JOIN FILTER CREATE, JOIN FILTER USE). Unfortunately the predicate section makes it look as if only one of the Bloom filters is used – although all three do get actually get used – and the predicate at line 33 even lists the wrong join filter (:BF0000 when it’s really applying join filter :BF0002).

    The easiest way to check this interpretation is to use the SQL Monitoring feature – if you’re licenced for it – it’s very easy to see how many rows flow through each section of the plan, summed across all the PX slaves, and correlate the numbers with what is (or isn’t) visible in the execution plans. If you’re not licensed to use SQL Monitor then v$pq_tqstat is a free, and very helpful, view that gives you enough of the same information to show you what’s happening.

    If you’re running Exadata it’s very important to track the Bloom filters – in the first (broadcast) case all three filters will be pushed down to the storage servers, so the minimum volume of data travels back up the network to the database servers. In the second (hash/hash) case only one filter will be pushed to the storage servers, so a lot more data will come back and will be passed between parallel execution slaves as the other filters are applied late. This isn’t a bug, by the way, it’s a necessary side effect of the way that the hash/hash distribution works.

    It helps to understand what’s going on if you can read parallel execution plans properly – and the method for doing that isn’t the good old “start at the first child, recursive descent” strategy. But that’s a topic for another day (and for a few minutes in my OpenWorld presentation).

    Comment by Jonathan Lewis — September 16, 2013 @ 10:20 pm BST Sep 16,2013 | Reply

    • Oh, I was writing answer in slow manner (there were, meanwhile, other important things to do), so I didn’t notice you’ve provided the answer. I have still many questions in my head regarding HASH JOIN BUFFERED operation, although Randolf Geist made some very interesting observations few months ago.

      Comment by Pavol Babel — September 17, 2013 @ 3:34 am BST Sep 17,2013 | Reply

      • Pavol,

        I started reading your previous comment and thought I might not have to write the next article after all.

        Comment by Jonathan Lewis — September 17, 2013 @ 6:32 pm BST Sep 17,2013 | Reply

        • Jonathan,

          please, write next article, at least you are able to explain whole thing with much better english :) I could hardly find few people who read my Comment, your article would be read by plenty of readers and it could be usefull.People always forget that PARALLEL HASH JOINS can be processed in very different order.

          I’m still quite confused by HASH JOIN BUFFERED operator. You had also wrote article, but later you found out you were wrong in some details. Randolf Geist observed in spectacular way the second row source is being buffered before the probe phase starts and non-matching rows are discarded. However, I do not understand how is Parallel Slave Set 1 able to find out which row can be discarded (when Bloom Filter is not present) before Sending to Set 2 (only Set 2 does have the build table).
          Did you have time to dig deeper into HASH JOIN BUFFERED?

          Comment by Pavol Babel — September 18, 2013 @ 4:46 pm BST Sep 18,2013

        • Meanwhile, I asked Randolf few questions and confused him for a while. It seems he found satisfactory explanation.

          Comment by Pavol Babel — September 20, 2013 @ 8:34 am BST Sep 20,2013

  6. […] to Jonathan Lewis for quiz on bloom filter. I learned more about bloom filters from quiz and trying to find answers for below […]

    Pingback by Learning Bloom Filters | Vishal desai's Oracle Blog — September 18, 2013 @ 10:43 pm BST Sep 18,2013 | Reply

  7. […] be basing my comments on the two execution plans I produced for a recent question on Bloom filters, so you may want to read that article and its subsequent comments first (the link will open in a […]

    Pingback by Parallel Execution – 1 | Oracle Scratchpad — October 13, 2013 @ 7:44 pm BST Oct 13,2013 | Reply

  8. […] that I’ve been using in my previous demonstrations.  This was a query first introduced in a note on Bloom Filters (opens in a separate window) where I show two versions of a four-table hash parallel hash join, one […]

    Pingback by Parallel Execution – 5 | Oracle Scratchpad — March 10, 2014 @ 1:30 pm GMT Mar 10,2014 | Reply

  9. […] A quiz introducing Bloom Filters […]

    Pingback by Parallel Plans | Oracle Scratchpad — March 11, 2014 @ 7:39 am GMT Mar 11,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.