Oracle Scratchpad

July 8, 2016

DML and Bloom

Filed under: Bugs,Oracle,Partitioning,Troubleshooting — Jonathan Lewis @ 1:01 pm BST Jul 8,2016

One of the comments on my recent posting about “Why use pl/sql bulk strategies over simple SQL” pointed out that it’s not just distributed queries that can change plans dramatically when you change from a simple select to “insert into … select …”; there’s a similar problem with queries that use Bloom filters – the filter disappears when you change from the query to the DML.

This seemed a little bizarre, so I did a quick search on MoS (using the terms “insert select Bloom Filter”) to check for known bugs and then tried to run up a quick demo. Here’s a summary of the related bugs that I found through my first simple search:

Bug 20112932 : BLOOM FILTER IS NOT CHOOSED WHEN EXECUTING INSERT SELECT STATEMENT
11.2.0.4 - Nov 2014: "not a bug".
This cites three more documents:
  Bug 13801198  BLOOM PRUNING/FILTER NOT USED IN DML STATEMENTS 
    11.2.0.3 - March 2012 - Duplicate, base bug: 18949550
    18949550: MERGE QUERY NOT USING BLOOM FILTER
    11.2.0.3 - June 2014: fixed in 12.2 (no patches yet)

  Bug 14325392  BLOOM FILTER NOT SUPPORTED IN PARTITION WISE JOIN 
    11.2.0.3 - July 2012 - fixed in 12.1 
    (but I think this is a PX filter, not a partition filter)

  Bug 17716301  BLOOM FILTER NOT USED FOR OUTER JOIN</pre>

    Aug 2014 - hidden bug note. Patches available for 11.2.0.3
    Note 1919508.1 describes the problem.

And here’s some code to demonstrate the problem – running 11.2.0.4 (with similar results on 12.1.0.2).


rem
rem     Script:         dml_bloom_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table pt_hash (
        id,
        grp,
        small_vc,
        padding
)
nologging
pctfree 90 pctused 10
partition by hash(grp)
(
        partition p1,
        partition p2,
        partition p3,
        partition p4
)
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                          id,
        trunc(rownum/50)                grp,
        to_char(trunc(rownum/20))       small_vc,
        rpad('x',100)                   padding
from
        generator, generator
where
        rownum <= 1e5
;

create table t1
as
select
        rownum  id,
        rownum  n1,
        lpad(rownum,10) v1,
        rpad('x',100) padding
from
        dual
connect by
        rownum <= 4
;

create table target (
        n1      number,
        id      number
)
;

-- gather stats on t1 and pt_hash (for all columns size 1, otherwise default)

There’s nothing fancy in this code, just a hash partitioned table with 100,000 rows (and a lot of wasted space), and a “driving” table with 4 rows that I can use in a very simple join to the partitioned table. I’ve also got an empty table ready for an insert statement.

So here’s a query with its execution plan (with rowsource execution statistics) that (based on the content of the t1 table) should select from at most two of the four partitions. I’ve hinted a hash join which is where Bloom filtering can be very effective with partitioned table joins:


set serveroutput off
alter session set statistics_level = all;

select
        /*+
                leading(t1 h1)
                use_hash(h1)    no_swap_join_inputs(h1)
        */
        t1.n1,
        h1.id
from
        t1,
        pt_hash h1
where
        t1.id between 2 and 3
and     h1.grp = t1.n1
and     h1.id <= 100
;

select * from table(dbms_xplan.display_cursor(null,null,'outline allstats last cost partition'));

------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |      1 |        |  2115 (100)|       |       |      1 |00:00:00.16 |    8374 |   8367 |       |       |          |
|*  1 |  HASH JOIN                  |         |      1 |      3 |  2115   (2)|       |       |      1 |00:00:00.16 |    8374 |   8367 |  2440K|  2440K|  777K (0)|
|   2 |   PART JOIN FILTER CREATE   | :BF0000 |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |      0 |       |       |          |
|*  3 |    TABLE ACCESS FULL        | T1      |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |      0 |       |       |          |
|   4 |   PARTITION HASH JOIN-FILTER|         |      1 |    100 |  2113   (2)|:BF0000|:BF0000|     51 |00:00:00.16 |    8372 |   8367 |       |       |          |
|*  5 |    TABLE ACCESS FULL        | PT_HASH |      2 |    100 |  2113   (2)|:BF0000|:BF0000|     51 |00:00:00.16 |    8372 |   8367 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------

-- Predicates and Outline redacted

The most significant points to notice are:

  • Operation 2 – Part Join Filter Created: we’ve created a Bloom filter
  • Operation 4 – Partition Hash Join – Filter: we’ve used the Bloom filter, Pstart/Pstop = :BF0000
  • Operations 5 – Table Access Full: starts twice, the two partitions we need

And now we use the same SQL statement but insert the result set into the holding table:

insert into target(n1, id)
select
        /*+
                ordered
                use_hash(h1)    no_swap_join_inputs(h1)
        */
        t1.id,
        h1.id
from
        t1,
        pt_hash h1
where
        t1.id between 2 and 3
and     h1.grp = t1.n1
and     h1.id <= 100
;

select * from table(dbms_xplan.display_cursor(null,null,'outline allstats last cost partition'));

------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name    | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |         |      1 |        |  2115 (100)|       |       |      0 |00:00:00.02 |   16698 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL |         |      1 |        |            |       |       |      0 |00:00:00.02 |   16698 |       |       |          |
|*  2 |   HASH JOIN              |         |      1 |      3 |  2115   (2)|       |       |      1 |00:00:00.02 |   16689 |  1969K|  1969K|  723K (0)|
|*  3 |    TABLE ACCESS FULL     | T1      |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |       |       |          |
|   4 |    PARTITION HASH ALL    |         |      1 |    100 |  2113   (2)|     1 |     4 |    100 |00:00:00.02 |   16687 |       |       |          |
|*  5 |     TABLE ACCESS FULL    | PT_HASH |      4 |    100 |  2113   (2)|     1 |     4 |    100 |00:00:00.02 |   16687 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------

-- Predicate and Outline sections redacted

In this case we don’t see any operation to create a Bloom filter, and operation 5 reports 4 starts, one for each partition, with the Pstart/Pstop showing 1 – 4. As a cross-check you can also see that the buffers visited has doubled and the time (though don’t trust it too much on such a small test) has increased.

To my surprise the outline information for the first plan didn’t report a Bloom filter hint (px_join_filter), but I tried adding one to the insert statement anyway – and nothing changed; however there is a “subquery pruning” mechanism that’s been available to the optimizer for a very long time now so I decided to try hinting that for both queries /*+ subquery_pruning(@sel$1 h1@sel$1 partition) */. This had no effect on the select statement, but produced the following plan for the insert:


-------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name    | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT          |         |      1 |        |  2115 (100)|       |       |      0 |00:00:00.01 |    8390 |       |       |          |
|   1 |  LOAD TABLE CONVENTIONAL  |         |      1 |        |            |       |       |      0 |00:00:00.01 |    8390 |       |       |          |
|*  2 |   HASH JOIN               |         |      1 |      3 |  2115   (2)|       |       |      1 |00:00:00.01 |    8381 |  1969K|  1969K|  829K (0)|
|*  3 |    TABLE ACCESS FULL      | T1      |      1 |      3 |     2   (0)|       |       |      2 |00:00:00.01 |       2 |       |       |          |
|   4 |    PARTITION HASH SUBQUERY|         |      1 |    100 |  2113   (2)|KEY(SQ)|KEY(SQ)|     51 |00:00:00.01 |    8379 |       |       |          |
|*  5 |     TABLE ACCESS FULL     | PT_HASH |      2 |    100 |  2113   (2)|KEY(SQ)|KEY(SQ)|     51 |00:00:00.01 |    8377 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------

As you can see operation 4 now reports Partition Hash Subquery – Oracle has “pre-queried” the driving table (you can see the statement in the 10046 trace file if you enable SQL trace) to discover which partitions it will have to visit – and operation 5 starts twice with a Pstart/Pstop of KEY(SQ).

Interestingly the hint didn’t end up in the Outline section of the execution plan, so this is an example of a query which you can’t fix by capturing the SQL Plan Baseline for one statement against the text for another – though you can hack the hint into an SQL Profile or (my preference in this case, I think) an SQL Patch.

Footnote:

After publishing this note I started to browse through the comments on the article I’d linked to reference subquery pruning, and discovered that an example of this insert/select issue appeared there with a surprising workaround, which was to use the “multi-table insert” (with just a single table). At the moment, though I haven’t managed to get this working with this example.

Footnote 2:

I’ve just been reminded by email of another detail mentioned in the comments of the earlier post – this problem isn’t present for direct path (i.e. /*+ append */) inserts – and I have run a quick test on 12.1.0.2 to check that this is indeed the case (though the fix isn’t present in 11.2.0.4).  Don’t forget to check the listed Bug notes on MoS, in case there’s a back-port available for your version of Oracle.

 

10 Comments »

  1. Jonathan, nice example, as always, demonstrating this Oracle bug. And nice workaround. I wasn’t able to reproduce the workaround when the table is sub-partitioned (range by date, hash by location –> a number) with the bloom filter on the sub-partition when the select is not within an insert. A date constant is used for the partition key and the sub-partition keys come from another table joined. Also, I can confirm that even with the append hint, it is still not working in 11.2.0.4, because that is the version I’m using.

    Comment by Louis — July 8, 2016 @ 2:47 pm BST Jul 8,2016 | Reply

    • Louis,

      I’ve got an example where the subquery_pruning hint gives key(SQ) pruning on the subpartition – but my partition key is numeric.
      The hint uses the word “subpartition” instead of “partition” to aim itself.
      To get more complicated my example has a range that crosses two partitions.
      I’ll modify my model to use dates instead of numbers over the weekend.

      Comment by Jonathan Lewis — July 8, 2016 @ 3:27 pm BST Jul 8,2016 | Reply

      • Thanks Jonathan. The fact the range crosses two partitions is not an issue. I actually use a between with 2 date constants.The problem is that the bloom filter does no get applied at the sub-partition level with insert-select and all sub-partitions are read instead of only 1 (when the select is within an insert).

        One thing that could be different in mine is that the locations (sub-partition keys) actually come from an inline view (a union of 2 tables). I also have a RANK() function to get the last row (most current information) within the range of dates.

        From a master script looping on the sub-partitions, I pass the number of sub-partitions and the partition position so that I process only the locations that reside in that one sub-partition (using the ORA_HASH function). This way, I can process one sub-partition at a time. I then run this concurrently in the background up to a max number of threads (example 64 sub-partitions to process using 24 concurrent threads to minimize the load of the database server). The sub-partitions can then be processed in parallel in direct path (append) mode because I name the partition of the target table (no locking of the whole table). The target table has a partition key equal to the sub-partition key of the source table. This is faster than a full insert-select (whole table), uses less undo space and has the possibility of reprocessing only one sub-partition if needed in case of error.

        If it can help, this is my query:

        SELECT h.item,
               h.location,
               h.loc_type,
               h.channel_id,
               h.eow_date,
               h.regular_unit_retail_amt,
               h.selling_unit_retail_amt,
               h.unit_cost_amt
        FROM (SELECT j.item,
                     j.location,
                     j.loc_type,
                     l.channel_id,
                     j.eow_date,
                     j.regular_unit_retail_amt,
                     j.selling_unit_retail_amt,
                     j.unit_cost_amt,
                     RANK() OVER(PARTITION BY j.item, j.location ORDER BY j.eow_date DESC)
                        AS wk_rank
              FROM (SELECT s.store     location,
                           'S'         loc_type,
                           s.channel_id
                    FROM store s
                    UNION ALL
                    SELECT w.wh     location,
                           'W'      loc_type,
                           w.channel_id
                    FROM wh w
                    WHERE w.wh != w.physical_wh) l,
                   jcp_wk_item_soh     j
              WHERE ORA_HASH(l.location, :g_nb_partitions-1) + 1 = :g_partition_position
                AND j.location = l.location
                AND j.loc_type = l.loc_type
                AND j.eow_date BETWEEN TO_DATE(:g_from_eow_date, 'YYYYMMDD')
                                   AND TO_DATE(:g_to_eow_date, 'YYYYMMDD')) h
        WHERE h.wk_rank = 1
        
        And this is the plan (select only, no insert). The bloom filter disappears when the select is used in an insert-select.
        -----------------------------------------------------------------------------------------------------------------------------
        | Id  | Operation                        | Name            | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop |
        -----------------------------------------------------------------------------------------------------------------------------
        |   0 | SELECT STATEMENT                 |                 |    719K|    70M|       |  4765K  (4)| 15:53:04 |       |       |
        |*  1 |  VIEW                            |                 |    719K|    70M|       |  4765K  (4)| 15:53:04 |       |       |
        |*  2 |   WINDOW SORT PUSHED RANK        |                 |    719K|    44M|    57M|  4765K  (4)| 15:53:04 |       |       |
        |*  3 |    FILTER                        |                 |        |       |       |            |          |       |       |
        |*  4 |     HASH JOIN                    |                 |    719K|    44M|       |  4754K  (4)| 15:50:59 |       |       |
        |   5 |      PART JOIN FILTER CREATE     | :BF0000         |     17 |   493 |       |    20   (0)| 00:00:01 |       |       |
        |   6 |       VIEW                       |                 |     17 |   493 |       |    20   (0)| 00:00:01 |       |       |
        |   7 |        UNION-ALL                 |                 |        |       |       |            |          |       |       |
        |*  8 |         TABLE ACCESS STORAGE FULL| STORE           |     12 |    84 |       |    14   (0)| 00:00:01 |       |       |
        |*  9 |         TABLE ACCESS STORAGE FULL| WH              |      5 |    65 |       |     6   (0)| 00:00:01 |       |       |
        |  10 |      PARTITION RANGE ITERATOR    |                 |     99M|  3414M|       |  4754K  (4)| 15:50:55 |   KEY |   KEY |
        |  11 |       PARTITION HASH JOIN-FILTER |                 |     99M|  3414M|       |  4754K  (4)| 15:50:55 |:BF0000|:BF0000|
        |* 12 |        TABLE ACCESS STORAGE FULL | JCP_WK_ITEM_SOH |     99M|  3414M|       |  4754K  (4)| 15:50:55 |   KEY |   KEY |
        -----------------------------------------------------------------------------------------------------------------------------
        

        Comment by Louis — July 8, 2016 @ 9:20 pm BST Jul 8,2016 | Reply

        • Louis,

          There doesn’t seem to be much to trace to determine how the optimizer bypasses the subquery pruning, but in your case it does seem to be the UNION ALL view that is the driving “table”.

          If you can change the code you could create a materialized view, defined as refresh on commit, for the UNION ALL.
          You’d need materialized view logs with rowid on the two tables, and you’ve conveniently got a “UNION ALL marker’ in the view already.

          Given the table names (store, warehouse) I don’t think the tables are likely to be subject to frequent rapid update, so “refresh on commit” looks fairly safe.

          You’d then need to change the query to reference the materialized view directly. (Though who knows without testing it; maybe query rewrite could substitute the MV name and then optimize on partition pruning – but I’d be surprised (and impressed)).

          Comment by Jonathan Lewis — July 9, 2016 @ 1:21 pm BST Jul 9,2016

        • What about using a WITH clause with a /*+ materialize */ hint? It would create a temp segment and that temp segment would then be referenced directly.

          Comment by Louis — July 9, 2016 @ 5:53 pm BST Jul 9,2016

        • Louis,

          It doesn’t work – it was one of the first things I tried after finding that the UNION ALL seemed to be the blocker.

          Comment by Jonathan Lewis — July 9, 2016 @ 5:59 pm BST Jul 9,2016

  2. An interesting enough, CTAS uses bloom filters again. A “workaround” I used a few times ….

     -----------------------------------------------------------------------------------------------------------------
    |   0 | CREATE TABLE STATEMENT       |         |        |  4630 (100)|       |       |       |       |          |
    |   1 |  LOAD AS SELECT              |         |        |            |       |       |   267K|   267K|  267K (0)|
    |*  2 |   HASH JOIN                  |         |      3 |  4629   (1)|       |       |   968K|   968K|  491K (0)|
    |   3 |    PART JOIN FILTER CREATE   | :BF0000 |      3 |     3   (0)|       |       |       |       |          |
    |*  4 |     TABLE ACCESS FULL        | T1      |      3 |     3   (0)|       |       |       |       |          |
    |   5 |    PARTITION HASH JOIN-FILTER|         |    100 |  4625   (1)|:BF0000|:BF0000|       |       |          |
    |*  6 |     TABLE ACCESS FULL        | PT_HASH |    100 |  4625   (1)|:BF0000|:BF0000|       |       |          |
    -----------------------------------------------------------------------------------------------------------------
    

    Comment by stefan zehnder — July 9, 2016 @ 10:21 pm BST Jul 9,2016 | Reply

    • which does not invalidate the bug description, as CTAS is not considered DML, but DDL with data

      Comment by stefan zehnder — July 9, 2016 @ 10:26 pm BST Jul 9,2016 | Reply

    • Stefan,

      That’s a point worth remembering; especially if you’re running 12c (and have already paid for the partitioning licence) where you could define your final table as a one-partition hash partitioned table, do a CTAS to generate the data you wanted in it, and then do an online partition exchange. As a fringe benefit you also get the basic statistics on the table as you do the CTAS.

      Comment by Jonathan Lewis — July 11, 2016 @ 3:35 pm BST Jul 11,2016 | Reply

  3. […] an article I wrote last week about Bloom filters disappearing as you changed a SELECT to a (conventional) INSERT/SELECT I suggested using the subquery_pruning() […]

    Pingback by Union All MV | Oracle Scratchpad — July 12, 2016 @ 10:10 am BST Jul 12,2016 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.