Oracle Scratchpad

January 3, 2015

Table Duplication

Filed under: Execution plans,Oracle,subqueries — Jonathan Lewis @ 11:54 am GMT Jan 3,2015

I’ve probably seen a transformation like the following before and I may even have written about it (though if I have I can’t find the article), but since it surprised me when I was experimenting with a little problem a few days ago I thought I’d pass it on as an example of how sophisticated the optimizer can be with query transformation.  I’ll be talking about the actual problem that I was working on in a later post so I won’t give you the table and data definitions in this post, I’ll just show some SQL and its plan:


rem
rem     Script:         most_recent.sql
rem     Author:         Jonathan Lewis
rem     Dated:          2nd Jan 2015
rem

select
        tr.item, tr.trans_date, tr.quantity
    , pr.gross_price
    , pr.net_price
    , pr.special_price
from
        xo_stock_trans tr,
        xo_prices pr
where
        tr.trans_date between '01-AUG-2014' and '3-AUG-2014'
and     pr.item = tr.item
and     pr.price_date = (
                select
                        max(pr2.price_date)
                from
                        xo_prices pr2
                where   pr2.item = tr.item
                and     pr2.price_date <= tr.trans_date
        )
;

The code is a fairly standard expression of “find me the most recent price available for each stock item as at the stock date of that item”. As you can see I’ve referenced the stock table once and the pricing table twice – the second appearance being in a “max()” correlated subquery. Oracle has decided to unnest the subquery – but spot the interesting detail in the plan:


-----------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                |    14M|   748M|       | 77955  (84)| 00:06:30 |
|*  1 |  HASH JOIN                 |                |    14M|   748M|    37M| 77955  (84)| 00:06:30 |
|*  2 |   HASH JOIN                |                |   829K|    28M|       | 70867  (92)| 00:05:55 |
|   3 |    JOIN FILTER CREATE      | :BF0000        | 25274 |   370K|       |  2530  (13)| 00:00:13 |
|*  4 |     TABLE ACCESS FULL      | XO_STOCK_TRANS | 25274 |   370K|       |  2530  (13)| 00:00:13 |
|   5 |    VIEW                    | VW_SQ_1        |   210M|  4206M|       | 64135  (94)| 00:05:21 |
|   6 |     HASH GROUP BY          |                |   210M|  5408M|       | 64135  (94)| 00:05:21 |
|   7 |      JOIN FILTER USE       | :BF0000        |   210M|  5408M|       | 11807  (67)| 00:01:00 |
|*  8 |       HASH JOIN            |                |   210M|  5408M|       | 11807  (67)| 00:01:00 |
|*  9 |        TABLE ACCESS FULL   | XO_STOCK_TRANS | 25274 |   370K|       |  2530  (13)| 00:00:13 |
|* 10 |        INDEX FAST FULL SCAN| XO_PRICES_IX1  |  3918K|    44M|       |  1936  (10)| 00:00:10 |
|  11 |   TABLE ACCESS FULL        | XO_PRICES      |  4571K|    87M|       |  2275  (11)| 00:00:12 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("PR"."ITEM"="TR"."ITEM" AND "PR"."PRICE_DATE"="MAX(PR2.PRICE_DATE)")
   2 - access("ITEM_1"=ROWID)
   4 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   8 - access("PR2"."ITEM"="TR"."ITEM")
       filter("PR2"."PRICE_DATE"<="TR"."TRANS_DATE") 9 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  10 - filter("PR2"."PRICE_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

I was running 11.2.0.4 at the time, which is why we can get a serial Bloom filter on the hash join (though, perhaps, only on non-mergeable, aggregate, views) – and it’s interesting to see that the filter has been pushed inside the view operator; but the really interesting part of the plan is the second appearance of the XO_STOCK_TRANS table.

My correlated subquery returns a value that is used in a comparison with a column in the XO_PRICES table, but the correlation predicates referred back to the XO_STOCK_TRANS table, so the optimizer has added the XO_STOCK_TRANS to the subquery as it unnested it.

I’ve written several examples of how we can optimise SQL manually by rewriting it to introduce extra copies of some of the tables (typically in a fashion analogous to the optimizer’s mechanism for star transformations), so it’s nice to see another variation on the theme of the optimizer using table duplication to optimise a statement.

Footnote:

The execution plan in 10.2.0.5 is slightly different, but it still unnests the subquery, introducing a second occurrence of XO_STOCK_TRANS as it does so.

Update Sept 2021

After writing up a number of notes on the optimizer’s ability to transform queries I took some time to revisit this example to see if there was any way to block the transformation, or any way I could see the transformation being invoked in the optimizer trace file.

It’s virtually invisible in the trace file until you know the answer; but an exhaustive check of parameter settings and optimizer environments revealed that the feature appeared in 10.1 with the parameter “_optimizer_squ_bottomup”, and setting this parameter to false (which can be done through the opt_param() hint) will disable the transformation.

The clue in the trace file is the series of annotation (with a few lines deleted):


SU: Considering bottom-up subquery unnesting
...
SU:   Checking validity of unnesting subquery SEL$2 (#2)
...
SU:   Unnesting subquery query block SEL$2 (#2)
Registered qb: SEL$0CF828B0 0x6a716258 (QUERY BLOCK TABLES CHANGED SEL$2)

 

4 Comments »

  1. Jonathan sir,
    Bloom filter was fine,Didn’t understand why we have below hash join access conditions

    2 – access(“ITEM_1″=ROWID)

    Why Rowid was use as joining column and not tr.item?

    Can you please post, transform sql by optimizer for better understanding on same.?

    Comment by Deepak Mahto — January 3, 2015 @ 3:28 pm GMT Jan 3,2015 | Reply

    • Deepak,

      Here’s the transformed SQL (a re-formatted versin of the “Unparsed SQL” in the 10053 trace file) that makes it easier to see the answer to your first question:

      
      SELECT 
              TR.ITEM ITEM,
              TR.TRANS_DATE TRANS_DATE,
              TR.QUANTITY QUANTITY,
              PR.GROSS_PRICE GROSS_PRICE,
              PR.NET_PRICE NET_PRICE,
              PR.SPECIAL_PRICE SPECIAL_PRICE 
      FROM    (
              SELECT
                      MAX(PR2.PRICE_DATE)  "MAX(PR2.PRICE_DATE)",
                      TR.ROWID ITEM_1
              FROM 
                      TEST_USER.XO_STOCK_TRANS  TR,
                      TEST_USER.XO_PRICES       PR2 
              WHERE 
                      PR2.ITEM = TR.ITEM
              AND     PR2.PRICE_DATE <= TR.TRANS_DATE
              AND     TR.TRANS_DATE  <= TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND     TR.TRANS_DATE  >= TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND     PR2.PRICE_DATE <= TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND     TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss') >= 
                              TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              GROUP BY 
                      TR.ROWID
              )                          VW_SQ_1,
              TEST_USER.XO_STOCK_TRANS   TR,
              TEST_USER.XO_PRICES        PR
      WHERE 
              TR.TRANS_DATE >= TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
      AND     TR.TRANS_DATE <= TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
      AND     PR.ITEM        = TR.ITEM
      AND     PR.PRICE_DATE  = VW_SQ_1."MAX(PR2.PRICE_DATE)" 
      AND     VW_SQ_1.ITEM_1 = TR.ROWID
      ;
      
      

      Just in case you missed it, the column on the left of the join is “ITEM_1”, not “ITEM”, and it’s the alias for the XO_STOCKTRANS.ROWID in the subquery.

      Since each row in xo_stock_trans could join to many rows in xo_prices based on the item and trans/price dates, and since there are many rows in xo_stock_trans for each item, we can’t aggregate and join on item; we want to aggregate on the xo_stock_trans ROW to find the max date for that row after the join, and that means aggregating on the rowid – which we then join back.

      Comment by Jonathan Lewis — January 3, 2015 @ 3:46 pm GMT Jan 3,2015 | Reply

  2. […] to find the maximum price_date for each target row in the xo_stock_trans table. (See my earlier posting on table duplication for a variation and explanation of what Oracle has done here). This step is a lot of work, but the […]

    Pingback by Most Recent | Oracle Scratchpad — January 7, 2015 @ 6:21 pm GMT Jan 7,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: