Oracle Scratchpad

January 3, 2015

Table Duplication

Filed under: Execution plans,Oracle,subqueries — Jonathan Lewis @ 11:54 am GMT Jan 3,2015

I’ve probably seen a transformation like the following before and I may even have written about it (though if I have I can’t find the article), but since it surprised me when I was experimenting with a little problem a few days ago I thought I’d pass it on as an example of how sophisticated the optimizer can be with query transformation.  I’ll be talking about the actual problem that I was working on in a later post so I won’t give you the table and data definitions in this post, I’ll just show some SQL and its plan:


rem
rem     Script:         most_recent.sql
rem     Author:         Jonathan Lewis
rem     Dated:          2nd Jan 2015
rem

select
        tr.item, tr.trans_date, tr.quantity
    , pr.gross_price
    , pr.net_price
    , pr.special_price
from
        xo_stock_trans tr,
        xo_prices pr
where
        tr.trans_date between '01-AUG-2014' and '3-AUG-2014'
and     pr.item = tr.item
and     pr.price_date = (
                select
                        max(pr2.price_date)
                from
                        xo_prices pr2
                where   pr2.item = tr.item
                and     pr2.price_date <= tr.trans_date
        )
;

The code is a fairly standard expression of “find me the most recent price available for each stock item as at the stock date of that item”. As you can see I’ve referenced the stock table once and the pricing table twice – the second appearance being in a “max()” correlated subquery. Oracle has decided to unnest the subquery – but spot the interesting detail in the plan:


-----------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                |    14M|   748M|       | 77955  (84)| 00:06:30 |
|*  1 |  HASH JOIN                 |                |    14M|   748M|    37M| 77955  (84)| 00:06:30 |
|*  2 |   HASH JOIN                |                |   829K|    28M|       | 70867  (92)| 00:05:55 |
|   3 |    JOIN FILTER CREATE      | :BF0000        | 25274 |   370K|       |  2530  (13)| 00:00:13 |
|*  4 |     TABLE ACCESS FULL      | XO_STOCK_TRANS | 25274 |   370K|       |  2530  (13)| 00:00:13 |
|   5 |    VIEW                    | VW_SQ_1        |   210M|  4206M|       | 64135  (94)| 00:05:21 |
|   6 |     HASH GROUP BY          |                |   210M|  5408M|       | 64135  (94)| 00:05:21 |
|   7 |      JOIN FILTER USE       | :BF0000        |   210M|  5408M|       | 11807  (67)| 00:01:00 |
|*  8 |       HASH JOIN            |                |   210M|  5408M|       | 11807  (67)| 00:01:00 |
|*  9 |        TABLE ACCESS FULL   | XO_STOCK_TRANS | 25274 |   370K|       |  2530  (13)| 00:00:13 |
|* 10 |        INDEX FAST FULL SCAN| XO_PRICES_IX1  |  3918K|    44M|       |  1936  (10)| 00:00:10 |
|  11 |   TABLE ACCESS FULL        | XO_PRICES      |  4571K|    87M|       |  2275  (11)| 00:00:12 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("PR"."ITEM"="TR"."ITEM" AND "PR"."PRICE_DATE"="MAX(PR2.PRICE_DATE)")
   2 - access("ITEM_1"=ROWID)
   4 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   8 - access("PR2"."ITEM"="TR"."ITEM")
       filter("PR2"."PRICE_DATE"<="TR"."TRANS_DATE") 9 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  10 - filter("PR2"."PRICE_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

I was running 11.2.0.4 at the time, which is why we can get a serial Bloom filter on the hash join (though, perhaps, only on non-mergeable, aggregate, views) – and it’s interesting to see that the filter has been pushed inside the view operator; but the really interesting part of the plan is the second appearance of the XO_STOCK_TRANS table.

My correlated subquery returns a value that is used in a comparison with a column in the XO_PRICES table, but the correlation predicates referred back to the XO_STOCK_TRANS table, so the optimizer has added the XO_STOCK_TRANS to the subquery as it unnested it.

I’ve written several examples of how we can optimise SQL manually by rewriting it to introduce extra copies of some of the tables (typically in a fashion analogous to the optimizer’s mechanism for star transformations), so it’s nice to see another variation on the theme of the optimizer using table duplication to optimise a statement.

Footnote:

The execution plan in 10.2.0.5 is slightly different, but it still unnests the subquery, introducing a second occurrence of XO_STOCK_TRANS as it does so.

4 Comments »

  1. Jonathan sir,
    Bloom filter was fine,Didn’t understand why we have below hash join access conditions

    2 – access(“ITEM_1″=ROWID)

    Why Rowid was use as joining column and not tr.item?

    Can you please post, transform sql by optimizer for better understanding on same.?

    Comment by Deepak Mahto — January 3, 2015 @ 3:28 pm GMT Jan 3,2015 | Reply

    • Deepak,

      Here’s the transformed SQL (a re-formatted versin of the “Unparsed SQL” in the 10053 trace file) that makes it easier to see the answer to your first question:

      
      SELECT 
              TR.ITEM ITEM,
              TR.TRANS_DATE TRANS_DATE,
              TR.QUANTITY QUANTITY,
              PR.GROSS_PRICE GROSS_PRICE,
              PR.NET_PRICE NET_PRICE,
              PR.SPECIAL_PRICE SPECIAL_PRICE 
      FROM    (
              SELECT
                      MAX(PR2.PRICE_DATE)  "MAX(PR2.PRICE_DATE)",
                      TR.ROWID ITEM_1
              FROM 
                      TEST_USER.XO_STOCK_TRANS  TR,
                      TEST_USER.XO_PRICES       PR2 
              WHERE 
                      PR2.ITEM = TR.ITEM
              AND     PR2.PRICE_DATE <= TR.TRANS_DATE
              AND     TR.TRANS_DATE  <= TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND     TR.TRANS_DATE  >= TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND     PR2.PRICE_DATE <= TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND     TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss') >= 
                              TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              GROUP BY 
                      TR.ROWID
              )                          VW_SQ_1,
              TEST_USER.XO_STOCK_TRANS   TR,
              TEST_USER.XO_PRICES        PR
      WHERE 
              TR.TRANS_DATE >= TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
      AND     TR.TRANS_DATE <= TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
      AND     PR.ITEM        = TR.ITEM
      AND     PR.PRICE_DATE  = VW_SQ_1."MAX(PR2.PRICE_DATE)" 
      AND     VW_SQ_1.ITEM_1 = TR.ROWID
      ;
      
      

      Just in case you missed it, the column on the left of the join is “ITEM_1”, not “ITEM”, and it’s the alias for the XO_STOCKTRANS.ROWID in the subquery.

      Since each row in xo_stock_trans could join to many rows in xo_prices based on the item and trans/price dates, and since there are many rows in xo_stock_trans for each item, we can’t aggregate and join on item; we want to aggregate on the xo_stock_trans ROW to find the max date for that row after the join, and that means aggregating on the rowid – which we then join back.

      Comment by Jonathan Lewis — January 3, 2015 @ 3:46 pm GMT Jan 3,2015 | Reply

  2. […] to find the maximum price_date for each target row in the xo_stock_trans table. (See my earlier posting on table duplication for a variation and explanation of what Oracle has done here). This step is a lot of work, but the […]

    Pingback by Most Recent | Oracle Scratchpad — January 7, 2015 @ 6:21 pm GMT Jan 7,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.