I’ve probably seen a transformation like the following before and I may even have written about it (though if I have I can’t find the article), but since it surprised me when I was experimenting with a little problem a few days ago I thought I’d pass it on as an example of how sophisticated the optimizer can be with query transformation. I’ll be talking about the actual problem that I was working on in a later post so I won’t give you the table and data definitions in this post, I’ll just show some SQL and its plan:
rem rem Script: most_recent.sql rem Author: Jonathan Lewis rem Dated: 2nd Jan 2015 rem select tr.item, tr.trans_date, tr.quantity , pr.gross_price , pr.net_price , pr.special_price from xo_stock_trans tr, xo_prices pr where tr.trans_date between '01-AUG-2014' and '3-AUG-2014' and pr.item = tr.item and pr.price_date = ( select max(pr2.price_date) from xo_prices pr2 where pr2.item = tr.item and pr2.price_date <= tr.trans_date ) ;
The code is a fairly standard expression of “find me the most recent price available for each stock item as at the stock date of that item”. As you can see I’ve referenced the stock table once and the pricing table twice – the second appearance being in a “max()” correlated subquery. Oracle has decided to unnest the subquery – but spot the interesting detail in the plan:
----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14M| 748M| | 77955 (84)| 00:06:30 | |* 1 | HASH JOIN | | 14M| 748M| 37M| 77955 (84)| 00:06:30 | |* 2 | HASH JOIN | | 829K| 28M| | 70867 (92)| 00:05:55 | | 3 | JOIN FILTER CREATE | :BF0000 | 25274 | 370K| | 2530 (13)| 00:00:13 | |* 4 | TABLE ACCESS FULL | XO_STOCK_TRANS | 25274 | 370K| | 2530 (13)| 00:00:13 | | 5 | VIEW | VW_SQ_1 | 210M| 4206M| | 64135 (94)| 00:05:21 | | 6 | HASH GROUP BY | | 210M| 5408M| | 64135 (94)| 00:05:21 | | 7 | JOIN FILTER USE | :BF0000 | 210M| 5408M| | 11807 (67)| 00:01:00 | |* 8 | HASH JOIN | | 210M| 5408M| | 11807 (67)| 00:01:00 | |* 9 | TABLE ACCESS FULL | XO_STOCK_TRANS | 25274 | 370K| | 2530 (13)| 00:00:13 | |* 10 | INDEX FAST FULL SCAN| XO_PRICES_IX1 | 3918K| 44M| | 1936 (10)| 00:00:10 | | 11 | TABLE ACCESS FULL | XO_PRICES | 4571K| 87M| | 2275 (11)| 00:00:12 | ----------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("PR"."ITEM"="TR"."ITEM" AND "PR"."PRICE_DATE"="MAX(PR2.PRICE_DATE)") 2 - access("ITEM_1"=ROWID) 4 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 8 - access("PR2"."ITEM"="TR"."ITEM") filter("PR2"."PRICE_DATE"<="TR"."TRANS_DATE") 9 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 10 - filter("PR2"."PRICE_DATE"<=TO_DATE(' 2014-08-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
I was running 11.2.0.4 at the time, which is why we can get a serial Bloom filter on the hash join (though, perhaps, only on non-mergeable, aggregate, views) – and it’s interesting to see that the filter has been pushed inside the view operator; but the really interesting part of the plan is the second appearance of the XO_STOCK_TRANS table.
My correlated subquery returns a value that is used in a comparison with a column in the XO_PRICES table, but the correlation predicates referred back to the XO_STOCK_TRANS table, so the optimizer has added the XO_STOCK_TRANS to the subquery as it unnested it.
I’ve written several examples of how we can optimise SQL manually by rewriting it to introduce extra copies of some of the tables (typically in a fashion analogous to the optimizer’s mechanism for star transformations), so it’s nice to see another variation on the theme of the optimizer using table duplication to optimise a statement.
Footnote:
The execution plan in 10.2.0.5 is slightly different, but it still unnests the subquery, introducing a second occurrence of XO_STOCK_TRANS as it does so.
Update Sept 2021
After writing up a number of notes on the optimizer’s ability to transform queries I took some time to revisit this example to see if there was any way to block the transformation, or any way I could see the transformation being invoked in the optimizer trace file.
It’s virtually invisible in the trace file until you know the answer; but an exhaustive check of parameter settings and optimizer environments revealed that the feature appeared in 10.1 with the parameter “_optimizer_squ_bottomup”, and setting this parameter to false (which can be done through the opt_param() hint) will disable the transformation.
The clue in the trace file is the series of annotation (with a few lines deleted):
SU: Considering bottom-up subquery unnesting ... SU: Checking validity of unnesting subquery SEL$2 (#2) ... SU: Unnesting subquery query block SEL$2 (#2) Registered qb: SEL$0CF828B0 0x6a716258 (QUERY BLOCK TABLES CHANGED SEL$2)
Jonathan sir,
Bloom filter was fine,Didn’t understand why we have below hash join access conditions
2 – access(“ITEM_1″=ROWID)
Why Rowid was use as joining column and not tr.item?
Can you please post, transform sql by optimizer for better understanding on same.?
Comment by Deepak Mahto — January 3, 2015 @ 3:28 pm GMT Jan 3,2015 |
Deepak,
Here’s the transformed SQL (a re-formatted versin of the “Unparsed SQL” in the 10053 trace file) that makes it easier to see the answer to your first question:
Just in case you missed it, the column on the left of the join is “ITEM_1”, not “ITEM”, and it’s the alias for the XO_STOCKTRANS.ROWID in the subquery.
Since each row in xo_stock_trans could join to many rows in xo_prices based on the item and trans/price dates, and since there are many rows in xo_stock_trans for each item, we can’t aggregate and join on item; we want to aggregate on the xo_stock_trans ROW to find the max date for that row after the join, and that means aggregating on the rowid – which we then join back.
Comment by Jonathan Lewis — January 3, 2015 @ 3:46 pm GMT Jan 3,2015 |
Thanks a lot of information.. Now its crystal clear!!
Appreciate.!
Comment by Deepak Mahto — January 3, 2015 @ 3:53 pm GMT Jan 3,2015 |
[…] to find the maximum price_date for each target row in the xo_stock_trans table. (See my earlier posting on table duplication for a variation and explanation of what Oracle has done here). This step is a lot of work, but the […]
Pingback by Most Recent | Oracle Scratchpad — January 7, 2015 @ 6:21 pm GMT Jan 7,2015 |