In an echo of a very old “conditional SQL” posting a recent posting on the ODC general database discussion forum ran into a few classic errors of trouble-shooting. By a lucky coincidence this allowed me to rediscover and publish an old example of parallel execution gone wild before moving on to talk about the fundamental problem exhibited in the latest query.
The ODC thread started with a question along the lines of “why isn’t Oracle using the index I hinted” through the minor variation “When I hint my SQL with an index hint it runs quickly so I’ve created a profile that applies the hint, but the hint doesn’t get used in production.”
The query was a bit messy and, as is often the case with ODC, the formatting wasn’t particularly readable so I’ve extracted the where clause from the SQL that the OP used to generate the profile and reformatted it below. See if you can spot the clue that tells you why there might be a big problem using this SQL to generate a profile for the production environment:
WHERE MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' AND MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' AND MSG.SRCH_4_FLD_VAL = '123456' AND ( ( 'INVOICENUMBER' = 'INVOICENUMBER' AND MSG.MSG_ID IN ( SELECT * FROM TABLE(CAST(FNM_GN_IN_STRING_LIST('123456') AS TABLE_OF_VARCHAR))) ) OR ( 'INVOICENUMBER' = 'SIEBELORDERID' AND MSG.SRCH_3_FLD_VAL IN ( SELECT * FROM TABLE(CAST(FNM_GN_IN_STRING_LIST('') AS TABLE_OF_VARCHAR))) ) ) AND MSG.MSG_ID = TRK.INV_NUM(+) AND ( TRK.RESEND_DT IS NULL OR TRK.RESEND_DT = ( SELECT MAX(TRK1.RESEND_DT) FROM FNM.BCS_INV_RESEND_TRK TRK1 WHERE TRK1.INV_NUM = TRK.INV_NUM ) )
If the SQL by itself doesn’t give you an inportant clue, compare it with the Predicate Information from the “good” execution plan that it produced:
Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(("TRK"."RESEND_DT" IS NULL OR "TRK"."RESEND_DT"=)) 8 - filter(("MSG"."SRCH_4_FLD_VAL"='123456' AND "MSG"."MSG_CAPTR_STG_CD"='PRE_BCS')) 9 - access("MSG"."MSG_ID"="COLUMN_VALUE" AND "MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE') filter("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE') 10 - access("MSG"."MSG_ID"="TRK"."INV_NUM") 13 - access("TRK1"."INV_NUM"=:B1)
Have you spotted the thing that isn’t there in the predicate information ?
What happened to the ‘INVOICENUMBER’ = ‘INVOICENUMBER’ predicate and the ‘INVOICENUMBER’ = ‘SIEBELORDERID’ predicate?
They’ve disappeared because the optimizer knows that the first predicate is always true and doesn’t need to be tested at run-time and the second predicate is always false and doesn’t need to be tested at run-time. Moreover both predicates are part of a conjunct (AND) – so in the second case the entire two-part predicate can be eliminated; so the optimizer can immediately transform the original where clause into the following:
WHERE MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' AND MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' AND MSG.SRCH_4_FLD_VAL = '123456' AND ( MSG.MSG_ID IN ( SELECT * FROM TABLE(CAST(FNM_GN_IN_STRING_LIST('123456') AS TABLE_OF_VARCHAR))) ) AND MSG.MSG_ID = TRK.INV_NUM(+) AND ( TRK.RESEND_DT IS NULL OR TRK.RESEND_DT = ( SELECT MAX(TRK1.RESEND_DT) FROM FNM.BCS_INV_RESEND_TRK TRK1 WHERE TRK1.INV_NUM = TRK.INV_NUM ) )
Looking at this reduced predicate you may note that the IN subquery referencing the fnm_gn_in_string_list() collection could now be unnested and used to drive the final execution plan and the optimizer will even recognize that it’s a rowsource with at most one row. So here’s the “good” execution plan:
--------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 2 |00:00:00.08 | 12 | 7 | | | | | 1 | SORT ORDER BY | | 1 | 1 | 2 |00:00:00.08 | 12 | 7 | 2048 | 2048 | 2048 (0)| |* 2 | FILTER | | 1 | | 2 |00:00:00.08 | 12 | 7 | | | | | 3 | NESTED LOOPS OUTER | | 1 | 1 | 2 |00:00:00.08 | 10 | 7 | | | | | 4 | NESTED LOOPS | | 1 | 1 | 2 |00:00:00.06 | 6 | 5 | | | | | 5 | VIEW | VW_NSO_1 | 1 | 1 | 1 |00:00:00.01 | 0 | 0 | | | | | 6 | HASH UNIQUE | | 1 | 1 | 1 |00:00:00.01 | 0 | 0 | 1697K| 1697K| 487K (0)| | 7 | COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST | 1 | 1 | 1 |00:00:00.01 | 0 | 0 | | | | |* 8 | TABLE ACCESS BY INDEX ROWID | FNM_VSBL_MSG | 1 | 1 | 2 |00:00:00.06 | 6 | 5 | | | | |* 9 | INDEX RANGE SCAN | XIE2FNM_VSBL_MSG | 1 | 4 | 4 |00:00:00.04 | 4 | 3 | | | | |* 10 | INDEX RANGE SCAN | XPKBCS_INV_RESEND_TRK | 2 | 1 | 2 |00:00:00.01 | 4 | 2 | | | | | 11 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 2 | 0 | | | | | 12 | FIRST ROW | | 1 | 1 | 1 |00:00:00.01 | 2 | 0 | | | | |* 13 | INDEX RANGE SCAN (MIN/MAX) | XPKBCS_INV_RESEND_TRK | 1 | 1 | 1 |00:00:00.01 | 2 | 0 | | | | ---------------------------------------------------------------------------------------------------------------------------------------------------------------
The plan looks great – Oracle predicts a single row driver (operation 5) which can use a very good index (XIE2FNM_VSBL_MSG) in a nested loop, followed by a second nested loop, followed by a filter subquery and a sort of a tiny amount of data. Predictions match actuals all the way down the plan, and the workload is tiny. So what goes wrong in production?
You’ve probably guessed the flaw in this test. Why would anyone include a predicate like ‘INVOICENUMBER’ = ‘INVOICENUMBER’ in production code, or even worse ‘INVOICENUMBER’ = ‘SIEBELORDERID’. The OP has taken a query using bind variables picked up the actual values that were peeked when the query was executed, and substituted them into the test as literals. This has allowed the optimizer to discard two simple predicates and one subquery when the production query would need a plan that catered for the possibility that the second subquery would be the one that had to be executed and the first one bypassed. Here’s the corrected where clause using SQL*Plus variables (not the substitution type, the proper type) for the original bind variables:
WHERE MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' AND MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' AND MSG.SRCH_4_FLD_VAL = :BindInvoiceTo AND ( ( :BindSearchBy = 'INVOICENUMBER' AND MSG.MSG_ID IN ( SELECT * FROM TABLE(CAST(FNM_GN_IN_STRING_LIST(:BindInvoiceList) AS TABLE_OF_VARCHAR))) ) OR ( :BindSearchBy = 'SIEBELORDERID' AND MSG.SRCH_3_FLD_VAL IN ( SELECT * FROM TABLE(CAST(FNM_GN_IN_STRING_LIST(:BindSeibelIDList) AS TABLE_OF_VARCHAR))) ) ) AND MSG.MSG_ID = TRK.INV_NUM(+) AND ( TRK.RESEND_DT IS NULL OR TRK.RESEND_DT = ( SELECT MAX(TRK1.RESEND_DT) FROM FNM.BCS_INV_RESEND_TRK TRK1 WHERE TRK1.INV_NUM = TRK.INV_NUM ) )
And this, with the “once good” hint in place to force the use of the XIE2FNM_VSBL_MSG index, is the resulting execution plan
--------------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | | 1 | SORT ORDER BY | | 1 | 73728 | 73728 | | |* 2 | FILTER | | | | | | | 3 | NESTED LOOPS OUTER | | 1 | | | | |* 4 | TABLE ACCESS BY INDEX ROWID | FNM_VSBL_MSG | 1 | | | | |* 5 | INDEX FULL SCAN | XIE2FNM_VSBL_MSG | 4975K| | | | |* 6 | INDEX RANGE SCAN | XPKBCS_INV_RESEND_TRK | 1 | | | | |* 7 | COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST | 1 | | | | |* 8 | COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST | 1 | | | | | 9 | SORT AGGREGATE | | 1 | | | | | 10 | FIRST ROW | | 1 | | | | |* 11 | INDEX RANGE SCAN (MIN/MAX) | XPKBCS_INV_RESEND_TRK | 1 | | | | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter((((:BINDSEARCHBY='INVOICENUMBER' AND IS NOT NULL) OR (:BINDSEARCHBY='SIEBELORDERID' AND IS NOT NULL)) AND ("TRK"."RESEND_DT" IS NULL OR "TRK"."RESEND_DT"=))) 4 - filter(("MSG"."SRCH_4_FLD_VAL"=:BINDINVOICETO AND "MSG"."MSG_CAPTR_STG_CD"='PRE_BCS')) 5 - access("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE') filter("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE') 6 - access("MSG"."MSG_ID"="TRK"."INV_NUM") 7 - filter(VALUE(KOKBF$)=:B1) 8 - filter(VALUE(KOKBF$)=:B1) 11 - access("TRK1"."INV_NUM"=:B1)
The “unnested driving subquery” approach can no longer be used – we now start with the fnm_vsbl_msg table (accessing it using a most inefficient execution path because that’s what the hint demands, and the optimizer can obey the hint) and for each row check which of the two subqueries we need to execute. There is, in fact, no way we can hint this query to operate efficiently [at least, that’s my opinion, .I may be wrong].
The story so far
If you’re going to try to use SQL*Plus (or similar) to test a production query with bind variables you can’t just use a sample of literal values in place of the bind variables (though you may get lucky sometimes, of course), you should set up some SQL*Plus variables and assign values to them.
Though I haven’t said it presiously in this article this is an example where a decision that really should have been made by the front-end code has been embedded in the SQL and passed to the database as SQL which cannot be run efficiently. The front end code should have been coded to recognise the choice between invoice numbers and Siebel order ids and sent the appropriate query to the database.
Next Steps
WIthout making a significant change to the front-end mechanism wrapper is it possible to change the SQL to something the optimizer can handle efficiently? Sometimes the answer is yes; so I’ve created a simpler model to demonstrate the basic problem and supply a solution for cases like this one. The key issue is finding a way of working around the OR clauses that are trying to allow the optimizer to choose between two subqueries but make it impossible for either to be unnested into a small driving data set.
First, some tables:
rem rem Script: or_in_twice.sql rem Author: Jonathan Lewis rem Dated: June 2019 rem rem Last tested rem 18.3.0.0 rem 12.2.0.1 rem create table t1 as with generator as ( select rownum id from dual connect by level <= 1e4 -- > comment to avoid WordPress format issue ) select rownum id, mod(rownum,371) n1, lpad(rownum,10,'0') v1, lpad('x',100,'x') padding from generator v1, generator v2 where rownum <= 1e4 -- > comment to avoid WordPress format issue ; alter table t1 add constraint t1_pk primary key(id); create table t2 as with generator as ( select rownum id from dual connect by level <= 1e4 -- > comment to avoid WordPress format issue ) select rownum id, mod(rownum,372) n1, lpad(rownum,10,'0') v1, lpad('x',100,'x') padding from generator v1, generator v2 where rownum <= 1e4 -- > comment to avoid WordPress format issue ; create table t3 as with generator as ( select rownum id from dual connect by level <= 1e4 -- > comment to avoid WordPress format issue ) select rownum id, mod(rownum,373) n1, lpad(rownum,10,'0') v1, lpad('x',100,'x') padding from generator v1, generator v2 where rownum <= 1e4 -- > comment to avoid WordPress format issue ;
Now a query – first setting up a variable in SQL*Plus to allow us to emulate a production query with bind variables. Since I’m only going to use Explain Plan the variable won’t be peekable, so there would still be some scope for this plan not matching a production plan, but it’s adequate to demonstrate the structural problem:
variable v1 varchar2(10) exec :v1 := 'INVOICE' explain plan for select t1.v1 from t1 where ( :v1 = 'INVOICE' and t1.id in (select id from t2 where n1 = 0) ) or ( :v1 = 'ORDERID' and t1.id in (select id from t3 where n1 = 0) ) ; select * from table(dbms_xplan.display); --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 150 | 26 (4)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL| T1 | 10000 | 146K| 26 (4)| 00:00:01 | |* 3 | TABLE ACCESS FULL| T2 | 1 | 8 | 26 (4)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T3 | 1 | 8 | 26 (4)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(:V1='INVOICE' AND EXISTS (SELECT 0 FROM "T2" "T2" WHERE "ID"=:B1 AND "N1"=0) OR :V1='ORDERID' AND EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID"=:B2 AND "N1"=0)) 3 - filter("ID"=:B1 AND "N1"=0) 4 - filter("ID"=:B1 AND "N1"=0)
As you can see, thanks to the OR that effectively gives Oracle the choice between running the subquery against t3 or the one against t2, Oracle is unable to do any unnesting. (In fact different versions of Oracle allow different levels of sophistication with disjuncts (OR) of subqueries, so this is the kind of example that’s always useful to keep for tests against future versions.)
Since we know that we are going to use one of the data sets supplied in one of the subqueries and have no risk of double-counting or eliminating required duplicates, one strategy we could adopt for this query is to rewrite the two subqueries as a single subquery with a union all – because we know the optimizer can usually handle a single IN subquery very nicely. So let’s try the following:
explain plan for select t1.v1 from t1 where t1.id in ( select id from t2 where n1 = 0 and :v1 = 'INVOICE' union all select id from t3 where n1 = 0 and :v1 = 'ORDERID' ) ; select * from table(dbms_xplan.display); ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 54 | 1512 | 77 (3)| 00:00:01 | |* 1 | HASH JOIN | | 54 | 1512 | 77 (3)| 00:00:01 | | 2 | VIEW | VW_NSO_1 | 54 | 702 | 51 (2)| 00:00:01 | | 3 | HASH UNIQUE | | 54 | 432 | 51 (2)| 00:00:01 | | 4 | UNION-ALL | | | | | | |* 5 | FILTER | | | | | | |* 6 | TABLE ACCESS FULL| T2 | 27 | 216 | 26 (4)| 00:00:01 | |* 7 | FILTER | | | | | | |* 8 | TABLE ACCESS FULL| T3 | 27 | 216 | 26 (4)| 00:00:01 | | 9 | TABLE ACCESS FULL | T1 | 10000 | 146K| 26 (4)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T1"."ID"="ID") 5 - filter(:V1='INVOICE') 6 - filter("N1"=0) 7 - filter(:V1='ORDERID') 8 - filter("N1"=0)
Thanks to the FILTERs at operations 5 and 7 this plan will pick the data from just one of the two subqueries, reduce it to a unique list and then use that as the build table to a hash join. Of course, with different data (or suitable hints) that hash join could become a nested loop using a high precision index.
But there’s an alternative. We manually rewrote the two subqueries as a single union all subquery and as we did so we moved the bind variable comparisons inside their respective subqueries; maybe we don’t need to introduce the union all. What would happen if we simply take the original query and move the “constant” predicates inside their subqueries?
explain plan for select t1.v1 from t1 where t1.id in (select id from t2 where n1 = 0 and :v1 = 'INVOICE') or t1.id in (select id from t3 where n1 = 0 and :v1 = 'ORDERID') ; select * from table(dbms_xplan.display); ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 54 | 1512 | 77 (3)| 00:00:01 | |* 1 | HASH JOIN | | 54 | 1512 | 77 (3)| 00:00:01 | | 2 | VIEW | VW_NSO_1 | 54 | 702 | 51 (2)| 00:00:01 | | 3 | HASH UNIQUE | | 54 | 432 | 51 (2)| 00:00:01 | | 4 | UNION-ALL | | | | | | |* 5 | FILTER | | | | | | |* 6 | TABLE ACCESS FULL| T3 | 27 | 216 | 26 (4)| 00:00:01 | |* 7 | FILTER | | | | | | |* 8 | TABLE ACCESS FULL| T2 | 27 | 216 | 26 (4)| 00:00:01 | | 9 | TABLE ACCESS FULL | T1 | 10000 | 146K| 26 (4)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T1"."ID"="ID") 5 - filter(:V1='ORDERID') 6 - filter("N1"=0) 7 - filter(:V1='INVOICE') 8 - filter("N1"=0)
In 12.2.0.1 and 18.3.0.0 it gets the same plan as we did with our “single subquery” rewrite – the optimizer is able to construct the union all single subquery (although the ordering of the subqueries has been reversed) and unnest without any other manual intervention. (You may find that earlier versions of Oracle don’t manage to do this, but you might have to go all the way back to 10g.
Conclusion
Oracle doesn’t like disjuncts (OR) and finds conjuncts (AND) much easier to cope with. Mixing OR and subqueries is a good way to create inefficient execution plans especially when you try to force the optimizer to handle a decision that should have been taken in the front-end code. The optimizer, however, gets increasingly skilled at handling the mixture as you move through the newer versions but you may have to find ways to give it a little help if you see it running subqueries as filter subqueries when you’re expecting it to unnest a subquery to produce a small driving data set.
Worth mentioning a really nice script by Kerry Osborne to automate such task.
Comment by Timur Akhmadeev — June 18, 2019 @ 12:18 pm BST Jun 18,2019 |