Here’s a note that has been sitting around for more than 3 years (the draft date is Jan 2015), waiting for me to finish it off; and in that time we’ve got a new version of Oracle that changes the solution to the problem it presented. (I also managed to write “Conditional SQL – 6” in the intervening period !)
This posting started with a question on the Oracle Developer forum about an execution plan used by 11.2.0.3. Here’s a model to represent the data and the query:
rem rem Script: null_plan_4.sql rem Author: Jonathan Lewis rem Dated: Jan 2015 rem rem Last tested rem 12.2.0.1 rem 12.1.0.2 rem 11.2.0.4 rem create table catentry as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 -- > comment here to avoid format issue ) select rownum catentry_id, case when mod(rownum-1,100) > 0 then mod(rownum-1,100) end member_id, case when trunc((rownum-1)/100) > 0 then trunc((rownum-1)/100) end partnumber, rpad('x',100) padding from generator, generator where rownum <= 100 * 100 -- > comment here to avoid format issue ; execute dbms_stats.gather_table_stats(user,'catentry'); create unique index cat_i0 on catentry(member_id, partnumber) compress 1; -- create index cat_i1 on catentry(member_id, partnumber, 0) compress 1; -- create index cat_i2 on catentry(partnumber, member_id, 0) compress 1; variable b1 number variable b2 number variable b3 number variable b4 number begin :b1 := 22; :b2 := 1; :b3 := 44; :b4 := 1; end; / select catentry_id from catentry where ( partnumber= :b1 or (0 = :b2 and partnumber is null) ) and ( member_id= :b3 or (0 = :b4 and member_id is null) ) ; select * from table(dbms_xplan.display_cursor); ============================================================================== ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 23 (100)| | |* 1 | TABLE ACCESS FULL| CATENTRY | 1 | 10 | 23 (5)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter((("PARTNUMBER"=:B1 OR ("PARTNUMBER" IS NULL AND 0=:B2)) AND ("MEMBER_ID"=:B3 OR ("MEMBER_ID" IS NULL AND 0=:B4))))
The question this example raised was, effectively: “Why didn’t Oracle use bind peeking to work out that the best plan for this query – with these input values – was an index range scan?”
The basic answer to this question is this: “No matter how clever Oracle can be with bind peeking and executions plans it has to produce an execution plan that will give the right answer whatever the supplied values might be.”
The OP was hoping that the optimizer would see :b2 and :b4 were arriving with the value 1, infer that “0 = 1” is always false, and reduce the query predicate to “partnumber =22 and member_id = 44” to produce the following plan:
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 2 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | CAT_I0 | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("PARTNUMBER"=22 AND "MEMBER_ID"=44)
But this plan could produce the wrong results if the next execution of the query supplied zeros for :b2 or :b4, so Oracle has to do something more generic. (Bear in mind that adaptive cursor sharing keeps reusing the same execution plan until it detects that the performance for some input values is bad; it doesn’t pre-emptively create new plans based on the incoming values – though in principle it might be possible for the Oracle developers to introduce code that can recognise special cases for predicates of the form “constant1 = constant2”).
If you review the SQL statement you can see that it’s clearly trying to allow the user to find data about member_ids and partnumbers where both, neither, or either value is allowed to be null: a couple of “if – then – else” conditions that should have been handled in the client code have been embedded in the code. As we have seen several times before (1) (2) (3) (4) if you can’t change the client code then you have to hope that Oracle will use some clever transformation to handle the query in sections.
We can infer from various details of the posting that the member_id and partnumber columns were both allowed to be null, so if we want to make sure that Oracle always uses an indexed access path to acquire data for this query we need to have an index which starts with those two columns and then has at least one column which is guaranteed to be non-null so, for example, we could simply drop the current index and replace it with one that has a fixed zero on the end:
create index cat_i1 on catentry(member_id, partnumber, 0) compress 1;
With my particular data set, query, and version of Oracle this didn’t make any difference to the plan. But then I thought about the data definition and realised (and checked) that the index had a terrible clustering_factor, so I dropped it and created it with the first two columns in the opposite order:
create index cat_i2 on catentry(partnumber, member_id, 0) compress 1;
Side note:
You’ll notice that I’ve replaced the original unique index with a non-unique index. This was necessary because there were many rows where both partnumber and member_id were null, so if I want to maintain the logic of the previous unique index I’ll need to add a unique constraint on (member_id, partnumber). It’s possible, of course, that in similar circumstances I might want both indexes – one for the uniqueness and to access the data using only one of the columns, the other to access the data using only the other column.
With this index in place, and unhinted, the plan I got from 11.2.0.4 changed to use concatenation with an impressive four-way split:
------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 12 (100)| | | 1 | CONCATENATION | | | | | | |* 2 | FILTER | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | CAT_I2 | 1 | | 2 (0)| 00:00:01 | |* 5 | FILTER | | | | | | |* 6 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | CAT_I2 | 1 | | 2 (0)| 00:00:01 | |* 8 | FILTER | | | | | | |* 9 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | CAT_I2 | 1 | | 2 (0)| 00:00:01 | |* 11 | TABLE ACCESS BY INDEX ROWID | CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 12 | INDEX RANGE SCAN | CAT_I2 | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter((0=:B2 AND 0=:B4)) 4 - access("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL) filter("MEMBER_ID" IS NULL) 5 - filter(0=:B2) 6 - filter((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4))) 7 - access("PARTNUMBER" IS NULL AND "MEMBER_ID"=:B3) filter("MEMBER_ID"=:B3) 8 - filter(0=:B4) 9 - filter((LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2))) 10 - access("PARTNUMBER"=:B1 AND "MEMBER_ID" IS NULL) 11 - filter(((LNNVL("MEMBER_ID" IS NULL) OR LNNVL(0=:B4)) AND (LNNVL("PARTNUMBER" IS NULL) OR LNNVL(0=:B2)))) 12 - access("PARTNUMBER"=:B1 AND "MEMBER_ID"=:B3)
To execute this plan the run-time engine works as follows:
- Operation 2: If :b2 and :b4 are both zero we use the index to find the rows where member_id and partnumber are null (the filter “member_id is null” seems to be redundant)
- Operation 5: if :b2 is zero we use the index to find rows where the partnumber is null and the member_id is the supplied value (and if that’s null the access will immediately return zero rows because of the equality predicate), and – through the calls to lnnvl() – discard any rows that have already been returned by operation 2
- Operation 8: if :b4 is zero we will use the index to find rows where the partnumber is the supplied value (and if the partnumber is null, that access will immediately return zero rows because of the equality predicate) and the member_id is null, and discard any rows that have already been returned by operations 2 and 5.
- Operations 11 and 12 will always run – using the index to find rows that match with equality on both the incoming member_id and partnumber, discarding any rows already returned by the previous operations, and obviously not matching any rows where either column “IS” null.
The critical feature of this plan, of course, is that we got it because we have given Oracle an efficient option to find the rows where both member_id and partnumber are null – and that allows the rest of the concatenation options to take place.
Hints and Upgrades
Interestingly, after the clue that 11g would happily use concatenation with a “good enough” index I went back to the example where I’d just added a zero to the existing index and checked to see what would happen if I added a /*+ use_concat */ hint (without any of the qualifying parameters that the hint can now use) and got the same concatenated plan. The fact that the path appeared wasn’t the interesting bit – see if you can spot what is the interesting bit:
------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 12 (100)| | | 1 | CONCATENATION | | | | | | |* 2 | FILTER | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | CAT_I1 | 1 | | 2 (0)| 00:00:01 | |* 5 | FILTER | | | | | | |* 6 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | CAT_I1 | 1 | | 2 (0)| 00:00:01 | |* 8 | FILTER | | | | | | |* 9 | TABLE ACCESS BY INDEX ROWID| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | CAT_I1 | 1 | | 2 (0)| 00:00:01 | |* 11 | TABLE ACCESS BY INDEX ROWID | CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 12 | INDEX RANGE SCAN | CAT_I1 | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------
Check the cost, and compare it with the cost of the full tablescan. The hinted path has a lower cost than the default path. I think this may be another case of an “unknowable” range scan being ignored in favour of a known alternative.
Finally, we get to today – when I tested the code against 12.1.0.2 and 12.2.0.1. Nothing exciting happened in 12.1.0.2 – the plans were just like the 11g plans, but here’s the plan I got in 12.2 with the “bad” indexing (original column order with added zero column – index cat_i1) without any hints in the SQL:
---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 9 (100)| | | 1 | VIEW | VW_ORE_37A7142B | 4 | 52 | 9 (0)| 00:00:01 | | 2 | UNION-ALL | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID | CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | CAT_I1 | 1 | | 1 (0)| 00:00:01 | |* 5 | FILTER | | | | | | | 6 | TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | CAT_I1 | 1 | | 1 (0)| 00:00:01 | |* 8 | FILTER | | | | | | | 9 | TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | CAT_I1 | 1 | | 1 (0)| 00:00:01 | |* 11 | FILTER | | | | | | | 12 | TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY | 1 | 10 | 3 (0)| 00:00:01 | |* 13 | INDEX RANGE SCAN | CAT_I1 | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA ... OR_EXPAND(@"SEL$1" (1) (2) (3) (4)) ... END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1) 5 - filter(0=:B4) 7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1) filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3))) 8 - filter(0=:B2) 10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL) filter(LNNVL("PARTNUMBER"=:B1)) 11 - filter((0=:B4 AND 0=:B2)) 13 - access("MEMBER_ID" IS NULL AND "PARTNUMBER" IS NULL) filter(("PARTNUMBER" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND LNNVL("MEMBER_ID"=:B3)))
The full tablescan didn’t appear – but it wasn’t eliminated by concatenation but by the “new” 12.2 variant: “OR EXPANSION”. In this case the net effect is remarkably similar – we still have filter operations comparing :b2 and :b4 with zero, and we still have a scattering of lnnvl() function calls being used to discard rows we’ve already accessed, but the pattern is slightly different and we have a union all rather than concatenation.
This change prompted me to go back to testing with just the original index (member_id, partnumber – index cat_i0) … which took me back to the full tablescan until I added the hint /*+ or_expand */ to the query to get the following plan:
---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 29 (100)| | | 1 | VIEW | VW_ORE_37A7142B | 4 | 52 | 29 (4)| 00:00:01 | | 2 | UNION-ALL | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID | CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | CAT_I0 | 1 | | 1 (0)| 00:00:01 | |* 5 | FILTER | | | | | | | 6 | TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | CAT_I0 | 1 | | 1 (0)| 00:00:01 | |* 8 | FILTER | | | | | | | 9 | TABLE ACCESS BY INDEX ROWID BATCHED| CATENTRY | 1 | 10 | 2 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | CAT_I0 | 1 | | 1 (0)| 00:00:01 | |* 11 | FILTER | | | | | | |* 12 | TABLE ACCESS FULL | CATENTRY | 1 | 10 | 23 (5)| 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("MEMBER_ID"=:B3 AND "PARTNUMBER"=:B1) 5 - filter(0=:B4) 7 - access("MEMBER_ID" IS NULL AND "PARTNUMBER"=:B1) filter(("PARTNUMBER"=:B1 AND LNNVL("MEMBER_ID"=:B3))) 8 - filter(0=:B2) 10 - access("MEMBER_ID"=:B3 AND "PARTNUMBER" IS NULL) filter(LNNVL("PARTNUMBER"=:B1)) 11 - filter((0=:B4 AND 0=:B2)) 12 - filter(("PARTNUMBER" IS NULL AND "MEMBER_ID" IS NULL AND LNNVL("PARTNUMBER"=:B1) AND LNNVL("MEMBER_ID"=:B3)))
The plan shows “or expansion”, and highlights the most significant difference between concatenation and expansion – concatenation requires indexed access paths in all branches, or-expansion doesn’t.
At first sight this plan with its full tablescan at operation 12 might seem like a bad idea; but check operation 11, the guarding filter, the tablescan occurs only if both :b2 and :b4 are null. Perhaps that special condition is never supposed to appear, perhaps it’s going to do a lot of work whether or not you can use an index. The fact that you can now handle the original problem without adding or altering existing indexes – provided you can inject this or_expand hint – may be of significant benefit. (Of course, being able to recreate the original index with the columns in the reverse order would even avoid the necessity of worrying about the hint.)
tl;dr
Applications that push “if-then-else” decisions into the SQL and down to the optimizer are a pain in the backside; the performance problems they produce can sometimes be bypassed by the addition of extra indexes that might give you plans (possibly hinted) that report the use of the concatentation operation. In 12cR2 the optimizer has an improved strategy for damage limitation “Cost-based Or Expansion” that can produce very similar effects without the addition of extra indexes. These plans will report union all operations, referencing views with names like: VW_ORE_xxxxxxxx and will probably include several appearances of the lnnvl() function in their predicate information.
When you next upgrade you may find a few cases where you can get rid of indexes that were only created to work around defective application coding strategies. You may also want to hunt down any code where you’ve added use_concat hints and see if they can be removed, or if they should be replaced by or_expand. since the former hint will disable the latter feature.
Jonathan,
Very interesting and useful article. I think I found some minor typos while going though it, and thought to post them in hope they are useful to other readers.
– The “create index cat_i2 on catentry(member_id, partnumber, 0) compress 1;” statement located approx. in the middle of the post should probably be changed to
“create index cat_i2 on catentry(partnumber, member_id, 0) compress 1;”, like it’s correctly stated at the beginning of the post.
– “Operation 5: if :b2 is null we use the index ” should probably read: “Operation 5: if :b2 is zero we use the index”
– a “create table catentry as ..” is missing at the beginning of the “with generator as (..” statement.
Regards,
Jure
Comment by Jure Bratina — March 15, 2018 @ 3:33 pm GMT Mar 15,2018 |
Jure,
Thanks for that – correct on all counts, and now corrected in the post.
Comment by Jonathan Lewis — March 15, 2018 @ 3:58 pm GMT Mar 15,2018 |
[…] reading the example my first thought was to wonder why the optimizer hadn’t simple used “OR-expansion” (or concatenation if you’re running an older version), then I remembered that by the time the […]
Pingback by Danger – Hints | Oracle Scratchpad — May 22, 2019 @ 2:56 pm BST May 22,2019 |
[…] three-part union all. This, by the way, is a 12c feature, though older versions of Oracle could get similar effects in some cases from the /*+ use_concat */ hint. Here’s the plan, with outline, I got from […]
Pingback by ANSI Plans | Oracle Scratchpad — October 22, 2019 @ 6:59 pm BST Oct 22,2019 |
[…] OR-expansion appeared specifically in 12.2.0.1; prior to that we had a similar option in the use_concat() hint and concatenation – which also attempts to rewrite your query to produce a union all of […]
Pingback by Subquery with OR | Oracle Scratchpad — August 19, 2020 @ 2:03 pm BST Aug 19,2020 |