This is a little case study on working around a problem caused by a 3rd-party application that couldn’t be changed quickly and easily. There’s a touch of irony in this case as the problem I’ll address is a side effect of the best possible workaround to a more generic design flaw in the application, moreover the side effect appears because of a special case optimisation mechanism introduced in 12.2.
A recent post on the MOS database forum (needs an account) came up with a slightly surprising problem – a query with a predicate “columnX like ‘%XYZ%” would perform reasonably well if hinted to use a B-tree index on columnX (doing an index full scan) but performed badly when cursor_sharing was set to force because it switched to an index range scan which did a lot more work.
Modelling
Here’s a little script to build some data to demonstate the issue:
rem
rem Script: index_wildcard.sql
rem Author: Jonathan Lewis
rem Dated: July 2022
rem Purpose:
rem
rem Last tested
rem 19.11.0.0
create table t1
as
select obj.*
from
(select * from all_objects where rownum <= 50000) obj,
(select rownum from dual connect by level <= 5) gen
/
create index t1_id on t1(object_name);
alter session set cursor_sharing = force;
alter session set statistics_level = all;
alter session set events '10053 trace name context forever';
set serveroutput off
All I’ve done is create a simple heap table with 250,000 rows and an index (which happens to be on a column declared as NOT NULL, but that detail is not important). Here’s my query:
select /*+ index(t1(object_name)) cursor_sharing_exact */
t1.*
from
t1 t1
where
object_name like '%XYZ%'
/
select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
In normal circumstances the leading wildcard would mean that the optimizer would not be able to use an index driven plan (unless the query satisfied the requirements to allow an index fast full scan), so the default plan would have been a tablescan with a selectivity of 5% (estimated rows = 12,500).
Because I’ve hinted the use of an index I get the following plan from a call to dbms_xplan.display_cursor():
LAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
SQL_ID 1q1p6fyubhafu, child number 0
-------------------------------------
select /*+ index(t1(object_name)) cursor_sharing_exact */ t1.* from
t1 t1 where object_name like '%XYZ%'
Plan hash value: 2798063786
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.08 | 1718 | 1717 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 12500 | 0 |00:00:00.08 | 1718 | 1717 |
|* 2 | INDEX FULL SCAN | T1_ID | 1 | 12500 | 0 |00:00:00.08 | 1718 | 1717 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_NAME" LIKE '%XYZ%')
Because I’ve set cursor_sharing to force I’ve included the hint /*+ cursor_sharing_exact */ in the query to make sure that Oracle sees the query with a literal rather than a system generated bind variable. The plan tells us that the optimizer has picked an index full scan, and the Predicate Information for operation 2 tells us that Oracle has checked our predicate for every single entry in the index, and the A-rows column tells us that it didn’t find any matching index entries. So the total work done was 1718 buffer gets and 250,000 index entries tested. (You’ll notice that the E-rows is still the standard 5% guess.)
Now let’s try again, allowing enforced cursor sharing to be applied (i.e. taking out the cursor_sharing_exact hint):
SQL_ID cm4tjwuqxgbqg, child number 0
-------------------------------------
select /*+ index(t1(object_name)) */ t1.* from t1 t1 where
object_name like :"SYS_B_0"
Plan hash value: 539998951
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.26 | 241K|
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 12500 | 0 |00:00:00.26 | 241K|
|* 2 | INDEX RANGE SCAN | T1_ID | 1 | 12500 | 250K|00:00:00.04 | 1718 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME" LIKE :SYS_B_0)
2 - access("OBJECT_NAME" LIKE :SYS_B_0)
The SQL text shows us that the literal has been replaced by the bind variable :SYS_B_0. Operation 2 has changed from an index full scan to an index range scan – which “feels” as if “the range” should be the whole index and the query shouldn’t, therefore, do any more work than the index full scan plan. But the Buffers column now shows 241K buffer visits at operation 1, the table access! What’s gone wrong?
In the case of the OP the unexpected visit to the table blocks didn’t do just a few hundred thousand “unnecessary” buffer visits, it did 4 million buffer gets and 127,000 disk reads.
Look at the Predicate Information (always): the filter() that matched our original predicate is no longer being applied to the index, it’s only applied at the table. The only predicate applied to the index is an access() predicate – so Oracle takes the bind value and works out that the range scan should start at the beginning of the index and stop at the end of the index – that’s what an access() predicate does, it doesn’t check the actual values in the index entries once it has derived the start and stop locations.
Trouble-shooting
You will have noticed in the initial code to set up the data I had a call to set event 10053 (the CBO trace event) – a fairly quick and easy option to look at in this case since I’m looking at such a simple query. The critical details that showed up in the trace files were:
- in both cases the optimizer reported “User hint to use this index” in the Base Statistical Information.
- in the case of the “literal string” SQL the optimizer did not consider an index full scan
- in the case of the “forced bind” SQL the optimizer did not consider an index range scan
This is worth knowing as it tells us that the switch between paths was not an “accident” of circumstances or statistics, it was a deliberately code strategy; so we need to discover whether there is something we can do to change the strategy.
Time to check the hidden parameters and fix controls, and this is what I found in a listing of the 19 .11.0.0 system fix controls (listed as appearing in 12.2.x.x):
BUGNO OPTIMIZE SQL_FEATURE DESCRIPTION
---------- -------- ---------------------------------- ----------------------------------------------------------------
20289688 12.2.0.1 QKSFM_ACCESS_PATH_20289688 check for leading wildcard in LIKE with bind
Doesn’t that sound a little bit promising? Let’s turn the fix off by adding the hint /*+ opt_param(‘_fix_control’ ‘20289688:0’) */ to the problem query and see what we get:
SQL_ID 75cxtymdrh8r2, child number 0
-------------------------------------
select /*+ index(t1(object_name)) opt_param('_fix_control'
'20289688:0') */ t1.* from t1 t1 where object_name like :"SYS_B_0"
Plan hash value: 539998951
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.08 | 1718 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 12500 | 0 |00:00:00.08 | 1718 |
|* 2 | INDEX RANGE SCAN | T1_ID | 1 | 12500 | 0 |00:00:00.08 | 1718 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_NAME" LIKE :SYS_B_0)
filter("OBJECT_NAME" LIKE :SYS_B_0)
We still have an index range scan (with a range that is obviously the whole index), but our predicate is now being applied as a filter() in the index with a total of only 1,718 buffer gets even though the literal has been replaced by a system-generated bind variable.
Summary
The application is accepting a search value from the user interface and constructing “literal string” SQL that adds a leading and trailing wildcard to the value. As a result of this the optimizer will not consider using a particular index unless hinted, and does a very expensive tablescan.
The optimizer will accept and use an index() hint to produce a plan that is much more efficient than the tablescan even though it has to use an index full scan and apply a filter predicate on every index entry.
Unfortunately cursor_sharing has been set to force to deal with the general parsing overheads of applications that construct “literal string” SQL, and when the index() hint is applied to a bind-variable version of this query the optimizer applies an index range scan and postpones the filter() until it has acquired rows from the table.
A check of the system fix controls identified a fix that looked as if it might have some relevance to our situation, “leading wild card, bind variable, LIKE clause”, and a test with an opt_param() hint to switch off that optimizer “fix” had the effect we wanted – the optimizer still reported an index range scan but it also used our initial predicate as a filter() predicate on the index.
Technically, since the fix_control was introduced in 12.2.0.1 we could have got the same effect by setting the optimizer_features_enable back to 12.1.0.2 – but that’s not a good idea when the production version is so much newer although it might be an acceptable strategy when implemented as a hint for a few special case queries.
After showing that the fix control works by hinting the next step should be to create an SQL Patch (using dbms_sqldiag.create_sql_patch) attached to the (forced) bind-variable version of the query.
Update (a few hours later)
It didn’t take long to find that the fix_control and the optimzer_features_enable were not the only workarounds. There’s a hidden parameter “_optim_peek_user_binds” with the description “enable peeking of user binds” default value “true”, so if you set this to false the optimizer can’t notice that the critical bind value starts with a wildcard and behaves the way the OP wants. As with the other options you could set this as a system parameter or a session parameter, or as a hint (SQL patch etc.) /*+ opt_param(‘_optim_peek_user_binds’ ‘false’) */ – the last being the kindest for the system.
Update (21st July)
Bug number 33500777 (unpublished) looks like an exact match for the problem; the description reads:
FILTER FOR PREDICATE WITH LIKE AND BIND WITH LEADING WILDCARD IS NOT ALLOCATED ON INDEX SCAN AFTER FIX 20289688
[…] Index Wildcard (July 2022): Solving a problem of a case where a special case fix control had a bad side effect on a cursor_sharing damage-limitation strategy. […]
Pingback by Troubleshooting catalogue | Oracle Scratchpad — July 15, 2022 @ 12:00 pm BST Jul 15,2022 |
[…] Index Wildcard (July 2022): Leading wildcards and cursor_sharing can have an unexpected side effect on indexed acccess paths. […]
Pingback by Indexing Catalogue | Oracle Scratchpad — July 15, 2022 @ 12:17 pm BST Jul 15,2022 |
[…] Index Wildcard (July 2022): Looking closely at Predicate Information to understand what cursor_sharing has done to an execution plan. […]
Pingback by Execution Plans Catalogue | Oracle Scratchpad — July 15, 2022 @ 12:19 pm BST Jul 15,2022 |
Hello Jonathan,
I find the behavior here somewhat contradictory …
On the one hand, using CURSOR_SHARING_EXACT (aka forcing the optimizer to “see” the exact value used for the bind variable)
does produce a good plan.
On the other hand, when using CURSOR_SHARING = FORCE, setting “_optim_peek_user_binds” to FALSE (aka, preventing the optimizer
to “see” the value of the bind variable) will produce the good plan.
I don’t see any logical reason for the optimizer NOT to apply ALL the filter predicates already on the INDEX,
once it decided (or was driven by a hint) to use an index, so it looks to me that this is a coding flaw waiting to be corrected.
Interestingly, there is no documented hint for explicitly requesting an INDEX FULL SCAN, as of 21c.
Any way, if we already decide to use hint(s) for optimizing a statement, I guess that it is probably better to tell the optimizer what to do,
than telling it what not to do.
And this, comes back again to the old problem of missing full documentation and explanations for all the available hints …
Thanks a lot & Best Regards,
Iudith Mentzel
Comment by Iudith Mentzel — July 16, 2022 @ 2:26 pm BST Jul 16,2022 |
Iudith,
Thanks for the comment.
Yes, definitely contradictory – but I think such inconsistencies are the natural result of gradual changes that are somewhere between extensions and fixes that are applied as boundary conditions show up.
I don’t think this behavious was designed in, I think it emerged over time – possibly as SRs were raised describing special cases. If you check the v$system_fix_control you’ll find:
So over the range of 3 major versions some code in this area received changes that were probably sensible solutions to specific SRs, but turned out to be a very bad idea when viewed from a higher level.
It’s also worth remember two other points
1) this inconsistency showed up when an indexed path that would not otherwise be considered was taken, and the hint was necessary to produce a really bad execution path because of a terrible design feature in the application
2) a better plan still than the hinted one would be an index fast full scan to find the small (in this case non-existent) set of rowids that should be used to access the table – but index_ffs() … table access by rowid is not programmed in to the optimizer and has to be coded by hand.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — July 17, 2022 @ 7:20 pm BST Jul 17,2022 |