Saturday’s posting about setting cursor_sharing to force reminded me about one of the critical limitations/restrictions of SQL Profiles (which is one of those little reason why you shouldn’t be hacking SQL Profiles as a substitute for SQL Plan Baselines). Here’s a demo (taking advantage of some code that I think Kerry Osborne published several years ago) of creating an SQL Profile from the current execution plan of a simple statement – first we create some data and find the sql_id and child_number for a simple query:
rem rem Script: sql_profile_restriction.sql rem Author: Jonathan Lewis rem Dated: Jun 2018 rem Purpose: rem rem Last tested rem 18.3.0.0 rem 12.2.0.1 rem 12.1.0.2 create table t1 as select rownum n1, rownum n2, lpad(rownum,10) small_vc, rpad('x',100,'x') padding from dual connect by level <= 1e4 -- > comment to avoid WordPress format issue ; alter system flush shared_pool; select /*+ find this */ count(*) from t1 where n1 = 15 and n2 = 15; column sql_id new_value m_sql_id column child_number new_value m_child_number select sql_id , child_number from v$sql where sql_text like 'selec%find this%' and sql_text not like '%v$sql%' ;
Now I can create the SQL Profile for this query using the Kerry Osborne code:
declare ar_profile_hints sys.sqlprof_attr; cl_sql_text clob; begin select extractvalue(value(d), '/hint') as outline_hints bulk collect into ar_profile_hints from xmltable( '/*/outline_data/hint' passing ( select xmltype(other_xml) as xmlval from v$sql_plan where sql_id = '&m_sql_id' and child_number = &m_child_number and other_xml is not null ) ) d; select sql_fulltext into cl_sql_text from v$sql where sql_id = '&m_sql_id' and child_number = &m_child_number ; dbms_sqltune.import_sql_profile( sql_text => cl_sql_text, profile => ar_profile_hints, category => 'DEFAULT', name => 'PROFILE_LITERAL', force_match => true ); end; /
Note particularly that I have given the profile a simple name, put it in the DEFAULT category, and set force_match to true (which means that the profile ought to be used even if I change the literal values in the query). So now let’s check that the profile will be used as expected. First I’ll create an index that is a really good index for this query, then I’ll run the query to see if Oracle uses the index or obeys the profile; then I’ll change the query (literals) slightly and check again. I’ll also run a query that won’t be recognised as legally matching (thanks to the changed “hint”) to demonistrate that the index could have been used if the profile hadn’t been there:
alter system flush shared_pool; set serveroutput off prompt ============================= prompt Is the SQL Profile used ? Yes prompt ============================= select /*+ find this */ count(*) from t1 where n1 = 15 and n2 = 15; select * from table(dbms_xplan.display_cursor); select /*+ find this */ count(*) from t1 where n1 = 16 and n2 = 16; select * from table(dbms_xplan.display_cursor); select /*+ Non-match */ count(*) from t1 where n1 = 16 and n2 = 16; select * from table(dbms_xplan.display_cursor);
Here (with a little cosmetic adjustment) are the three outputs from dbms_xplan.display_cursor():
SQL_ID ayxnhrqzd38g3, child number 0 ------------------------------------- select /*+ find this */ count(*) from t1 where n1 = 15 and n2 = 15 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24 (100)| | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | TABLE ACCESS FULL| T1 | 1 | 8 | 24 (5)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(("N1"=15 AND "N2"=15)) Note ----- - SQL profile PROFILE_LITERAL used for this statement SQL_ID gqjb8pp35cnyp, child number 0 ------------------------------------- select /*+ find this */ count(*) from t1 where n1 = 16 and n2 = 16 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24 (100)| | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | TABLE ACCESS FULL| T1 | 1 | 8 | 24 (5)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(("N1"=16 AND "N2"=16)) Note ----- - SQL profile PROFILE_LITERAL used for this statement SQL_ID 3gvaxypny9ry1, child number 0 ------------------------------------- select /*+ Non-match */ count(*) from t1 where n1 = 16 and n2 = 16 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 (100)| | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | INDEX RANGE SCAN| T1_I1 | 1 | 8 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("N1"=16 AND "N2"=16)
As you can see the SQL Profile is reported as used in the first two queries, and (visibly) seems to have been used. Then in the third query where we wouldn’t expect a match the SQL Profile is not used and we get a plan that shows the index would have been used for the other queries had the SQL Profile not been there. So far, so good – the profile behaves as everyone might expect.
Bind Variable Breaking
Now let’s repeat the entire experiment but first do a global find and replace to change every occurrence of “n2 = 16” to “n2 = :b1”. We’ll also change the name of the SQL Profile when we create it to PROFILE_MIXED, and we’ll put in a couple of lines at the top of the script to declare the variable b1 and set its value, then the final test in the script will look like this:
alter system flush shared_pool; create index t1_i1 on t1(n1, n2); exec :b1 := 15 select /*+ find this */ count(*) from t1 where n1 = 15 and n2 = :b1; select * from table(dbms_xplan.display_cursor); exec :b1 := 16 select /*+ find this */ count(*) from t1 where n1 = 16 and n2 = :b1; select * from table(dbms_xplan.display_cursor);
And here are the execution plans from the two queries:
SQL_ID 236f82vmsvjab, child number 0 ------------------------------------- select /*+ find this */ count(*) from t1 where n1 = 15 and n2 = :b1 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 24 (100)| | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | TABLE ACCESS FULL| T1 | 1 | 8 | 24 (5)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(("N1"=15 AND "N2"=:B1)) Note ----- - SQL profile PROFILE_MIXED used for this statement SQL_ID 7nakm3tw27z3c, child number 0 ------------------------------------- select /*+ find this */ count(*) from t1 where n1 = 16 and n2 = :b1 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 (100)| | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | INDEX RANGE SCAN| T1_I1 | 1 | 8 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("N1"=16 AND "N2"=:B1)
As you can see the execution plan for the original query is still doing a full tablescan and reporting the SQL Profile as used; but we’re not using (or reporting) the SQL Profile when we change the literal values – even though a query against dba_sql_profiles will tell us that the profile has force_matching = ‘YES’.
tl;dr
(Clarified in response to Mohammed Houri’s comment below)
If you use an SQL Profile with force_match => true to “hide” the literals in a statement that also contains bind variables (even if they appear only in the select list, in fact) the mechanism will not be used, and the SQL Profile will apply only to the original statement.
Update
Christian Antognini has an elegant little script that uses the dbms_sqltune.sqltext_to_signature() function to highlight this point (among others). Bear in mind, before you run the script, that you need to be licensed to use the dbms_sqltune package to do so.
Jonathan
It seems to me that it is the mixture of both literal and bind variables that prevent the SQL Profile from force matching. If we don’t mix then the SQL Profile will be used
Best regards
Mohamed
Comment by hourim — June 26, 2018 @ 10:01 am BST Jun 26,2018 |
Mohammed,
That was exactly the point of the blog. However, I could have expressed my tl;dr better – after all, if you don’t have ANY literals then you wouldn’t be need to worry about force matching. It would be redundant.
I’ve updated the tl;dr to clarify the point.
Comment by Jonathan Lewis — June 26, 2018 @ 10:07 am BST Jun 26,2018 |
Jonathan
Yes you are right.
Best regards
Mohamed
Comment by hourim — June 26, 2018 @ 10:12 am BST Jun 26,2018 |
Jonathan, I have come across a very strange issue with 12.1.0.2 optimizer using sql plan baseline for SQLs having a mix of both literals and binds. we didn’t turn on SPB auto capture, but after having SPB for the SQL, it’s started capturing too many SPB’s (~300 in few hours) with an origin of “Auto Capture” and taking longer (from milliseconds to minutes) even for parsing and generating too many child cursors (12 or more) and building the session queue and most of the sessions waiting on either “cursor pin S wait on X” or “row cache lock” and with TOP_LEVEL_CALL_NAME as “lob/read” and corresponding segment SQLOBJ$PLAN. No answers from Oracle support yet. we do use cursor_sharing as exact and this SQL is part of a PL/SQL procedure…
Comment by Tony — June 28, 2018 @ 5:00 am BST Jun 28,2018 |
Tony,
One of the drawbacks of SQL Plan Baselines is that once a specific SQL statement has a baseline associated Oracle will carry on creating new baselines for it even when the parameter optimizer_capture_sql_plan_baselines is set to false. Although this helps to explain why you could end up with lots of SPBs for this statement it shouldn’t really explain why the time for parsing can grow dramatically.
When Oracle has to optimize a statement which has an sql plan baseline it first goes through its standard optimisation process to generate a plan – then checks to see if [Correction thanks to Mohammed. Houri, see comment 4 below]
there’s a baseline that is enabled and accepted and gives a better cost, then it re-optimizes for that baseline to check (in your version) that the baseline will still produce the same plan. In theory this means Oracle optimises the statement twicethe plan already exists as a baseline; if it does then Oracle executes it, if not then Oracle starts working through the baselines that are enabled and accepted to see if it should use one of them; for details of this process follow the link in Mohammed’s comment below. (If the optimisationproduces a lower cost plan thendidn’t produce an existing the baseline Oracle stores it as a new (not enabled, not accepted) baseline – which is why the list of baselines for a statement can grow.)From what you’ve said I do not know why Oracle will see the need to optimize the statement lots of times in a few hours – the simplest explanation is that you have lots of different (literal-string, perhaps) that keep pushing other stuff out of the library cache; on the other hand it may be something to do with the way you’re using PL/SQL to run the SQL, perhaps calls to dbms_sql or to execute immediate behave differently in some cases compared to embedded SQL.
I suspect the time spent relates to statistics feedback and the appearance of sql_plan_directives for the objects in the statement – combined, perhaps, with the presence of histograms for various of the columns mentioned in the predicates. The directives generally tell Oracle to sample tables and partial joins between tables, in the query and there have been lots of complaint about the feature in this version of Oracle leading to long parse times (and the type of library cache contention you’re recording). In 12.1.0.2 this is part of “optimizer_adaptive_features”, but in 12.2 it has been isolated as “optimizer_adaptive_statistics” with a backport available for 12.1.0.2 so you can disable it (the default is off in 12.2). See this note (especially Update 5 – with the link to Mike Dietrich’s definitive blog note on the patch) for more details.
One query you could run to review the threat of directives is:
Obviously you could include predicates on owner and object_name to get better precision. In my case I’ve just found 370 directives associated with sys.obj$.
Comment by Jonathan Lewis — June 28, 2018 @ 8:17 am BST Jun 28,2018 |
Jonathan
I haven’t looked again to the CBO-SPM interaction since a couple of years ago but your explanation doesn’t coincide of what I suppose I have clearly understood about this CBO-SPM interaction summarized here:
https://hourim.wordpress.com/?s=CBO-+interaction
In effect when the CBO goes through its standard optimisation and comes up with a plan that exists into the SPM baseline it will use this plan whatever its cost is. It is only when the CBO comes up with an execution plan that doesn’t exist into the SPM Baseline that Oracle will start the two phase reproduction of all accepted and enabled SPM plan. The more there are accepted and enabled SPM plans the more the parsing phase will be longer. So the threat is always behind these two points:
1) The CBO comes up with a plan that doesn’t exist in the SPB
2) There are too many accepted and enabled SPM plan into the SPB which Oracle has to reproduce and cost before selecting the best costed one.
Best regards
Mohamed
Comment by hourim — July 2, 2018 @ 9:27 am BST Jul 2,2018 |
Mohammed,
Thanks for the corrections – I’ve edited in part of the correction to the original comment to make sure that people don’t take it as gospel and forget to carry on readintg.
Comment by Jonathan Lewis — July 2, 2018 @ 10:29 am BST Jul 2,2018 |
Hi Jonathan, If we suspect the time spent relates to 12.1 optimizer’s adaptive features(statistics feedback and or SPDs), we shouldn’t be having the same behavior when there is no SPB in place. Oracle still has to do hard parsing anyways whether we have SPBs or not. The procedure, rather the SQL statement has been working well for months now after SPBs were dropped. I could prevent oracle capturing additional plans (due to adaptive mayhem) by setting the plans in the baseline to FIXED=YES. But when the optimizer couldn’t reproduce the plan in base lines, why we need to have SPBs in first place? So I didn’t spend more time on it.
I can’t use SQL profile in this case as the SQL has mix of literals and binds. Btw, I don’t use SQL tuning advisor’s OPT_ESTIMATE stuffs, instead i always use coe script (part of SQLT) to generate sql profiles, which actually grabs hints from outline section of the plan.Carlos who wrote SQLT has recommended to use sql plan baselines instead of profiles and Oracle 12.2 SQL tuning performance guide says below:
No strict relationship exists between the SQL profile and a SQL plan baseline. If a statement has multiple plans in a SQL plan baseline, then a SQL profile is useful because it enables the optimizer to choose the lowest-cost plan in the baseline.
SQL profile enables the optimizer to choose the lowest-cost???
Both SQL profile and SQL plan management aren’t truly great at least with my experience so far with all these limitations. Why can’t they code in such a way that the optimizer choose the lowest cost plan directly from base lines ,if a base line exists.I understand that Oracle needs all dependent objects in place to reproduce a plan. But the check can be done w/o hard parsing in the name of re-optimization.
Comment by dbabible — September 7, 2018 @ 2:23 am BST Sep 7,2018 |
dbabible,
Thanks for the comment. It would be nice to know what triggered the endless generation of new baselines but I sympathise completely with the need to move on once a solution (or workaround) to the important part of the problem has been found.
I haven’t been able to locate the comment in the 12.2 manuals, but for the benefit of other readers here’s a link into the 12.1 Database SQL Tuning Guide – the comment is just above the heading I’ve linked to: https://docs.oracle.com/database/121/TGSQL/tgsql_profiles.htm#TGSQL604
I hadn’t spotted this before, but it mirrors the comment I’ve been making for years that you really shouldn’t use the CoE strategy for copying what is essentially an SQL Plan Baseline into the area defined for SQL Profiles – even though an SQL Profile allows you to “force_match” to work around problems of literals. I hadn’t been able to come up with an example of why this might turn out to be a very bad idea, but the comment from Oracle suggests the obvious threat:
If Oracle has to pick the lowest cost SQL Plan Baseline then it needs to run the optimizer against all the different hint sets. But SQL Profiles are statistical corrections to the optimizer generally used to correct cardinality estimates. So Oracle should used the corrected cardinalities when trying to find the lowest cost plan – which means applying the SQL Profile to each SQL Plan Baseline. I wonder what happens when the profile contains directives like “do a full tablescan” rather than “there are only 20 distinct combinations of col1 and col2, not 2,000”.
Comment by Jonathan Lewis — September 7, 2018 @ 7:14 am BST Sep 7,2018
SQL re-optimization was the main culprit behind generating more number of plans and we also had SQL plan baselines without FIXED=YES option, it kept adding the new plans to baselines. I really don’t understand how it could possibly increase the parsing time, when it’s not accepted by default. we had just one plan in the baseline as “ACCEPTED” for this SQL. There were 40+ sessions waiting with “Cursor Pin S wait on X” and the blocking session doing hard parsing (in_parse is Y and in_hard_parse is Y and all other sessions in a state IN_PARSE=’Y’). I have also noticed that the blocking session doing “lob/read operations” against SQLOBJ$DATA & SQLOBJ$PLAN.
Thanks
Tony
Comment by dbabible — September 7, 2018 @ 9:43 pm BST Sep 7,2018 |