There are times when an upgrade makes a big difference to performance because an enhancement to the optimizer code path changes the costing of a plan that was always possible, but badly costed. I noticed an example of this while testing the code in the email I mentioned in last month’s posting on the “Incremental Sort” that Postgres 13 can do. Here’s a model to create some data and demonstrate the principle – the code is a modified (reduced) version of the code published by Phil Florent describing the feature.
rem
rem Script: fetch_first_postgres.sql
rem author: Phil Florent
rem Dated: 6th Nov 2020
rem
rem Last tested
rem 19.3.0.0 Uses index descending unhinted at low cost
rem 18.3.0.0 Used index desc efficiently if hinted, but high cost
rem 12.2.0.1 Used index desc efficiently if hinted, but high cost
rem
create table lancers(dtl timestamp, idg integer not null, perf integer);
insert into lancers(dtl, idg, perf)
with serie(i) as (
select 25e4 from dual
UNION ALL
select i - 1 from serie where i > 1
)
select
current_timestamp - (i / 1440),
trunc(dbms_random.value * 1e5 + 1),
case
when dbms_random.value <= 0.001 then 50000 + trunc(dbms_random.value * 50000 + 1)
else trunc(dbms_random.value * 50000 + 1)
end
from serie
/
execute dbms_stats.gather_table_stats(user,'lancers',method_opt=>'for all columns size 1')
create index perf_i1 on lancers(perf, dtl);
alter table lancers modify (perf not null, dtl not null);
This is the basic statement I want to execute – but in some versions of Oracle it will have to be hinted to produce the execution plan I want to see.
select
idg, perf
from
lancers
order by
perf desc
fetch first 5 rows only
/
If you check the order by clause and the definition of the index perf_i1 you’ll see that Oracle could (in principle) walk the index in descending order, stopping after just 5 rows, to produce the result.
But here are the execution plans from 19.3.0.0, 18.3.0.0, and 12.2.0.1, with their plans pulled from memory and showing the rowsource execution statistics (hinted by gather_plan_statistics) to show you what happens – starting from the newest first:
19.3.0.0: (unhinted)
--------------------
SQL_ID 8nmavy42tzrhb, child number 0
-------------------------------------
select /*+ gather_plan_statistics -- index_desc(lancers(perf,
dtl)) */ idg, perf from lancers order by perf desc fetch first 5
rows only
Plan hash value: 843745288
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 8 (100)| 5 |00:00:00.01 | 9 |
|* 1 | VIEW | | 1 | 5 | 8 (0)| 5 |00:00:00.01 | 9 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 5 | 8 (0)| 5 |00:00:00.01 | 9 |
| 3 | TABLE ACCESS BY INDEX ROWID| LANCERS | 1 | 250K| 8 (0)| 5 |00:00:00.01 | 9 |
| 4 | INDEX FULL SCAN DESCENDING| PERF_I1 | 1 | 5 | 3 (0)| 5 |00:00:00.01 | 4 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("PERF") DESC )<=5)
You can see an index_desc() hint in the output, but it has been commented out. The key feature to note is that the optimizer has found the path I was hoping to see, and it’s a low-cost path, although there is one oddity in the plan – the E-rows (cardinality estimate) for the table access doesn’t allow for the stopkey and, since there are no predicates in the query, reports the 250K rows that exist in the table.
For 18.3.0.0 I had to include the hint, and you’ll see why:
18.3.0.0 (hinted with index_desc)
---------------------------------
SQL_ID fgxvcaz3sab4q, child number 0
-------------------------------------
select /*+ gather_plan_statistics index_desc(lancers(perf, dtl))
*/ idg, perf from lancers order by perf desc fetch first 5 rows
only
Plan hash value: 843745288
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 250K(100)| 5 |00:00:00.01 | 10 |
|* 1 | VIEW | | 1 | 5 | 250K (1)| 5 |00:00:00.01 | 10 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 250K| 250K (1)| 5 |00:00:00.01 | 10 |
| 3 | TABLE ACCESS BY INDEX ROWID| LANCERS | 1 | 250K| 250K (1)| 6 |00:00:00.01 | 10 |
| 4 | INDEX FULL SCAN DESCENDING| PERF_I1 | 1 | 250K| 854 (3)| 6 |00:00:00.01 | 4 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("PERF") DESC )<=5)
Again we see the plan is possible, but the optimizer’s cardinality estimate for the hinted index scan is 250K rows – the full size of the index, and it has allowed for that in the cost of the query. So the cost of this plan is high and in the absence of the hint the optimizer would have used a full tablescan with sort.
Finally we get down to 12.2.0.1 – and I’ve shown the hinted and unhinted plans.
12.2.0.1 (hinted index_desc)
-----------------------------
SQL_ID fgxvcaz3sab4q, child number 0
-------------------------------------
select /*+ gather_plan_statistics index_desc(lancers(perf, dtl))
*/ idg, perf from lancers order by perf desc fetch first 5 rows
only
Plan hash value: 843745288
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 250K(100)| 5 |00:00:00.01 | 10 |
|* 1 | VIEW | | 1 | 5 | 250K (1)| 5 |00:00:00.01 | 10 |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 250K| 250K (1)| 5 |00:00:00.01 | 10 |
| 3 | TABLE ACCESS BY INDEX ROWID| LANCERS | 1 | 250K| 250K (1)| 6 |00:00:00.01 | 10 |
| 4 | INDEX FULL SCAN DESCENDING| PERF_I1 | 1 | 250K| 854 (3)| 6 |00:00:00.01 | 4 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("PERF") DESC )<=5)
12.2.0.1 Unhinted
------------------
SQL_ID 8nmavy42tzrhb, child number 0
-------------------------------------
select /*+ gather_plan_statistics -- index_desc(lancers(perf,
dtl)) */ idg, perf from lancers order by perf desc fetch first 5
rows only
Plan hash value: 1374242431
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1102 (100)| 5 |00:00:00.24 | 822 | | | |
|* 1 | VIEW | | 1 | 5 | 1102 (10)| 5 |00:00:00.24 | 822 | | | |
|* 2 | WINDOW SORT PUSHED RANK| | 1 | 250K| 1102 (10)| 5 |00:00:00.24 | 822 | 2048 | 2048 | 2048 (0)|
| 3 | TABLE ACCESS FULL | LANCERS | 1 | 250K| 132 (16)| 250K|00:00:00.13 | 822 | | | |
--------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("PERF") DESC )<=5)
As you can see, 12.2.0.1 and 18.3.0.0 behave exactly the same way when hinted – the path is acceptable, but the cost is high. Consequently when I remove the hint the optimizer switches to using a full tablescan with sort because it’s cost is lower (thanks, in part, to the pattern in the data) than the indexed access path.
Summary
Two thoughts to take away from this note.
- First, there were two possible execution plans for the same query and the optimizer in versions below 19c was picking the one that was clearly a bad idea. The presence of alternatives, though, means that the patterns in the data, the index definition and statistics (especially the clustering_factor) the number of rows to fetch, and various other optimizer settings may mean that you find yourself in the unlucky position that the optimizer’s arithmetic is on the boundary between the two plans and it switches randomly between them from day to day.
- Secondly, when you upgrade to 19c the optimizer seems to be more likely to pick the indexed access path for a query like this – and that will probably be a good thing, but in a few cases it might turn out to be a bad thing.
Jonathan – thanks for the nice demo.
However we dont need index_desc hint, even a first_rows hint would do that. and the optimizer team have gone through that in detail at this link (https://blogs.oracle.com/optimizer/fetch-first-rows-just-got-faster)
Comment by Rajeshwaran Jeyabal — December 23, 2020 @ 2:53 pm GMT Dec 23,2020 |
Rajesh,
Thanks for the comment.
First_rows(n) is in some ways a nicer hint, but it’s nice to know (following your link) that there’s a patch available for those still on 12.2.0.1 (or 12.1.0.2) that means there no need to hint at all.
WordPress messed up your cut-n-paste, so I’ve cleaned it up as much as possible; I think I’ve got the important bits right.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — December 23, 2020 @ 4:57 pm GMT Dec 23,2020 |
Hi,
We rarely use outlines (baseline with a single plan in 18c+) and we never use direct hints but they are not a problem for double compatibility. They are considered as comments.
Perhaps it can be a good idea to put hints as comments without the “+” ? It would be a real hint, an indication given by the developer to the DBA and perhaps to the autonomous things in the future. “I think it could be good in most cases but I don’t want to make that mandatory with a cost-based optimizer”
Oracle is capable to analyze activity a posteriori to improve plans but we had major problems with that in 11.2.0.4. It was called “cardinality feedback”. 1st executions was fast, next executions were slow. Strangely CBO was not able to reuse the first good plan by itself. We deactivated the feature and it was OK but I guess it works better in EE since its machine learning capabilities are more sophisticated than a simple cardinality feedback.
For example, in this case, did you notice if Oracle 12.2.0.1 EE is automatically able to correct the plan even if the patch is not applied ?
Best regards,
Phil
Comment by Phil Florent — December 24, 2020 @ 12:10 pm GMT Dec 24,2020 |