Here’s a silly little detail about execution plans on (interval) partitioned tables that I hadn’t noticed until it showed up on this thread on a public Oracle forum: it’s an execution plan that claims that Oracle will be visiting a partition that clearly won’t be holding the data requested.
Here’s the starting section of a demonstration script – mostly by Solomon Yakobson with minor tweaks and additions from me:
rem
rem Script: non_existent_partition.sql
rem Author: Solomon Yakobson / Jonathan Lewis
rem Dated: Mar 2024
rem
rem Last tested
rem 19.11.0.0
rem
create table invoices_partitioned(
invoice_no number not null,
invoice_date date not null,
comments varchar2(500)
)
partition by range (invoice_date)
interval (interval '3' month)
(
partition invoices_past values less than (date '2023-01-01')
);
insert into invoices_partitioned
select level,
date '2023-01-01' + numtoyminterval(3 * (level - 1),'month'),
null
from dual
connect by level <= 6
/
insert into invoices_partitioned select * from invoices_partitioned;
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
commit
/
execute dbms_stats.gather_table_stats(user,'invoices_partitioned')
set linesize 156
column high_value format a80
select partition_position, num_rows,
partition_name,
high_value
from user_tab_partitions
where table_name = 'INVOICES_PARTITIONED'
order by partition_position
/
alter table invoices_partitioned drop partition for (date'2023-09-01');
purge recyclebin;
select partition_position, num_rows,
partition_name,
high_value
from user_tab_partitions
where table_name = 'INVOICES_PARTITIONED'
order by partition_position
/
The script creates an interval partitioned table, with an interval of 3 months, then inserts 131,072 rows per partition (the strange re-execution of “insert into x select from x” was my lazy way of increasing the volume of data from the original one row per partition without having to think too carefully.
After creating the data we report the partition names and high values in order, then drop (and purge) the partition that should hold the value 1st Sept 2023 (which will be the partition with the high_value of 1st Oct 2023) and report the partition names and high values again so that you can see the “gap” in the high values and the adjustment to the partition_position values. Here are the “before” and “after” outputs:
PARTITION_POSITION NUM_ROWS PARTITION_NAME HIGH_VALUE
------------------ ---------- ---------------------- --------------------------------------------------------------------------------
1 0 INVOICES_PAST TO_DATE(' 2023-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
2 131072 SYS_P39375 TO_DATE(' 2023-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
3 131072 SYS_P39376 TO_DATE(' 2023-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
4 131072 SYS_P39377 TO_DATE(' 2023-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
5 131072 SYS_P39378 TO_DATE(' 2024-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
6 131072 SYS_P39379 TO_DATE(' 2024-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
7 131072 SYS_P39380 TO_DATE(' 2024-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
7 rows selected.
PARTITION_POSITION NUM_ROWS PARTITION_NAME HIGH_VALUE
------------------ ---------- ---------------------- --------------------------------------------------------------------------------
1 0 INVOICES_PAST TO_DATE(' 2023-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
2 131072 SYS_P39375 TO_DATE(' 2023-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
3 131072 SYS_P39376 TO_DATE(' 2023-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
4 131072 SYS_P39378 TO_DATE(' 2024-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
5 131072 SYS_P39379 TO_DATE(' 2024-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
6 131072 SYS_P39380 TO_DATE(' 2024-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
6 rows selected.
Now we check the execution plan for a query that would have accessed the partition we’ve just dropped:
explain plan for
select *
from invoices_partitioned
where invoice_date = date '2023-09-01';
select *
from dbms_xplan.display();
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
Plan hash value: 1148008570
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 109K| 1173K| 104 (1)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 109K| 1173K| 104 (1)| 00:00:01 | 4 | 4 |
|* 2 | TABLE ACCESS FULL | INVOICES_PARTITIONED | 109K| 1173K| 104 (1)| 00:00:01 | 4 | 4 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("INVOICE_DATE"=TO_DATE(' 2023-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
The execution plans says it’s going to visit partition number 4 (pstart/pstop) – which we know definitely cannot be holding any relevant data. If this were an ordinary range-partitioned table – as opposed to interval partitioned – it would be the correct partition for 1st Sept 2024, of course, but it isn’t, so it feels like the pstart/pstop ought to say something like “non-existent” and all the numeric estimates should be zero.
A quick trick for making an interval partition appear without inserting data into it is to issue a “lock table … partition for () …” statement (See footnote to this blog note). I did wonder if the attempt to explain a plan that needed a non-existent partition had actually had the same effect of making Oracle create the partition, so I ran the query against user_tab_partitions again just to check that this hadn’t happend.
So what’s going to happen at run-time: is this an example of “explain plan” telling us a story that’s not going to match what shows up in v$sql_plan (dbms_xplan.display_cursor). Let’s run the query (with rowsource execution stats enabled) and find out:
set serveroutput off
alter session set statistics_level = all;
alter session set "_rowsource_execution_statistics"=true;
select *
from invoices_partitioned
where invoice_date = date '2023-09-01';
select *
from table(dbms_xplan.display_cursor(format=>'allstats last partition'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------
SQL_ID d42kw12htubhn, child number 0
-------------------------------------
select * from invoices_partitioned where invoice_date = date
'2023-09-01'
Plan hash value: 1148008570
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Pstart| Pstop | A-Rows | A-Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | | 0 |00:00:00.01 |
| 1 | PARTITION RANGE SINGLE| | 1 | 109K| 4 | 4 | 0 |00:00:00.01 |
|* 2 | TABLE ACCESS FULL | INVOICES_PARTITIONED | 0 | 109K| 4 | 4 | 0 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("INVOICE_DATE"=TO_DATE(' 2023-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
It’s the same plan with the same “wrong” partition identified, and the same estimate for rows returned – but the access never actually happened: Starts = 0 on the table access full.
My hypothesis about this misleading reporting is that Oracle knows from the table definition everything about every partition that might eventually exist – the high_value for the “anchor” partition is known and the interval is known so the appropriate partition number for any partition key value can be derived [ed: see comment #4 for details]. Then, at some point, a disconnect appears between the theoretical partition position and the set of physically instantiated partitions, so the optimizer gets the message “theoretically it’s in the 4th partition” and collects the stats from “partition_position = 4” to do the arithmetic and produce the plan. [ed: a later thought – the 109K estimate in this particularly simple example is based on {number of rows in table}/ {number of distinct values in table}]
Fortunately there’s some metadata somewhere that means the run-time engine doesn’t try to access the wrong partition, so this little glitch doesn’t really matter for this simple query – beyond its ability to cause a little confusion.
It’s possible, though, that this type of behaviour leaves the optimizer with another reason for getting the arithmetic wrong and picking the wrong path if you have a number of “missing” partitions in an interval partitioned table that you’re querying with a range-based predicate that crosses several (non-existent) partitions. So treat this as a warning/clue if you recognise that pattern in any of your partitioned table.