Despite the title and content, this article is more about thinking and testing than it is about list partitions as such. It’s just based on a little thought experiment with list partitioning.
I have a list-partitioned table, with no indexes, defined as follows:
rem rem Script: pt_l_elimination.sql rem Author: Jonathan Lewis rem Dated: Oct 2010 rem create table area_sales ( area varchar2(10) not null, dated date not null, quantity number(6), value number(8,2), padding char(100), constraint as_area_ck check (area in ('England','Ireland','Scotland','Wales')) ) partition by list (area) ( partition england values ('England'), partition scotland values ('Scotland'), partition ireland values ('Ireland'), partition wales values ('Wales') ) ;
You’ll notice that there is no default partition, the partitioning column is declared as not null and is limited by a check constraint to a very specific set of values, and that every partition is defined to hold exactly one of the legal partition key values. Given those restrictions you might like to think about which of the following queries will be able to do perfect partition elimination:
select count(*) from area_sales where area = 'England' ; select count(*) from area_sales where area in ( 'England', 'Ireland', 'Wales') ; select count(*) from area_sales where area != 'England' ; select count(*) from area_sales where area not in ( 'England', 'Ireland', 'Wales') ; select count(*) from area_sales where area between 'England' and 'Ireland' ;
Once you’ve thought about what’s likely to happen, you might want to create and populate the table with a few thousand rows and run some tests to see if your thoughts were correct. Here’s a simple statement to spread some data evenly through the tables:
insert into area_sales select decode(mod(rownum,4), 0,'England', 2,'Scotland', 1,'Ireland', 3,'Wales' ), sysdate + 0.01 * rownum, rownum, rownum, rownum from all_objects where rownum <= 30000 ;
Once you’ve done the tests, and satisfied yourself that you understand what’s going on, you might start thinking about a few variations that might make the behaviour change, such as:
- you add a default partition
- you add a default partition and remove the check constraint
- you add a default partition and remove the not null constraint
- you define partitions that hold multiple values
- you change the order of partitions so the key values don’t appear in alphabetical order
- you use a different version of Oracle
I’m not going to tell you the answers to these questions. The point I want to make is that this is the type of model, and these are the types of question you should ask yourself when you start to investigate a feature to see whether it will behave as well as you hope, and what special benefits you might get from it.
In fact, these are the questions I’m asking myself right now as I look at a system which is using list-partitioned tables to hold (lots of) sales data. I think it might be a good idea to restructure the table to disallow nulls, include a check constraint, and hold just one key value per partition – and then modify some of the code to get rid of “not in” and “!=” predicates.
The trouble is it’s a massive table with lots of indexes, the number of partitions would jump from eight to 40, and the number of times that rows would migrate due to partition key updates would increase; so the cost of restructuring would be large and there would be some fairly undesirable impact on DML that would offset the benefits of more efficient queries.
Update (June 2020)
I was prompted by a question on the Oracle developer forum to re-discover this note and decided to recreate and re-run the simple tests above. Originally I had run the tests under 11.1.0.6, and two of the queries ( “area != ‘England'”, and “area not in (‘England’,’Ireland’,’Wales’)”) had failed to do partition elimination, producing the following execution plans:
---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 107 | | | | 1 | SORT AGGREGATE | | 1 | 8 | | | | | 2 | PARTITION LIST ALL| | 22500 | 175K| 107 | 1 | 4 | |* 3 | TABLE ACCESS FULL| AREA_SALES | 22500 | 175K| 107 | 1 | 4 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("AREA"<>'England') ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 107 | | | | 1 | SORT AGGREGATE | | 1 | 8 | | | | | 2 | PARTITION LIST ALL| | 12656 | 98K| 107 | 1 | 4 | |* 3 | TABLE ACCESS FULL| AREA_SALES | 12656 | 98K| 107 | 1 | 4 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("AREA"<>'England' AND "AREA"<>'Ireland' AND "AREA"<>'Wales')
Note, particularly, the operator “Partition list all”.
Upgrade to 12c, and the optimizer has been enhanced – quite cleverly – and in this test I’ve used rowsource execution statistics to show that partition elimination really is happening at runtime:
---------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 389 (100)| | | 1 |00:00:00.01 | 591 | | 1 | SORT AGGREGATE | | 1 | 1 | | | | 1 |00:00:00.01 | 591 | | 2 | PARTITION LIST INLIST| | 1 | 22500 | 389 (3)| KEY | KEY | 22500 |00:00:00.01 | 591 | | 3 | TABLE ACCESS FULL | AREA_SALES | 3 | 22500 | 389 (3)| KEY | KEY | 22500 |00:00:00.01 | 591 | ---------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 131 (100)| | | 1 |00:00:00.01 | 197 | | 1 | SORT AGGREGATE | | 1 | 1 | | | | 1 |00:00:00.01 | 197 | | 2 | PARTITION LIST INLIST| | 1 | 12656 | 131 (4)| KEY | KEY | 7500 |00:00:00.01 | 197 | | 3 | TABLE ACCESS FULL | AREA_SALES | 1 | 12656 | 131 (4)| KEY | KEY | 7500 |00:00:00.01 | 197 | ----------------------------------------------------------------------------------------------------------------------------
There are two points to note here – first the operation “Partition list inlist”, and in both cases that has resulted in the number of Starts for the “Table access full” being less than four; secondly the absence of any predicate information (and that isn’t a bug, or accident in copying).
While deriving the INLIST, the optimiser has managed to take advantage of the fact that all of my partitions are defined by exactly one key value, so (for example) it knows that “area != ‘England'” will eliminate all the data from the partition that holds ‘England’, so it doesn’t need to visit the ‘England’ partition therefore it doesn’t need to use the predicate on any partition.
A very similar, but slightly more complex, argument holds with my “not in ()” predicate, viz: for each of England, Ireland, Wales, the value will eliminate all the data from the partition that holds that value, so the predicate never needs to be used – Oracle will not be visiting a partition where the predicate would discard a subset of that partition’s data.
That’s one thing that puzzles me. Oracle is doing partition *elimination*, yet the most basic form – list partitions – cannot process even simple “not” conditions to eliminate partition from scanning.
Comment by Christo Kutrovsky — November 4, 2010 @ 4:54 am GMT Nov 4,2010 |
I have pinpointed the same problem many times. I work with Oracle and netezza and usually try to compare their internal structures. in Netezza we have a structure called zone map. its a map of min/max values in extents. a SPU can be taken equivalent to Partition. during execution plan creation Netezza looks into zone map and eliminate SPU(partition in oracle terms) and based on remain extents evaluate the DOP and SPU on which snippet will run. In oracle we have partition to do the same with Bitmap index and in Oracle exadata we have stored index. but stored index does not eliminate data at plan creation time but at run time. due to which some times it uses more data for parallel processing and DOP cannot be calculated as exactly as it been calculated in Netezza. I have expressed my opinion on different forums and to Oracle team but… last time i did this i lost my work…
Comment by Amir Riaz — November 4, 2010 @ 7:36 am GMT Nov 4,2010 |
It’s not really able to do “Partition Elimination”. It can do “Partition Pruning” to identify that it needs 3 of 4 partitions (but doesn’t know identify the partitions in the case of “area in ( ‘England’, ‘Ireland’, ‘Wales’)”.
Comment by Hemant K Chitale — November 8, 2010 @ 6:17 am GMT Nov 8,2010 |
Hemant,
Are you making a distinction between “elimination” and “pruning” or just pointing out that I ought to be using the term “pruning” rather than “elimination” since that’s the term used virtually everywhere in the manuals ?
There may be some variation across versions in how well Oracle can handle the IN-list, but here’s the execution plan for the (‘England’,’Ireland’,’Wales’) query from an 11.1.0.6 database:
Note – there is NO predicate section. So the optimizer has deduced that the whole of the England partition is necessary and sufficient to answer the ‘England’ key, and so on for the other two values. Partition pruning has been applied with perfect precision in this case.
Comment by Jonathan Lewis — November 8, 2010 @ 6:35 pm GMT Nov 8,2010 |
My comment was in reply to Christo’s : “Oracle is doing partition *elimination*, yet the most basic form – list partitions – cannot process even simple “not” conditions to eliminate partition from scanning”
I pointed out that Oracle does Pruning. So it doesn’t do “Elimination” directly but by way of Pruning.
There’s one reference to Elimination at http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#CNCPT88868 which goes “Both types of indexes can take advantage of partition elimination (also called partition pruning), which occurs when the optimizer speeds data access by excluding partitions from consideration” when talking of Local Indexes.
Comment by Hemant K Chitale — November 9, 2010 @ 9:21 am GMT Nov 9,2010 |
[…] a recent article about list partitioned tables I raised some questions about the cases where the optimizer can’t (yet) do partitioning […]
Pingback by Local Indexes « Oracle Scratchpad — November 11, 2010 @ 10:45 pm GMT Nov 11,2010 |
[…] the cost of using the index t1_id to access the data ? A2: As I pointed out in an earlier note on list partitioned tables, Oracle doesn’t recognise the (obvious to the human eye) option for partition pruning in this […]
Pingback by Local Indexes – 2 « Oracle Scratchpad — November 14, 2010 @ 5:42 pm GMT Nov 14,2010 |
[…] Lewis explains how he tests list partitions, a post which gives an excellent review on how one should go about testing new features in […]
Pingback by Log Buffer #206, A Carnival of the Vanities for DBAs | The Pythian Blog — November 15, 2010 @ 6:27 am GMT Nov 15,2010 |
Sorry for the late comment
Looking for the NOT is a common occurrence in my world. A few years back I came across (and blogged) an alternate way to construct this type of query.
select count(*)
from area_sales
where area != 'England'
;
could be changed to
select count(*)
from area_sales
where case when area = 'England' then 1 else 0 end = 0
Of course the case I wrote about was not really about partitioning but it does represent an alternative way to preset a query – and worthy of testing to see if it better approximates what is known of the data. It certainly is not a “one size fits all” solution – test, test, test!
Comment by Pete Scott — November 27, 2010 @ 11:11 pm GMT Nov 27,2010 |
In a few idle moments I managed to find a suitable list partitioned table to test my query approach on – the case statement variant did not produce an appreciably different query plan to the simple not equals case. It appeared marginally faster, but not by anything that could be regarded as significant.
The key thing about the test was it was just that a test – an idea (no matter how ‘off the wall’) is not dismissed out of hand but tested and compared to other approaches – it is all about the process of discovery.
Comment by Pete Scott — November 30, 2010 @ 8:32 pm GMT Nov 30,2010 |
Peter,
Thanks for the follow-up.
Comment by Jonathan Lewis — November 30, 2010 @ 10:50 pm GMT Nov 30,2010 |
[…] Lewis does a little thought experiment with list […]
Pingback by Log Buffer #205, A Carnival of the Vanities for DBAs — February 13, 2013 @ 4:54 pm GMT Feb 13,2013 |
[…] with partitions you can also exploit parallel query and partition elimination, which should make your users very […]
Pingback by Best way to delete very large recordset in Oracle — October 14, 2023 @ 7:22 am BST Oct 14,2023 |