Oracle Scratchpad

November 3, 2010

List partitions

Filed under: Partitioning,Tuning — Jonathan Lewis @ 7:01 pm GMT Nov 3,2010

Despite the title and content, this article is more about thinking and testing than it is about list partitions as such. It’s just based on a little thought experiment with list partitioning.

I have a list-partitioned table, with no indexes, defined as follows:

rem
rem     Script:         pt_l_elimination.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2010
rem

create table area_sales (
        area            varchar2(10)    not null,
        dated           date            not null,
        quantity        number(6),
        value           number(8,2),
        padding         char(100),
        constraint as_area_ck check (area in ('England','Ireland','Scotland','Wales'))
)
partition by list (area) (
        partition england values ('England'),
        partition scotland values ('Scotland'),
        partition ireland values ('Ireland'),
        partition wales values ('Wales')
)
;


You’ll notice that there is no default partition, the partitioning column is declared as not null and is limited by a check constraint to a very specific set of values, and that every partition is defined to hold exactly one of the legal partition key values. Given those restrictions you might like to think about which of the following queries will be able to do perfect partition elimination:


select  count(*)
from    area_sales 
where   area = 'England'
;

select  count(*)
from    area_sales 
where   area in ( 'England', 'Ireland', 'Wales')
;

select  count(*)
from    area_sales 
where   area != 'England'
;

select  count(*)
from    area_sales 
where   area not in ( 'England', 'Ireland', 'Wales')
;

select  count(*)
from    area_sales 
where   area between 'England' and 'Ireland'
;

Once you’ve thought about what’s likely to happen, you might want to create and populate the table with a few thousand rows and run some tests to see if your thoughts were correct. Here’s a simple statement to spread some data evenly through the tables:


insert into area_sales
select
        decode(mod(rownum,4),
                0,'England',
                2,'Scotland',
                1,'Ireland',
                3,'Wales'
        ),
        sysdate + 0.01 * rownum,
        rownum,
        rownum,
        rownum
from
        all_objects
where
        rownum <= 30000
;

Once you’ve done the tests, and satisfied yourself that you understand what’s going on, you might start thinking about a few variations that might make the behaviour change, such as:

    you add a default partition
    you add a default partition and remove the check constraint
    you add a default partition and remove the not null constraint
    you define partitions that hold multiple values
    you change the order of partitions so the key values don’t appear in alphabetical order
    you use a different version of Oracle

I’m not going to tell you the answers to these questions. The point I want to make is that this is the type of model, and these are the types of question you should ask yourself when you start to investigate a feature to see whether it will behave as well as you hope, and what special benefits you might get from it.

In fact, these are the questions I’m asking myself right now as I look at a system which is using list-partitioned tables to hold (lots of) sales data. I think it might be a good idea to restructure the table to disallow nulls, include a check constraint, and hold just one key value per partition – and then modify some of the code to get rid of “not in” and “!=” predicates.

The trouble is it’s a massive table with lots of indexes, the number of partitions would jump from eight to 40, and the number of times that rows would migrate due to partition key updates would increase; so the cost of restructuring would be large and there would be some fairly undesirable impact on DML that would offset the benefits of more efficient queries.

Update (June 2020)

I was prompted by a question on the Oracle developer forum to re-discover this note and decided to recreate and re-run the simple tests above. Originally I had run the tests under 11.1.0.6, and two of the queries ( “area != ‘England'”, and “area not in (‘England’,’Ireland’,’Wales’)”) had failed to do partition elimination, producing the following execution plans:


----------------------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |            |     1 |     8 |   107 |       |       |
|   1 |  SORT AGGREGATE     |            |     1 |     8 |       |       |       |
|   2 |   PARTITION LIST ALL|            | 22500 |   175K|   107 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| AREA_SALES | 22500 |   175K|   107 |     1 |     4 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("AREA"<>'England')

----------------------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |            |     1 |     8 |   107 |       |       |
|   1 |  SORT AGGREGATE     |            |     1 |     8 |       |       |       |
|   2 |   PARTITION LIST ALL|            | 12656 |    98K|   107 |     1 |     4 |
|*  3 |    TABLE ACCESS FULL| AREA_SALES | 12656 |    98K|   107 |     1 |     4 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("AREA"<>'England' AND "AREA"<>'Ireland' AND "AREA"<>'Wales')

Note, particularly, the operator “Partition list all”.

Upgrade to 12c, and the optimizer has been enhanced – quite cleverly – and in this test I’ve used rowsource execution statistics to show that partition elimination really is happening at runtime:

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |            |      1 |        |   389 (100)|       |       |      1 |00:00:00.01 |     591 |
|   1 |  SORT AGGREGATE        |            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     591 |
|   2 |   PARTITION LIST INLIST|            |      1 |  22500 |   389   (3)|   KEY |   KEY |  22500 |00:00:00.01 |     591 |
|   3 |    TABLE ACCESS FULL   | AREA_SALES |      3 |  22500 |   389   (3)|   KEY |   KEY |  22500 |00:00:00.01 |     591 |
----------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |            |      1 |        |   131 (100)|       |       |      1 |00:00:00.01 |     197 |
|   1 |  SORT AGGREGATE        |            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     197 |
|   2 |   PARTITION LIST INLIST|            |      1 |  12656 |   131   (4)|   KEY |   KEY |   7500 |00:00:00.01 |     197 |
|   3 |    TABLE ACCESS FULL   | AREA_SALES |      1 |  12656 |   131   (4)|   KEY |   KEY |   7500 |00:00:00.01 |     197 |
----------------------------------------------------------------------------------------------------------------------------

There are two points to note here – first the operation “Partition list inlist”, and in both cases that has resulted in the number of Starts for the “Table access full” being less than four; secondly the absence of any predicate information (and that isn’t a bug, or accident in copying).

While deriving the INLIST, the optimiser has managed to take advantage of the fact that all of my partitions are defined by exactly one key value, so (for example) it knows that “area != ‘England'” will eliminate all the data from the partition that holds ‘England’, so it doesn’t need to visit the ‘England’ partition therefore it doesn’t need to use the predicate on any partition.

A very similar, but slightly more complex, argument holds with my “not in ()” predicate, viz:  for each of England, Ireland, Wales, the value will eliminate all the data from the partition that holds that value, so the predicate never needs to be used – Oracle will not be visiting a partition where the predicate would discard a subset of that partition’s data.

 

13 Comments »

  1. That’s one thing that puzzles me. Oracle is doing partition *elimination*, yet the most basic form – list partitions – cannot process even simple “not” conditions to eliminate partition from scanning.

    Comment by Christo Kutrovsky — November 4, 2010 @ 4:54 am GMT Nov 4,2010 | Reply

  2. I have pinpointed the same problem many times. I work with Oracle and netezza and usually try to compare their internal structures. in Netezza we have a structure called zone map. its a map of min/max values in extents. a SPU can be taken equivalent to Partition. during execution plan creation Netezza looks into zone map and eliminate SPU(partition in oracle terms) and based on remain extents evaluate the DOP and SPU on which snippet will run. In oracle we have partition to do the same with Bitmap index and in Oracle exadata we have stored index. but stored index does not eliminate data at plan creation time but at run time. due to which some times it uses more data for parallel processing and DOP cannot be calculated as exactly as it been calculated in Netezza. I have expressed my opinion on different forums and to Oracle team but… last time i did this i lost my work…

    Comment by Amir Riaz — November 4, 2010 @ 7:36 am GMT Nov 4,2010 | Reply

  3. It’s not really able to do “Partition Elimination”. It can do “Partition Pruning” to identify that it needs 3 of 4 partitions (but doesn’t know identify the partitions in the case of “area in ( ‘England’, ‘Ireland’, ‘Wales’)”.

    Comment by Hemant K Chitale — November 8, 2010 @ 6:17 am GMT Nov 8,2010 | Reply

  4. Hemant,

    Are you making a distinction between “elimination” and “pruning” or just pointing out that I ought to be using the term “pruning” rather than “elimination” since that’s the term used virtually everywhere in the manuals ?

    There may be some variation across versions in how well Oracle can handle the IN-list, but here’s the execution plan for the (‘England’,’Ireland’,’Wales’) query from an 11.1.0.6 database:

    -------------------------------------------------------------------------------------
    | Id  | Operation              | Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
    -------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT       |            |     1 |     8 |    64 |       |       |
    |   1 |  SORT AGGREGATE        |            |     1 |     8 |       |       |       |
    |   2 |   PARTITION LIST INLIST|            | 22500 |   175K|    64 |KEY(I) |KEY(I) |
    |   3 |    TABLE ACCESS FULL   | AREA_SALES | 22500 |   175K|    64 |KEY(I) |KEY(I) |
    -------------------------------------------------------------------------------------
    
    

    Note – there is NO predicate section. So the optimizer has deduced that the whole of the England partition is necessary and sufficient to answer the ‘England’ key, and so on for the other two values. Partition pruning has been applied with perfect precision in this case.

    Comment by Jonathan Lewis — November 8, 2010 @ 6:35 pm GMT Nov 8,2010 | Reply

  5. My comment was in reply to Christo’s : “Oracle is doing partition *elimination*, yet the most basic form – list partitions – cannot process even simple “not” conditions to eliminate partition from scanning”
    I pointed out that Oracle does Pruning. So it doesn’t do “Elimination” directly but by way of Pruning.

    There’s one reference to Elimination at http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/schemaob.htm#CNCPT88868 which goes “Both types of indexes can take advantage of partition elimination (also called partition pruning), which occurs when the optimizer speeds data access by excluding partitions from consideration” when talking of Local Indexes.

    Comment by Hemant K Chitale — November 9, 2010 @ 9:21 am GMT Nov 9,2010 | Reply

  6. […] a recent article about list partitioned tables I raised some questions about the cases where the optimizer can’t (yet) do partitioning […]

    Pingback by Local Indexes « Oracle Scratchpad — November 11, 2010 @ 10:45 pm GMT Nov 11,2010 | Reply

  7. […] the cost of using the index t1_id to access the data ? A2: As I pointed out in an earlier note on list partitioned tables, Oracle doesn’t recognise the (obvious to the human eye) option for partition pruning in this […]

    Pingback by Local Indexes – 2 « Oracle Scratchpad — November 14, 2010 @ 5:42 pm GMT Nov 14,2010 | Reply

  8. […] Lewis explains how he tests list partitions, a post which gives an excellent review on how one should go about testing new features in […]

    Pingback by Log Buffer #206, A Carnival of the Vanities for DBAs | The Pythian Blog — November 15, 2010 @ 6:27 am GMT Nov 15,2010 | Reply

  9. Sorry for the late comment
    Looking for the NOT is a common occurrence in my world. A few years back I came across (and blogged) an alternate way to construct this type of query.

    select count(*)
    from area_sales
    where area != 'England'
    ;

    could be changed to

    select count(*)
    from area_sales
    where case when area = 'England' then 1 else 0 end = 0

    Of course the case I wrote about was not really about partitioning but it does represent an alternative way to preset a query – and worthy of testing to see if it better approximates what is known of the data. It certainly is not a “one size fits all” solution – test, test, test!

    Comment by Pete Scott — November 27, 2010 @ 11:11 pm GMT Nov 27,2010 | Reply

  10. In a few idle moments I managed to find a suitable list partitioned table to test my query approach on – the case statement variant did not produce an appreciably different query plan to the simple not equals case. It appeared marginally faster, but not by anything that could be regarded as significant.

    The key thing about the test was it was just that a test – an idea (no matter how ‘off the wall’) is not dismissed out of hand but tested and compared to other approaches – it is all about the process of discovery.

    Comment by Pete Scott — November 30, 2010 @ 8:32 pm GMT Nov 30,2010 | Reply

  11. […] Lewis does a little thought experiment with list […]

    Pingback by Log Buffer #205, A Carnival of the Vanities for DBAs — February 13, 2013 @ 4:54 pm GMT Feb 13,2013 | Reply

  12. […] with partitions you can also exploit parallel query and partition elimination, which should make your users very […]

    Pingback by Best way to delete very large recordset in Oracle — October 14, 2023 @ 7:22 am BST Oct 14,2023 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.