It’s possible to spend ages talking about the best ways of collecting, or creating, statistics on partitioned tables.
The possible strategies for maintaining partitioned tables, (exchange partition, split partition, drop partition etc.) the types of partitioning available, and the way that the optimizer plays with the stats as you do so, have kept changing over the years, and I’ve got a large set of examples designed to test what happens to the stats as you do things to the table – but it’s impossible to keep it up to date.
Doug Burns is writing a series of articles about the trials, tribulations, and successes about partitioned tables and statistics. The series was well worth reading and will give you an insight into the problems you may have to address, so I’ve produced a catalogue to make it easy to visit the individual chapters in order. Make sure you also read the comments and related links.
- Part 1 – In which we see a simple example and do a default stats collection
- Part 2 – In which we consider Global Stats
- Part 3 – In which subpartitions and aggregation cause problems
- Part 4 – In which our hero fights his way through stats aggregation woes
- Part 5 – In which we encounter a partition exchange
- Part 6a – In which we start to use dbms_stats.copy_table_stats()
- Part 6b – In which we see how reputable individuals handle their mistakes
- Part 6c – In which we hear about 10.2.0.5 and lots of bugs
- Part 6d – In which we revisit earlier errors and discuss the benefit of discussion
- Part 6e – In which we revisit earlier problem again and talk about a bug.
- Part 7 – Not in the original series, but an interesting (slow) experience in 11g
I thought I’d collate a few other items on partition stats and optimizer behaviour – mainly from Randolf Geist’s blog:
And one from Kerry Osbourne – which lists a new granularity option, and a patch for 10.2.0.4
- Feb 2009: Maintaining statistics on a large partitioned table. (See also Metalink Doc ID: 6526370.8)
A couple (as pdf files) from David Kurtz, with a particular view to optimising Peoplesoft.
And an investigation into an oddity with the optimizer when using partitioned indexes
- Feb 2011: Jokes of the CBO with local indexes (10.2.0.4, 18.104.22.168)
In an earlier article I gave a description of how splitting a single date ranges into a pair of date ranges with an OR would change the arithmetic and so run the risk (or introduce the benefit) of changing the execution plan.
At the time I made a couple of comments about other details that could be demonstrated by the same query – but postponed saying anything about them. This follow-up article addresses the omission.
From time to time I get asked if it’s possible to index a partitioned table so that recent partitions have different (local) indexes from older partitions. The answer is “not really, but there’s a couple of dirty tricks which aren’t very nice and aren’t very stable“. (You can always play around – dangerously – with unusable indexes or function-based indexes).
With Oracle 11.2 there’s a new optimizer feature called “table expansion” which I’m guessing has been created to address this issue. Christian Antognini introduces it in this posting – which is actually starts by talking about zero-sized segments and unusable indexes.
In one of those little coincidences that seem to happen so often a question came up on the comp.databases.oracle. server news group that has an answer in a blog note I’d written just a few of days earlier . The question was simply asking for a way of counting the number of rows in each partition of a partitioned table.
A few weeks ago I wrote a note demonstrating the way in which Oracle’s strategy for hash partitioning is engineered to give an even data distribution when the number of partitions is a power of two. In one of the comments, Christo Kutrovsky pre-empted my planned follow-up by mentioning the hashing function ora_hash() that appeared in 10g.
I made a throwaway comment in a recent posting about using powers of two for the number of partitions when using hash partitioning. The article in question was talking about globally partitioned indexes, but the “power of 2” principle was first associated with tables.
Here’s a simple demonstration of hash partitioning in action demonstrating why Oracle adopted this “power of 2” rule. We start by creating a table that doesn’t obey the rule – with six partitions – and collect stats on it to see how many rows go into each partition:
I wrote a short reply this morning to a question from someone on the OTN forum who wanted a little help with indexing strategies after partitioning one of his larger tables.
The comments I made there capture the key points you need to consider when comparing the costs and benefits of locally partitioned indexes with those of global, or globally partitioned indexes – so I thought I’d post a pointer to the thread.
Experiences like this one [Ed: Nov 2008 – the blog has become private since I wrote this note]are always worth reading about to remind yourself what you can do with the dbms_stats package when it’s really necessary.
And while I’m pointing to other URLs, here’s another one worth knowing about – event “Cursor: pin S wait on X”. It’s not surprising to see this wait event occasionally in a busy 10g system, but if you’re losing a significant amount of time, it could be a bug.
No matter how simple a topic you pick, a few minutes thought invariably allow you to conjure up some new anomalies that could appear in the right (or possibly wrong) circumstances.
Yesterday I made a few comments about hash partitioning and performance. Today I suddenly realised that global indexes on hash-partitioned tables could exhibit an unfortunate behaviour pattern that would make them pretty useless – unless hacked or hinted. Consider the following table: (more…)
A question recently appeared on an article I wrote about partitioning a few months ago:
We are planning to create 8 HASH partitions. Looking only at PERFORMANCE would be there be any improvements, if we go for 16 or 32 partitions (maintainance and availability is not a problem in our case). There are only 2 indexes on our 350M table – one is LOCAL, another non-partitioned index.
Here’s a little example of how technology catches you out.
You have a table of 1,000,000 blocks in your system and from time to time it is necessary to run a tablescan through it. It’s not nice, but it only happens once a day, and it’s not worth building an index to create an alternative access path. (more…)
A little while ago, I received the following question by email:
What is the optimum level of partitioning in Oracle 10g, as in what is an efficient number of partitions that you can have in a single table? What are the advantages and disadvantages of having more than 3,000 partitions in a table?
These are actually questions I addressed seven years ago in my first book. The answers you might come up with for any specific occasion may vary, but the analysis does not change.
Looking back at some of my previous posts I was reminded how easy it is to overlook one important feature when trying to comment on another. In particular, in this case, a short posting on indexed access paths omitted any mention of parallel execution until a comment from David Aldridge jogged my memory. So here’s an important thought about parallel execution.
Here’s a dirty trick for those of you who are having problems with “partition0-wise” referential integrity on partitioned tables.
If you’ve created a pair of partitioned tables, parent and child say, with referential integrity declared between them, then you’ve probably discovered that you have to set the foreign key constraint on the child table to novalidate if you want to use “partition exchange loading” (as Oracle now calls it) to get data into the child table.
If you’ve got to the point where you decide to drop a pair of old partitions, you may have tried the following:
Posting from the UKOUG (UK Oracle User Group) conference. A couple of useful details from Tom Kyte’s technical keynote on “Things which might be in 11g but we’re not making any promises and you can’t hold us to it”.
The optimizer will be extended to allow us to collect some statistical information about correlated columns. This should help the optimizer to deal with combining predicates like: “Orders made in the last two weeks” with “Orders that have not yet been delivered” – if they’re recent orders, they’re more likely to be undelivered. I’m looking forward to seeing how far the Optimizer team has got with handling this rather difficult problem.
Partition handling: one of the current irritations for partitioning is that you have to disable referential integrity between partitioned tables if you want to drop old partitions. (Drop child partition Jan2001, followed by an attempt to drop the “obviously matching” parent partition Jan2001 currently results in Oracle error “ORA-02266: unique/primary keys in table referenced by enabled foreign keys”). 11g will give us the ability to declare that the partitioning of the child table is dependent on the partitioning of the parent table, and therefore guaranteed to be in-synch with the parent table. Apart from handling the drop partition problem, this should also help to avoid accidents that manage to disable partition-wise joins.