Oracle Scratchpad

March 23, 2010

Fake Histograms

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:49 pm BST Mar 23,2010

I got an email a few days ago asking me how to create a “height-balanced” histogram using the dbms_stats.set_column_stats() procedure.

I posted an example of creating a frequency histogram a few months ago – and it doesn’t take much to turn this into an example for the height-balanced type; the key change apppears in three critical lines.

(more…)

March 17, 2010

Partition Stats

Filed under: Oracle,Partitioning,Statistics — Jonathan Lewis @ 7:44 am BST Mar 17,2010

It’s possible to spend ages talking about the best ways of collecting, or creating, statistics on partitioned tables.

The possible strategies for maintaining partitioned tables, (exchange partition, split partition, drop partition etc.) the types of partitioning available, and the way that the optimizer plays with the stats as you do so, have kept changing over the years, and I’ve got a large set of examples designed to test what happens to the stats as you do things to the table – but it’s impossible to keep it up to date.

Doug Burns is writing a series of articles about the trials, tribulations, and successes about partitioned tables and statistics. The series was well worth reading and will give you an insight into the problems you may have to address, so I’ve produced a catalogue to make it easy to visit the individual chapters in order.  Make sure you also read the comments and related links.

  • Part 1 – In which we see a simple example and do a default stats collection
  • Part 2 – In which we consider Global Stats
  • Part 3 – In which subpartitions and aggregation cause problems
  • Part 4 – In which our hero fights his way through stats aggregation woes
  • Part 5 - In which we encounter a partition exchange
  • Part 6a – In which we start to use dbms_stats.copy_table_stats()
  • Part 6b – In which we see how reputable individuals handle their mistakes
  • Part 6c – In which we hear about 10.2.0.5 and lots of bugs
  • Part 6d – In which we revisit earlier errors and discuss the benefit of discussion
  • Part 6e – In which we revisit earlier problem again and talk about a bug.
  • Part 7 – Not in the original series, but an interesting (slow) experience in 11g

I thought I’d collate a few other items on partition stats and optimizer behaviour – mainly from Randolf Geist’s blog:

And one from Kerry Osbourne – which lists a new granularity option, and a patch for 10.2.0.4

  • Feb 2009: Maintaining statistics on a large partitioned table.  (See also Metalink Doc ID: 6526370.8)

A couple (as pdf files) from David Kurtz, with a particular view to optimising Peoplesoft.

And an investigation into an oddity with the optimizer when using partitioned indexes

  • Feb 2011: Jokes of the CBO with local indexes (10.2.0.4, 11.1.0.7)

February 20, 2010

Time on Stats

Filed under: Infrastructure,Statistics — Jonathan Lewis @ 6:35 pm BST Feb 20,2010

In a blog that’s now three years old I pointed out the issue of the automatic stats collection job and the way it could affect performance by changing object-level statistics behind your back. There’s another reason it could have an annoying side effect, though – it actually takes time to collect statistics.

The automatic job is, by default, allowed to run from 10:00 pm to 6:00 am on weekdays and for even longer over the weekend – and the time it needs to complete can vary dramatically from day to day. So it’s nice to have a little script to cross-check when it started and how long it ran. All you have to do is query the view dba_scheduler_job_run_details:

(more…)

February 4, 2010

SQL Server

Filed under: CBO,Execution plans,Hints,Infrastructure,Oracle,SQL Server,Statistics,Wishlist — Jonathan Lewis @ 7:07 pm BST Feb 4,2010

A few days ago I did a presentation on SQL Server. This probably sounds a little strange given my status as an Oracle specialist – but the nice people at Microsoft asked me if I would contribute to one of their seminars so I downloaded and installed the 180 day free trial copy of the Enterprise version, then downloaded the “Books Online” manuals and started to play.

It was an interesting experience – and I think the audience (and organisers) found my presentation useful. The title was “What the Enterprise needs in an RDBMS” – and that’s something I do know about – and the presentation was about whether or not you could find everything you needed in SQL Server 2008, where you’d have to look in the manuals, and supplementary questions you’d have to ask.

(more…)

January 6, 2010

Copy Stats

Filed under: Statistics — Jonathan Lewis @ 2:24 pm BST Jan 6,2010

In a recent posting on OTN, someone was having trouble copying stats from one index to another using the import_index_stats and export_index_stats procedures from package dbms_stats modifying the contents of their “stat table” between the export and import.

(more…)

June 2, 2009

Fixed Stats

Filed under: Execution plans,Performance,Statistics,Troubleshooting,Tuning — Jonathan Lewis @ 7:19 pm BST Jun 2,2009

There was a question about the notorious dba_extents view on the comp.databases.oracle.server newsgroup a little while ago.

This is a view that has been popularly abused over the years despite warnings such as a note of mine in 2006 and even back as far as 2001 this one from Connor McDonald.

There are many reasons why this is a nasty view – and the number of reasons has grown as Oracle has evolved – but David FitzJarell has highlighted an important point in one of his blog postings that was worth a mention. So here it is.

May 28, 2009

Frequency Histograms

Filed under: CBO,Histograms,Statistics,Tuning — Jonathan Lewis @ 7:34 pm BST May 28,2009

In a recent article on changes to the calculation of cardinality with frequency histograms I made a comment about writing programs to create fake but realistic and complete frequency histograms to work around the problems of dramatic changes in execution plan that can occur if you let Oracle create the histograms by gathering stats.

As a follow-up to that comment, here’s a little script to demonstrate the method (for a numeric column), and a few points to remember when using this method.

(more…)

May 11, 2009

Cardinality feedback

Filed under: CBO,Execution plans,Statistics,Troubleshooting — Jonathan Lewis @ 8:09 pm BST May 11,2009

One of my faviourite presentations in Collaborate 09 was a quick tip (30 minutes) on Tuning by Cardinality Feedback – i.e. comparing the optimizer’s predictions of cardinality with the actual rowcounts returned when you run the query.

The strategy is one that I first saw demonstrated in a presentation by Wolfgang Breitling a few years ago at one of the Hotsos seminars – but this latest example demonstrated an important variation on the theme in a short, precise, lesson.

(more…)

May 6, 2009

Philosophy – 1

Filed under: Infrastructure,Philosophy,Statistics,Troubleshooting — Jonathan Lewis @ 6:15 pm BST May 6,2009

There are some ideas about Oracle that are so fundamental that they should be lodged in your brain and floating behind your eyes whenever you want to investigate SQL performance problems. Here’s one of them:

Histograms and bind variables exist for diametrically opposed reasons – they won’t work well together without help.

You use bind variables because you want everyone to share the same child cursor for an SQL statement because that statement is going to be used very frequently, everyone is going to do the same (small) amount of work with it, the same execution plan should be ideal for everyone, and you don’t want to re-optimise it every time you use it because the overheads for optimisation would probably be greater than the resources need to run the statement. Typically we want to make great (but not exclusive) use of bind variables in OLTP systems.

You create histograms because statements that are virtually identical do enormously different amounts of work, need different execution plans, and the work done in optimising is tiny compared to the work done by the statements, and getting the wrong execution plan would lead to a huge waste of resources. Typically you need to generate histograms in data warehouse or decision support systems where the queries can be brutal and expensive.

Spot the contradiction – one technology is supposed to give you one plan that everyone uses; the other technology is supposed to give each individual the plan that’s best for them.

Remember that and you will remember to be cautious about sticking histograms into OLTP systems, and won’t be tempted to turn absolutely every literal into a bind variable.

Footnote: It seems that this article should have been called “Philosophy – 2″ because I’ve just rediscovered a short note called “Philosophy”

[The Philosophy Series]

May 4, 2009

IOUG Day 1

Filed under: CBO,Execution plans,Performance,Statistics,Troubleshooting,Tuning — Jonathan Lewis @ 10:00 pm BST May 4,2009

I attended four sessions today (one of them was a 2-hour session), and the highlight of the day for me was a lunchtime “quick tip” from Michelle Deng of Sanofi Aventis who gave a 30-minute talk with the title “Cardinality Analysis – a life saver for DBAs and Developers”.

(more…)

April 30, 2009

Analyze this!

Filed under: CBO,Execution plans,Statistics,Troubleshooting — Jonathan Lewis @ 8:29 pm BST Apr 30,2009

There’s a question on the Database General forum on OTN containing the following observation:

I have some chained rows in some of my tables. I have exported, truncated, import back the data in the table. Then collect the stats. But I have the same Chained_count. Nothing has changed.

If you export, truncate the table, then re-import the data, you expect to get rid of any migrated rows – but if a row is too big to fit in a single block, or has more than 255 columns and a non-null value after the 255th, (either of which will be reported as a chained row) it’s not going to change after exporting and importing; so the first question to ask is: “What’s the definition of the table ?” optionally followed by: “Can you run some SQL to check for very long rows.”

(more…)

April 23, 2009

Histogram change

Filed under: CBO,Histograms,Statistics,Troubleshooting — Jonathan Lewis @ 7:17 pm BST Apr 23,2009

When you upgrade from 10.2.0.3 to 10.2.0.4 or 11g, watch out for SQL that depends on the existence of frequency histograms.

In 10.2.0.3 (and earlier versions of Oracle) if you have a frequency histogram on a column, and then use a predicate of the form:

where colX = {value not found in histogram}

then the optimizer would calculate a cardinality of one for this predicate (having recorded a density of 1/(2 * num_rows) in the data dictionary (see comments 21 and 22 of this note on dbms_xplan).

(more…)

April 16, 2009

Virtual Columns

Filed under: Performance,Statistics,Tuning — Jonathan Lewis @ 7:25 pm BST Apr 16,2009

A recent post on comp.databases.oracle.server asked:

I have read various articles on virtual columns? I still do not understand their advantages other than they save some disk space, one can put logic of virtual columns in a trigger which will save information in a real column. Real column will need some additional disk space, but it will save some cpu time when one is doing selects on “virtual” columns.

(more…)

March 20, 2009

Hotsos 2009

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 7:43 am BST Mar 20,2009

So much for my belief that I’d have some quiet time for catching up with a little internet gossip while attending Hotsos 2009.

The days were busy and I crashed out at about 7:00 pm (local time) each evening and was asleep by 7:30 pm and up by 2:00 a.m – then I spent the morning (until breakfast) writing up notes I had taken the day before. So hardly a moment for blogging or answering questions on OTN.

Best topic of Hotsos (for me, at any rate – there were lots of very good presentations): Amit Poddar’s presentation on the new “approximate NDV” mechanism that Oracle 11g uses to do a one pass, accurate, estimate of the number of distinct values in a column (no more massive sorts for “count(distinct)” and how it manages to keep a “synopsis” for each partition of a partitioned table so that there is no need to sample the entire table when you need to recalculate table-level statistics.

Amit Poddar’s website is no longer online, but he has given me permission to publish the material, so here are the links to a pair of pdf files: the presentation (1.6MB), and the white paper (3.45MB).

The mathematics is brilliant, and I’m going to have to review my previous strategy for stats collection as a consequence.

The upside to starting the day at 2:00 am in Dallas, by the way – no jet lag when I got home !

Update Dec 2010:

Just in from an OTN thread and Greg Rahn; there is a bug relating to synopses and approximate NDV that shows up with partitioned tables and incremental stats – leading to very long stats collection time. The bug number is 8310339, and Greg Rahn recommends applying the fix for bug 8719831.

Update Jan 2012:

A recent OTN note highlights a problem with the Approximate NDV code when collecting statistics on external tables. MOS notes 1290722.1 or 1305127.1 are  relevant.

 

December 28, 2008

timed_os_stats

Filed under: Infrastructure,Statistics — Jonathan Lewis @ 9:20 pm BST Dec 28,2008

There is a startup parameter called timed_os_statistics, carrying a warning in the manuals that said:

Note: Gathering OS statistics is very expensive. Oracle Corporation recommends that you set this parameter in an ALTER SYSTEM statement rather than in the initialization parameter file, and that you reset the value to zero as soon as the needed statistics have been gathered.

(more…)

« Previous PageNext Page »

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,266 other followers