Oracle Scratchpad

May 1, 2014

Delete Histogram

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 1:27 pm GMT May 1,2014

Here’s a note that I drafted in November 2010, then didn’t publish. I found it earlier on this morning while looking for another note I’d written about histograms so, even though it may not be something that people need so much these days, I thought: better late than never.

I’ve pointed out in the past that I’m not keen on seeing lots of histograms on a system and tend to delete them if I think they are not needed. Here’s an example of the type of code I use to delete a histogram.

declare

	srec			dbms_stats.statrec;
	m_distcnt		number;
	m_density		number;
	m_nullcnt		number;
	m_avgclen		number;

	n_array                 dbms_stats.numarray;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'n1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

	srec.bkvals := null;
	srec.novals :=	dbms_stats.numarray(
				utl_raw.cast_to_number(srec.minval),
				utl_raw.cast_to_number(srec.maxval)
			);
	srec.epc := 2;
	dbms_stats.prepare_column_values(srec, srec.novals);

	m_density := 1/m_distcnt;

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'n1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

exception
	when others then
		raise;		-- should handle div/0

end;
/

The code basically reads the column stats, resets the histogram figures to just the low and high values for the column, setting the endpoint-count to two, then adjusts the density to the standard for a column with no histogram. This specific example is for a numeric column.

Footnote: my preferred method of collecting statistics is to use method_opt => ‘for all columns size 1’ (i.e. no histograms) and then run scripts to create the histograms I want. This means that after any stats collection I need to run code that checks to see which tables have new stats, and then re-run any histogram code that I’ve written for that table.

To move from Oracle’s default histogram collection to this strategy, you could start by switching to method_opt => ‘for all columns size repeat’ (i.e. recreate existing histograms, don’t create new ones [ but see footnote 3, this is no longer safe]), then simply delete histograms as you find that you don’t need them, and introduce scripts to recreate the histograms that you do need. When you’ve finally got to the point where every histogram is scripted you can then switch to method_opt => ‘for all columns size 1’.

 Footnote 2: Since 2010 when I drafted this note Oracle 12c has launched, and the changes it has introduced for frequency and Top-N histograms means that I’m far less stringent in my demand that if a histogram is worth having it’s better to write code to create it. There’s a series of three articles about 12c histograms in particular at this link.

Footnote 3: Since writing this article I’ve discovered that “size repeat” has what Oracle should have published as a “notable change in behaviour” for 12c and it is no longer a suitable method_opt for histogram collection. See this note.

October 9, 2013

12c Histograms pt.3

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:13 pm GMT Oct 9,2013

It has taken much longer than I anticipated to get around to writing part 3 of this mini-series on what Oracle has done about histograms in 12c.
In part 1 I gave a thumbnail sketch of the three types of histogram available in 12c
In part 2 I described in some detail the improvements in performance and accuracy for the frequency and top-frequency histograms

In part 3 of this mini-series I’ll be describing how the implementation of the “hybrid” histogram that Oracle produces if the “approximate NDV” mechanism has been enabled and you’ve left the estimate_percent to auto_sample_size. There is little difference between the work needed to create a hybrid histogram and the work needed to generate the old “height-balanced” histogram, but the degree of information captured by the hybrid is much greater than that of the height-balanced.

(more…)

September 1, 2013

Histograms

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 9:41 am GMT Sep 1,2013

A three-part series on histograms up to and including 11g – written for Allthings Oracle. With Chinese transalation:

And a three part series on histograms in 12c on this blog

A short video recorded at RMOUG 2013 about the improvements in Oracle’s histogram mechanisms as you upgrade from 11g to 12c.

Update (Jan 2016):

A pair of articles by Mohamed Houri, published on allthingsoracle.com adding discussing details of the mechanisms and selectivity calculations for 12c histograms:

 

 

August 19, 2013

Distributed Queries – 3

Filed under: distributed,Histograms,Oracle — Jonathan Lewis @ 7:25 am GMT Aug 19,2013

A comment I’ve made many times in the past about distributed queries is that Oracle doesn’t try to retrieve histogram information from remote databases when optimizing a query. Checking back through previous posts, though, I don’t think I’ve ever mentioned it on the blog – so here’s an example demonstrating the point.

(more…)

July 30, 2013

12c Histograms pt.2

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 9:00 pm GMT Jul 30,2013

In part 2 of this mini-series I’ll be describing the new mechanism for the simple frequency histogram and the logic of the Top-N frequency histogram. In part 3 I’ll be looking at the new hybrid histogram.

You need to know about the approximate NDV before you start examining the 12c implementation of the frequency and top-frequency histograms – but there’s a thumbnail sketch at the end of the posting if you need a quick reminder.

(more…)

July 14, 2013

12c histograms

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 7:11 pm GMT Jul 14,2013

There are a few enhancements in 12c that might make a big difference to performance for a small investment in effort. One of the important enhancements comes from changes in histograms – which improve speed of collection with accuracy of results. The changes are so significant that I chose the topic as my presentation at OpenWorld last year.

(more…)

September 13, 2012

Histogram Generation

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 5:21 pm GMT Sep 13,2012

This really could be published in the Errata and Addenda of “Cost Based Oracle – Fundamentals”, but it’s more convenient to publish the whole thing here and just add a link to the Errata pages.

In chapter 7, on page 156, I gave an example of the type of SQL that Oracle runs (in the dbms_stats package) to generate a histogram. A sample of the code, and the plan from the 9.2.0.8 tkprof output, is listed below:
(more…)

January 3, 2012

NewDensity

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 5:56 pm GMT Jan 3,2012

A recent comment on a note I wrote some time ago about faking histograms asked about the calculations of selectivity in the latest versions of Oracle. As I read the question, I realised that I had originally supplied a formula for calculating cardinality, rather than selectivity, so I thought I’d supply a proper example.

We’ll start with a script to create some data and stats – and I’m going to start with a script I wrote in Jan 2001 (which is why it happens to use the analyze command rather than dbms_stats.gather_table_stats, even though this example comes from an instance of 11.2.0.2).
(more…)

October 19, 2010

Frequency Histograms – 6

Filed under: Histograms,Statistics,Troubleshooting,Tuning — Jonathan Lewis @ 6:01 pm GMT Oct 19,2010

In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you that the strategy could lead to a couple of anomalies if you were unlucky. I’ve already published a note about one such anomaly that can occur with fairly long character strings, this note describes another anomaly that could appear in less extreme cases. Again, we start by constructing a data set.

(more…)

October 13, 2010

Frequency Histogram 5

Filed under: Histograms,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 9:21 am GMT Oct 13,2010

In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you  that the strategy could lead to a couple of anomalies if you were unlucky. This note describes one such anomaly. We start with a slightly curious data set:
(more…)

October 5, 2010

Frequency Histogram 4

Filed under: Histograms,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 6:25 pm GMT Oct 5,2010

In an earlier note on interpreting the content of frequency histograms I made a throwaway comment about the extra complexity of interpreting frequency histograms on character-based columns. This note starts to examine some of the complications.

The driving problem behind character columns is that they can get quite large – up to 4,000 bytes – so the content of an “accurate histogram” could become quite large, and Oracle seems to have taken a strategic decision (at some point in history) to minimise this storage. As a result we can see an algorithm that works roughly as follows:
(more…)

September 24, 2010

Frequency Histogram 3

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 6:45 pm GMT Sep 24,2010

Here’s the output I get from querying dba_tab_histograms for a column that has been given a frequency histogram by a call to dbms_stats.gather_table_stats().
(more…)

September 20, 2010

Frequency Histograms 2

Filed under: Histograms,Statistics — Jonathan Lewis @ 6:18 pm GMT Sep 20,2010

I find it convenient occasionally to “translate” a frequency histogram into a report of the underlying data (sometimes to see how closely the histogram matches the real data). To demonstrate the type of query I use I’ve created a data set with a small number of distinct values and generated a frequency histogram on the data set. This is what the data and histogram look like:
(more…)

March 23, 2010

Fake Histograms

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:49 pm GMT Mar 23,2010

I got an email a few days ago asking me how to create a “height-balanced” histogram using the dbms_stats.set_column_stats() procedure.

I posted an example of creating a frequency histogram a few months ago – and it doesn’t take much to turn this into an example for the height-balanced type; the key change apppears in three critical lines.

(more…)

May 28, 2009

Frequency Histograms

Filed under: CBO,Histograms,Statistics,Tuning — Jonathan Lewis @ 7:34 pm GMT May 28,2009

In a recent article on changes to the calculation of cardinality with frequency histograms I made a comment about writing programs to create fake but realistic and complete frequency histograms to work around the problems of dramatic changes in execution plan that can occur if you let Oracle create the histograms by gathering stats.

As a follow-up to that comment, here’s a little script to demonstrate the method (for a numeric column), and a few points to remember when using this method.

(more…)

« Previous PageNext Page »

Powered by WordPress.com.