Oracle Scratchpad

January 18, 2018

Column Stats

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 2:22 pm BST Jan 18,2018

I’ve made several comments in the past about the need for being selective when gathering objects statistics with particular reference to the trade-offs when creating histograms. With Oracle 12c it’s now reasonably safe (as far as I’m concerned) to set a method_opt as a table preference that identifies columns where you expect to see Frequency or (pace the buggy behaviour described in a recent post) a Top-N histograms. The biggest problem I have is that I keep forgetting the exact syntax I need – so I’ve written this note more as a reminder to myself than anything else.

Typically I might expect to use the standard 254 columns for gathering histograms, with an occasional variation to increase the bucket count; but for the purposes of this note I’m going to demonstarate with a much lower value. So here’s a table creation statement (running 12.1.0.2 – so it will gather basic stats on the create) and two variations of a call to gather stats with a specific method_opt – followed by a question:


rem
rem     Script:         method_opt.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2003
rem

create table t1
as
select
        object_type o1,
        object_type o2,
        object_type o3,
        object_id,
        object_name
from
        all_objects
where
        rownum <= 50000 -- > comment to bypass wordpress format problem
;

select  column_name, num_distinct, histogram, num_buckets, to_char(last_analyzed,'hh24:mi:ss')
from    user_tab_cols where table_name = 'T1' order by column_id;

execute dbms_lock.sleep(2)

begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt=>'for all columns size 1 for columns o1 o2 o3 size 15'
        );
end;
/

select  column_name, num_distinct, histogram, num_buckets, to_char(last_analyzed,'hh24:mi:ss')
from    user_tab_cols where table_name = 'T1' order by column_id;

execute dbms_lock.sleep(2)

begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt=>'for all columns size 1 for columns size 15 o1 o2 o3'
        );
end;
/

select  column_name, num_distinct, histogram, num_buckets, to_char(last_analyzed,'hh24:mi:ss')
from    user_tab_cols where table_name = 'T1';


The big question is this: which columns will have histograms after each of the gather_table_stats() calls:

method_opt=>'for all columns size 1 for columns o1 o2 o3 size 15'
method_opt=>'for all columns size 1 for columns size 15 o1 o2 o3'

The problem I have is simple – to me both options look as if they will create histograms on all three named columns but the first option is the one that I type in “intuitively” if I don’t stop to think about it carefully. The first option, alas, will only gather a histogram on column o3 – the second option is the one that creates three histograms.

The manuals are a little unclear and ambiguous about how to construct a slightly complicated method_opt; there’s a fragment of text with the usual mix of square brackets, italics and ellipses to indicate optional and repeated clauses (interestingly the only clue about multiple columns is that comma separation seems to be required – despite one of the examples above working withough commas) but there’s no explanation of when a “size” clause should go before a “column” column and when it should go after.

So here are a few more method_opt clauses – can you work out in advance which columns would have histograms if you used them and how many buckets in each histogram; there are a couple that may surprise you:


for columns o1 size 12, o2 size 13, o3 size 14

for columns o1 size 15 o2 size 16 o3 size 17

for columns size 18 o1 size 19 o2 size 20 o3

for columns size 21 o1 o2 size 22 o3

for columns o1 size 12, o2 size 12, o3 size 13, object_id size 13 object_name size 14

for columns size 22 o1 o2 for columns size 23 o3 object_id for columns size 24  object_name

Bottom line – to me – is to check very carefully that the method_opt is going to do what I want it to do; and for production systems I tend to use the final form that repeats the “for columns {size clause} {column list}”.

1 Comment »

  1. […] issue then was to change the table preferences for stats collection on this table to add the clause “for columns size 1 skew” so that the histogram would not be recreated on the next […]

    Pingback by Histogram Threat | Oracle Scratchpad — January 30, 2018 @ 8:07 am BST Jan 30,2018 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply to Histogram Threat | Oracle Scratchpad Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.