Oracle Scratchpad

September 12, 2018

Column Stats

Filed under: 12c,extended stats,Oracle,Statistics — Jonathan Lewis @ 1:46 pm BST Sep 12,2018

A little while ago I added a postscript about gathering stats on a virtual column to a note I’d written five years ago and then updated with a reference to a problem on the Oracle database forum that complained that stats collection had taken much longer after the addition of a function-based index. The problem related to the fact that the function-based index was supported by a virtual column that used an instr() function on a CLOB (XML) column – and gathering stats on the virtual column meant applying the function to every CLOB in the table.

So my post-script, added about a month ago, suggested adding a preference (dbms_stats.set_table_prefs) to avoid gathering stats on that column. There’s a problem with this suggestion – it doesn’t work

Oracle doesn’t play nicely when you try to limit the stats collection to a few columns – even in version 18.3. Here’s a demonstration of the effect. First we create a table that includes a column group (extended stats), a virtual column, and a function-based index – i.e. the three different ways of generating user-related virtual columns.

rem     Script:         stats_struggle_06.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sep 2018

create table t1
with generator as (
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
        rownum                          id,
        lpad(rownum,10,'0')             v1,
        lpad(rownum,10,'0')             v2
        generator       v1
        rownum <= 1e4 -- > comment to avoid WordPress format issue

execute dbms_stats.delete_table_stats(user,'t1')

                        ownname         => user,
                        tabname         => 'T1',
                        extension       => '(v1, v2)'

alter table t1 add id_12 
        generated always as (mod(id,12)) virtual

create index t1_id on t1(mod(id,10));

Since I’ve run this on 12c and 18c I’ve included a call to delete table stats after creating the table. So the next step is to enable SQL trace and see what Oracle does under the covers when we try to gather stats on just a couple of columns in the table:

alter session set events '10046 trace name context forever';

                ownname     => user,
                tabname     => 't1',
                method_opt  => 'for columns size 1 id v1',
                cascade     => false

alter session set events '10046 trace name context off';

column column_name  format a32
column data_default format a32

        column_name, data_default,
        num_nulls, num_distinct, to_char(last_analyzed,'hh24:mi:ss') gathered
from    user_tab_cols 
where   table_name = 'T1' 
order by 

COLUMN_NAME                      DATA_DEFAULT                      NUM_NULLS NUM_DISTINCT GATHERED
-------------------------------- -------------------------------- ---------- ------------ --------
ID                                                                         0        10000 16:13:12
V1                                                                         0        10000 16:13:12
ID_12                            MOD("ID",12)
SYS_NC00006$                     MOD("ID",10)

According to the output of the last query we’ve gathered stats only on the two columns specified. But have we really avoided the work ? Here, with some cosmetic tidying, is the SQL executed by the package:

                full(t) no_parallel(t) no_parallel_index(t) dbms_stats
                cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
                xmlindex_sel_idx_tbl no_substrb_pad 

We can see that Oracle has done a count(), min() and max() on id and v1, and the “comment” at the end of the text tells us that it’s applied the approximate_ndv mechanism to the first two columns queried but not the rest. So that’s collected the statistics we asked for.  However Oracle has also count()ed all the other columns – which means it’s evaluated their underlying expressions. So if you were hoping that limiting the columns gathered would avoid a really expensive function call, bad luck it won’t work. (The same is still true in 19c).

Threat / Bug alert

A further irritation showed up when I ran a test case that used a deterministic PL/SQL function to generate a virtual column: in the function was called once per row (possibly because every row had a different value) whether or not it was in the list of columns for gathering stats; in 18.3 the function was called nearly twice per row when I didn’t specify stats gathering for the column and nearly 4 times per row when I did. This looks like it might be a change (possibly accidental) to how deterministic functions can cache their inputs and outputs across selected columns – possibly something as “minor” as the size of the cache. To be continued when time permits …




  1. […] any stats on any other columns remaining unchanged. Unfortunately I’ve just got around to testing this (Sept 2018) and discovered that the code Oracle uses to gather the stats will still count() the number of […]

    Pingback by Virtual Stats | Oracle Scratchpad — September 24, 2019 @ 11:56 pm BST Sep 24,2019 | Reply

  2. […] you limit the number of columns where you collect stats I’ve had to introduce a link to a note I wrote in September 2018 which points out that Oracle still count the number of non-null values for every column even if […]

    Pingback by IOT limitation | Oracle Scratchpad — February 11, 2021 @ 8:18 pm GMT Feb 11,2021 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by

%d bloggers like this: