Oracle Scratchpad

July 2, 2018

Clustering_Factor

Filed under: CBO,Indexing,Oracle,Statistics — Jonathan Lewis @ 1:24 pm BST Jul 2,2018

Here’s another little note on the clustering_factor for an index and the table preference table_cached_blocks that can be set with a call to dbms_stats.set_table_prefs(). I might be repeating a point that someone made in a comment on an older posting but if that’s the case I can’t find the comment at present, and it’s worth its own posting anyway.

The call to dbms_stats.set_table_prefs(null,'{tablename}’,’table_cached_blocks’,N) – where N can be any integer between 1 and 255, will modify Oracle’s algorithm for calculating the clustering_factor of an index. The default is 1, which often means the clustering_factor is much higher than it ought to be from a humanly visible perspective and leads to Oracle not using an index that could be a very effective index.

The big point is this: the preference has no effect when you execute a “create index” statement, or an “alter index rebuild” statement. Here’s a simple script to demonstrate the point.


rem
rem     Script:         table_cached_blocks_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2018
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem

drop table t1 purge;
create table t1
segment creation immediate
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum-1,100)               n1,
        mod(rownum-1,100)               n2,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator
;

column blocks new_value m_blocks

select  blocks 
from    user_tables
where   table_name = 'T1'
;

column preference_value format a40

select  preference_name, preference_value
from    user_tab_stat_prefs
where
        table_name = 'T1'
;

I’ve created a very simple table of 10,000 rows with two identical columns and captured the number of blocks (which I know will be less than 256) in a substitution variable which I’m going to use in a call to set_table_prefs(). I’ve also run a quick check to show there are no table preferences set for the table. I’ll be running the same check again after setting the table_cached_blocks preference. Step 1 – create two indexes, but set the preference after building the first one; I’ve shown the result of the query against user_indexes immediately after the query:


create index t1_i1 on t1(n1);

execute dbms_stats.set_table_prefs(null,'t1','table_cached_blocks',&m_blocks)

create index t1_i2 on t1(n2);

select
        index_name, clustering_factor, to_char(last_analyzed, 'dd-mon-yyyy hh24:mi:ss') analyzed
from
        user_indexes
where
        table_name = 'T1'
order by
        index_name
;

INDEX_NAME	     CLUSTERING_FACTOR ANALYZED
-------------------- ----------------- -----------------------------
T1_I1				 10000 26-jun-2018 14:13:51
T1_I2				 10000 26-jun-2018 14:13:51


Now we check the effect of rebuilding the t1_i2 index – the one second sleep is so that we can use the last_analyzed time to see that new stats have been created for the index:


execute dbms_lock.sleep(1)
alter index t1_i2 rebuild /* online */ ;

select
        index_name, clustering_factor, to_char(last_analyzed, 'dd-mon-yyyy hh24:mi:ss') analyzed
from
        user_indexes
where
        table_name = 'T1'
order by
        index_name
;

INDEX_NAME	     CLUSTERING_FACTOR ANALYZED
-------------------- ----------------- -----------------------------
T1_I1				 10000 26-jun-2018 14:13:51
T1_I2				 10000 26-jun-2018 14:13:52


Finally we do an explicit gather_index_stats():


execute dbms_lock.sleep(1)
execute dbms_stats.gather_index_stats(null,'t1_i2')

select
        index_name, clustering_factor, to_char(last_analyzed, 'dd-mon-yyyy hh24:mi:ss') analyzed
from
        user_indexes
where
        table_name = 'T1'
order by
        index_name
;

INDEX_NAME	     CLUSTERING_FACTOR ANALYZED
-------------------- ----------------- -----------------------------
T1_I1				 10000 26-jun-2018 14:13:51
T1_I2				   179 26-jun-2018 14:13:53

At last – on the explicit call to gather stats – the table_cached_blocks preference is used.

Dire Threat

Think about what this means: you’ve carefully worked out that a couple of indexes really need a special setting of table_cached_blocks and you gathered stats on those indexes so you have a suitable value for the clustering_factor. Then, one night, someone decides that they’re going to rebuild some of those indexes. The following morning the clustering_factor is much higher and a number of critical execution plans change as a consequence, and don’t revert until the index statistics (which are perfectly up to date!) are re-gathered.

Footnote

The same phenomenon appears even when you’ve set the global preference for stats collection with dbms_stats.set_global_prefs().

Update (Oct 2021)

The problem is still present in 21.3.0.0 (and 19.11.0.0)

 

13 Comments »

  1. Not surprisingly, 18c demonstrates same behavior.

    Comment by Viacheslav Andzhich — July 2, 2018 @ 1:44 pm BST Jul 2,2018 | Reply

  2. Hi Jonathan – Investigating. I will post an update soon.

    Comment by Nigel Bayliss — July 4, 2018 @ 11:49 am BST Jul 4,2018 | Reply

  3. We’ve raised bug 28292026 (unpublished at the moment). Thank you for bringing this to our attention.

    Comment by Nigel Bayliss — July 4, 2018 @ 2:56 pm BST Jul 4,2018 | Reply

  4. Nigel,

    Thanks for picking this one up.

    Comment by Jonathan Lewis — July 4, 2018 @ 7:10 pm BST Jul 4,2018 | Reply

  5. […] Jonathan Lewis’s excellent Oracle blog, do yourself a favour. In a recent article, Jonathan highlighted a danger with rebuilding indexes (or indeed creating an index) when used in relation to collecting index […]

    Pingback by Rebuilding Indexes: Danger With Clustering Factor Calculation (Chilly Down) | Richard Foote's Oracle Blog — July 17, 2018 @ 7:33 am BST Jul 17,2018 | Reply

  6. […] A current bug relating to create index and index rebuild […]

    Pingback by CBO Oddities – 1 | Oracle Scratchpad — October 18, 2019 @ 6:10 pm BST Oct 18,2019 | Reply

  7. Has it been addressed? Just tested in LiveSQL, and results are still the same: table_cached blocks is ignored when index is rebuilt

    Comment by Viacheslav Andzhich — October 19, 2019 @ 9:58 pm BST Oct 19,2019 | Reply

    • Viacheslav,

      As far as I know this hasn’t been fixed yet.
      I couldn’t find anything on MOS about it, and it hasn’t changed (for create or rebuild) in the latest version I have (19.3).

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — October 19, 2019 @ 10:41 pm BST Oct 19,2019 | Reply

  8. […] Change some table preferences. Do not rely on the default settings.Especially you should take care about the new type of creating synopses (APPROXIMATE_NDV_ALGORITHM), if you upgraded some pre 12.2 database to the newer version.https://blogs.oracle.com/optimizer/efficient-statistics-maintenance-for-partitioned-tables-using-incremental-statistics-part-3TABLE_CACHED_BLOCKS can improve your clustering factor.https://richardfoote.wordpress.com/category/table_cached_blocks/https://jonathanlewis.wordpress.com/2018/07/02/clustering_factor-5/ […]

    Pingback by You should take care about statistics! – Oracle Ninja — April 18, 2020 @ 10:09 pm BST Apr 18,2020 | Reply

  9. Mea culpa – I forgot about this. The code change to implement TABLE_CACHED_BLOCKS was added to the DBMS_STATS package. There is no matching code for implementing this during index build/rebuild (this is an entirely different code path of course). There is an unpublished bug for this but its priority has never been elevated because (rightly or wrongly) there has only ever been one SR attached to it, and even that looks dubious. The bug is 28292026 TABLE_CACHED_BLOCKS PREFERENCE NOT CONSIDERED WHEN CREATING OR RECREATING INDEX. I will at least bug the DBMS_STATS API doc to make this clear – until 28292026 is addressed. My reasoning is that, arguably, 28292026 is not a bug because the behavior is “as implemented” and specific to the DBMS_STATS API, but don’t burn me at the stake for saying that because I do understand the arguments made above. I am not closing the bug. As for the default being “1” (I received an @Tweet about this): the value was chosen to preserve compatibility with 12.1 etc. This was the right choice IMO, and I say that based on the frequency and intensity of feedback I receive about changes to defaults. Of course, perhaps we could/should have started out with a larger value like 2,4,8,16, etc, but we didn’t. Taking a positive, half-pint-full perspective: the TABLE_CACHED_BLOCKS DBMS_STATS preference is a useful spanner to have in an expert’s toolkit and is a reason for some degree of celebration.

    Comment by Nigel Bayliss — November 3, 2021 @ 12:14 pm GMT Nov 3,2021 | Reply

    • Nigel,

      Thanks for the response.

      I have no problem with the “default = 1 for backwards compatibility” reasoning – there are probably a number of sites where a global change to (e.g.) 16 would result in some plans picking the wrong index because the site had already set optimizer_index_cost_adj as a workaround to the optimizer’s overestimate of the clustering_factor. It would have been nice, though, if the “changes” manual had included a specific note about the current setting, and warned of a change of default in a future release. As it is the resistance to change will continue because “the manuals don’t say it”.

      if you’re going to raise a doc bug on dbms_stats it would also be good to have a doc bug for the SQL Language reference manual under Create Index and Alter Index pointing out that the commands will not honour the preference.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — November 5, 2021 @ 9:56 am GMT Nov 5,2021 | Reply

      • Hi Jonathan, in the doc bug, I’ve included a mention WRT the SQL language ref guide. As of today, I don’t anticipate a change in the default from “1”, but the topic does come up periodically. As an aside (and genuinely – this is not a sales pitch), for those that want to quantify the effects of changing the value, I have always found SQL Performance Analyzer to be brilliant for experimenting with optimizer stats (requires a license). I use “dbms_sqlpa.execute_analysis_task (execution_type => ‘compare performance'” and compare runs on published stats vs experimental stats gathered in a pending state.
        Regards,
        Nigel

        Comment by Nigel Bayliss — November 5, 2021 @ 12:45 pm GMT Nov 5,2021 | Reply

  10. […] (CF), but if you want to read more about it I would recommend to read blog posts by Richard Foote, Jonathan Lewis and few other great articles by Randolf […]

    Pingback by Bad clustering factor on >12.1 ? Have you tried Attribute Clustering ? « Tales From A Lazy Fat DBA — May 16, 2022 @ 4:50 am BST May 16,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: