Oracle Scratchpad

April 30, 2009

Analyze this!

Filed under: CBO,Execution plans,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 8:29 pm BST Apr 30,2009

There’s a question on the Database General forum on OTN containing the following observation:

I have some chained rows in some of my tables. I have exported, truncated, import back the data in the table. Then collect the stats. But I have the same Chained_count. Nothing has changed.

If you export, truncate the table, then re-import the data, you expect to get rid of any migrated rows – but if a row is too big to fit in a single block, or has more than 255 columns and a non-null value after the 255th, (either of which will be reported as a chained row) it’s not going to change after exporting and importing; so the first question to ask is: “What’s the definition of the table ?” optionally followed by: “Can you run some SQL to check for very long rows.”

In this case the table definition showed that there couldn’t possibly be any long rows – so any figures for the chain_cnt must be from row migration – i.e. rows that got longer during an update at a time when there was not enough free space in the current block to allow for the growth. So how come the chain_cnt hasn’t changed after the export/truncate/import ?

There is one obvious (and fairly likely) idea that will spring to mind if you’ve spent many years using Oracle – and this idea prompts the question: “How, exactly, did you collect the stats after the import?”

If the chain_cnt was originally non-zero then someone must have used the analyze command to collect statistics at some point – the calls in the dbms_stats package don’t affect the chain_cnt. When you truncate a table most of the stats for the table will (in recent versions of Oracle) be cleared – but that won’t include the chain_cnt.

After re-importing the data the original chain_cnt would still be in place in the data dictionary, and if the person posting this question used the dbms_stats package to “collect the stats” the chain_cnt would remain unaltered. If he wants to get rid of that chain_cnt he will have to run “analyze table XXX delete statistics;” or use the analyze to collect the new statistics (and this would show that the chain_cnt really had dropped to zero).

Surprising CBO:

Here’s a little surprise, though, and the main reason for writing this note. I’ve got a table with a simple index on it. There are 10,000 rows in the table, and the indexed column has 1,000 tightly clustered rows for each value.

Watch what happens when I switch from using dbms_stats to using analyze (running 10.2.0.3 – but the same happens in 9i):

rem
rem     Script: chain_cnt.sql
rem     Author: Jonathan Lewis
rem     Dated:  June 2006
rem

execute dbms_stats.gather_table_stats()
set autotrace traceonly explain

select v1, v2  from t1 where n1 = 7;

--------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost  |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |  1000 |  8000 |    19 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |  1000 |  8000 |    19 |
|*  2 |   INDEX RANGE SCAN          | I1   |  1000 |       |     2 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=7)

analyze table t1 compute statistics;
select /*+ index(t1) */ v1, v2  from t1 where n1 = 7;

--------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost  |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |  1000 |  6000 |   905 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |  1000 |  6000 |   905 |
|*  2 |   INDEX RANGE SCAN          | I1   |  1000 |       |     2 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=7)

The cost of the query changed so much after the analyze that I had to put in a hint to force Oracle to use the index.

I happen to have 8,852 rows in the table which migrated after a bulk update, and when I used the analyze command to set the chain_cnt the CBO changed the cost of the query. (The increase is “query selectivity * chain_cnt” viz: 0.1  * 8,852 = 886).

Two things to bear in mind: (a) you should be using dbms_stats and not the analyze command to collect object statistics – and Oracle Corp. has been saying that since version 8; but (b) even though dbms_stats does nothing to clear, set, export, or import the chain_cnt, the cost based optimizer will use the value if it’s there – so if you’re still waiting to change your stats collection procedures, watch out – when you change the procedures some of your execution paths may change for “no apparent reason”.

Footnote

If you decide to check a table for chained rows, be careful – if you use the basic analyze command to populate the chain_cnt column you need to “analyze delete” the statistics afterwards before gathering (or reloading) stats through a call to dbms_stats. It’s much better to create the chained_rows table (see @$ORACLE_HOME/rdbms/admin/utlchain.sql) and use the extended command: “analyze table xxx list chained rows;” as this doesn’t change the stats on the table and it identifies the rows which are chained.

Inevitably there’s another type of price to pay – this process will generate undo and redo as it runs because it is writing the list of chained rows into the chained_rows table.

Update (Aug 2021)

The same behaviour appears in 19.11.0.0

 

14 Comments »

  1. In this case wouldn’t the analyze cost be more accurate as it takes the migrated rows into account?

    Henry

    Comment by Henry Poras — May 4, 2009 @ 7:22 pm BST May 4,2009 | Reply

  2. Henry,

    There’s certainly a strong case for saying that the adjustment for the chain_cnt is consistent with the rest of the strategy used by the optimizer in calculating indexed access paths – and possibly that’s why the code changed in 9i.

    The issue I wanted to highlight, though, was that you could get significant changes in costs if you changed your stats collection method.

    Comment by Jonathan Lewis — May 5, 2009 @ 6:04 pm BST May 5,2009 | Reply

  3. […] I’ve just rediscovered another note I’ve written about the difference between dbms_stats and analyze, and the effect this […]

    Pingback by Analyze This – 2 « Oracle Scratchpad — April 3, 2010 @ 10:30 am BST Apr 3,2010 | Reply

  4. You are highlighting that if i change the method of statistic collection; i.e. from analyze command to dbms_stats; now optimizer is having smart and most worth full statistics, but suppose there are two execution plans; (A) With Chain_cnt (B) Without Chain_cnt; so how particular this (chain_cnt) is significantly influence for calculation of index access path?. Which one is good?
    If i conclude your line “adjustment for the chain_cnt is consistent with the rest of the strategy used by the optimizer..”; so there will no/minor impact of chain_cnt, because it is consistent. Please correct me, if i am wrong.

    Thanks and Regards
    Girish Sharma

    Comment by Girish Sharma — June 10, 2010 @ 3:42 am BST Jun 10,2010 | Reply

  5. “You are highlighting …”
    The previous response to Henry states explicitly what I was trying to highlight.

    “most worth full statistics” I did not say that using dbms_stats means you get “most worth full statistics”, I said that Oracle Corp. has been telling you for years to use dbms_stats and not analyze – you’re supposed to use dbms_stats() – whether or not the statistics it gives are “better” or “worse”.

    “Which one is good?” The best path for a query may be one that the optimizer doesn’t find, regardless of what statistics you’ve got. The important point is that analyze and dbms_stats() generate different stats and therefore may give you different paths — either strategy could give you the better paths for some of your queries and worse paths for other queries.

    “no/minor impact”. The blog shows an example where the impact of changing from analyze to dbms_stats() is not minor. Consistent means “always does the same thing” which means, amongst other things: “always makes the same mistakes”. If, for a given version of Oracle, you always use analyze, the optimizer will always make the same mistakes; if you always use dbms_stats() the optimizer will make a different set of mistakes – but it will always make the same “different” set of mistakes.

    Comment by Jonathan Lewis — June 10, 2010 @ 9:30 am BST Jun 10,2010 | Reply

  6. You are saying something very different; while i wish to request something very different. I am not saying that you have said that dbms_stats will collect more smart/worthy stats; its my opinion; assumption. My request is what are implications of existence of chain_cnt in stats? Will it influence to calculate index access path? If yes; then how? (for which you have this blog entry in reference of above OTN thread in which OP is somewhat confused regarding export/import Vs Stats of chained rows).

    Best Regards
    Girish Sharma

    Comment by Girish Sharma — June 11, 2010 @ 6:23 am BST Jun 11,2010 | Reply

    • Girish:

      I’m sorry, I still don’t understand what your question is. You say: “My request is what are implications of existence of chain_cnt in stats? Will it influence to calculate index access path?”

      But the article is about the effects of analyze and the chain_cnt value, and includes an example with the following comment:

      “The cost of the query changed so much after the analyze that I had to put in a hint to force Oracle to use the index.”

      This seems to address the question you appear to be asking.

      Comment by Jonathan Lewis — June 12, 2010 @ 2:30 pm BST Jun 12,2010 | Reply

  7. I don’t believe CHAIN_CNT is actually used by the optimizer – at least not in my experience with 9i or 10g. In the past I have encountered several tables with high migrated rows, however, CHAIN_CNT = 0 because dbms_stats() doesn’t compute CHAIN_CNT! I performed a 10053 trace on a query against this table, and then again after an ANALYLZE COMPUTE STATISTICS, and found a different costing as is mentioned above. However, after re-applying dbms_stats() after the ANALYZE, the 10053 trace costing model reverted back to the pre-ANALYZE state, even though CHAIN_CNT retained its high non-zero value. Thus a technique I often use is to run a periodic sweep through the database using ANALYZE to identify those tables with increasing CHAIN_CNT – and then, immediately follow-up with dbms_stats() so as to revert to the dbms_stats costing. This enables computation of the optimum PCTFREE while not signficantly compromising the execution plans. It is also very helpful to identify tables that need to be reorganized (e.g., ALTER TABLE xxx MOVE). It would be nice to know why dbms_stats() dropped support for CHAIN_CNT in the first place as it can be a very useful statistic.

    Comment by Bruce Zikmund — July 3, 2010 @ 8:30 pm BST Jul 3,2010 | Reply

    • Bruce,

      That’s an interesting observation – unfortunately my first attempt to emulate your results on 10.2.0.3 and 9.2.0.8 failed.

      I modified the test scripts from the original blog, and added code after the example with the ANALYZE command to use dbms_stats.gather_table_stats() again – which left the chain_cnt in place. The resulting execution path (and 10053 trace) showed that Oracle still considered the chain_cnt.

      First observation – when you use dbms_stats the global_stats column on user_tables is set to ‘YES’, when you use analyze it is set to ‘NO’. Possibly there are some versions of Oracle where the handling of chain_cnt is dependent on that flag.

      Can you give an example of the code sequence you used to produce your results.

      Comment by Jonathan Lewis — July 4, 2010 @ 9:31 am BST Jul 4,2010 | Reply

  8. […] chained and migrated rows counted separately in dba_tables when we run the packaged procedure. The optimizer will use information about chained or migrated rows, but the information is only collected if you use the analyze command (and even then the two […]

    Pingback by I Wish « Oracle Scratchpad — December 16, 2011 @ 6:32 pm GMT Dec 16,2011 | Reply

  9. […] went) I used the analyze command to gather stats on the tables so that I could get a count of the number of chained rows; and I dumped a couple of blocks from the tables to see what the inserted rows looked […]

    Pingback by 255 columns | Oracle Scratchpad — May 20, 2017 @ 8:58 am BST May 20,2017 | Reply

  10. […] doesn’t tell you about the LOBs that have ended up in the LOB segment) but remember to use analuze again to delete the stats afterwards before using dbms_stats to recreate the stats the […]

    Pingback by Fragmentation ? | Oracle Scratchpad — August 11, 2021 @ 1:48 pm BST Aug 11,2021 | Reply

  11. […] CBO had started using the CHAIN_CNT statistic in its calculations, that Jonathan Lewis had already written on this subject way way back in April 2009 […]

    Pingback by CBO Costing Plans With Migrated Rows Part II (“New Killer Star”) | Richard Foote's Oracle Blog — March 28, 2023 @ 12:21 am BST Mar 28,2023 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: