Oracle Scratchpad

July 14, 2013

12c histograms

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 7:11 pm BST Jul 14,2013

There are a few enhancements in 12c that might make a big difference to performance for a small investment in effort. One of the important enhancements comes from changes in histograms – which improve speed of collection with accuracy of results. The changes are so significant that I chose the topic as my presentation at OpenWorld last year.

In 11g Oracle gave us the option for using an “approximate NDV (number of distinct values)” for rapid and accurate collection of basic column stats. In 12c Oracle extends the mechanism to frequency histograms and a new type of frequency histogram called a “Top-N” histogram (also called a “Top-Frequency” histogram). At the same time the maximum number of buckets allowed for a histogram has gone up from 254 to 2000 2048 [corrected: see note 4 below] – although the default stays at 254 and you probably don’t need to increase it in most cases (remember the side effect you might have on the sysaux tablespace as Oracle automatically retains the historical stats).

Height-balanced histograms are still relatively expensive to collect – but there is a newer type of height-balanced histogram named the “Hybrid” histogram which gives you better information than the old height-balanced for the same amount of work and, thanks to the Top-N histogram you may find that some of your old (expensive, inaccurate) height-balanced histograms are replaced by cheap, accurate Top-N histograms.

In part 2 of this mini-series I’ll describe the new mechanism for the frequency histogram and the logic of the Top-N histogram and in part 3 I’ll describe the mechanism and demonstrate the benefits of the Hybrid histogram. The takeaway from this note, though, is that you need to look at your current stats collection and think about how a small change in strategy could both reduce the workload and improve the accuracy of your histograms.

15 Comments »

  1. Jonathan,

    I thought you were generally against using histograms – you were/are recommending on multiple occasions to turn them off (~ set size to 1).
    Did anything about 12c change your mind in this matter?

    Comment by Josef — July 15, 2013 @ 9:44 am BST Jul 15,2013 | Reply

    • Josef,

      I’m still very careful about (and generally against) using histograms.

      Key points:
      * You can collect a (frequency – in particular) histogram at the wrong time
      * A (height-balanced) histogram can vary its contents from collection to collection even when you collect on the same data at the same time of day – just because of the luck of random selection when sampling.

      With 12c (and suitable configuration – to be written up):
      * a frequency histogram is “free of charge” while being 100% accurate
      * a “top-N” histogram is “free of charge” and can be very accurate when there are very many values but most of the data comes from a relatively small number of values
      * a “hybrid” histogram gives much more information, stability, and accuracy at the same cost as the equivalent height-based.

      So if you’re letting Oracle gather histograms now (and aren’t having problems) then 12c will probably gather them faster and better.

      It’s still possible, though, to gather them at the wrong time of day, and it will still be cheaper and safer to write your own code to fake them in some (probably fewer) special cases.

      Comment by Jonathan Lewis — July 24, 2013 @ 12:39 pm BST Jul 24,2013 | Reply

  2. there are some interesting bugs that are found in 12cR1, i know about one which is follows:

    Bug 16933305
    Adding a nullable column with a default value and then later unsetting the default value in an Oracle Database 12c Release 1 (12.1) environment will not unset the default (back to NULL) despite the data dictionary reflecting the change to a default value of NULL. The statements that are impacted are ALTER TABLE x ADD (y NUMBER DEFAULT 99) followed by a subsequent ALTER TABLE x MODIFY (y DEFAULT NULL) where y is a nullable column.
    Workaround: Unset the default by using the ALTER TABLE x MODIFY (y DEFAULT TRIM(”)) statement which has the same semantic result as unsetting the default.

    Comment by OracleCommunity — July 19, 2013 @ 10:49 am BST Jul 19,2013 | Reply

    • Thanks for the comment.
      I think the guideline here is to test new features carefully – and make sure that one of the test is: “what if I change my mind”.

      The Oak Table mailing list has recently had an interesting chat about the effect of: “make column invisible, oops not that one the other one”, and what could happen to an application when you make the column visible again. (Answer – if your code doesn’t always identify columns by name then maybe your application crashes until you fiddle things back into the right order.)

      I’ve cross-linked your comment into a posting on “12 Debug” which references Oracle’s list of known 12c bugs: https://jonathanlewis.wordpress.com/2013/06/28/12c-debug/

      Comment by Jonathan Lewis — July 24, 2013 @ 12:49 pm BST Jul 24,2013 | Reply

  3. […] around to writing part 3 of this mini-series on what Oracle has done about histograms in 12c. In part 1 I gave a thumbnail sketch of the three types of histogram available in 12c In part 2 I described in […]

    Pingback by 12c Histograms pt.3 | Oracle Scratchpad — October 9, 2013 @ 8:14 pm BST Oct 9,2013 | Reply

  4. Jonathan,
    “at the same time the maximum number of buckets allowed for a histogram has gone up from 254 to 2000”

    in 12c it seems that the maximum number of buckets is of 2,048 …

    SQL> select banner from v$version;
    
    BANNER
    --------------------------------------------------------------------------------
    Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
    PL/SQL Release 12.1.0.1.0 - Production
    CORE    12.1.0.1.0      Production
    TNS for Linux: Version 12.1.0.1.0 - Production
    NLSRTL Version 12.1.0.1.0 - Production
    
    
    SQL> exec dbms_stats.gather_table_stats(user, 'Z', method_opt=> 'for columns c size 2048', estimate_percent=>100);
    
    PL/SQL procedure successfully completed.
    
    [
    COLUMN_NAM HISTOGRAM       NUM_BUCKETS
    ---------- --------------- -----------
    C          FREQUENCY              2048
    ]
    
    

    I also believe that there is some problem (or something that I haven’t yet clear)
    using DBMS_STATS with the limit values​​:

    SQL> exec dbms_stats.gather_table_stats(user, 'Z', method_opt=> 'for columns c size 2048');
    BEGIN dbms_stats.gather_table_stats(user, 'Z', method_opt=> 'for columns c size 2048'); END;
    
    *
    ERROR at line 1:
    ORA-20010: DBMS_STATS INTERNAL ERROR in process_topn_values : parse count != given count
    ORA-06512: at "SYS.DBMS_STATS", line 33859
    ORA-06512: at line 1
    
    
    SQL> exec dbms_stats.set_param('APPROXIMATE_NDV','FALSE');
    
    PL/SQL procedure successfully completed.
    
    
    SQL> exec dbms_stats.gather_table_stats(user, 'Z', method_opt=> 'for columns c size 2048');
    
    PL/SQL procedure successfully completed.
    
    [
    COLUMN_NAM HISTOGRAM       NUM_BUCKETS
    ---------- --------------- -----------
    C          FREQUENCY              2047
    ]
    
    

    Regards,
    Donatello

    Comment by dsettembrino — October 15, 2013 @ 7:00 pm BST Oct 15,2013 | Reply

    • I don’t have the exact problem pinned down but it seems if your number of distinct buckets is less than the number you ask for this can happen. When you really seem to need them it seems to work. Comprehensive testing of common and edge cases pending and I know at least one SR has been submitted.

      Comment by rsiz — March 25, 2017 @ 4:52 pm GMT Mar 25,2017 | Reply

      • The failure to create histograms at 2048 size turned out to be highly data dependent (those apparently not too rare, since I hit it more than once.) A patch received and applied to the customer’s machine seems to have been effective. I am no longer experiencing any failures to create up to and including size 2048. Given that it was a data dependent problem there is no way to prove the error cannot crop up again, but I am no longer able to cause it with the data I have.

        Comment by rsiz — May 4, 2017 @ 2:07 pm BST May 4,2017 | Reply

  5. Donatello,

    Thanks for the correction and warning – not sure why I wrote 2,000, but it’s the sort of error that I would keep repeating once I’d made it once. (Good job I think you should rarely need to change the default,)

    Since you get an error when you disable TOP-N, and given the content of the error message, and since 2048 really is a valid number of buckets I’ll take a guess that it’s something to do with Oracle’s handling for a TOP-N histogram when the low or high values for the column aren’t part of the initial capture of popular values. There’s a query that might help the investigation in part 2 of the series on histograms that I wrote for AllThingsOracle: http://allthingsoracle.com/histograms-pt-2/

    Comment by Jonathan Lewis — October 15, 2013 @ 7:28 pm BST Oct 15,2013 | Reply

  6. […] Part 1 – Overview of histogram enhancements in 12c […]

    Pingback by Histograms | Oracle Scratchpad — October 16, 2013 @ 5:22 pm BST Oct 16,2013 | Reply

  7. […] the legacy “frequency” and “height balanced” ones [posts by Jonathan Lewis: part 1, part 2, part 3 (with a very clear example of hybrid […]

    Pingback by DB Oriented — January 4, 2014 @ 2:19 pm GMT Jan 4,2014 | Reply

  8. […] statistics with particular reference to the trade-offs when creating histograms. With Oracle 12c it’s now reasonably safe (as far as I’m concerned) to set a method_opt as a table preference that identifies columns where you expect to see Frequency […]

    Pingback by Column Stats | Oracle Scratchpad — January 18, 2018 @ 2:22 pm GMT Jan 18,2018 | Reply

  9. […] the default histogram on trunc(dt_tm) would be a height-balanced histogram while we could create a frequency histogram in 12c since 12c allows us to specify up to 2,048 […]

    Pingback by Cardinality Puzzle | Oracle Scratchpad — July 12, 2018 @ 12:58 pm BST Jul 12,2018 | Reply

  10. […] 12c, of course, you ought to be taking advantage of the “approximate NDV” implementation for using a 100% sample to generate frequency (and Top-N / Top-Frequency histograms). If […]

    Pingback by Extended Histograms | Oracle Scratchpad — August 1, 2018 @ 3:02 pm BST Aug 1,2018 | Reply

  11. […] Overview of histogram enhancements in 12c (July 2013) […]

    Pingback by Histogram catalogue | Oracle Scratchpad — February 25, 2022 @ 10:03 am GMT Feb 25,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.