Oracle Scratchpad

July 14, 2013

12c histograms

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 7:11 pm BST Jul 14,2013

There are a few enhancements in 12c that might make a big difference to performance for a small investment in effort. One of the important enhancements comes from changes in histograms – which improve speed of collection with accuracy of results. The changes are so significant that I chose the topic as my presentation at OpenWorld last year.

In 11g Oracle gave us the option for using an “approximate NDV (number of distinct values)” for rapid an accurate collection of basic column stats. In 12c Oracle extends the approximate NDV mechanism to frequency histograms and a new type of frequency histogram called a “Top-N histogram” – at the same time the maximum number of buckets allowed for a histogram has gone up from 254 to 2000 – although the default stays at 254 and you probably don’t need to increase it in most cases (remember the side effect you might have on the sysaux tablespace as Oracle retains the historical stats).

Height-balanced histograms are still relatively expensive to collect – but there is a newer type of height-balanced histogram named the hybrid histogram which gives you better information than the old height-balanced for the same amount of work, and thanks to the “Top-N” histogram you may find that some of your old (expensive, inaccurate) height-balanced histograms are replaced by cheap, accurate Top-N histograms.

In part 2 of this mini-series I’ll describe the new mechanism for the frequency histogram and the logic of the Top-N histogram, and in part 3 I’ll describe the mechanism and demonstrate the benefits of the Hybrid histogram. The takeaway from this note, though, is that you need to look at your current stats collection, and think about how a small change in strategy could both reduce the workload and improve the accuracy of your histograms.

9 Comments »

  1. Jonathan,

    I thought you were generally against using histograms – you were/are recommending on multiple occasions to turn them off (~ set size to 1).
    Did anything about 12c change your mind in this matter?

    Comment by Josef — July 15, 2013 @ 9:44 am BST Jul 15,2013 | Reply

    • Josef,

      I’m still very careful about (and generally against) using histograms.

      Key points:
      * You can collect a (frequency – in particular) histogram at the wrong time
      * A (height-balanced) histogram can vary its contents from collection to collection even when you collect on the same data at the same time of day – just because of the luck of random selection when sampling.

      With 12c (and suitable configuration – to be written up):
      * a frequency histogram is “free of charge” while being 100% accurate
      * a “top-N” histogram is “free of charge” and cab be very accurate when there are very many values but most of the data comes from a relatively small number of values
      * a “hybrid” histogram gives much more information, stability, and accuracy at the same cost as the equivalent height-based.

      So if you’re letting Oracle gather histograms now (and not having problems) then 12c will probably gather them faster and better.

      It’s still possible, though, to gather them at the wrong time of day, and it will still be cheaper and safer to write your own code to fake them in some (probably fewer) special cases.

      Comment by Jonathan Lewis — July 24, 2013 @ 12:39 pm BST Jul 24,2013 | Reply

  2. there are some interesting bugs that are found in 12cR1, i know about one which is follows:

    Bug 16933305
    Adding a nullable column with a default value and then later unsetting the default value in an Oracle Database 12c Release 1 (12.1) environment will not unset the default (back to NULL) despite the data dictionary reflecting the change to a default value of NULL. The statements that are impacted are ALTER TABLE x ADD (y NUMBER DEFAULT 99) followed by a subsequent ALTER TABLE x MODIFY (y DEFAULT NULL) where y is a nullable column.
    Workaround: Unset the default by using the ALTER TABLE x MODIFY (y DEFAULT TRIM(”)) statement which has the same semantic result as unsetting the default.

    Comment by OracleCommunity — July 19, 2013 @ 10:49 am BST Jul 19,2013 | Reply

    • Thanks for the comment.
      I think the guideline here is to test new features carefully – and make sure that one of the test is: “now what if I change my mind”.
      The Oak Table mailing list has recently had an interesting chat about the effect of: “make column invisible, oops not that one the other one”, and what could happen to an application when make the column visible again. (Answer – if your code doesn’t always identify columns by name, then maybe crash the application until you fiddle things back into the right order.)

      I’ve cross-linked your comment into a posting on “12 Debug” which references Oracle’s list of known 12c bugs: http://jonathanlewis.wordpress.com/2013/06/28/12c-debug/

      Comment by Jonathan Lewis — July 24, 2013 @ 12:49 pm BST Jul 24,2013 | Reply

  3. […] around to writing part 3 of this mini-series on what Oracle has done about histograms in 12c. In part 1 I gave a thumbnail sketch of the three types of histogram available in 12c In part 2 I described in […]

    Pingback by 12c Histograms pt.3 | Oracle Scratchpad — October 9, 2013 @ 8:14 pm BST Oct 9,2013 | Reply

  4. Jonathan,
    “at the same time the maximum number of buckets allowed for a histogram has gone up from 254 to 2000″

    in 12c it seems that the maximum number of buckets is of 2,048 …

    SQL> select banner from v$version;

    BANNER
    ——————————————————————————–
    Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 – 64bit Production
    PL/SQL Release 12.1.0.1.0 – Production
    CORE 12.1.0.1.0 Production
    TNS for Linux: Version 12.1.0.1.0 – Production
    NLSRTL Version 12.1.0.1.0 – Production

    SQL> exec dbms_stats.gather_table_stats(user, ‘Z’, method_opt=> ‘for columns c size 2048′, estimate_percent=>100);

    PL/SQL procedure successfully completed.

    [
    COLUMN_NAM HISTOGRAM NUM_BUCKETS
    ---------- --------------- -----------
    C FREQUENCY 2048
    ]

    I also believe that there is some problem (or something that I haven’t yet clear)
    using DBMS_STATS with the limit values​​:

    SQL> exec dbms_stats.gather_table_stats(user, ‘Z’, method_opt=> ‘for columns c size 2048′);
    BEGIN dbms_stats.gather_table_stats(user, ‘Z’, method_opt=> ‘for columns c size 2048′); END;

    *
    ERROR at line 1:
    ORA-20010: DBMS_STATS INTERNAL ERROR in process_topn_values : parse count != given count
    ORA-06512: at “SYS.DBMS_STATS”, line 33859
    ORA-06512: at line 1

    SQL> exec dbms_stats.set_param(‘APPROXIMATE_NDV’,’FALSE’);

    PL/SQL procedure successfully completed.

    SQL> exec dbms_stats.gather_table_stats(user, ‘Z’, method_opt=> ‘for columns c size 2048′);

    PL/SQL procedure successfully completed.

    [
    COLUMN_NAM HISTOGRAM NUM_BUCKETS
    ---------- --------------- -----------
    C FREQUENCY 2047
    ]

    Regards,
    Donatello

    Comment by dsettembrino — October 15, 2013 @ 7:00 pm BST Oct 15,2013 | Reply

  5. Donatello,

    Thanks for the correction and warning – not sure why I wrote 2,000, but it’s the sort of error that I would keep repeating once I’d made it once. (Good job I don’t think you should rarely need to change the default,)

    Since you get an error when you disable TOP-N, and given the content of the error message, and since 2048 really is a valid number of buckets I’ll take a guess that it’s something to do with Oracle’s handling for a TOP-N histogram when the low or high values for the column aren’t part of the initial capture of popular values. There’s a query that might help the investigation in part 2 of the series on histograms that I wrote for AllThingsOracle: http://allthingsoracle.com/histograms-pt-2/

    Comment by Jonathan Lewis — October 15, 2013 @ 7:28 pm BST Oct 15,2013 | Reply

  6. […] Part 1 – Overview of histogram enhancements in 12c […]

    Pingback by Histograms | Oracle Scratchpad — October 16, 2013 @ 5:22 pm BST Oct 16,2013 | Reply

  7. […] the legacy “frequency” and “height balanced” ones [posts by Jonathan Lewis: part 1, part 2, part 3 (with a very clear example of hybrid […]

    Pingback by DB Oriented — January 4, 2014 @ 2:19 pm BST Jan 4,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,014 other followers