Oracle Scratchpad

May 1, 2014

Delete Histogram

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 1:27 pm BST May 1,2014

Here’s a note which I drafted in November 201then 0, and didn’t publish. I found it earlier on this morning while looking for another note I’d written about histograms so, even though it may not be something that people need so much these days, I thought: better late than never.

I’ve pointed out in the past that I’m not keen on seeing lots of histograms on a system and tend to delete them if I think they are not needed. Here’s an example of the type of code I use to delete a histogram.

declare

	srec			dbms_stats.statrec;
	m_distcnt		number;
	m_density		number;
	m_nullcnt		number;
	m_avgclen		number;

	n_array                 dbms_stats.numarray;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'n1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

	srec.bkvals := null;
	srec.novals :=	dbms_stats.numarray(
				utl_raw.cast_to_number(srec.minval),
				utl_raw.cast_to_number(srec.maxval)
			);
	srec.epc := 2;
	dbms_stats.prepare_column_values(srec, srec.novals);

	m_density := 1/m_distcnt;

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'n1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

exception
	when others then
		raise;		-- should handle div/0

end;
/

The code basically reads the column stats, resets the histogram figures to just the low and high values for the column, setting the endpoint-count to two, then adjusts the density to the standard for a column with no histogram. This specific example is for a numeric column.

Footnote: my preferred method of collecting statistics is to use method_opt => ‘for all columns size 1′ (i.e. no histograms) and then run scripts to create the histograms I want. This means that after any stats collection I need to run code that checks to see which tables have new stats, and then re-run any histogram code that I’ve written for that table.

To move from Oracle’s default histogram collection to this strategy, you could start by switching to method_opt => ‘for all columns size repeat’ (i.e. recreate existing histograms, don’t create new ones), then simply delete histograms as you find that you don’t need them, and introduce scripts to recreate the histograms that you do need. When you’ve finally got to the point where every histogram is scripted you can then switch to method_opt => ‘for all columns size 1′.

 Footnote 2: Since 2010 when I drafted this note Oracle 12c has launched, and the changes it has introduced for frequency and Top-N histograms means that I’m far less stringent in my demand that if a histogram is worth having it’s better to write code to create it. There’s a series of three articles about 12c histograms in particular at this link.

10 Comments »

  1. On 11g, DBMS_STATS. DELETE_COLUMN_STATS includes parameter col_stat_type, which allows to delete just the HISTOGRAM of a column, or ALL stats

    Comment by Carlos Sierra — May 1, 2014 @ 1:49 pm BST May 1,2014 | Reply

    • Carlos,

      Thanks for that.
      I guess that may be why I never got around to publishing the note originally – I had forgotten all about the easy option.
      Still – there are a few people around using 10g and (much) older versions ;)

      Comment by Jonathan Lewis — May 1, 2014 @ 3:44 pm BST May 1,2014 | Reply

      • Jonathan,
        no worries, i remembered that one because some time ago, i had just incorporated to SQLT that functionality (delete histograms) when I learned from Maria that 11g provided the same!… Your script is very useful for 10g or older. Thanks for sharing!

        Comment by Carlos Sierra — May 1, 2014 @ 3:59 pm BST May 1,2014 | Reply

  2. Hi Jonathan,
    a few month ago i hit Oracle bug #11786774 (MOS ID #11786774.8 – Bug 11786774 invalid histogram created by set_column_stats) at client site, which creates an invalid histogram even if only two endpoint values (and no buckets) are specified – it was on 10.2.0.5. I noticed this bug by cloning and adjusting (local) partition statistics which results in subsequent copy issues, when the number of buckets were not explicitly specified / reset every time. Unfortunately there is no (official, not hacking) work around for getting the rid of (invalid) histograms then.

    Regards
    Stefan

    Comment by Stefan Koehler — May 1, 2014 @ 5:26 pm BST May 1,2014 | Reply

    • Stefan,

      Thanks for the note.
      According to the bug description the problem appears when set_column_stats is called with a single value; obviously this shouldn’t be allowed to cause a problem, but any code that calls it is inherently wrong anyway. There’s a patch available for 10.2.0.5 for anyone on extended support.

      Interestingly I found another bug (13697529) when searching for 11786774 (which is considered to be the base bug) complaining that a call to prepare_column_stats() results in:

      ERROR at line 1:
      ORA-20001: Invalid or inconsistent input values
      ORA-6512: at "SYS.DBMS_STATS", line 5012
      ORA-6512: at line 29

      The bug is also against 10.2.0.5 and has been dormant for a couple of years – but looking at the information offered (and the sample code) the problem is that the programmer didn’t realise you had to set srec.epc (endpoint counter) to match the number of elements in the array.

      Comment by Jonathan Lewis — May 1, 2014 @ 5:40 pm BST May 1,2014 | Reply

      • Hi Jonathan,
        i used set_column_stats with the whole parameter set and not just one single value. I used the following PL/SQL code and it created that invalid (frequency) histogram (in this specific case i needed the manual adjustment of the low/high value due to Oracle bug #14607573 – dbms_stats.copy_table_stats does not adjust min/max correctly when NDV=1 on first partition column).

        ——————8<————–

        DECLARE  
            va_srec     DBMS_STATS.STATREC;  
            va_distinct    NUMBER;  
            va_density    NUMBER;  
            va_nullcnt    NUMBER;  
            va_avglen    NUMBER;  
            va_numvals    DBMS_STATS.NUMARRAY;  
        BEGIN  
            DBMS_STATS.DELETE_TABLE_STATS(USER, 'PARTTEST','P4_L5');  
            DBMS_STATS.COPY_TABLE_STATS(USER, 'PARTTEST', 'P3_L4', 'P4_L5');  
            DBMS_STATS.GET_COLUMN_STATS(USER,'PARTTEST','QID','P3_L4',NULL,NULL,va_distinct,  
                                            va_density,va_nullcnt,va_srec,va_avglen);  
            va_numvals := DBMS_STATS.NUMARRAY(4,4);  
            va_srec.epc := 2;  
            va_srec.bkvals := NULL;  
            DBMS_STATS.PREPARE_COLUMN_VALUES (va_srec, va_numvals);  
            DBMS_STATS.SET_COLUMN_STATS(USER,'PARTTEST','QID','P4_L5',NULL,NULL,va_distinct,  
                                            va_density,va_nullcnt,va_srec,va_avglen);  
            commit;  
        END;  
        / 
        

        ——————8<————–

        If Oracle creates that invalid histogram and you try to copy the (local) partition statistics from that "target partition" (in my case P4_L5) without explicitly setting va_srec.epc and va_srec.bkvals once again – you will receive that "ORA-20001: Invalid or inconsistent input values" by executing prepare_column_stats.

        Regards
        Stefan

        P.S.: If you are interested into the whole story – i have written a blog about the bug-to-bug hopping here: http://tinyurl.com/p3qxfch

        Comment by Stefan Koehler — May 1, 2014 @ 7:01 pm BST May 1,2014 | Reply

        • Stefan,
          Thanks for the link – it’s a useful reference.
          Have you considered faking it even further — even less official.
          I ran through the steps in you blog and then checked histgrm$ and hist_head$ to see if they were consistent, and they weren’t (part of an Oracle bug). The new partition had a row in histgrm$ when none of the older partitions did.

          If you bypass the prepare completely you can get a match. In the past I’ve done something like this:

          va_srec.epc := 2;
          va_srec.eavs := 0;
          va_srec.novals := DBMS_STATS.NUMARRAY(4,4);
          va_srec.bkvals := DBMS_STATS.NUMARRAY(0,1);

          va_srec.minval := utl_raw.cast_from_number(4);
          va_srec.maxval := utl_raw.cast_from_number(4);

          I don’t know if you have noticed, by the way, but I put a link to your blog in my blog roll a couple of days ago.

          Comment by Jonathan Lewis — May 1, 2014 @ 8:37 pm BST May 1,2014

        • Hi Jonathan,
          yes, i also thought about hacking and fixing that invalid (frequency) histogram information under the hood, but the client wanted it as official as possible (extended support). The cost based optimizer itself got no issues with that invalid histogram information in that application SQL context and so we decided to use the repeating “va_srec.epc” / “va_srec.bkvals” approach. I was not aware, that i can by-pass the prepare call with utl_raw – thank you very much for that hint.

          Thank you for adding me to your blog roll – what a honor to be mentioned on your site beside all the other well known Oracle guys :-))

          Regards
          Stefan

          Comment by Stefan Koehler — May 2, 2014 @ 10:17 am BST May 2,2014

  3. […] the recent article on deleting histograms this is another draft that I rediscovered while searching for some notes I had written on a […]

    Pingback by Extended stats | Oracle Scratchpad — May 4, 2014 @ 1:24 pm BST May 4,2014 | Reply

  4. Hi Jonathan,

    If I’m not mistaken the approach described above does not work perfectly for sub-partitioned tables. Please correct me if I’m wrong – I’ve been relying on statement that stats exported from table with no histograms should be equal to stats exported from the same table after histogram(s) had been added and removed afterwards.

    -- Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit
    -- 1. create and populate table with range-hash composite partition 
    CREATE TABLE SUBPART_TABLE  
    (
    owner                 varchar2(30), 
    object_name       varchar2(30), 
    subobject_name varchar2(30), 
    object_id             number, 
    data_object_id    number, 
    object_type         varchar2(19), 
    created               date, 
    last_ddl_time      date, 
    timestamp           varchar2(19), 
    status                  varchar2(7), 
    temporary           varchar2(1), 
    generated           varchar2(1), 
    secondary          varchar2(1)
    )
    partition by range(created)
    subpartition by hash (object_id)
    subpartition template(  subpartition sp1,  subpartition sp2,  subpartition sp3,  subpartition sp4) 
    (
    partition objects_2013 values less than(to_date('01/01/2014','mm/dd/yyyy')),
    partition objects_2014 values less than(to_date('01/01/2015','mm/dd/yyyy'))
    );
    
    insert into SUBPART_TABLE 
    select owner, object_name, subobject_name, object_id, data_object_id, object_type, 
               sysdate - rownum, 
               last_ddl_time, timestamp, status, temporary, generated, secondary
       from all_objects 
    where rownum  100, 
                  method_opt =&gt; 'FOR ALL COLUMNS SIZE 1');
       dbms_stats.export_table_stats(user, 'SUBPART_TABLE', stattab =&gt; 'STATS_TABLE_SIZE1');
    end; 
    
    -- 3.  Delete previous and gather stats again with histogram on OBJECT_TYPE column
    begin
       dbms_stats.delete_table_stats(user, 'SUBPART_TABLE'); -- just to be sure
       dbms_stats.gather_table_stats(user, 'SUBPART_TABLE', estimate_percent =&gt; 100, 
                       method_opt =&gt; 'FOR ALL COLUMNS SIZE 1 FOR COLUMNS SIZE 254 OBJECT_TYPE');
    end;
    
    -- 4. Check histogram 
    select count(*) from user_tab_columns utc where utc.table_name = 'SUBPART_TABLE' and utc.histogram  'NONE';
    
    count(*)
    ----------
    1 
    
    -- 5. Run "delete histogram" routine described above
    ......
    
    -- 6. Export stats again to another table
    begin
       dbms_stats.create_stat_table(user, 'STATS_TABLE_AFTER_HSGM_REMOVAL');
       dbms_stats.export_table_stats(user, 'SUBPART_TABLE', stattab =&gt; 'STATS_TABLE_AFTER_HSGM_REMOVAL');
    end;
    
    -- 7. Compare stats when method_opt was set to ‘FOR ALL COLUMNS SIZE 1’ and after histogram removal
    _tab_columns disctionary shows no histograms in both cases but there is a huge difference in STATS_TABLE_SIZE1 and STATS_TABLE_AFTER_HSGM_REMOVAL tables (number of rows is not the same, etc.). 
    
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    
    

    I’ve been using following routine to remove histograms in 10g:

    -- 1. Export stats from the source table 
    begin
       dbms_stats.create_stat_table(user, 'HSTRGM$STATS');
       dbms_stats.export_table_stats(user, 'SUBPART_TABLE', stattab =&gt; 'HSTRGM$STATS');
    end; 
    
    -- 2. Remove records with histogram info 
    declare
          
       p_table varchar2(30) := 'SUBPART_TABLE';
       p_owner varchar2(30) := USER;
    
       type tstatstype is table of hstrgm$stats%rowtype;
       v_stats_row      tStatsType;
       v_sysdate        constant date := sysdate;
       type THelpV2Type is table of varchar2(30);
       v_c2_list THelpV2Type          := THelpV2Type();
       v_c3_list THelpV2Type          := THelpV2Type();
          
       cursor c_get_hstgrm_data(p_owner_name  varchar2
                              , p_table_name  varchar2                          
                              , p_column_name varchar2
       )
       is
          select max(statid)
                ,max(s.type)
                ,max(s.version)
                ,2 
                ,c1,c2,c3,c4,c5
                ,max(n1)
                ,case when nvl(max(n1), 0) = 0 then 0 else trunc(1 / max(n1), 18) end 
                ,max(n3) 
                ,max(n4) 
                ,max(n5) 
                ,max(n6) 
                ,max(n7) 
                ,max(n8) 
                ,null 
                ,null 
                ,null 
                ,max(n12) 
                ,v_sysdate
                ,max(r1) 
                ,max(r2) 
                ,null
                /* ,null  - uncomment in case of 11g */
            from hstrgm$stats s
           where s.c1 = p_table_name
             and s.c4 = p_column_name
             and s.c5 = p_owner_name
             and s.type = 'C'
           group by s.c1, s.c2, s.c3, s.c4, s.c5;
          
    begin
          
       for iHistgmColumn in (select * 
                                             from user_tab_columns u 
                                          where u.table_name = p_table
                                              and u.histogram  'NONE'                                                 
       )
       loop
          open c_get_hstgrm_data(p_owner_name   =&gt; p_owner
                                                 , p_table_name     =&gt; p_table
                                                 , p_column_name =&gt; iHistgmColumn.Column_Name);
          loop
             fetch c_get_hstgrm_data
             bulk collect into v_stats_row limit 1000;
             exit when v_stats_row.count = 0;   
         
             /* begin 10g workaround (not the best) */
             v_c2_list.delete;
             v_c2_list.extend(v_stats_row.count);
             v_c3_list.delete;
             v_c3_list.extend(v_stats_row.count);
                
             for i in 1 .. v_stats_row.count loop
                v_c2_list(i) := v_stats_row(i).c2;         
                v_c3_list(i) := v_stats_row(i).c3;                        
             end loop;
             /* end 10g workaround */            
    
             forall j in 1..v_stats_row.count 
                delete from hstrgm$stats s 
                 where s.c1   = p_table
                   and decode(s.c2, v_c2_list(j), 1) = 1
                   and decode(s.c3, v_c3_list(j), 1) = 1
                   and s.c4   = iHistgmColumn.Column_Name
                   and s.c5   = p_owner
                   and s.type = 'C';
    
              forall j in 1..v_stats_row.count 
                insert into hstrgm$stats s values v_stats_row(j);                  
                   
          end loop;
          close c_get_hstgrm_data; 
       end loop;
                
    end;  
    
    
    -- 3. Import updated stats to the source table
    begin
       dbms_stats.import_table_stats(user, 'SUBPART_TABLE', stattab =&gt; 'HSTRGM$STATS');
    end;
    
    

    This routine could be also used in 11g. It is little bit faster for sub-partitioned tables than removing histograms using DELETE_COLUMN_STATS (with col_stat_type parameter)

    Regards,
    Yuriy

    Comment by Yuriy — June 18, 2014 @ 1:10 pm BST Jun 18,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,877 other followers