Oracle Scratchpad

December 2, 2016

Histogram Upgrade

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 3:02 pm GMT Dec 2,2016

I’ve written notes in the past about the improvements 12c introduces for histograms – particularly the frequency and top-N histograms which can be generated “free of charge” while the basic “approximate NDV” scans are taking place to gather stats. Gathering histograms in 12c is much safer than it used to be in earlier versions of Oracle even in the case of the new hybrid histograms (which are still sampled on a very small sample and therefore still a stability risk).

There is a threat, though, recently highlighted by Franck Pachot, that sneaked in at 11.2.0.4 – the way in which the calculation of endpoint values for histograms on char() and nchar() columns has changed. When you upgrade from anything prior to 11.2.0.4 to either 11.2.0.4 or 12c you need to recreate any historgams on those columns; note that this does not apply to varchar2() and nvarchar2() columns, just the fixed length character types. If you fail to do this then you may find that Oracle produces some very silly estimates of cardinality, which could result in some very inefficient tablescans – in particular you are likely to find (as Franck showed) cases where you “know” that a particular value is IN the histogram but the optimizer behaves as if it isn’t – which means it uses the “half the least popular” estimate for the cardinality.

Here’s a little demo to show the underlying difference:


rem
rem     Script:         histogram_change_11204.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2016
rem     Purpose:
rem

create table t1 (v1 varchar2(32), c1 char(32));
insert into t1
select
        case when rownum <= 100 then 'N' else 'Y' end,
        case when rownum <= 100 then 'N' else 'Y' end
from
        all_objects
where
        rownum <= 1000
;
begin
        dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 10'
        );
end;
/

column endpoint_value format 999,999,999,999,999,999,999,999,999,999,999,999
break on column_name skip 1

select
        column_name, endpoint_number, endpoint_value, to_char(endpoint_value,'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
from
        user_tab_histograms
where
        table_name = 'T1'
order by
        column_name,
        endpoint_number
;

Here are the results from an instance of 11.1.0.7 (though anything up to 11.2.0.3 should produce the same), and 11.2.0.4 (and later – including 12.2):

Results 11.1.0.7
----------------
COLUMN_NAME          ENDPOINT_NUMBER                                   ENDPOINT_VALUE TO_CHAR(ENDPOINT_VALUE,'XXXXXXXXX
-------------------- --------------- ------------------------------------------------ ---------------------------------
C1                               100  405,650,737,316,592,000,000,000,000,000,000,000    4e20202020203a7bb119d5f6000000
                                1000  462,766,002,760,475,000,000,000,000,000,000,000    59202020202034d998ff0b5ae00000

V1                               100  404,999,154,965,717,000,000,000,000,000,000,000    4e0000000000181f436c7bbb200000
                                1000  462,114,420,409,600,000,000,000,000,000,000,000    590000000000127d2b51b120000000

Results 11.2.0.4
----------------
COLUMN_NAME          ENDPOINT_NUMBER                                   ENDPOINT_VALUE TO_CHAR(ENDPOINT_VALUE,'XXXXXXXXX
-------------------- --------------- ------------------------------------------------ ---------------------------------
C1                               100  404,999,154,965,717,000,000,000,000,000,000,000    4e0000000000181f436c7bbb200000
                                1000  462,114,420,409,600,000,000,000,000,000,000,000    590000000000127d2b51b120000000

V1                               100  404,999,154,965,717,000,000,000,000,000,000,000    4e0000000000181f436c7bbb200000
                                1000  462,114,420,409,600,000,000,000,000,000,000,000    590000000000127d2b51b120000000

Look particularly at the first 6 bytes of the Hex version of the endpoint values for the char() column c1. In 11.1.0.7 you see “4e2020202020”, “592020202020” – that’s ASCII ‘N’ and ‘Y’ respectively, padded to 6 characters with spaces. In 11.2.0.4 the spaces have disappeared – the char() columns are now padded to 6 characters with zeros (which is how varchar2() columns have always been treated).

In 11.1.0.7 the optimizer will find a histogram entry for c1 = ‘Y’ and produce a cardinality of 900; if you upgrade the database to 11.2.0.4 without recreating the histograms the optimizer won’t find a histogram entry for the predicate and will produce a cardinality of 50 (i.e. 100 / 2).

Footnote

There’s a brief summary of the algorithm Oracle uses to generate values for character-based histograms at this URL.

 

October 24, 2016

Anniversary OICA

Filed under: CBO,Oracle,Performance,Statistics,Troubleshooting — Jonathan Lewis @ 1:00 pm GMT Oct 24,2016

Happy anniversary to me!

On this day 10 years ago I published the first article in my blog. It was about the parameter optimizer_index_cost_adj (hence OICA), a parameter that has been a  source of many performance problems and baffled DBAs over the years and, if you read my first blog posting and follow the links, a parameter that should almost certainly be left untouched.

It seems appropriate to mention it today because I recently found a blog posting (dated 3rd May 2013) on the official Oracle Blogs where the director for Primavera advises setting this parameter to 1 (and the optimizer_index_caching parameter to 90) for the Primavera P6 OLTP (PMDB) database. The recommendation is followed by a fairly typical “don’t blame me” warning, viz: “As with any changes that affect query optimization, it is paramount to TEST, TEST and TEST again. At least these settings are easily adjusted or change back to the original value”.

Here’s a thought, though: setting the optimizer_index_cost_adj to the extreme value 1 is a catastrophic change so don’t suggest it unless you are extremely confident that it’s almost certain to be the right thing to do. If you’re confident that it’s a good idea to reduce the parameter to a much smaller value than the default then suggest a range of values that varies from “ideal if it works, but high risk” to “low risk and mostly helpful”. Maybe a suggestion like: “Primavera P6 OLTP (PMDB) tends to work best with this parameter set to a value in the range of 1 to 15” would be a more appropriate comment from someone in a position of authority.

Here’s another thought: if you work for Oracle you could always contact the optimizer group to present them with your argument for the strategy and see what they think about it. Then you can include their opinion when you offer your suggestion.

For what it’s worth, here’s my opinion: as a general rule you shouldn’t be working around performance issues by fiddling with the optimizer_index_cost_adj; as a specific directive do not set it to 1. If you want to encourage Oracle to be enthusiastic about indexes in general then adjust the system statistics (preferably with a degree of truth). If you need to persuade Oracle that particular indexes are highly desirable than you can use dbms_stats.set_index_stats() to adjust the clustering_factor (and avg_data_blocks_per_key) of those indexes. If you are running 11.2.0.4 or later then you can use dbms_stats.set_table_prefs() to set the “table_cached_blocks” parameter for tables where you think Oracle should be particularly keen on using indexes but isn’t; and if your queries are suffering from bad cardinality estimates because of a pattern of multi-column filter predicates create some column group (extended) statistics.

Why am I so firmly set against setting the optimizer_index_cost_adj to 1 ? Because it doesn’t tell Oracle to “use indexes instead of doing tablescans”, it tells Oracle that every index is just about as good as every other index for almost any query. Here’s a pdf file of an article (formerly published on DBAZine and then on my old website) I wrote over twelve years ago explaining the issue. Various links in the article no longer work, and the data pattern was generated to display the problem in 8i and 9i and you would need to modify the data to display the same effect in newer versions of Oracle – but the principle remains the same.

If you would like to see a slightly newer example of how the parameter causes problems. Here’s a thread dated April 2012 from the OTN database forum where a SYS-recursive query caused a performance problem because the parameter was set 1.

 

October 17, 2016

Fixed Stats

Filed under: Execution plans,Oracle,Statistics,Statspack,Troubleshooting — Jonathan Lewis @ 12:43 pm GMT Oct 17,2016

There are quite a lot of systems around the world that aren’t using the AWR (automatic workload repository) and ASH (active session history) tools to help them with trouble shooting because of the licensing requirement – so I’m still finding plenty of sites that are using Statspack and I recently came across a little oddity at one of these sites that I hadn’t noticed before: one of the Statspack snapshot statements was appearing fairly regularly in the Statspack report under the “SQL Ordered by Elapsed Time” section – even when the application had been rather busy and had generated lots of other work that was being reported. It was the following statement – the collection of file-level statistics:


select
       ts.name      tsname
     , df.name      filename
     , fs.phyrds
     , fs.phywrts
     , fs.readtim
     , fs.writetim
     , fs.singleblkrds
     , fs.phyblkrd
     , fs.phyblkwrt
     , fs.singleblkrdtim
     , fw.count     wait_count
     , fw.time      time
     , df.file#
  from x$kcbfwait   fw
     , v$filestat   fs
     , v$tablespace ts
     , v$datafile   df
 where ts.ts#    = df.ts#
   and fs.file#  = df.file#
   and fw.indx+1 = df.file#
;

The execution plan didn’t look very friendly, and the volume of I/O it generated (several hundred thousand disk reads) was surprising. The reason why the statement stood out so much in this case was that there was a fairly large number of files in the database (over 1,000) and the default execution plan was showing very bad cardinality estimates that resulted in highly inappropriate cartesian merge joins. At best the statement was taking around 2 minutes to run, at worst it was much, much worse.

This system was running 10g – also something which is still fairly common, though becoming much scarcer – which produced the following execution plan (which  I’ve recreated on a much smaller system):

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   535 |     2 (100)| 00:00:01 |
|*  1 |  HASH JOIN                    |                 |     1 |   535 |     2 (100)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN        |                 |     5 |  1190 |     0   (0)| 00:00:01 |
|   3 |    NESTED LOOPS               |                 |     1 |   199 |     0   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN      |                 |     1 |   173 |     0   (0)| 00:00:01 |
|*  5 |      FIXED TABLE FULL         | X$KCCTS         |     1 |    43 |     0   (0)| 00:00:01 |
|   6 |      BUFFER SORT              |                 |     1 |   130 |     0   (0)| 00:00:01 |
|*  7 |       FIXED TABLE FULL        | X$KCFIO         |     1 |   130 |     0   (0)| 00:00:01 |
|*  8 |     FIXED TABLE FIXED INDEX   | X$KCCFE (ind:1) |     1 |    26 |     0   (0)| 00:00:01 |
|   9 |    BUFFER SORT                |                 |   100 |  3900 |     0   (0)| 00:00:01 |
|  10 |     FIXED TABLE FULL          | X$KCBFWAIT      |   100 |  3900 |     0   (0)| 00:00:01 |
|* 11 |   VIEW                        | GV$DATAFILE     |     1 |   297 |     1 (100)| 00:00:01 |
|  12 |    SORT ORDER BY              |                 |     1 |   957 |     1 (100)| 00:00:01 |
|  13 |     NESTED LOOPS              |                 |     1 |   957 |     0   (0)| 00:00:01 |
|  14 |      NESTED LOOPS             |                 |     1 |   647 |     0   (0)| 00:00:01 |
|  15 |       NESTED LOOPS            |                 |     1 |   371 |     0   (0)| 00:00:01 |
|* 16 |        FIXED TABLE FULL       | X$KCCFN         |     1 |   323 |     0   (0)| 00:00:01 |
|* 17 |        FIXED TABLE FIXED INDEX| X$KCVFH (ind:1) |     1 |    48 |     0   (0)| 00:00:01 |
|* 18 |       FIXED TABLE FIXED INDEX | X$KCCFE (ind:1) |     1 |   276 |     0   (0)| 00:00:01 |
|* 19 |      FIXED TABLE FULL         | X$KCCFN         |     1 |   310 |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TSTSN"="TS#" AND "K"."KCFIOFNO"="FILE#" AND "FILE#"="FW"."INDX"+1)
   5 - filter("TSTSN"<>(-1) AND "INST_ID"=USERENV('INSTANCE'))
   7 - filter("K"."INST_ID"=USERENV('INSTANCE'))
   8 - filter("F"."FEDUP"<>0 AND "F"."FENUM"="K"."KCFIOFNO")
  11 - filter("INST_ID"=USERENV('INSTANCE'))
  16 - filter("FN"."FNNAM" IS NOT NULL AND "FN"."FNTYP"=4 AND BITAND("FN"."FNFLG",4)<>4)
  17 - filter("FN"."FNFNO"="FH"."HXFIL")
  18 - filter("FE"."FEDUP"<>0 AND "FN"."FNFNO"="FE"."FENUM" AND
              "FE"."FEFNH"="FN"."FNNUM" AND "FE"."FETSN"<>(-1))
  19 - filter("FE"."FEPAX"<>65535 AND "FE"."FEPAX"<>0 AND "FE"."FEPAX"="FNAUX"."FNNUM"
              OR ("FE"."FEPAX"=0 OR "FE"."FEPAX"=65535) AND "FE"."FENUM"="FNAUX"."FNFNO" AND
              "FNAUX"."FNTYP"=4 AND "FNAUX"."FNNAM" IS NOT NULL AND BITAND("FNAUX"."FNFLG",4)<>4 AND
              "FE"."FEFNH"="FNAUX"."FNNUM")

Note particularly the two Cartesian merge joins and the very late filter at operation 1.

Note also the number of times the cardinality estimate is 1 – always a bit of a threat when the query gets complicated: “anything goes following a one for Rows”.

The easy (first thought) solution was simply to gather stats on all the fixed objects in this query:


begin
        dbms_stats.gather_table_stats('sys','x$kcbfwait',method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kccfe',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kccfn',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kccts',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kcfio',   method_opt=>'for all columns size 1');
        dbms_stats.gather_table_stats('sys','x$kcvfh',   method_opt=>'for all columns size 1');
end;
/

The option to gather fixed objects stats individually with a call to dbms_stats.gather_table_stats() is not commonly known, but it does work.

Here’s the plan (again from the small system) after stats collection:

--------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                 |    29 | 10411 |     3 (100)| 00:00:01 |
|*  1 |  HASH JOIN                     |                 |    29 | 10411 |     3 (100)| 00:00:01 |
|*  2 |   HASH JOIN                    |                 |     1 |   350 |     2 (100)| 00:00:01 |
|   3 |    MERGE JOIN CARTESIAN        |                 |    25 |  1325 |     0   (0)| 00:00:01 |
|   4 |     NESTED LOOPS               |                 |     4 |   148 |     0   (0)| 00:00:01 |
|*  5 |      FIXED TABLE FULL          | X$KCFIO         |   200 |  6200 |     0   (0)| 00:00:01 |
|*  6 |      FIXED TABLE FIXED INDEX   | X$KCCFE (ind:1) |     1 |     6 |     0   (0)| 00:00:01 |
|   7 |     BUFFER SORT                |                 |     7 |   112 |     0   (0)| 00:00:01 |
|*  8 |      FIXED TABLE FULL          | X$KCCTS         |     7 |   112 |     0   (0)| 00:00:01 |
|*  9 |    VIEW                        | GV$DATAFILE     |     1 |   297 |     1 (100)| 00:00:01 |
|  10 |     SORT ORDER BY              |                 |     1 |   316 |     1 (100)| 00:00:01 |
|  11 |      NESTED LOOPS              |                 |     1 |   316 |     0   (0)| 00:00:01 |
|  12 |       NESTED LOOPS             |                 |     1 |   248 |     0   (0)| 00:00:01 |
|  13 |        NESTED LOOPS            |                 |     1 |   226 |     0   (0)| 00:00:01 |
|* 14 |         FIXED TABLE FULL       | X$KCCFE         |     4 |   612 |     0   (0)| 00:00:01 |
|* 15 |         FIXED TABLE FIXED INDEX| X$KCCFN (ind:1) |     1 |    73 |     0   (0)| 00:00:01 |
|* 16 |        FIXED TABLE FIXED INDEX | X$KCVFH (ind:1) |     1 |    22 |     0   (0)| 00:00:01 |
|* 17 |       FIXED TABLE FULL         | X$KCCFN         |     1 |    68 |     0   (0)| 00:00:01 |
|  18 |   FIXED TABLE FULL             | X$KCBFWAIT      |   400 |  3600 |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("FILE#"="FW"."INDX"+1)
   2 - access("TSTSN"="TS#" AND "K"."KCFIOFNO"="FILE#")
   5 - filter("K"."INST_ID"=USERENV('INSTANCE'))
   6 - filter("F"."FEDUP"<>0 AND "F"."FENUM"="K"."KCFIOFNO")
   8 - filter("TSTSN"<>(-1) AND "INST_ID"=USERENV('INSTANCE'))
   9 - filter("INST_ID"=USERENV('INSTANCE'))
  14 - filter("FE"."FEDUP"<>0 AND "FE"."FETSN"<>(-1))
  15 - filter("FN"."FNTYP"=4 AND "FN"."FNNAM" IS NOT NULL AND BITAND("FN"."FNFLG",4)<>4
              AND "FN"."FNFNO"="FE"."FENUM" AND "FE"."FEFNH"="FN"."FNNUM")
  16 - filter("FN"."FNFNO"="FH"."HXFIL")
  17 - filter("FE"."FEPAX"<>65535 AND "FE"."FEPAX"<>0 AND "FE"."FEPAX"="FNAUX"."FNNUM" OR
              ("FE"."FEPAX"=0 OR "FE"."FEPAX"=65535) AND "FE"."FENUM"="FNAUX"."FNFNO" AND
              "FNAUX"."FNTYP"=4 AND "FNAUX"."FNNAM" IS NOT NULL AND BITAND("FNAUX"."FNFLG",4)<>4 AND
              "FE"."FEFNH"="FNAUX"."FNNUM")

Note the changes in cardinality estimates: they now look a little more realistic and we’re down to one cartesian merge join which (if you have a rough idea of what your X$ tables hold) still looks a little surprising at first sight but not completely unreasonable. A change of plan doesn’t necessarily mean much without the data and time behind it, of course, so here are the two sets of results from a 10g database with a handful of datafiles and tablespaces showing the Row Source Operation sections from the tkprof output before and after stats collection:

Before stats collection:

Rows     Row Source Operation
-------  ---------------------------------------------------
      6  HASH JOIN  (cr=0 pr=0 pw=0 time=1957860 us)
  16800   MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=33855662 us)
     42    NESTED LOOPS  (cr=0 pr=0 pw=0 time=73795 us)
   1400     MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=21555 us)
      7      FIXED TABLE FULL X$KCCTS (cr=0 pr=0 pw=0 time=3204 us)
   1400      BUFFER SORT (cr=0 pr=0 pw=0 time=7233 us)
    200       FIXED TABLE FULL X$KCFIO (cr=0 pr=0 pw=0 time=1210 us)
     42     FIXED TABLE FIXED INDEX X$KCCFE (ind:1) (cr=0 pr=0 pw=0 time=1859092 us)
  16800    BUFFER SORT (cr=0 pr=0 pw=0 time=67643 us)
    400     FIXED TABLE FULL X$KCBFWAIT (cr=0 pr=0 pw=0 time=2008 us)
      6   VIEW  GV$DATAFILE (cr=0 pr=0 pw=0 time=68087 us)
      6    SORT ORDER BY (cr=0 pr=0 pw=0 time=68065 us)
      6     NESTED LOOPS  (cr=0 pr=0 pw=0 time=65989 us)
      6      NESTED LOOPS  (cr=0 pr=0 pw=0 time=56632 us)
      6       NESTED LOOPS  (cr=0 pr=0 pw=0 time=47217 us)
      6        FIXED TABLE FULL X$KCCFN (cr=0 pr=0 pw=0 time=19830 us)
      6        FIXED TABLE FIXED INDEX X$KCVFH (ind:1) (cr=0 pr=0 pw=0 time=25568 us)
      6       FIXED TABLE FIXED INDEX X$KCCFE (ind:1) (cr=0 pr=0 pw=0 time=9849 us)
      6      FIXED TABLE FULL X$KCCFN (cr=0 pr=0 pw=0 time=9715 us)

After stats collection:


Rows     Row Source Operation
-------  ---------------------------------------------------
      6  HASH JOIN  (cr=0 pr=0 pw=0 time=196576 us)
      6   HASH JOIN  (cr=0 pr=0 pw=0 time=195829 us)
     42    MERGE JOIN CARTESIAN (cr=0 pr=0 pw=0 time=4390 us)
      6     NESTED LOOPS  (cr=0 pr=0 pw=0 time=7810 us)
    200      FIXED TABLE FULL X$KCFIO (cr=0 pr=0 pw=0 time=1224 us)
      6      FIXED TABLE FIXED INDEX X$KCCFE (ind:1) (cr=0 pr=0 pw=0 time=150150 us)
     42     BUFFER SORT (cr=0 pr=0 pw=0 time=1574 us)
      7      FIXED TABLE FULL X$KCCTS (cr=0 pr=0 pw=0 time=1353 us)
      6    VIEW  GV$DATAFILE (cr=0 pr=0 pw=0 time=41058 us)
      6     SORT ORDER BY (cr=0 pr=0 pw=0 time=41005 us)
      6      NESTED LOOPS  (cr=0 pr=0 pw=0 time=39399 us)
      6       NESTED LOOPS  (cr=0 pr=0 pw=0 time=34229 us)
      6        NESTED LOOPS  (cr=0 pr=0 pw=0 time=15583 us)
      6         FIXED TABLE FULL X$KCCFE (cr=0 pr=0 pw=0 time=1124 us)
      6         FIXED TABLE FIXED INDEX X$KCCFN (ind:1) (cr=0 pr=0 pw=0 time=15067 us)
      6        FIXED TABLE FIXED INDEX X$KCVFH (ind:1) (cr=0 pr=0 pw=0 time=18971 us)
      6       FIXED TABLE FULL X$KCCFN (cr=0 pr=0 pw=0 time=5581 us)
    400   FIXED TABLE FULL X$KCBFWAIT (cr=0 pr=0 pw=0 time=1615 us)

The execution time has dropped from about 2 seconds to less than 1/10th of a second – and all I’ve got is 6 or 7 files and tablespaces.  (Some of the “time=” values in the first plan are very odd, but the final time figure is about right.)

Generating an intermediate result set of 16,800 rows for a tiny number of files is not a good thing – just imagine how big that number would get with 1,000 files and a couple of hundred tablespaces.

I have to say that (for a couple of tiny databases) the 11.2.0.4 and 12.1.0.2 systems I checked this query on had no problem and immediately picked a sensible path. It’s possible that the definition of some of the v$ objects has actually changed or that the optimizer features have changed (some assistence from complex view merging, perhaps) – but if you are still running Statspack, even if it’s on 11g or 12c, then it’s worth checking from time to time how much work goes into executing the snapshot and seeing if you need some fixed object stats to make things a little more efficient.

Footnote:

Several years ago I wrote a short note about how Statspack actually captured its own execution time (from 10g onwards) and how you could run a report on it to check the run time. It’s worth running that report from time to time. I’ve recently updated that note to add the equivalent query against the AWR.

September 26, 2016

PK Histogram

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 10:25 am GMT Sep 26,2016

One of the little myths of Oracle appeared on the Oracle-L list server a few days ago – the one that says: “you don’t need a histogram on a single column unique/primary key”.

Not only can a histogram be helpful on a column that’s declared to hold unique values, the optimizer may even spot the need automatically. It’s a little unusual (and probably the result of poor programming practice) but it does happen. Here’s an example demonstrating the principle:


rem
rem     Script:         pk_histogram.sql
rem     Author:         Jonathan Lewis
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4
rem             11.1.0.7
rem             10.2.0.5
rem

create table t1
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum  id
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4
;

insert into t1 select id + 1e6 from t1;
insert into t1 select id + 1e7 from t1;

alter table t1 add constraint t1_pk primary key(id);

select
        /*+ dynamic_sampling(0) */
        *
from    t1
where
        id between 12000 and 13000
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1'
        );
end;
/


select
        column_name, sample_size,
        num_distinct, num_nulls, density,
        histogram, num_buckets
from   
        user_tab_cols
where
        table_name = 'T1'
order by
        column_name
;

I’ve created a small data set some large gaps in the ranges of values used, then queried the data with a range-based predicate that will return no rows. After that I’ve let Oracle do it’s own thing with gathering stats (I think all the preferences are at their defaults). This is the result I got:

COLUMN_NAME          SAMPLE_SIZE NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM       NUM_BUCKETS
-------------------- ----------- ------------ ---------- ---------- --------------- -----------
ID                          5555        40000          0 .000024752 HEIGHT BALANCED         254

Oracle will gather a histogram automatically if it can see that the distribution of the data you are querying is skewed; but people tend to think of “skewed” as meaning there is wide variation in the number of rows for a given value and forget that it can also mean a variation in the number of rows within a given size of range. All I’ve done with my example is cue Oracle to the fact that I have big gaps in the range of known values – so it has produced a histogram that let’s it know where the gaps are.

Note: The results above are from 11.2.0.4, in 12.1.0.2 I got a HYBRID histogram, on 10.2.0.5 the sample size was 40,000

September 16, 2016

Stats collection time

Filed under: Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 12:58 pm GMT Sep 16,2016

Yesterday I posted a note about querying dba_optstat_operations to get a quick report of how long calls to dbms_stats had been taking but said I had another script that helped to fill some of the gaps it left. One of my readers points out fairly promptely that 12c enhances the feature considerably, with a view dba_optstat_operation_tasks that (for example) lists all the tables processed during a single call to gather_schema_stats.

Well, I wrote my script years (if not decades) before 12c came out, so I’m going to publish it anyway.

It definitely falls into the quick and dirty category. It’s not trying to be clever, makes a critical assumption about the recent past, and isn’t a generally useful tool, but I’ve found it a quick way to highlight a possible problem with expensive stats collection routines (especially those written by 3rd parties who have disabled Oracle’s automatic job.)

The concept is very simple: list all the tables in the order they were last analyzed and assume that for any “obviously continuous” stretch of time the start of stats collections for one table coincided with the end of stats collection for the previous one. A little fiddling with the lag() function then allows you to report the time spent on gathering the stats.

rem
rem     stats_check.sql
rem
rem     Quick and dirty to get an idea of
rem     where the time went on the most
rem     recent stats collection job.
rem
rem     Weakness: It assumes that the start
rem     of stats collection of a table is the
rem     end of stats collection for the previous
rem     table.  This may be far from true.
rem

define m_schema = '&1'

set linesize 144
set pagesize 60
set trimspool on

column a_date format a20

spool stats_check

select
        table_name,
        blocks, num_rows, sample_size,
        round(100 * sample_size/nullif(num_rows,0),2)   sample_pct,
        to_char(last_analyzed,'dd-mon-yyyy hh24:mi:ss') a_date,
        to_char(
                round(1440 *
                        (
                        last_analyzed -
                        lag(last_analyzed, 1) over (order by last_analyzed)
                        ), 2
                ),'9,999.00'
        )                        a_time
from
        dba_tables
where
        owner = upper('&m_schema')
order by
        last_analyzed
;

spool off

It’s a very simple script – one detail that’s quite useful is the presence of the data about table size and sample size. But that’s really only the starting point. If you’re gathering stats on a table then most of the time might be spent on gathering stats for histograms or indexes, so there’s plenty of scope to add bits to the script to report number of histograms per table, number of indexes per table, and so on. Personally I tend to do “incremental” trouble-shooting and keep my scripts (and their run time) short, so I’d only worry about specific details if I saw a headline figure that seemed worth a little more effort.

Here’s some (camouflaged) output – which happens to highlight one of the weaknesses of the approach:

TABLE_NAME              BLOCKS   NUM_ROWS SAMPLE_SIZE SAMPLE_PCT A_DATE                  A_TIME
--------------------- --------  --------- ----------- ---------- --------------------  --------
TABLE_AAA                    0          0           0            14-jul-2016 01:27:55       .00
TABLE_BBB              6320666  540356865   108071373         20 14-jul-2016 02:30:45     62.83
TABLE_CCC              9587372  930770535   186154107         20 14-jul-2016 03:36:13     65.47
TABLE_DDD              2272319  104698080    20939616         20 14-jul-2016 04:19:45     43.53
TABLE_EEE              4413696  258746720    51749344         20 14-jul-2016 04:49:13     29.47
TABLE_FFF                    0          0           0            14-jul-2016 05:02:28     13.25
TABLE_GGG              4377656  258740995    51748199         20 14-jul-2016 05:02:28       .00
TABLE_HHH                    0          0           0            14-jul-2016 05:02:28       .00
TABLE_III                   60         52          52        100 14-jul-2016 05:02:29       .02
TABLE_JJJ                   60        170         170        100 14-jul-2016 05:02:30       .02
TABLE_KKK                   60        100         100        100 14-jul-2016 05:02:30       .00
TABLE_LLL                   60       5548        5021       90.5 14-jul-2016 05:02:31       .02
TABLE_MMM                 4660     686575      137315         20 14-jul-2016 05:02:32       .02
...
TABLE_TTT                    0          0           0            14-jul-2016 05:02:38       .00
TABLE_UUU                   60        659         659        100 14-jul-2016 05:02:38       .00
TABLE_VVV               325324   18618685     3723737         20 14-jul-2016 05:04:14      1.60
TABLE_WWW                   60       1106        1106        100 14-jul-2016 05:06:01      1.78
TABLE_XXX                24001    2534810      506962         20 14-jul-2016 05:06:13       .20
TABLE_YYY                    0          0           0            14-jul-2016 05:06:20       .12
TABLE_ZZZ                    0          0           0            14-jul-2016 05:06:20       .00

You’ll notice that the zero row FFF table is reported to have taken 13.25 minutes for stats gathering, while the 4.3M row GGG table table took no time at all. That’s the problem of sorting by the last_analyzed time when it’s accurate only to the second and some tables take less than a second to gather stats. Clearly these two time estimates are the wrong way round. (WWW and XXX look fairly suspect too, but their timestamps are different so there must be some other explanation – but the time is too small to worry about at the moment.)

Tables BBB and CCC are also an example of how variable the results are when you compare cost with time. Although the table and sample had twice the rows table CCC has fewer histograms and indexes. The same variation shows up clearly between EEE and GGG (after allowing for the order error), and slightly less immediately between DDD and all the other large tables.

That brings me to one more threat of mis-interpretation (and a reason why I’m not terribly keen on publishing scripts that I’ve hacked together). If you see Oracle claiming (through this script) that it took no more than a couple of minutes to gather stats on a 5 billion row table it may be true – but maybe that stats collection was only related to gathering stats for a single, fairly new, partition in a partitioned table. There are, no doubt, several boundary cases that a simple script like this will have overlooked – and when I run it I’ll be at a client site with knowledge of the client database and I’ll spot the anomalies and omissions.

Update

One simple enhancement to the script above – indexes take time to analyze as well, and if you query only for tables the time spent analyzing the indexes on a table will be attributed to the NEXT table in the output. All we have to do is change the reference to dba_tables to be an inline view referencing dba_tables and dba_indexes in a UNION ALL:


select
        table_name, index_name,
        blocks, num_rows, sample_size,
        round(100 * sample_size/nullif(num_rows,0),2)   sample_pct,
        to_char(last_analyzed,'dd-mon-yyyy hh24:mi:ss') a_date,
        to_char(
                round(1440 *
                        (
                        last_analyzed -
                        lag(last_analyzed, 1) over (order by last_analyzed)
                        ), 2
                ),'9,999.99'
        )                        a_time
from
        (
        select
                owner, table_name, null as index_name,
                blocks, num_rows, sample_size, last_analyzed
        from
                dba_tables
        union all
        select
                table_owner, table_name, index_name,
                leaf_blocks, num_rows, sample_size, last_analyzed
        from
                dba_indexes
        )
where
        owner = upper('&m_schema')
order by
        last_analyzed
;

You’ll notice that I’ve left the ‘&schema’ outside the view so that I only have to supply a value once; the optimizer is smart enough to push this predicate inside the union all view so that the plan shows the two separate subqueries being operated only for the given schema.

Footnote

A reader has emailed me to point out that one of the cases where this script won’t help (or, at the least, would produce highly misleading results) when the CONCURRENT preference has been set to true, allowing multiple jobs run to gather stats on difference objects at the same.

 

September 15, 2016

Stats time

Filed under: Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 9:41 am GMT Sep 15,2016

I don’t really remember how long it’s been since Oracle created an automatic log of how long a call to the dbms_stats package took, though it was probably some time in the 10g time-line. It wasn’t until it had been around for several years, though before I wrote little script (possibly prompted by a comment from Martin Widlake) that I’ve used occasionally since to see what’s been going on in the past, how variable stats collection times have been, and what unexpected dbms_stats call an application may have been making. Here’s what it currently looks like:

rem
rem     Script:         optimizer_operations.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2010
rem     Purpose:
rem
rem     Last tested
rem             11.2.0.4
rem     Not tested
rem             12.1.0.2
rem             10.2.0.5
rem
rem     Notes:
rem     Simple script to list calls to dbms_stats showing
rem     what they did and how long it took.
rem
rem     Notice that gather_database_stats and gather_schema_stats
rem     don't tell you anything about individual tables.
rem

spool optimizer_operations

column start_time       format a35
column operation        format a28
column target           format a28

column duration         format a12
column max_duration     format a12

column utc_start_time   format a28
column first_utc_time   format a28

select
        operation,
        cast(sys_extract_utc(min(start_time)) as timestamp(0))                  first_utc_time,
        --
        to_char(extract(hour from max(end_time - start_time)),'FM00') ||
        ':' ||
        to_char(extract(minute from max(end_time - start_time)),'FM00') ||
        ':' ||
        to_char(round(extract(second from max(end_time - start_time)),2),'FM00.00')     max_duration,
        --
        count(*)        Frequency
from
        dba_optstat_operations
group by
        operation
order by
        operation
;

select
        operation,
        target,
        -- start_time,
        cast(sys_extract_utc(start_time) as timestamp(0))                               utc_start_time,
        to_char(extract(hour from (end_time - start_time)),'FM00') ||
        ':' ||
        to_char(extract(minute from (end_time - start_time)),'FM00') ||
        ':' ||
        to_char(round(extract(second from (end_time - start_time)),2),'FM00.00')        duration
from
        dba_optstat_operations
where
        end_time - start_time >= numtodsinterval(2,'SECOND')
and     target is not null
order by
        start_time
;

spool off

This version of the script first prints a summary showing how many calls to which operation have occured, the worst case scenario for each, and when the first one happened – in case a recent change has introduced a problem – you might find it useful to include other bits of information, like the min(duration) in the first section, it’s not difficult modify the SQL. The second part of the report is then just a list of the individual calls, with a minimum interesting time included.

And here’s some (camouflagued) output


OPERATION                    FIRST_UTC_TIME               MAX_DURATION  FREQUENCY
---------------------------- ---------------------------- ------------ ----------
copy_table_stats             14-JUL-16 11.12.55 PM        00:00:03.43         320
gather_database_stats(auto)  14-JUL-16 06.00.01 AM        01:13:17.25          67
gather_schema_stats          14-JUL-16 04.00.02 AM        05:16:01.22          35
gather_table_stats           14-JUL-16 01.15.59 AM        00:04:07.98         153


OPERATION                    TARGET                             UTC_START_TIME               DURATION
---------------------------- ---------------------------------- ---------------------------- ------------
gather_schema_stats          MY_SCHEMA                          01-AUG-16 04.00.02 AM        04:40:33.03
copy_table_stats             SYS.WRH$_SQLSTAT.WRH$_SQLSTA_24706 01-AUG-16 11.18.15 PM        00:00:02.04
                             69947_32443

gather_schema_stats          MY_SCHEMA                          02-AUG-16 04.00.02 AM        04:57:39.61
gather_schema_stats          MY_SCHEMA                          03-AUG-16 04.00.02 AM        04:40:10.88
gather_schema_stats          MY_SCHEMA                          04-AUG-16 04.00.02 AM        05:00:48.05
gather_schema_stats          MY_SCHEMA                          05-AUG-16 04.00.03 AM        05:07:42.33
gather_schema_stats          MY_SCHEMA                          06-AUG-16 04.00.02 AM        04:50:39.93
gather_table_stats           SYS.WRI$_ADV_ACTIONS               06-AUG-16 12.42.38 PM        00:00:03.01
gather_table_stats           SYS.WRI$_ADV_FINDINGS              06-AUG-16 12.42.41 PM        00:00:02.94
gather_table_stats           SYS.WRI$_ADV_RECOMMENDATIONS       06-AUG-16 12.42.46 PM        00:00:02.03
gather_table_stats           SYS.WRI$_ADV_SQLT_PLANS            06-AUG-16 12.42.49 PM        00:00:04.45
gather_schema_stats          MY_SCHEMA                          07-AUG-16 04.00.02 AM        04:48:57.23
gather_schema_stats          MY_SCHEMA                          08-AUG-16 04.00.01 AM        04:40:57.81
gather_table_stats           MY_SCHEMA.TABLE_A                  09-AUG-16 01.16.21 AM        00:00:03.61
gather_schema_stats          MY_SCHEMA                          09-AUG-16 04.00.03 AM        04:51:14.48
gather_table_stats           MY_SCHEMA.TABLE_A                  10-AUG-16 01.16.44 AM        00:00:02.07
gather_schema_stats          MY_SCHEMA                          10-AUG-16 04.00.02 AM        04:36:19.35
gather_schema_stats          MY_SCHEMA                          11-AUG-16 04.00.02 AM        04:43:30.77
copy_table_stats             SYS.WRH$_FILESTATXS.WRH$_FILEST_24 11-AUG-16 11.24.21 PM        00:00:02.65
                             70669947_32683

gather_schema_stats          MY_SCHEMA                          12-AUG-16 04.00.02 AM        04:42:15.92
gather_schema_stats          MY_SCHEMA                          12-AUG-16 04.00.02 AM        04:42:15.89
gather_schema_stats          MY_SCHEMA                          13-AUG-16 04.00.02 AM        04:41:06.68
gather_table_stats           SYS.WRI$_ADV_PARAMETERS            13-AUG-16 10.31.35 AM        00:00:03.88
gather_table_stats           SYS.WRI$_ADV_RATIONALE             13-AUG-16 10.31.39 AM        00:00:03.03
gather_table_stats           MY_SCHEMA.TABLE_B                  13-AUG-16 01.24.33 PM        00:04:07.98
gather_table_stats           MY_SCHEMA.TABLE_C.SYS_P12423       13-AUG-16 02.06.34 PM        00:03:22.33
copy_table_stats             SYS.WRH$_ACTIVE_SESSION_HISTORY.WR 13-AUG-16 11.03.35 PM        00:00:02.11
                             H$_ACTIVE_2470669947_32731


There is an important limitation with this script – it’s all very well seeing that gather_schema_stats has been running at a fairly uniform 4.5 to 5 hours, but it doesn’t tell us where the time went inside that call and it’s a call that could have been covering a lot of tables. There’s not a lot you can do about that from this set of data, but there’s another script I wrote a long time ago (before I wrote this one) that tried to do something about that question, and I’ll be publishing that tomorrow.

 

June 28, 2016

Index Sanity

Filed under: CBO,extended stats,Indexing,Oracle,Statistics — Jonathan Lewis @ 8:43 am GMT Jun 28,2016

By popular demand (well, one person emailed me to ask for it) I’m going to publish the source code for a little demo I’ve been giving since the beginning of the millennium – it concerns indexes and the potential side effects that you can get when you drop an index that you’re “not using”. I think I’ve mentioned the effect several times in the history of this blog, but I can’t find an explicit piece of demo code, so here it is – starting at the conclusion – as a cut and paste from an SQL*Plus session running against an 11g instance:


SQL> set autotrace traceonly explain
select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1184213596

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   484 | 64856 |   227   (2)| 00:00:02 |
|*  1 |  HASH JOIN          |      |   484 | 64856 |   227   (2)| 00:00:02 |
|*  2 |   HASH JOIN         |      |   484 | 57596 |    14   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T2   |    20 |  1160 |     4   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T1   |   484 | 29524 |    10   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL | T3   |  5000 | 75000 |   213   (2)| 00:00:02 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T3"."ID"="T1"."ID1")
   2 - access("T2"."ID1"="T1"."ID1" AND "T2"."IND_PAD"="T1"."IND_PAD"
              AND "T2"."ID2"="T1"."ID2")
   4 - filter("T1"."N1"<=50 AND "T1"."N1">=40)

SQL> drop index t2_i1;

Index dropped.

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2290830436

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    52 |  6968 |    67   (2)| 00:00:01 |
|   1 |  NESTED LOOPS                |       |    52 |  6968 |    67   (2)| 00:00:01 |
|   2 |   NESTED LOOPS               |       |    52 |  6968 |    67   (2)| 00:00:01 |
|*  3 |    HASH JOIN                 |       |    52 |  6188 |    14   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL        | T2    |    20 |  1160 |     4   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL        | T1    |   484 | 29524 |    10   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN         | T3_PK |     1 |       |     0   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID| T3    |     1 |    15 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T2"."ID1"="T1"."ID1" AND "T2"."IND_PAD"="T1"."IND_PAD" AND
              "T2"."ID2"="T1"."ID2")
   5 - filter("T1"."N1"<=50 AND "T1"."N1">=40)
   6 - access("T3"."ID"="T1"."ID1")

Starting from the top – I’ve enabled autotrace which, technically, could mean that the plans are not the ones I’d see at run-time, but you can take my word for it that in 11g they are the run-time plans; then I’ve supplied a query that produces a plan with 3 full tablescans, two hash joins, and no index usage at all.

You’ll notice at operation 3 of the plan that table t2 is very small – only 20 rows selected, with no predicates that could have filtered that result down from a large table (take my word for it the stats have just been collected) so, as the ancient mythology would have it, we don’t really need an index on that table (a quick check tells me that the index wasn’t there to enforce uniqueness). Immediately after the first execution plan you can see that I’ve dropped an index called t2_i1 – trust me that IS the index on table t2.

We “run” the original query again, it gets re-optimised (and there’s no question of cardinality feedback or any other feature coming into play) and we get a different plan.

Dropping, or adding, a multi-column index to a table could change execution plans – even if the index is not present in the plan.

The reason for this is the “index sanity check”. When the optimizer is doing its cardinality estimates, if it see equality conditions on the set of columns that make up an index it can use the distinct_keys statistic from the index in the calculation rather than using the standard calculation of multiplying together the num_distinct of the separate columns. In earlier versions of Oracle there were some restrictions about uniqueness, but the limitations were removed in 11.1.0.7.

In my case there were 10 distinct values for id1, just one value for ind_pad, and 20 distinct values for id2 – but a total of only 20 distinct values for the combination. With an index in place on the combination the optimizer used the value 20 in its calculation, in the absence of the index it used the value 200 – that factor of 10 led to a drop in the join cardinality estimate from 484 rows to 52 rows – at which point the optimizer calculations made the next step in the plan change from a hash join to a nested loop join.

If you want to reproduce the demo, here’s the full script – the data isn’t a realistic data set, and I’ve had to use various non-standard settings to make the script as repeatable as possible – I’ve built the data set in a tablespace using an 8KB block size, 1MB uniform extents and manual (freelist) segment space management.


rem
rem     Script:         index_sanity.sql
rem     Author:         Jonathan Lewis
rem

drop table t3;
drop table t2;
drop table t1;

execute dbms_random.seed(0);

begin   
        begin           execute immediate 'purge recyclebin';
        exception       when others then null;
        end; 

        begin
                dbms_stats.set_system_stats('MBRC',16);
                dbms_stats.set_system_stats('MREADTIM',10);
                dbms_stats.set_system_stats('SREADTIM',5);
                dbms_stats.set_system_stats('CPUSPEED',1000);
        exception
                when others then null;
        end;

end;
/

create table t1
as
select
        mod(rownum,10)          id1,
        mod(rownum,20)          id2,
        rpad('x',40,'x')        ind_pad,
        mod(rownum,100)         n1,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',50)            padding
from
        all_objects
where
        rownum  <= 4000
;

create table t2 
pctfree 99
pctused 1
as
select
        mod(rownum,10)          id1,
        mod(rownum,20)          id2,
        rpad('x',40,'x')        ind_pad,
        mod(rownum,100)         n1, 
        lpad(rownum,10,'0')     small_vc,
        rpad('x',200)           padding
from
        all_objects
where
        rownum <= 20
;

create table t3
pctfree 95
pctused 1
as
select
        rownum          id,
        rpad(rownum,10) small_vc,
        rpad('x',100)   padding
from
        all_objects
where
        rownum <= 5000
;
begin
        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T1',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T2',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T3',
                method_opt => 'for all columns size 1'
        );

end;
/

create        index t1_i1 on t1(id1, ind_pad, id2) pctfree 91;
create        index t2_i1 on t2(id1, ind_pad, id2) pctfree 91;
alter table t3 add constraint t3_pk primary key (id);

set autotrace traceonly explain

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
;

-- alter index t1_i1 invisible;
-- alter index t2_i1 invisible;

drop index t1_i1;
-- drop index t2_i1;

accept X prompt "Press return to coninue"

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
;

set autotrace off

You’ll notice from the commented lines in the above that the effect appears whether you drop the index or make it invisible, also that there’s a similar index on the t1 table that matches the index on the t2 table – I could get the effect from dropping or making invisible either index.

There is a saving grace in 11g – if I do drop, or make invisible, one of these indexes I can protect myself against the statistical effect by create a column group on the same set of columns, and the num_distinct from the column group would serve the same purpose as the distinct_keys from the index.

April 27, 2016

Stats History

Filed under: Histograms,Infrastructure,Oracle,Statistics — Jonathan Lewis @ 1:09 pm GMT Apr 27,2016

From time to time we see a complaint on OTN about the stats history tables being the largest objects in the SYSAUX tablespace and growing very quickly, with requests about how to work around the (perceived) threat. The quick answer is – if you need to save space then stop holding on to the history for so long, and then clean up the mess left by the history that you have captured; on top of that you could stop gathering so many histograms because you probably don’t need them, they often introduce instability to your execution plans, and they are often the largest single component of the history (unless you are using incremental stats on partitioned objects***)

For many databases it’s the histogram history – using the default Oracle automatic stats collection job – that takes the most space, here’s a sample query that the sys user can run to get some idea of how significant this history can be:


SQL> select table_name , blocks from user_tables where table_name like 'WRI$_OPTSTAT%HISTORY' order by blocks;

TABLE_NAME                           BLOCKS
-------------------------------- ----------
WRI$_OPTSTAT_AUX_HISTORY                 80
WRI$_OPTSTAT_TAB_HISTORY                244
WRI$_OPTSTAT_IND_HISTORY                622
WRI$_OPTSTAT_HISTHEAD_HISTORY          1378
WRI$_OPTSTAT_HISTGRM_HISTORY           2764

5 rows selected.

As you can see the “histhead” and “histgrm” tables (histogram header and histogram detail) are the largest stats history tables in this (admittedly very small) database.

Oracle gives us a couple of calls in the dbms_stats package to check and change the history setting, demonstrated as follows:


SQL> select dbms_stats.get_stats_history_retention from dual;

GET_STATS_HISTORY_RETENTION
---------------------------
                         31

1 row selected.

SQL> execute dbms_stats.alter_stats_history_retention(7)

PL/SQL procedure successfully completed.

SQL> select dbms_stats.get_stats_history_retention from dual;

GET_STATS_HISTORY_RETENTION
---------------------------
                          7

1 row selected.

Changing the retention period doesn’t reclaim any space, of course – it simply tells Oracle how much of the existing history to eliminate in the next “clean-up” cycle. This clean-up is controllled by a “savtime” column in each table:

SQL> select table_name from user_tab_columns where column_name = 'SAVTIME' and table_name like 'WRI$_OPTSTAT%HISTORY';

TABLE_NAME
--------------------------------
WRI$_OPTSTAT_AUX_HISTORY
WRI$_OPTSTAT_HISTGRM_HISTORY
WRI$_OPTSTAT_HISTHEAD_HISTORY
WRI$_OPTSTAT_IND_HISTORY
WRI$_OPTSTAT_TAB_HISTORY

5 rows selected.

If all you wanted to do was stop the tables from growing further you’ve probably done all you need to do. From this point onwards the automatic Oracle job will start deleting the oldest saved stats and re-using space in the existing table. But you may want to be a little more aggressive about tidying things up, and Oracle gives you a procedure to do this – and it might be sensible to use this procedure anyway at a time of your own choosing:


SQL> execute dbms_stats.purge_stats(sysdate - 7);

Basically this issues a series of delete statements (including a delete on the “stats operation log (wri$_optstat_opr)” table that I haven’t previously mentioned) – here’s an extract from an 11g trace file of a call to this procedure (output from a simple grep command):


delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_tab_history          where savtime < :1 and rownum <= NVL(:2, rownum)
delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_ind_history h        where savtime < :1 and rownum <= NVL(:2, rownum)
delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_aux_history          where savtime < :1 and rownum <= NVL(:2, rownum)
delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_opr                  where start_time < :1 and rownum <= NVL(:2, rownum)
delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_histhead_history     where savtime < :1 and rownum <= NVL(:2, rownum)
delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_histgrm_history      where savtime < :1 and rownum <= NVL(:2, rownum)

Two points to consider here: although the appearance of the rownum clause suggests that there’s a damage limitation strategy built into the code I only saw one commit after the entire delete cycle, and I never saw a limiting bind value being supplied. If you’ve got a large database with very large history tables you might want to delete one day (or even just a few hours) at a time. The potential for a very long, slow, delete is also why you might want to do a manual purge at a time of your choosing rather than letting Oracle do the whole thing on auto-pilot during some overnight operation.

Secondly, even though you may have deleted a lot of data from these table you still haven’t reclaimed the space – so if you’re trying to find space in the sysaux tablespace you’re going to have to rebuild the tables and their indexes. Unfortunately a quick check of v$sysaux_occupants tells us that there is no official “move” producedure:


SQL> execute print_table('select occupant_desc, move_procedure, move_procedure_desc from v$sysaux_occupants where occupant_name = ''SM/OPTSTAT''')

OCCUPANT_DESC                 : Server Manageability - Optimizer Statistics History
MOVE_PROCEDURE                :
MOVE_PROCEDURE_DESC           : *** MOVE PROCEDURE NOT APPLICABLE ***

So we have to run a series of explicit calls to alter table move and alter index rebuild. (Preferably not when anyone is trying to gather stats on an object). Coding that up is left as an exercise to the reader, but it may be best to move the tables in the order of smallest table first, rebuilding indexes as you go.

Footnote:

*** Incremental stats on partitioned objects: I tend to assume that sites which use partitioning are creating very large databases and have probably paid a lot more attention to the details of how to use statistics effectively and successfully; that’s why this note is aimed at sites which don’t use partitioning and therefore think that the space taken up by the stats history significant.

April 20, 2016

Column Groups

Filed under: extended stats,Oracle,Statistics — Jonathan Lewis @ 9:07 am GMT Apr 20,2016

Patrick Jolliffe alerted the Oracle-L list to a problem that appears when you combine fixed length character columns (i.e. char() or nchar())  with column group statistics. The underlying cause of the problem is the “blank padding” semantics that Oracle uses by default to compare varchar2 with char, so I’ll start with a little demo of that. First some sample data:


rem     Script:         col_group_char_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Apr 2016

execute dbms_random.seed(0)

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        cast(chr(trunc(dbms_random.value(1,6))+64) as char(1))  c1,
        cast(chr(trunc(dbms_random.value(1,6))+64) as char(2))  c2,
        cast('X' as varchar2(2))                                v2
from
        generator       v1
where
        rownum <= 5 * 5 * 10
;

insert into t1(c1, c2, v2)
select  'X', 'X', 'X'
from    t1
;

update t1 set v2 = c2;
commit;


The little demos I’m going to report here don’t use all the data in this table – there are several other tests in the script that I won’t be reporting – so I’ll just point out that there are 500 rows in the table, half of them have ‘X’ in all three columns, and half of them have a uniform distribution of the letters ‘A’ to ‘E’ in every column.

  • Column c1 is declared as char(1) – so it will hold the data exactly as it was inserted by the script.
  • Column c2 is declared as char(2) – so even though the script apparently inserts a character string of length 1, this will be padded with a space to two characters before being stored.

Now we can create some stats – in particular a frequency histogram on the c2 column – and check the cardinality estimates for a couple of queries:

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 254'
        );
end;
/

set autotrace traceonly explain

prompt  ==================
prompt  c2 without padding
prompt  ==================

select  *
from    t1
where   c2 = 'X'
;

prompt  ================
prompt  c2 with padding
prompt  ================

select  *
from    t1
where   c2 = 'X '
;

set autotrace off

The first query compares c2 with the single character ‘X’, the second compares it with the two-character string ‘X ‘. But since the comparison is with a char(2) column the optimizer pads the first constant with spaces, and both queries end up predicting the same cardinality:


==================
c2 without padding
==================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C2"='X')

================
c2 with padding
================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C2"='X ')


Note that both queries predict the 250 rows where (we know) c2 = ‘X ‘; even though the predicate sections suggest the queries are looking for different data sets. This IS the expected behaviour.

Now let’s make things more complex – we’ll add the predicate “and c1 = ‘X'” to both queries but we’ll create a column group with histogram on (c1, c2) before checking the plans. Again we expect both versions of the new query to predict the same volume of data and (in fact) to produce a perfect prediction because we have so few rows and so few distinct combinations that we should get a perfect frequency histogram:


begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1 for columns (c1, c2) size 254'
        );
end;
/

prompt  ========================
prompt  (c1, c2) without padding
prompt  ========================

select  *
from    t1
where   c1 = 'X' and c2 = 'X'
;

prompt  =====================
prompt  (c1, c2) with padding
prompt  =====================

select  *
from    t1
where   c1 = 'X' and c2 = 'X '
;

And here are the execution plans:

========================
(c1, c2) without padding
========================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     2 |    16 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     2 |    16 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C1"='X' AND "C2"='X')

=====================
(c1, c2) with padding
=====================

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250 |  2000 |    17   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   250 |  2000 |    17   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C1"='X' AND "C2"='X ')


If we run the query where the literal is padded with spaces to the correct length (2nd query) then the prediction is correct. But if we haven’t padded the literal the prediction is wrong; the estimate is the one the optimizer would have used for “value not found in histogram”.

I think what’s happening is that the optimizer doesn’t “remember” that the literal is being compared with a char() when making the call to sys_op_combined_hash() that it uses for calculating column group stats so it doesn’t pad the column with spaces before calling the function and, as a consequence, the hashed value isn’t the one it should be using.

I’ve run this test on 11.2.0.4 and 12.1.0.2 – the effects are the same on both versions.

Bottom Line:

Be careful about how you use char() data types in your code, and be especially careful if you think you’re going to be creating column group stats involving char() columns – and then remember that 12c may generate column group stats automatically for you. If you use char() columns you will have to ensure that predicates using literal values should have those values padded to the correct number of spaces if you want to have the best possible chance of getting the correct execution plans.

 

February 11, 2016

Quiz Night

Filed under: 12c,Infrastructure,LOBs,Oracle,Statistics — Jonathan Lewis @ 3:28 pm GMT Feb 11,2016

I was setting up a few tests on a copy of 12.1.0.2 recently when I made a mistake creating the table – I forgot to put in a couple of CAST() calls in the select list, so I just patched things up with a couple of “modify column” commands. Since I was planning to smash the table in all sorts of ways and it had taken me several minutes to create the data set (10 million rows) I decided to create a clean copy of the data so that I could just drop the original table and copy back the clean version – and after I’d done this I noticed something a little odd.

Here’s the code (cut down to just 10,000 rows), with a little output:

--
--      Script:  12c_vc32767_b.sql
--

create table t1 (
        id      not null,
        date_open,
        date_closed,
        deal_type,
        client_ref,
        small_vc,
        padding
)
nologging
as
select
        rownum                                          id,
        trunc(
                add_months(sysdate, -120) +
                        (rownum)* 3652 / 1e7
        )                                               date_open,
        trunc(
                add_months(
                        add_months(sysdate, -120) +
                                (rownum) * 3652 / 1e7,
                        12 * trunc(dbms_random.value(1,6))
                )
        )                                               date_closed,
        dbms_random.string('U',1)                       deal_type,
        dbms_random.string('U',4)                       client_ref,
        lpad(rownum,10)                                 vc_small,
        rpad('x',100,'x')                               vc_padding
from
        dual
connect by
        level <= 10000
;
alter table t1 modify deal_type varchar2(1);
alter table t1 modify client_ref varchar2(4);

create table t2 nologging as select * from t1;

begin
        dbms_stats.gather_table_stats(
                user,
                't1',
                method_opt=>'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                user,
                't2',
                method_opt=>'for all columns size 1'
        );
end;
/

select table_name, num_rows, blocks, avg_row_len from user_tables;

TABLE_NAME             NUM_ROWS     BLOCKS AVG_ROW_LEN
-------------------- ---------- ---------- -----------
T1                        10000        304         139
T2                        10000        218         139

There’s no hidden code – this is the code I ran, and the tables both went into the same tablespace- yet one table has used about 30% more blocks than the other on a simple “create as select”, even though the row lengths and row counts are the same!

When I first did this test it just so happened that the only place in the database I had to make the t2 copy was in a tablespace using freelist management, and I’d created the original table in a tablespace using ASSM, so I wasted a little time examining the ASSM space management (bitmap) blocks before I realised what had happened.

Suggestions about what has happened, and how to confirm your hypothesis, invited in the comments.

Update (following morning)

The explanation is in the comments: I had enabled extended (32,767 byte) varchar2().  My calls to dbms_random.string() had returned character values, and the return type defaults to the maximum character length, which meant they were declared as varchar2(32767) and those are implemented as (unhackable) CLOBs in 12.1.0.2.

When I modified the table to change the varchar2(32767) to shorter lengths Oracle let me get away with it because none of the actual stored values at that time was longer than my target declaration – but it didn’t physically change the stored column values to “short” varchar2(), it left them as CLOBs, and a “short” CLOB includes a LOB Locator which starts out at about 30 bytes.

When I created t2 from t1 Oracle first copied the (short) column definitions, and then copied the data, reading and converting the CLOBs to normal varchar2() storage, and that explains why t2 needed fewer blocks.  The avg_row_len for the two tables matched because the code to gather stats simply applies the sys_op_opnsize() function to the varchar2() defined in t1, and doesn’t give you any indication about the LOB locator.

(If you use the extremely deprecated analyze command to gather stats on the two tables you’ll find that the avg_row_len of table t1 then allows for the LOB locator size.)

“Workaround:”

I should have used:  cast(dbms_random.string(‘U’,4) as varchar2(4)) and the column definition would have been a proper varchar2(4) from the start.  (As noted in a comment from Ivica Arsov below, substr(…, 1, 4) would also have worked.  Of course I should have used cast() for the numerics as well so that I could declare them as things like number(8,0) rather than the slightly dangerous “number”.

Note:  I’ve put “workaround” in quotes because it’s not really a workaround to a problem – it’s just an example of doing it right.

Footnote:

Here’s a symbolic dump of a single character (‘T’) being stored as a varchar2(32767):


LOB
Locator:
  Length:        84(31)
  Version:        1
  Byte Length:    1
  LobID: 00.00.00.01.00.00.01.6b.ba.d7
  Flags[ 0x01 0x0c 0x00 0x80 ]:
    Type: BLOB
    Storage: SecureFile
    Characterset Format: IMPLICIT
    Partitioned Table: No
    Options: ReadWrite
  SecureFile Header:
    Length:   11
    Old Flag: 0x48 [ DataInRow SecureFile ]
    Flag 0:   0x90 [ INODE Valid ]
    Layers:
      Lengths Array: INODE:5
      INODE:
        00 00 01 01 54

Note the 84(31) at line 3: this is a reminder of how big a LOB locator could get for an out of line LOB, compared to the current size of the LOB locator. Rows that hold out of line LOBs can actually be much longer than avg_row_len tells you.

Addendum

Another side effect of enabling extended varchar2() types is the surprise you get when you use error logging in SQL (i.e. syntax like: “insert into t1 select … log errors”). When you create the “clone” table for the log it changes every column from the original into a varchar2(32767) e.g.:

desc target
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 ID                                                                                NUMBER
 N1                                                                                NUMBER
 N2                                                                                NUMBER
 SMALL_VC                                                                          VARCHAR2(10)
 PADDING                                                                           VARCHAR2(100)


begin
        dbms_errlog.create_error_log('TARGET');
end;
/

desc err$_TARGET
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 ORA_ERR_NUMBER$                                                                   NUMBER
 ORA_ERR_MESG$                                                                     VARCHAR2(2000)
 ORA_ERR_ROWID$                                                                    ROWID
 ORA_ERR_OPTYP$                                                                    VARCHAR2(2)
 ORA_ERR_TAG$                                                                      VARCHAR2(2000)
 ID                                                                                VARCHAR2(32767)
 N1                                                                                VARCHAR2(32767)
 N2                                                                                VARCHAR2(32767)
 SMALL_VC                                                                          VARCHAR2(32767)
 PADDING                                                                           VARCHAR2(32767)

You probably hope that you’re not going to see many rows inserted into the error log table, so the extra space taken up in the table by error rows probably won’t be a problem – but the fact that you get two segments (the LOB and the LOBINDEX) created for every column in the original table might cause a problem – even if they are completely empty.

December 29, 2015

Column Groups

Filed under: extended stats,Oracle,Statistics,Tuning — Jonathan Lewis @ 1:13 pm GMT Dec 29,2015

I think the “column group” variant of extended stats is a wonderful addition to the Oracle code base, but there’s a very important detail about using the feature that I hadn’t really noticed until a question came up on the OTN database forum recently about a very bad join cardinality estimate.

The point is this: if you have a multi-column equality join and the optimizer needs some help to get a better estimate of join cardinality then column group statistics may help if you create matching stats at both ends of the join. There is a variation on this directive that helps to explain why I hadn’t noticed it before – multi-column indexes (with exactly the correct columns) have the same effect and, most significantly, the combination of  one column group and a matching multi-column index will do the trick.

Here’s some code to demonstrate the effect:

create table t8
as
select
        trunc((rownum-1)/125)   n1,
        trunc((rownum-1)/125)   n2,
        rpad(rownum,180)        v1
from
        all_objects
where
        rownum <= 1000
;

create table t10
as
select
        trunc((rownum-1)/100)   n1,
        trunc((rownum-1)/100)   n2,
        rpad(rownum,180)        v1
from
        all_objects
where
        rownum <= 1000
;
begin
        dbms_stats.gather_table_stats(
                user,
                't8',
                method_opt => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                user,
                't10',
                method_opt => 'for all columns size 1'
        );
end;
/

set autotrace traceonly

select
        t8.v1, t10.v1
from
        t8,t10
where
        t10.n1 = t8.n1
and     t10.n2 = t8.n2
/

set autotrace off

Table t8 has eight distinct values for n1 and n2, and 8 combinations (though the optimizer will assume there are 64 combinations); table t10 has ten distinct values for n1 and n2, and ten combinations (though the optimizer will assume there are 100 combinations). In the absence of any column group stats (or histograms, or indexes) and with no filter predicates on either table, the join cardinality will be “{Cartesian Join cardinality} * {join selectivity}”, and in the absence of any nulls the join selectivity – thanks to the “multi-column sanity check” – will be 1/(greater number of distinct combinations). So we get 1,000,000 / 100 = 10,000.

Here’s the output from autotrace in 11.2.0.4 to prove the point:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |  3652K|    11  (10)| 00:00:01 |
|*  1 |  HASH JOIN         |      | 10000 |  3652K|    11  (10)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T8   |  1000 |   182K|     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T10  |  1000 |   182K|     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T10"."N1"="T8"."N1" AND "T10"."N2"="T8"."N2")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        835  consistent gets
          0  physical reads
          0  redo size
   19965481  bytes sent via SQL*Net to client
      73849  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed

As you can see, the query actually returns 100,000 rows. The estimate of 10,000 is badly wrong thanks to the correlation between the n1 and n2 columns. So let’s check the effect of creating a column group on t10:


begin
        dbms_stats.gather_table_stats(
                user,
                't10',
                method_opt => 'for all columns size 1 for columns (n1,n2) size 1'
        );
end;
/

At this point you might think that the optimizer’s sanity check might say something like: t8 table: 64 combinations, t10 table column group 10 combinations so use the 64 which is now the greater num_distinct. It doesn’t – maybe it will in some future version, but at present the optimizer code doesn’t seem to recognize this as a possibility. (I won’t bother to reprint the unchanged execution plan.)

But, at this point, I could create an index on t8(n1,n2) and run the query again:


create index t8_i1 on t8(n1, n2);

select
        t8.v1, t10.v1
from
        t8,t10
where
        t10.n1 = t8.n1
and     t10.n2 = t8.n2
/

Index created.


100000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 216880280

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100K|    35M|    12  (17)| 00:00:01 |
|*  1 |  HASH JOIN         |      |   100K|    35M|    12  (17)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T8   |  1000 |   182K|     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T10  |  1000 |   182K|     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T10"."N1"="T8"."N1" AND "T10"."N2"="T8"."N2")

Alternatively I could create a column group at the t8 table:



drop index t8_i1;

begin
        dbms_stats.gather_table_stats(
                user,
                't8',
                method_opt => 'for all columns size 1 for columns (n1,n2) size 1'
        );
end;
/

select  
        t8.v1, t10.v1 
from
        t8,t10
where
        t10.n1 = t8.n1
and     t10.n2 = t8.n2
/

Index dropped.


PL/SQL procedure successfully completed.


100000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 216880280

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100K|    35M|    12  (17)| 00:00:01 |
|*  1 |  HASH JOIN         |      |   100K|    35M|    12  (17)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T8   |  1000 |   182K|     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T10  |  1000 |   182K|     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T10"."N1"="T8"."N1" AND "T10"."N2"="T8"."N2")


If you’re wondering why I’ve not picked up this “both ends” detail in the past – it’s because I’ve usually been talking about replacing indexes with column groups and my examples have probably started with indexes at both end of the join before I replaced one index with a column group. (The other examples I’ve given of column groups are typically about single-table access rather than joins.)

 

November 5, 2015

Column Groups

Filed under: CBO,extended stats,Oracle,Statistics — Jonathan Lewis @ 6:48 am GMT Nov 5,2015

I think the “column group” variant of extended stats can be amazingly useful in helping the optimizer to generate good execution plans because of the way they supply better details about cardinality; unfortunately we’ve already seen a few cases (don’t forget to check the updates and comments) where the feature is disabled, and another example of this appeared on OTN very recently.

Modifying the example from OTN to make a more convincing demonstration of the issue, here’s some SQL to prepare a demonstration:


create table t1 ( col1 number, col2 number, col3 date);

insert  into t1
select 1 ,1 ,to_date('03-Nov-2015') from dual
union all
select 1, 2, to_date('03-Nov-2015')  from dual
union all
select 1, 1, to_date('03-Nov-2015')  from dual
union all
select 2, 2, to_date('03-Nov-2015')  from dual
union all   
select 1 ,1 ,null  from dual
union all  
select 1, 1, null  from dual
union all  
select 1, 1, null  from dual
union all
select 1 ,1 ,to_date('04-Nov-2015')  from dual
union all  
select 1, 1, to_date('04-Nov-2015')  from dual
union all  
select 1, 1, to_date('04-Nov-2015')  from dual
;

begin
        dbms_stats.gather_table_stats(
                ownname         => user,
                tabname         => 'T1',
                method_opt      => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname         => user,
                tabname         => 'T1',
                method_opt      => 'for columns (col1, col2, col3)'
        );
end;
/

I’ve collected stats in a slightly unusual fashion because I want to make it clear that I’ve got “ordinary” stats on the table, with a histogram on the column group (col1, col2, col3). You’ll notice that this combination is a bit special – of the 10 rows in the table there are three with the values (1,1,null) and three with the values (1,1,’04-Nov-2015′), so some very clear skew to the data which results in Oracle gathering a frequency histogram on the table.

These two combinations are remarkably similar, so what happens when we execute a query to find them – since there are no indexes the plan will be a tablescan, but what will we see as the cardinality estimate ? Surely it will be the same for both combinations:


select  count(*)
from    t1
where
        col1 = 1
and     col2 = 1
and     col3 = '04-Nov-2015'
;

select  count(*)
from    t1
where
        col1 = 1
and     col2 = 1
and     col3 is null

Brief pause for thought …

and here are the execution plans, including predicate section – in the same order (from 11.2.0.4 and 12.1.0.2):


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     3 |    36 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("COL1"=1 AND "COL2"=1 AND "COL3"=TO_DATE(' 2015-11-04
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |    12 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("COL3" IS NULL AND "COL1"=1 AND "COL2"=1)

The predictions are different – the optimizer has used the histogram on the column group for the query with “col3 = to_date()”, but not for the query with “col3 is null”. That’s a bit of a shame really because there are bound to be cases where some queries would benefit enormously from having a column group used even when some of its columns are subject to “is null” tests.

Analysis

The demonstration above isn’t sufficient to prove the point, of course; it merely shows an example of a suspiciously bad estimate. Here are a few supporting details – first we show that both the NULL and the ’04-Nov-2015′ combinations do appear in the histogram. We do this by checking the column stats, the histogram stats, and the values that would be produced by the hashing function for the critical combinations:


set null "n/a"

select distinct
        col3,
        mod(sys_op_combined_hash(col1, col2, col3), 9999999999)
from    t1
where
        col3 is null
or      col3 = to_date('04-Nov-2015')
order by
        2
;

column endpoint_actual_value format a40
column column_name           format a32
column num_buckets           heading "Buckets"

select
        column_name,
        num_nulls, num_distinct, density,
        histogram, num_buckets
from
        user_tab_cols
where
        table_name = 'T1'

break on column_name skip 1

select
        column_name,
        endpoint_number, endpoint_value,
        endpoint_actual_value -- , endpoint_repeat_count
from
        user_tab_histograms
where
        table_name = 'T1'
and     column_name not like 'COL%'
order by
        table_name, column_name, endpoint_number
;

(For an explanation of the sys_op_combined_hash() function see this URL).

Here’s the output from the three queries:


COL3      MOD(SYS_OP_COMBINED_HASH(COL1,COL2,COL3),9999999999)
--------- ----------------------------------------------------
04-NOV-15                                           5347969765
n/a                                                 9928298503

COLUMN_NAME                       NUM_NULLS NUM_DISTINCT    DENSITY HISTOGRAM          Buckets
-------------------------------- ---------- ------------ ---------- --------------- ----------
COL1                                      0            2         .5 NONE                     1
COL2                                      0            2         .5 NONE                     1
COL3                                      3            2         .5 NONE                     1
SYS_STU2IZIKAO#T0YCS1GYYTTOGYE            0            5        .05 FREQUENCY                5


COLUMN_NAME                      ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
-------------------------------- --------------- -------------- ----------------------------------------
SYS_STU2IZIKAO#T0YCS1GYYTTOGYE                 1      465354344
                                               4     5347969765
                                               6     6892803587
                                               7     9853220028
                                              10     9928298503

As you can see, there’s a histogram only on the combination and Oracle has found 5 distinct values for the combination. At endpoint 4 you can see the combination that includes 4th Nov 2015 (with the interval 1 – 4 indicating a frequency of 3 rows) and at endpoint 10 you can see the combination that includes the null (again with an interval indicating 3 rows). The stats are perfect to get the job done, and yet the optimizer doesn’t seem to use them.

If we examine the optimizer trace file (event 10053) we can see concrete proof that this is the case when we examine the “Single Table Access Path” sections for the two queries – here’s a very short extract from each trace file, the first for the query with “col3 = to_date()”, the second for “col3 is null”:


ColGroup (#1, VC) SYS_STU2IZIKAO#T0YCS1GYYTTOGYE
  Col#: 1 2 3    CorStregth: 1.60
ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.3000


ColGroup (#1, VC) SYS_STU2IZIKAO#T0YCS1GYYTTOGYE
  Col#: 1 2 3    CorStregth: 1.60
ColGroup Usage:: PredCnt: 2  Matches Full:  Partial:

Apparently “col3 is null” is not a predicate!

The column group can be used only if you have equality predicates on all the columns. This is a little sad – the only time that the sys_op_combined_hash() will return a null is (I think) when all its input are null, so there is one very special case for null handling with column groups – and even then the num_nulls for the column group would tell the optimizer what it needed to know. As it is, we have exactly the information we need to get a good cardinality estimate for the second query, but the optimizer is not going to use it.

Summary

If you create a column group to help the optimizer with cardinality calculations it will not be used for queries where any of the underlying columns is used in an “is null” predicate. This is coded into the optimizer, it doesn’t appear to be an accident.

Reference script: column_group_null.sql

November 2, 2015

Clustering_factor

Filed under: Indexing,Oracle,RAC,Statistics — Jonathan Lewis @ 10:27 am GMT Nov 2,2015

I had a recent conversation at Oracle OpenWorld 2015 about a locking anomaly in a 3-node RAC system which was causing unexpected deadlocks. Coincidentally, this conversation came about shortly after I had been listening to Martin Widlake talking about using the procedure dbms_stats.set_table_prefs() to adjust the way that Oracle calculates the clustering_factor for indexes. The juxtaposition of these two topics made me realise that the advice I had given in “Cost Based Oracle – Fundamentals” 10 years ago was (probably) incomplete, and needed some verification. The sticking point was RAC.

In my original comments about setting the “table_cached_blocks” preference (as it is now known) I has pointed out that the effect of ASSM (with its bitmap space management blocks) was to introduce a small amount of random scattering as rows were inserted by concurrent sessions and this would adversely affect the clustering_factor of any indexes on the table, so a reasonable default value for the table_cached_blocks parameter would be 16.

I had overlooked the fact that in RAC each instance tries to acquire ownership of its own level 1 (L1) bitmap block in an attempt to minimise the amount of global cache contention.  If each instance uses a different L1 bitmap block to allocate data blocks then (for tables and their partitions) they won’t be using the same data blocks for inserts, and they won’t even have to pass the bitmap blocks between instances when searching for free space. The consequence of this, though, is that if N separate instances are inserting data into a single table there are typically 16 * N different blocks into which sessions could be inserting concurrently, so the “most recent” data could be scattered across 16N blocks, which means the appropriate value table_cached_blocks is 16N.

To demonstrate the effect of RAC and multiple L1 blocks, here’s a little demonstration code from a 12c RAC database with 3 active instances.


create tablespace test_8k_assm_auto
datafile size 67108864
logging online permanent
blocksize 8192
extent management local autoallocate default
nocompress segment space management auto
;

create table t1 (n1 number, c1 char(1000)) storage (initial 8M next 8M);

The code above simply creates a tablespace using locally managed extents with system allocated extent sizes, then creates a table in that tablespace with a starting requirement of 8MB. Without this specification of initial the first few extents for the table would have been 64KB thanks to the system allocation algorithm, and that would have spoiled the demonstration because the table would have started by allocating a single extent of 64KB, with just one L1 bitmap block; slightly different effects would also have appeared with an extent size of 1MB – with 2 L1 bitmap blocks – which is the second possible extent size for system allocation.

Having created the table I connected one session to each of the three instances and inserted one row, with commit, from each instance. Then I ran a simple SQL statement to show me the file and block numbers of the rows inserted:


select
        dbms_rowid.rowid_relative_fno(rowid)    file_no,
        dbms_rowid.rowid_block_number(rowid)    block_no,
        count(*)                                rows_in_block
from
        t1
group by
        dbms_rowid.rowid_relative_fno(rowid),
        dbms_rowid.rowid_block_number(rowid)
order by
        dbms_rowid.rowid_relative_fno(rowid),
        dbms_rowid.rowid_block_number(rowid)
;


   FILE_NO   BLOCK_NO ROWS_IN_BLOCK
---------- ---------- -------------
        19        518             1
        19        745             1
        19       2157             1

As you can see, each row has gone into a separate block – more significantly, though, those blocks are a long way apart from each other – they are in completely different sets of 16 block – each instance is working with its own L1 block (there are 16 of them to choose from in an 8MB extent), and has formatted 16 blocks associated with that L1 for its own use.

In fact this simple test highlighted an anomaly that I need to investigate further. In my first test, after inserting just 3 rows into the table I found that Oracle had formatted 288 blocks (18 groups of 16) across 2 extents, far more than seems reasonable. The effect looks hugely wasteful, but that’s mainly because I’ve implied that I have a “large” table into which I’ve then inserted very little data – nevertheless something a little odd has happened. In my second test it got worse because Oracle formatted 16 blocks on the first insert,  took that up to 288 blocks on the second insert, then went up to 816 blocks (using a third extent) on the third insert; then in my third test Oracle behaved as I had assumed it ought to, formatting 3 chunks of 16 blocks each in a single extent – but that might have been because I did a truncate rather than a drop and recreate.

Summary

Whatever else is going on, the key point of this note is that if you’re trying to get Oracle to give you a better estimate for the clustering_factor in a RAC system then “16 * instance-count” is probably a good starting point for setting the table preference known as table_cached_blocks.

The anomaly of data being scattered extremely widely with more extents being allocated than you might expect is probably a boundary condition that you don’t have to worry about – until I’ve had time to look at it a little more closely.

 

October 23, 2015

Histogram Limit

Filed under: Bugs,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:03 pm GMT Oct 23,2015

A surprising question came up on OTN a couple of days ago:

Why does a query for “column = 999999999999999999” run slower than a query for “column > 999999999999999998” (that’s 18 digit numbers, if you don’t want to count them). With the equality predicate the query is very slow, with the range-based predicate perfomance is good.

In the absence of further information there are various reasons why this is possible – but the example in question was about a “versioning” table where the single very large value was used as the “not yet ended” value for the history of an id so, at a minimum, the table held columns (id, nstart, nend, other), and each id could appear many times with pairs of start and end values that supplied non-overlapping, covering ranges and one row that had the very large number as the end value.

Let’s jot down a few ideas about what the data (and stats) might look like.

Assuming every id appears “a few” time and every id has to have a “still valid” row this means that a very large fraction (say 10% to 25%, if “a few” means 4 to 9) of the rows hold the value 999999999999999999.

If you gather stats without a histogram then you should get the low and high, finding that the high is 999999999999999999 and that the range is enormous, and so the predicates “column = {high value}” and “column > {high value} – 1” should give very similar cardinalities.

If you collect stats with a histogram you should find the very popular high value even in a very small sample set (which is what happens with histogram collection in 11g, and even in 12c for hybrid histograms). In this case the histogram should spot the significance of the high value and again the two predicates should have very similar cardinalities.

At first sight there doesn’t seem to be a feasible way that the two cardinalities could be sufficiently different to cause a problem – so maybe there’s something about character conversion or maybe bind variable usage that hasn’t been mentioned. So to test a couple of the less likely ideas I built a data set using 11.2.0.4 – and found a bug:


create table t1 (
        id              number(18),
        nend            number(18),
        n1              number(18),
        small_vc        varchar2(10)
)
nologging
;

insert /*+ append */ into t1
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum -1               id,
        case mod(rownum - 1,4)
                when 0  then 999999999999999999
--              when 0  then 999999999999999
                        else mod(rownum - 1, 250000)
        end                     nend,
        rownum - 1              n1,
        rpad('x',10,'x')        small_vc
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 ; commit; select * from t1 where nend = 0; begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size auto'
        );
end;
/

I have an nend column that is set to 999999999999999999 every 4th row in the table and otherwise has 4 rows per value for 187,500 other (relatively low) values. It’s probably a reasonable initial model of the original data. I’ve run a query with a predicate referencing nend before gathering stats so that the (default) auto option will build a histogram for nend. Then I’ve checked the execution plans for two critical queries:


explain plan for
select  *
from    t1
where   nend = 999999999999999999
;

select * from table(dbms_xplan.display);

explain plan for
select  *
from    t1
where   nend > 999999999999999998
;

select * from table(dbms_xplan.display);

And here’s the surprise – the two plans, in order:

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     4 |   108 |   625   (9)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T1   |     4 |   108 |   625   (9)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("NEND"=999999999999999999)


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250K|  6591K|   627   (9)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T1   |   250K|  6591K|   627   (9)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("NEND">999999999999999998)


Clearly this should not happen – the equality test is way off, the inequality test is correct. The obvious first guess is that something funny has happened with the statistics so let’s see what they look like – the column stats (user_tab_cols) and the histogram stats (user_tab_histograms) seem like a good starting point:


select
        sample_size, num_distinct, histogram, num_buckets,
        substr(low_value,1,26) low_value, substr(high_value,1,26) high_value
from
        user_tab_cols
where
        table_name = 'T1'
and     column_name = 'NEND'
;

column endpoint_value format 999,999,999,999,999,999,999

select
        endpoint_number, endpoint_value -- , endpoint_repeat_count
from
        user_tab_histograms
where
        table_name = 'T1'
and     column_name = 'NEND'
order by
        endpoint_number
;

And here are the results (with a couple of hundred uninteresting rows eliminated from the histogram):


      Sample     Distinct HISTOGRAM          Buckets LOW_VALUE                  HIGH_VALUE
------------ ------------ --------------- ---------- -------------------------- --------------------------
       5,541      190,176 HEIGHT BALANCED        254 C102                       C9646464646464646464


ENDPOINT_NUMBER               ENDPOINT_VALUE
--------------- ----------------------------
              0                           19
              1                        1,225
              2                        2,503
              3                        3,911
              4                        4,806
...
            188                      247,479
            189                      248,754
            190                      249,862
            254    1,000,000,000,000,000,000

Oracle will have started with a 100% sample to collect stats on all the columns, but taken a small sample to test the need for a histogram on the nend column – and that’s why the sample size of 5,541 has appeared, but that’s not relevant to the problem in hand. The big question comes from endpoint_number 254 – why is the highest value in the histogram 1e19 when we know (and the column stats show) that the highest value is actually 999999999999999999 ?!

It’s a question to which I don’t have an answer – but I do know that

  • if your high value is 15 digits long (all 9s) then the histogram shows the right high value
  • if your high value is more that 15 9s then the histogram shows the high value plus 1
  • the value collected by the query that Oracle runs is the actual value (i.e. 18 9s)
  • if you use set_column stats to set 18 9s as the high value you still get 1e19 in the histogram

Once you see the stats you can understand why the OP sees the odd performance problem. If the histogram identifies 1e19 as a (very) popular value, leaving 999999999999999 as an “average” value with only 4 rows; on the other hand the query for greater than 999999999999999998 can see that there really are 250K rows with higher values.

Footnote:

Interestingly 12c does the same with the stats – introducing the 1e19 in the histogram – but still manages, somehow, to calculate the correct cardinality in the equality case. (There is one slight difference in 12c, the histogram is a hybrid histogram, not a height-balanced histogram).

Update 24th Oct 2015:

I realised late last night that I had written about this behaviour before – though possible just as a response on OTN or in some unpublished notes. According to the notes in a script called histogram_numeric_bug.sql that I found on my laptop this is “Bug 18514507 : WRONG CARDINALITY ESTIMATES WHEN NUMERIC VALUE IS LONGER THAN 15 BYTES”. (except that the base bugs that that one is linked to are probably not the same bug).

I suspect the problem is related to the way that character histograms are built based on a numeric representation of the string that takes the first few characters of the string, treats that resulting N bytes as a hex number, converts to decimal and then applies round(N,-14) to restrict the precision stored. The effect with strings is that (broadly speaking) you get fifteen digits precision – which is exactly what I seem to be seeing with numbers.

The reason that 12c can get the right answers despite storing the wrong endpoint_value is that it’s also storing to_char() of the right value as the endpoint_actual_value – possibly doing this any time it has had to round the endpoint_value (for character strings the endpoint_actual_value was only populated if two entries in the endpoint_value were the same).

 

 

 

 

September 30, 2015

Estimate_percent

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 5:10 pm GMT Sep 30,2015

Here’s a live one from OTN – here are a couple of extracts from the problem statement:

We’re experiencing an issue where it seems that the query plan changes from day to day for a particular procedure that runs once a night.
It’s resulting in a performance variance of 10 second completion time vs 20 minutes (nothing in between).
It started occurring about 2 months ago and now it’s becoming more prevalent where the bad query plan is coming up more often.
I noticed that the query plans vary for a simple query.
We do run gather statistics every night. (DBMS_STATS.GATHER_SCHEMA_STATS (ownname=>sys_context( ‘userenv’, ‘current_schema’ ), estimate_percent => 1);)

The query and two execution plans look like this:

select count(*) from cs_bucket_member_v2 where bucket_type='P' and sec_id > 0 and order_id=0;

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     1 |    12 |   155   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| CS_BUCKET_MEMBER_V2 |  1148 | 13776 |   155   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | CS_BUCKET_MEMBER_N1 |  1272 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("BUCKET_TYPE"='P' AND "SEC_ID">0)
   3 - access("ORDER_ID"=0)


------------------------------------------------------------------------------------------
| Id  | Operation          | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                     |     1 |    12 | 11215   (2)| 00:01:41 |
|   1 |  SORT AGGREGATE    |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| CS_BUCKET_MEMBER_V2 |  1522K|    17M| 11215   (2)| 00:01:41 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=0 AND "SEC_ID">0 AND "BUCKET_TYPE"='P')

There are a couple of bits of information that would be useful – such as the database version, the number of rows in the table, the number of distinct values in each column, and whether any of the columns have histograms – but there are a couple of reasonable guesses that we might make about the problem. Notice particularly that the number of rows estimated from the index ranges scan is 1272 and only a small volume is then eliminated by the table filter predicates on sec_id and bucket_type. This suggests that the optimizer has information that tells it that most of the rows in the table have sec_id > 0 and bucket_type = ‘P’, and you might note that that suggests that there’s a histogram on bucket_type.

Rather than stating the most obvious guesses about the problem, though, I’ll start by creating a data set and emulating the problem, starting from an empty schema on 11.2.0.4:

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual
        connect by 
                level <= 1e4
)
select
        rownum                  sec_id,
        case
                when mod(rownum,1000) = 0
                        then 'X'
                        else 'P'
        end                     bucket_type,
        case
                when rownum < 1e6 - 50000 
                        then mod(rownum-1,1e5)
                        else 1000
        end                     order_id,
        lpad(rownum,10,'0')     id_vc,
        rpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6

create index t1_i1 on t1(order_id) nologging; 

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

The column names in the table match those needed by the query, and the bucket_p column has a very skewed distribution that will eliminate very little data; the sec_id column is also not going to eliminate data, but it’s very evenly distributed with no large gaps so not a good candidate for a histogram in any case. The order_id has 50,000 rows out of 1,000,000 (5%) set of a single value, and most of those special rows are at the end of the table – it’s a pretty good candidate for a histogram (if Oracle spots it, and if we actually write queries to access that data).

I’ve run a query that references all three columns so that the default method_opt of “for all columns size auto” will apply to them when I gather stats. So here’s the code that gathers stats and checks the resulting execution plans, first for “auto_sample_size” then for the 1% used by the OP:


set autotrace traceonly explain

begin
        dbms_stats.gather_schema_stats(
/*              estimate_percent => 1, */
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

begin
        dbms_stats.gather_schema_stats(
                estimate_percent => 1,
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

set autotrace off

And here are the two plans – in the same order:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |  2333   (4)| 00:00:12 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 51063 |   598K|  2333   (4)| 00:00:12 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=1000 AND "SEC_ID">1000 AND "BUCKET_TYPE"='P')


--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    12 |    23   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |       |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| T1    |    20 |   240 |    23   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |    20 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):  
---------------------------------------------------
   2 - filter("SEC_ID">1000 AND "BUCKET_TYPE"='P')
   3 - access("ORDER_ID"=1000)


[Update: Following on from a question in the comments, I’ve expanded this section, and wandered a little off-topic]

I don’t know why, but with a 1% sample (which really did sample 10,000 rows) the optimizer didn’t spot the need for a histogram on order_id, but with the auto_sample_size (which sampled 5,500 – yes, half as many rows) the optimizer spotted the need for the histogram. Checking the trace files the only difference visible in the sample SQL was the presence in the 1% sample of the id_vc and padding columns which were not present in the auto_sample_size trace.

According to the manuals when the method_opt is “for all columns size auto”, then

“Oracle determines the columns on which to collect histograms based on data distribution and the workload of the columns.”

There is nothing in the manuals to suggest that there is a deliberate link between the auto_sample_size and estimate_percent, and there is room for ambiguity in how we interpret this bit of text in the manual so the difference in the SQL used and the effects thereof requires (a) some hand-waving, and/or (b) lots more experimentation.  At the moment I’m prepared to go for hand-waving:

Hypothesis 1: auto_sample_size did not sample the id_vc and padding columns because the (100%) sample taken had given Oracle enough information to decide that the data distribution of those columns was not skewed enough to merit further consideration; but it sampled the three columns that had been used in a fashion that might be helped by a histogram. This sampling spotted the benefit of a histogram on order_id and bucket_type but decided that sec_id didn’t need a histogram

Hypothesis 2: the 1% sample got pretty close to the same results in its estimates of number of distinct values for id_vc and padding as the (100%) auto_sample_size, but still decided to do a sampled test for the data distribution (the manual seems to suggest that the histograms will only be considered if there has been some use of the columns in predicates, but doesn’t explicitly preclude the possibility of creating the histogram on the basis of just the data distribution). After doing the 1% sample to analyze the data for suitability of a histogram the result suggested that only the histogram on bucket_type would be beneficial.  (In fact, after the first sample Oracle took a second 1% histogram sample on just the order_id before deciding that it a histogram on order_id wasn’t needed.)

Bottom line on this: I don’t know if the auto_sample_size “accidentally” eliminated a couple of columns from histogram sampling and if a larger fixed sample size (say 50%, or even 100%) might result in Oracle eliminating a few columns from the histogram; or maybe the code path for histogram samples with auto_sample_size in place is actually a different code path. The only thing I can say is that the two sets of events that appeared from my demonstration don’t seem to be entirely self-consistent, but it would probably take most of a day doing experiments to narrow down the variation in behaviour to a few “best guess” ideas of what’s going on behind the scenes – though unwrapping the code might lead to a more accurate answer more quickly.

Moral

Histograms are tricky things – and you can only make things worse in 11g by NOT using the auto_sample_size.

Footnote

Based on previous experience – my “obvious” guess about the OP’s data was that there was a special-case value for order_id, that the rows for that value were fairly well clustered, probably towards the end of the table, and constituted a small percentage of the table, and that the rest of the data reported “a few” rows per value. That’s why I built the model you see above.

Next Page »

Blog at WordPress.com.