Oracle Scratchpad

December 29, 2015

Column Groups

Filed under: extended stats,Oracle,Statistics,Tuning — Jonathan Lewis @ 1:13 pm GMT Dec 29,2015

I think the “column group” variant of extended stats is a wonderful addition to the Oracle code base, but there’s a very important detail about using the feature that I hadn’t really noticed until a question came up on the OTN database forum recently about a very bad join cardinality estimate.

The point is this: if you have a multi-column equality join and the optimizer needs some help to get a better estimate of join cardinality then column group statistics may help if you create matching stats at both ends of the join. There is a variation on this directive that helps to explain why I hadn’t noticed it before – multi-column indexes (with exactly the correct columns) have the same effect and, most significantly, the combination of  one column group and a matching multi-column index will do the trick.

Here’s some code to demonstrate the effect:

create table t8
as
select
        trunc((rownum-1)/125)   n1,
        trunc((rownum-1)/125)   n2,
        rpad(rownum,180)        v1
from
        all_objects
where
        rownum <= 1000
;

create table t10
as
select
        trunc((rownum-1)/100)   n1,
        trunc((rownum-1)/100)   n2,
        rpad(rownum,180)        v1
from
        all_objects
where
        rownum <= 1000
;
begin
        dbms_stats.gather_table_stats(
                user,
                't8',
                method_opt => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                user,
                't10',
                method_opt => 'for all columns size 1'
        );
end;
/

set autotrace traceonly

select
        t8.v1, t10.v1
from
        t8,t10
where
        t10.n1 = t8.n1
and     t10.n2 = t8.n2
/

set autotrace off

Table t8 has eight distinct values for n1 and n2, and 8 combinations (though the optimizer will assume there are 64 combinations); table t10 has ten distinct values for n1 and n2, and ten combinations (though the optimizer will assume there are 100 combinations). In the absence of any column group stats (or histograms, or indexes) and with no filter predicates on either table, the join cardinality will be “{Cartesian Join cardinality} * {join selectivity}”, and in the absence of any nulls the join selectivity – thanks to the “multi-column sanity check” – will be 1/(greater number of distinct combinations). So we get 1,000,000 / 100 = 10,000.

Here’s the output from autotrace in 11.2.0.4 to prove the point:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |  3652K|    11  (10)| 00:00:01 |
|*  1 |  HASH JOIN         |      | 10000 |  3652K|    11  (10)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T8   |  1000 |   182K|     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T10  |  1000 |   182K|     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T10"."N1"="T8"."N1" AND "T10"."N2"="T8"."N2")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        835  consistent gets
          0  physical reads
          0  redo size
   19965481  bytes sent via SQL*Net to client
      73849  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed

As you can see, the query actually returns 100,000 rows. The estimate of 10,000 is badly wrong thanks to the correlation between the n1 and n2 columns. So let’s check the effect of creating a column group on t10:


begin
        dbms_stats.gather_table_stats(
                user,
                't10',
                method_opt => 'for all columns size 1 for columns (n1,n2) size 1'
        );
end;
/

At this point you might think that the optimizer’s sanity check might say something like: t8 table: 64 combinations, t10 table column group 10 combinations so use the 64 which is now the greater num_distinct. It doesn’t – maybe it will in some future version, but at present the optimizer code doesn’t seem to recognize this as a possibility. (I won’t bother to reprint the unchanged execution plan.)

But, at this point, I could create an index on t8(n1,n2) and run the query again:


create index t8_i1 on t8(n1, n2);

select
        t8.v1, t10.v1
from
        t8,t10
where
        t10.n1 = t8.n1
and     t10.n2 = t8.n2
/

Index created.


100000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 216880280

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100K|    35M|    12  (17)| 00:00:01 |
|*  1 |  HASH JOIN         |      |   100K|    35M|    12  (17)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T8   |  1000 |   182K|     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T10  |  1000 |   182K|     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T10"."N1"="T8"."N1" AND "T10"."N2"="T8"."N2")

Alternatively I could create a column group at the t8 table:



drop index t8_i1;

begin
        dbms_stats.gather_table_stats(
                user,
                't8',
                method_opt => 'for all columns size 1 for columns (n1,n2) size 1'
        );
end;
/

select  
        t8.v1, t10.v1 
from
        t8,t10
where
        t10.n1 = t8.n1
and     t10.n2 = t8.n2
/

Index dropped.


PL/SQL procedure successfully completed.


100000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 216880280

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   100K|    35M|    12  (17)| 00:00:01 |
|*  1 |  HASH JOIN         |      |   100K|    35M|    12  (17)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T8   |  1000 |   182K|     5   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T10  |  1000 |   182K|     5   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T10"."N1"="T8"."N1" AND "T10"."N2"="T8"."N2")


If you’re wondering why I’ve not picked up this “both ends” detail in the past – it’s because I’ve usually been talking about replacing indexes with column groups and my examples have probably started with indexes at both end of the join before I replaced one index with a column group. (The other examples I’ve given of column groups are typically about single-table access rather than joins.)

 

November 5, 2015

Column Groups

Filed under: CBO,extended stats,Oracle,Statistics — Jonathan Lewis @ 6:48 am GMT Nov 5,2015

I think the “column group” variant of extended stats can be amazingly useful in helping the optimizer to generate good execution plans because of the way they supply better details about cardinality; unfortunately we’ve already seen a few cases (don’t forget to check the updates and comments) where the feature is disabled, and another example of this appeared on OTN very recently.

Modifying the example from OTN to make a more convincing demonstration of the issue, here’s some SQL to prepare a demonstration:


create table t1 ( col1 number, col2 number, col3 date);

insert  into t1
select 1 ,1 ,to_date('03-Nov-2015') from dual
union all
select 1, 2, to_date('03-Nov-2015')  from dual
union all
select 1, 1, to_date('03-Nov-2015')  from dual
union all
select 2, 2, to_date('03-Nov-2015')  from dual
union all   
select 1 ,1 ,null  from dual
union all  
select 1, 1, null  from dual
union all  
select 1, 1, null  from dual
union all
select 1 ,1 ,to_date('04-Nov-2015')  from dual
union all  
select 1, 1, to_date('04-Nov-2015')  from dual
union all  
select 1, 1, to_date('04-Nov-2015')  from dual
;

begin
        dbms_stats.gather_table_stats(
                ownname         => user,
                tabname         => 'T1',
                method_opt      => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname         => user,
                tabname         => 'T1',
                method_opt      => 'for columns (col1, col2, col3)'
        );
end;
/

I’ve collected stats in a slightly unusual fashion because I want to make it clear that I’ve got “ordinary” stats on the table, with a histogram on the column group (col1, col2, col3). You’ll notice that this combination is a bit special – of the 10 rows in the table there are three with the values (1,1,null) and three with the values (1,1,’04-Nov-2015′), so some very clear skew to the data which results in Oracle gathering a frequency histogram on the table.

These two combinations are remarkably similar, so what happens when we execute a query to find them – since there are no indexes the plan will be a tablescan, but what will we see as the cardinality estimate ? Surely it will be the same for both combinations:


select  count(*)
from    t1
where
        col1 = 1
and     col2 = 1
and     col3 = '04-Nov-2015'
;

select  count(*)
from    t1
where
        col1 = 1
and     col2 = 1
and     col3 is null

Brief pause for thought …

and here are the execution plans, including predicate section – in the same order (from 11.2.0.4 and 12.1.0.2):


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     3 |    36 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("COL1"=1 AND "COL2"=1 AND "COL3"=TO_DATE(' 2015-11-04
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |    12 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("COL3" IS NULL AND "COL1"=1 AND "COL2"=1)

The predictions are different – the optimizer has used the histogram on the column group for the query with “col3 = to_date()”, but not for the query with “col3 is null”. That’s a bit of a shame really because there are bound to be cases where some queries would benefit enormously from having a column group used even when some of its columns are subject to “is null” tests.

Analysis

The demonstration above isn’t sufficient to prove the point, of course; it merely shows an example of a suspiciously bad estimate. Here are a few supporting details – first we show that both the NULL and the ’04-Nov-2015′ combinations do appear in the histogram. We do this by checking the column stats, the histogram stats, and the values that would be produced by the hashing function for the critical combinations:


set null "n/a"

select distinct
        col3,
        mod(sys_op_combined_hash(col1, col2, col3), 9999999999)
from    t1
where
        col3 is null
or      col3 = to_date('04-Nov-2015')
order by
        2
;

column endpoint_actual_value format a40
column column_name           format a32
column num_buckets           heading "Buckets"

select
        column_name,
        num_nulls, num_distinct, density,
        histogram, num_buckets
from
        user_tab_cols
where
        table_name = 'T1'

break on column_name skip 1

select
        column_name,
        endpoint_number, endpoint_value,
        endpoint_actual_value -- , endpoint_repeat_count
from
        user_tab_histograms
where
        table_name = 'T1'
and     column_name not like 'COL%'
order by
        table_name, column_name, endpoint_number
;

(For an explanation of the sys_op_combined_hash() function see this URL).

Here’s the output from the three queries:


COL3      MOD(SYS_OP_COMBINED_HASH(COL1,COL2,COL3),9999999999)
--------- ----------------------------------------------------
04-NOV-15                                           5347969765
n/a                                                 9928298503

COLUMN_NAME                       NUM_NULLS NUM_DISTINCT    DENSITY HISTOGRAM          Buckets
-------------------------------- ---------- ------------ ---------- --------------- ----------
COL1                                      0            2         .5 NONE                     1
COL2                                      0            2         .5 NONE                     1
COL3                                      3            2         .5 NONE                     1
SYS_STU2IZIKAO#T0YCS1GYYTTOGYE            0            5        .05 FREQUENCY                5


COLUMN_NAME                      ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
-------------------------------- --------------- -------------- ----------------------------------------
SYS_STU2IZIKAO#T0YCS1GYYTTOGYE                 1      465354344
                                               4     5347969765
                                               6     6892803587
                                               7     9853220028
                                              10     9928298503

As you can see, there’s a histogram only on the combination and Oracle has found 5 distinct values for the combination. At endpoint 4 you can see the combination that includes 4th Nov 2015 (with the interval 1 – 4 indicating a frequency of 3 rows) and at endpoint 10 you can see the combination that includes the null (again with an interval indicating 3 rows). The stats are perfect to get the job done, and yet the optimizer doesn’t seem to use them.

If we examine the optimizer trace file (event 10053) we can see concrete proof that this is the case when we examine the “Single Table Access Path” sections for the two queries – here’s a very short extract from each trace file, the first for the query with “col3 = to_date()”, the second for “col3 is null”:


ColGroup (#1, VC) SYS_STU2IZIKAO#T0YCS1GYYTTOGYE
  Col#: 1 2 3    CorStregth: 1.60
ColGroup Usage:: PredCnt: 3  Matches Full: #1  Partial:  Sel: 0.3000


ColGroup (#1, VC) SYS_STU2IZIKAO#T0YCS1GYYTTOGYE
  Col#: 1 2 3    CorStregth: 1.60
ColGroup Usage:: PredCnt: 2  Matches Full:  Partial:

Apparently “col3 is null” is not a predicate!

The column group can be used only if you have equality predicates on all the columns. This is a little sad – the only time that the sys_op_combined_hash() will return a null is (I think) when all its input are null, so there is one very special case for null handling with column groups – and even then the num_nulls for the column group would tell the optimizer what it needed to know. As it is, we have exactly the information we need to get a good cardinality estimate for the second query, but the optimizer is not going to use it.

Summary

If you create a column group to help the optimizer with cardinality calculations it will not be used for queries where any of the underlying columns is used in an “is null” predicate. This is coded into the optimizer, it doesn’t appear to be an accident.

Reference script: column_group_null.sql

November 2, 2015

Clustering_factor

Filed under: Indexing,Oracle,RAC,Statistics — Jonathan Lewis @ 10:27 am GMT Nov 2,2015

I had a recent conversation at Oracle OpenWorld 2015 about a locking anomaly in a 3-node RAC system which was causing unexpected deadlocks. Coincidentally, this conversation came about shortly after I had been listening to Martin Widlake talking about using the procedure dbms_stats.set_table_prefs() to adjust the way that Oracle calculates the clustering_factor for indexes. The juxtaposition of these two topics made me realise that the advice I had given in “Cost Based Oracle – Fundamentals” 10 years ago was (probably) incomplete, and needed some verification. The sticking point was RAC.

In my original comments about setting the “table_cached_blocks” preference (as it is now known) I has pointed out that the effect of ASSM (with its bitmap space management blocks) was to introduce a small amount of random scattering as rows were inserted by concurrent sessions and this would adversely affect the clustering_factor of any indexes on the table, so a reasonable default value for the table_cached_blocks parameter would be 16.

I had overlooked the fact that in RAC each instance tries to acquire ownership of its own level 1 (L1) bitmap block in an attempt to minimise the amount of global cache contention.  If each instance uses a different L1 bitmap block to allocate data blocks then (for tables and their partitions) they won’t be using the same data blocks for inserts, and they won’t even have to pass the bitmap blocks between instances when searching for free space. The consequence of this, though, is that if N separate instances are inserting data into a single table there are typically 16 * N different blocks into which sessions could be inserting concurrently, so the “most recent” data could be scattered across 16N blocks, which means the appropriate value table_cached_blocks is 16N.

To demonstrate the effect of RAC and multiple L1 blocks, here’s a little demonstration code from a 12c RAC database with 3 active instances.


create tablespace test_8k_assm_auto
datafile size 67108864
logging online permanent
blocksize 8192
extent management local autoallocate default
nocompress segment space management auto
;

create table t1 (n1 number, c1 char(1000)) storage (initial 8M next 8M);

The code above simply creates a tablespace using locally managed extents with system allocated extent sizes, then creates a table in that tablespace with a starting requirement of 8MB. Without this specification of initial the first few extents for the table would have been 64KB thanks to the system allocation algorithm, and that would have spoiled the demonstration because the table would have started by allocating a single extent of 64KB, with just one L1 bitmap block; slightly different effects would also have appeared with an extent size of 1MB – with 2 L1 bitmap blocks – which is the second possible extent size for system allocation.

Having created the table I connected one session to each of the three instances and inserted one row, with commit, from each instance. Then I ran a simple SQL statement to show me the file and block numbers of the rows inserted:


select
        dbms_rowid.rowid_relative_fno(rowid)    file_no,
        dbms_rowid.rowid_block_number(rowid)    block_no,
        count(*)                                rows_in_block
from
        t1
group by
        dbms_rowid.rowid_relative_fno(rowid),
        dbms_rowid.rowid_block_number(rowid)
order by
        dbms_rowid.rowid_relative_fno(rowid),
        dbms_rowid.rowid_block_number(rowid)
;


   FILE_NO   BLOCK_NO ROWS_IN_BLOCK
---------- ---------- -------------
        19        518             1
        19        745             1
        19       2157             1

As you can see, each row has gone into a separate block – more significantly, though, those blocks are a long way apart from each other – they are in completely different sets of 16 block – each instance is working with its own L1 block (there are 16 of them to choose from in an 8MB extent), and has formatted 16 blocks associated with that L1 for its own use.

In fact this simple test highlighted an anomaly that I need to investigate further. In my first test, after inserting just 3 rows into the table I found that Oracle had formatted 288 blocks (18 groups of 16) across 2 extents, far more than seems reasonable. The effect looks hugely wasteful, but that’s mainly because I’ve implied that I have a “large” table into which I’ve then inserted very little data – nevertheless something a little odd has happened. In my second test it got worse because Oracle formatted 16 blocks on the first insert,  took that up to 288 blocks on the second insert, then went up to 816 blocks (using a third extent) on the third insert; then in my third test Oracle behaved as I had assumed it ought to, formatting 3 chunks of 16 blocks each in a single extent – but that might have been because I did a truncate rather than a drop and recreate.

Summary

Whatever else is going on, the key point of this note is that if you’re trying to get Oracle to give you a better estimate for the clustering_factor in a RAC system then “16 * instance-count” is probably a good starting point for setting the table preference known as table_cached_blocks.

The anomaly of data being scattered extremely widely with more extents being allocated than you might expect is probably a boundary condition that you don’t have to worry about – until I’ve had time to look at it a little more closely.

 

October 23, 2015

Histogram Limit

Filed under: Bugs,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:03 pm GMT Oct 23,2015

A surprising question came up on OTN a couple of days ago:

Why does a query for “column = 999999999999999999” run slower than a query for “column > 999999999999999998” (that’s 18 digit numbers, if you don’t want to count them). With the equality predicate the query is very slow, with the range-based predicate perfomance is good.

In the absence of further information there are various reasons why this is possible – but the example in question was about a “versioning” table where the single very large value was used as the “not yet ended” value for the history of an id so, at a minimum, the table held columns (id, nstart, nend, other), and each id could appear many times with pairs of start and end values that supplied non-overlapping, covering ranges and one row that had the very large number as the end value.

Let’s jot down a few ideas about what the data (and stats) might look like.

Assuming every id appears “a few” time and every id has to have a “still valid” row this means that a very large fraction (say 10% to 25%, if “a few” means 4 to 9) of the rows hold the value 999999999999999999.

If you gather stats without a histogram then you should get the low and high, finding that the high is 999999999999999999 and that the range is enormous, and so the predicates “column = {high value}” and “column > {high value} – 1” should give very similar cardinalities.

If you collect stats with a histogram you should find the very popular high value even in a very small sample set (which is what happens with histogram collection in 11g, and even in 12c for hybrid histograms). In this case the histogram should spot the significance of the high value and again the two predicates should have very similar cardinalities.

At first sight there doesn’t seem to be a feasible way that the two cardinalities could be sufficiently different to cause a problem – so maybe there’s something about character conversion or maybe bind variable usage that hasn’t been mentioned. So to test a couple of the less likely ideas I built a data set using 11.2.0.4 – and found a bug:


create table t1 (
        id              number(18),
        nend            number(18),
        n1              number(18),
        small_vc        varchar2(10)
)
nologging
;

insert /*+ append */ into t1
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum -1               id,
        case mod(rownum - 1,4)
                when 0  then 999999999999999999
--              when 0  then 999999999999999
                        else mod(rownum - 1, 250000)
        end                     nend,
        rownum - 1              n1,
        rpad('x',10,'x')        small_vc
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 ; commit; select * from t1 where nend = 0; begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size auto'
        );
end;
/

I have an nend column that is set to 999999999999999999 every 4th row in the table and otherwise has 4 rows per value for 187,500 other (relatively low) values. It’s probably a reasonable initial model of the original data. I’ve run a query with a predicate referencing nend before gathering stats so that the (default) auto option will build a histogram for nend. Then I’ve checked the execution plans for two critical queries:


explain plan for
select  *
from    t1
where   nend = 999999999999999999
;

select * from table(dbms_xplan.display);

explain plan for
select  *
from    t1
where   nend > 999999999999999998
;

select * from table(dbms_xplan.display);

And here’s the surprise – the two plans, in order:

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     4 |   108 |   625   (9)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T1   |     4 |   108 |   625   (9)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("NEND"=999999999999999999)


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   250K|  6591K|   627   (9)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T1   |   250K|  6591K|   627   (9)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("NEND">999999999999999998)


Clearly this should not happen – the equality test is way off, the inequality test is correct. The obvious first guess is that something funny has happened with the statistics so let’s see what they look like – the column stats (user_tab_cols) and the histogram stats (user_tab_histograms) seem like a good starting point:


select
        sample_size, num_distinct, histogram, num_buckets,
        substr(low_value,1,26) low_value, substr(high_value,1,26) high_value
from
        user_tab_cols
where
        table_name = 'T1'
and     column_name = 'NEND'
;

column endpoint_value format 999,999,999,999,999,999,999

select
        endpoint_number, endpoint_value -- , endpoint_repeat_count
from
        user_tab_histograms
where
        table_name = 'T1'
and     column_name = 'NEND'
order by
        endpoint_number
;

And here are the results (with a couple of hundred uninteresting rows eliminated from the histogram):


      Sample     Distinct HISTOGRAM          Buckets LOW_VALUE                  HIGH_VALUE
------------ ------------ --------------- ---------- -------------------------- --------------------------
       5,541      190,176 HEIGHT BALANCED        254 C102                       C9646464646464646464


ENDPOINT_NUMBER               ENDPOINT_VALUE
--------------- ----------------------------
              0                           19
              1                        1,225
              2                        2,503
              3                        3,911
              4                        4,806
...
            188                      247,479
            189                      248,754
            190                      249,862
            254    1,000,000,000,000,000,000

Oracle will have started with a 100% sample to collect stats on all the columns, but taken a small sample to test the need for a histogram on the nend column – and that’s why the sample size of 5,541 has appeared, but that’s not relevant to the problem in hand. The big question comes from endpoint_number 254 – why is the highest value in the histogram 1e19 when we know (and the column stats show) that the highest value is actually 999999999999999999 ?!

It’s a question to which I don’t have an answer – but I do know that

  • if your high value is 15 digits long (all 9s) then the histogram shows the right high value
  • if your high value is more that 15 9s then the histogram shows the high value plus 1
  • the value collected by the query that Oracle runs is the actual value (i.e. 18 9s)
  • if you use set_column stats to set 18 9s as the high value you still get 1e19 in the histogram

Once you see the stats you can understand why the OP sees the odd performance problem. If the histogram identifies 1e19 as a (very) popular value, leaving 999999999999999 as an “average” value with only 4 rows; on the other hand the query for greater than 999999999999999998 can see that there really are 250K rows with higher values.

Footnote:

Interestingly 12c does the same with the stats – introducing the 1e19 in the histogram – but still manages, somehow, to calculate the correct cardinality in the equality case. (There is one slight difference in 12c, the histogram is a hybrid histogram, not a height-balanced histogram).

Update 24th Oct 2015:

I realised late last night that I had written about this behaviour before – though possible just as a response on OTN or in some unpublished notes. According to the notes in a script called histogram_numeric_bug.sql that I found on my laptop this is “Bug 18514507 : WRONG CARDINALITY ESTIMATES WHEN NUMERIC VALUE IS LONGER THAN 15 BYTES”. (except that the base bugs that that one is linked to are probably not the same bug).

I suspect the problem is related to the way that character histograms are built based on a numeric representation of the string that takes the first few characters of the string, treats that resulting N bytes as a hex number, converts to decimal and then applies round(N,-14) to restrict the precision stored. The effect with strings is that (broadly speaking) you get fifteen digits precision – which is exactly what I seem to be seeing with numbers.

The reason that 12c can get the right answers despite storing the wrong endpoint_value is that it’s also storing to_char() of the right value as the endpoint_actual_value – possibly doing this any time it has had to round the endpoint_value (for character strings the endpoint_actual_value was only populated if two entries in the endpoint_value were the same).

 

 

 

 

September 30, 2015

Estimate_percent

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 5:10 pm GMT Sep 30,2015

Here’s a live one from OTN – here are a couple of extracts from the problem statement:

We’re experiencing an issue where it seems that the query plan changes from day to day for a particular procedure that runs once a night.
It’s resulting in a performance variance of 10 second completion time vs 20 minutes (nothing in between).
It started occurring about 2 months ago and now it’s becoming more prevalent where the bad query plan is coming up more often.
I noticed that the query plans vary for a simple query.
We do run gather statistics every night. (DBMS_STATS.GATHER_SCHEMA_STATS (ownname=>sys_context( ‘userenv’, ‘current_schema’ ), estimate_percent => 1);)

The query and two execution plans look like this:

select count(*) from cs_bucket_member_v2 where bucket_type='P' and sec_id > 0 and order_id=0;

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     1 |    12 |   155   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| CS_BUCKET_MEMBER_V2 |  1148 | 13776 |   155   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | CS_BUCKET_MEMBER_N1 |  1272 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("BUCKET_TYPE"='P' AND "SEC_ID">0)
   3 - access("ORDER_ID"=0)


------------------------------------------------------------------------------------------
| Id  | Operation          | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                     |     1 |    12 | 11215   (2)| 00:01:41 |
|   1 |  SORT AGGREGATE    |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| CS_BUCKET_MEMBER_V2 |  1522K|    17M| 11215   (2)| 00:01:41 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=0 AND "SEC_ID">0 AND "BUCKET_TYPE"='P')

There are a couple of bits of information that would be useful – such as the database version, the number of rows in the table, the number of distinct values in each column, and whether any of the columns have histograms – but there are a couple of reasonable guesses that we might make about the problem. Notice particularly that the number of rows estimated from the index ranges scan is 1272 and only a small volume is then eliminated by the table filter predicates on sec_id and bucket_type. This suggests that the optimizer has information that tells it that most of the rows in the table have sec_id > 0 and bucket_type = ‘P’, and you might note that that suggests that there’s a histogram on bucket_type.

Rather than stating the most obvious guesses about the problem, though, I’ll start by creating a data set and emulating the problem, starting from an empty schema on 11.2.0.4:

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual
        connect by 
                level <= 1e4
)
select
        rownum                  sec_id,
        case
                when mod(rownum,1000) = 0
                        then 'X'
                        else 'P'
        end                     bucket_type,
        case
                when rownum < 1e6 - 50000 
                        then mod(rownum-1,1e5)
                        else 1000
        end                     order_id,
        lpad(rownum,10,'0')     id_vc,
        rpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6

create index t1_i1 on t1(order_id) nologging; 

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

The column names in the table match those needed by the query, and the bucket_p column has a very skewed distribution that will eliminate very little data; the sec_id column is also not going to eliminate data, but it’s very evenly distributed with no large gaps so not a good candidate for a histogram in any case. The order_id has 50,000 rows out of 1,000,000 (5%) set of a single value, and most of those special rows are at the end of the table – it’s a pretty good candidate for a histogram (if Oracle spots it, and if we actually write queries to access that data).

I’ve run a query that references all three columns so that the default method_opt of “for all columns size auto” will apply to them when I gather stats. So here’s the code that gathers stats and checks the resulting execution plans, first for “auto_sample_size” then for the 1% used by the OP:


set autotrace traceonly explain

begin
        dbms_stats.gather_schema_stats(
/*              estimate_percent => 1, */
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

begin
        dbms_stats.gather_schema_stats(
                estimate_percent => 1,
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

set autotrace off

And here are the two plans – in the same order:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |  2333   (4)| 00:00:12 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 51063 |   598K|  2333   (4)| 00:00:12 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=1000 AND "SEC_ID">1000 AND "BUCKET_TYPE"='P')


--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    12 |    23   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |       |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| T1    |    20 |   240 |    23   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |    20 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):  
---------------------------------------------------
   2 - filter("SEC_ID">1000 AND "BUCKET_TYPE"='P')
   3 - access("ORDER_ID"=1000)


[Update: Following on from a question in the comments, I’ve expanded this section, and wandered a little off-topic]

I don’t know why, but with a 1% sample (which really did sample 10,000 rows) the optimizer didn’t spot the need for a histogram on order_id, but with the auto_sample_size (which sampled 5,500 – yes, half as many rows) the optimizer spotted the need for the histogram. Checking the trace files the only difference visible in the sample SQL was the presence in the 1% sample of the id_vc and padding columns which were not present in the auto_sample_size trace.

According to the manuals when the method_opt is “for all columns size auto”, then

“Oracle determines the columns on which to collect histograms based on data distribution and the workload of the columns.”

There is nothing in the manuals to suggest that there is a deliberate link between the auto_sample_size and estimate_percent, and there is room for ambiguity in how we interpret this bit of text in the manual so the difference in the SQL used and the effects thereof requires (a) some hand-waving, and/or (b) lots more experimentation.  At the moment I’m prepared to go for hand-waving:

Hypothesis 1: auto_sample_size did not sample the id_vc and padding columns because the (100%) sample taken had given Oracle enough information to decide that the data distribution of those columns was not skewed enough to merit further consideration; but it sampled the three columns that had been used in a fashion that might be helped by a histogram. This sampling spotted the benefit of a histogram on order_id and bucket_type but decided that sec_id didn’t need a histogram

Hypothesis 2: the 1% sample got pretty close to the same results in its estimates of number of distinct values for id_vc and padding as the (100%) auto_sample_size, but still decided to do a sampled test for the data distribution (the manual seems to suggest that the histograms will only be considered if there has been some use of the columns in predicates, but doesn’t explicitly preclude the possibility of creating the histogram on the basis of just the data distribution). After doing the 1% sample to analyze the data for suitability of a histogram the result suggested that only the histogram on bucket_type would be beneficial.  (In fact, after the first sample Oracle took a second 1% histogram sample on just the order_id before deciding that it a histogram on order_id wasn’t needed.)

Bottom line on this: I don’t know if the auto_sample_size “accidentally” eliminated a couple of columns from histogram sampling and if a larger fixed sample size (say 50%, or even 100%) might result in Oracle eliminating a few columns from the histogram; or maybe the code path for histogram samples with auto_sample_size in place is actually a different code path. The only thing I can say is that the two sets of events that appeared from my demonstration don’t seem to be entirely self-consistent, but it would probably take most of a day doing experiments to narrow down the variation in behaviour to a few “best guess” ideas of what’s going on behind the scenes – though unwrapping the code might lead to a more accurate answer more quickly.

Moral

Histograms are tricky things – and you can only make things worse in 11g by NOT using the auto_sample_size.

Footnote

Based on previous experience – my “obvious” guess about the OP’s data was that there was a special-case value for order_id, that the rows for that value were fairly well clustered, probably towards the end of the table, and constituted a small percentage of the table, and that the rest of the data reported “a few” rows per value. That’s why I built the model you see above.

September 4, 2015

Histogram Tip

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:32 am GMT Sep 4,2015

I’ve just responded to the call for items for the “IOUG Quick Tips” booklet for 2015 – so it’s probably about time to post the quick tip that I put into the 2014 issue. It’s probably nothing new to most readers of the blog, but sometimes an old thing presented in a new way offers fresh insights or better comprehension.

Histogram Tips

A histogram, created in the right way, at the right time, and supported by the correct client-side code, can be a huge benefit to the optimizer; but if you don’t create and use them wisely they can easily become a source of inconsistent performance, and the automatic statistics gathering can introduce an undesirable overhead during the overnight batch. This note explains how you can create histograms very cheaply on the few columns where they are most likely to have a beneficial effect.

set_column_stats

The dbms_stats package have many procedures and functions built into it that allow us to see (get) and manipulate (set) the stored statistics; in particular it holds two functions get_column_stats() and set_column_stats(), and we can use these procedures to create a histogram very cheaply whenever we want at very low cost. Here’s an example that could be modified to suit a character column in a table where you’ve previously collected some stats. It uses a copy of all_objects, limited to exactly 10,000 rows.


create table t1 as
select
	*
from
	all_objects
where
	rownum <= 10000 ; begin dbms_stats.gather_table_stats( ownname => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'
	);
end;
/


declare

	m_distcnt		number;		-- num_distinct
	m_density		number;		-- density
	m_nullcnt		number;		-- num_nulls
	m_avgclen		number;		-- avg_col_len

	srec			dbms_stats.statrec;
	c_array		dbms_stats.chararray;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'object_type', 
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec			=> srec,
		avgclen		=> m_avgclen
	); 

	c_array		:= dbms_stats.chararray('A', 'B', 'C', 'X', 'Y');
	srec.bkvals	:= dbms_stats.numarray (  2,   2,   2, 500, 494);
--	srec.rpcnts	:= dbms_stats.numarray (  0,   0,   0,   0,   0);
	srec.epc := 5;

	dbms_stats.prepare_column_values(srec, c_array);

	m_distcnt	:= 5;
	m_density	:= 1/(5000);

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'object_type', 
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec			=> srec,
		avgclen		=> m_avgclen
	); 

end;
/

Key features of the code: as you can see, the two calls have identical parameters which identify the table and column name (there is an optional parameter for a (sub) partition name), and most of the basic statistics about the column. The histogram (or low and high values) are accessed through a special record type, and we can populate that record type by supplying an ordered list of values, a matching list of frequencies, and a count of how many values we have supplied.

Since my code is fixing stats on a varchar2() column I’ve declared an array of type dbms_stats.chararray to hold the list of values I want to see in a frequency histogram – there are other array types for dates, raw, number, etc. I’ve then used the structure of the stats record I had declared to hold the list of frequencies (srec.bkvals – possibly a short name for “bucket values”) and the count (srec.epc“end-point count”).

The call to dbms_stats.prepare_column_stats() takes my two arrays and massages them into the correct format for storage as a histogram that I can then write into the data dictionary with the closing call to dbms_stats.set_column_stats(). Before making that call, though, I’ve also set the “num_distinct” variable to tell the optimizer that there are 5 distinct values for the column (it makes sense, but isn’t absolutely necessary, for the num_distinct to match the number of values in the array), and set the “density” to a value that I would like the optimizer to use in it calculations if someone asks for a value that is not in my list.

I’ve included (but commented out) a line that’s relevant to the new histogram mechanisms in 12c –the srec.rpcnts (“repeat counts”) array is used in “hybrid histograms”. It’s not relevant to my example where I’m trying to create a pure frequency histogram, but if I don’t set the array I get an Oracle error: “ORA-06532: Subscript outside of limit”.

Results

There’s one important point to the method that isn’t instantly visible in the code – I created my table with 10,000 rows and there will be no nulls for object_type; but if you sum the array of frequencies it comes to exactly 1,000. This apparent contradiction is not a problem – the optimizer will compare the histogram figures to the total number of non-null entries it has recorded (in other words user_tables.num_rowsuser_tab_columns.num_nulls), and scale up the histogram accordingly. This means that a query for ‘A’ should return an estimated row count of 20 (rather than 2), ‘X’ should return 5,000 (rather than 500) and ‘D’ should return 2 (10,000 rows * 1/5000, the selectivity I had set for non-existent values).

With a little editing to save space, here’s a cut-n-paste from an SQL*Plus session running against 12c:


SQL> set autotrace traceonly explain
SQL> select * from t1 where object_type = 'A';

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    20 |  2100 |    22   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    20 |  2100 |    22   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_TYPE"='A')

SQL> select * from t1 where object_type = 'X';

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  5000 |   512K|    22   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  5000 |   512K|    22   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_TYPE"='X')

SQL> select * from t1 where object_type = 'D';

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   105 |    22   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   105 |    22   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_TYPE"='D')

Conclusion

It is very easy to create truly representative histograms (if you know your data) and the resources required to do so are minimal. If you see instability due to bad luck, or bad timing, gathering a histogram then you benefit enormously from writing code to construct histograms.

Footnote

In 12c the introduction of the “approximate NDV” strategy to collecting frequency histograms, and the introduction of the “Top-frequency” histogram has made automatic gathering of histograms on columns with a relatively small number of distinct values much faster and safer – but timing may still be an issue, and the resources needed to gather a safe hybrid histogram may still justify a hand-coded approach.

 

August 17, 2015

Index Usage

Filed under: extended stats,Indexing,Oracle,Tuning — Jonathan Lewis @ 4:25 pm GMT Aug 17,2015

The question of how to identify indexes that could be dropped re-appeared (yet again) on the OTN database forum last week. It’s not really surprising that it recurs so regularly – the problem isn’t an easy one to solve but new (and even less new) users keep hoping that there’s a quick and easy solution.

There are, however, strategies and pointers that can help you to optimise the trade-off between effort, risk, and reward. Broadly the idea is to spend a small amount of effort finding a relatively small number of “expensive” indexes that might be safe to drop, so that when you do the detailed analysis you have a good chance that the time spent will be rewarded by a positive result.

Before we get to some results posted on OTN, it’s worth thinking about the global impact and what we’re trying to achieve, and the threats that go with our attempt to achieve it.

The key detail, of course, is that index maintenance is an expensive process. We could insert 1,000 rows into a table at a cost of writing about 25 table blocks plus a few undo blocks plus something like half a megabyte of redo (assuming, for the purposes of illustration that each row is about 200 bytes on insert). Add one index to the table and we might have to locate and modify 1,000 separate index leaf blocks. The increment on the redo might be about quarter of a megabyte and we may have to access 1,000 different undo blocks for read consistency reasons, but the simple fact that we may need 1,000 buffers to be able to maintain that index is likely to be a significant extra cost on the insert. Make that 10 indexes, or 70 (as one unhappy DBA once told me) and the probability of being able to do high-speed inserts becomes rather low.

Of course we hope that our indexes will allow our queries to operate efficiently with great precision, but inevitably we get to a point where the benefit of precision is outweighed by the cost of maintenance. Our target, then, is to design the set of indexes that makes it possible for the optimizer to find good paths for all the important queries and “good enough” paths for the rest. By the time the system is live, though, it’s too late for “proper design”, and the only option is for damage limitation, a bit of guesswork, and some live testing with fingers crossed (thank goodness for invisible indexes).

The starting point is usually an attempt to identify “the indexes we are not using”, which is typically translated into “the indexes that do not appear in execution plans” – but that’s not actually a good target, for various reasons:

  • Problem 1: If we are using an index it’s possible that we shouldn’t be and that there’s an alternative index available that ought to be more efficient. A corollary to this is that if you do identify and drop such an index you may find that the optimizer doesn’t use the alternative index you were expecting it to use until you take some action to help the optimizer recognise that the alternative is a good choice.
  • Problem 2: if we aren’t using a particular index then perhaps we should be using it and would use it if we dropped one of the other indexes on the table. (And there’s always the possibility that we didn’t happen to use it during the interval we were checking but do use it at some other times)
  • Problem 3: the optimizer is capable of using information about the number of distinct keys in a multi-column index to select an executon plan even though it may not use that index in the plan it finally chooses. We may be able to work around this problem in current versions of Oracle by creating a column group (extended statistics) that matches the definition of each index we drop – but there’s a limit of 20 column groups per table (and we may have to find the “opposite end” of each join where we use the index stats and create a matching column group there).
  • Problem 4: There are some indexes we might not be using but which must exist to avoid the “foreign key locking” problem. It should be easy enough to check, before dropping an index, whether it has to exist to match a foreign key; and even then it may be possible to show that nothing in the application would cause the locking problem to appear – and as a safety measure you could disable locks on the (child) table to ensure that the application doesn’t grind to a halt because of foreign key locking problems.

Provided you remember that problems like these exist, and think carefully about the indexes that your strategy suggests, there are various ways you could approach the problem of identifying indexes that don’t get into execution plans.

v$object_usage

The ink had barely dried on the manual pages for this view before several people (including me) had written notes explaining why this view wasn’t particularly helpful. (I think I even said something about this in Practical Oracle 8i). I won’t repeat the discussion here but it revolves around the fact that an index is flagged as “used” even if it has only been used once in a single execution of a single statement – so you don’t get any idea of the real importance of the index.

v$sql_plan et. al.

If you review the set of in-memory execution plans (and the AWR or Statspack equivalents) you can identify indexes which definitely have been used – but (a) it’s expensive to scan v$sql_plan frequently and (b) the AWR/Statspack repositories only capture a subset of the more expensive plans, so it’s easy to miss indexes which have been used and are relatively important but aren’t in the repository and don’t happen to be in memory at the moments you look.

Review the definitions

If you examine the index definitions you may spot indexes where look very similar. If one index starts with the same columns, in the same order, as another index, there is a good chance that you could reduce two indexes to one – especially if the whole of one of the indexes is the “leading edge” of the other – for example:

  • (dp_datetime_date)
  • (dp_datetime_date, dp_compid)

Even if the leading edges match and the trailing edges differ we might be able to collapse two indexes into one – depending on how selective the leading columns are and how the indexes are used – for example:

  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, dp_datetime_date)
  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, pnr_cfrqsj_date)

which could perhaps be replaced by one of :

  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, dp_datetime_date, pnr_cfrqsj_date)

or

  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, pnr_cfrqsj_date, dp_datetime_date)

Guessing about the use of a typical date column, though, it’s possible that in this example the current trailing date columns are used with a range-based predicate, so it’s possible that this strategy won’t be effective for this pair of indexes.

Even if the order of later columns in the index doesn’t match you may still find that a pair of indexes could be reduced to a single index – for example the pair:

  • (dp_datetime_date, dp_compid)
  • (dp_datetime_date, ddzdt, dp_compid, ct_nxrdh, ct_smsmobilno)

which could perhaps be replaced by just:

  • (dp_datetime_date, dp_compid, ddzdt, ct_nxrdh, ct_smsmobilno)

As a safety measure, of course, you would probably create a new index, then make the subject indexes invisible, and wait for at least a week to see whether any performance problems appear (remembering that one automatic performance threat would be the increase in workload as yet another index – temporarily – has to be maintained).

The difficulty of eliminating indexes by examination is that it takes a lot of effort to investigate all the possibilities, so you really need some way of choosing a relatively small subset of indexes that might be worth the effort. This brings me to the principle topic of this posting – using segment statistics to help you pick which indexes might be worth the effort.

v$segstat / v$segment_statistics

Oracle records a number of workload statistics for each object in memory. The view v$segstat is an efficient version of these statistics, and v$segment_statistics is a friendlier version that joins v$segstat to tables user$, obj$ and ts$, with a filter against ind$ to turn meaningless numbers into names.

SQL&amp;gt; desc V$segstat
 Name                    Null?    Type
 ----------------------- -------- ----------------
 TS#                              NUMBER
 OBJ#                             NUMBER
 DATAOBJ#                         NUMBER
 STATISTIC_NAME                   VARCHAR2(64)
 STATISTIC#                       NUMBER
 VALUE                            NUMBER

SQL&amp;gt; desc V$segment_statistics
 Name                    Null?    Type
 ----------------------- -------- ----------------
 OWNER                            VARCHAR2(30)
 OBJECT_NAME                      VARCHAR2(30)
 SUBOBJECT_NAME                   VARCHAR2(30)
 TABLESPACE_NAME                  VARCHAR2(30)
 TS#                              NUMBER
 OBJ#                             NUMBER
 DATAOBJ#                         NUMBER
 OBJECT_TYPE                      VARCHAR2(18)
 STATISTIC_NAME                   VARCHAR2(64)
 STATISTIC#                       NUMBER
 VALUE                            NUMBER

For each segment Oracle records the following statistics (according to v$segstat_name – but there are a couple more hidden statistics reported in the underlying x$ksolsstat object):

NAME                             SAMPLED
-------------------------------- -------
logical reads                    YES
buffer busy waits                NO
gc buffer busy                   NO
db block changes                 YES
physical reads                   NO
physical writes                  NO
physical read requests           NO
physical write requests          NO
physical reads direct            NO
physical writes direct           NO
optimized physical reads         NO
optimized physical writes        NO
gc cr blocks received            NO
gc current blocks received       NO
ITL waits                        NO
row lock waits                   NO
space used                       NO
space allocated                  NO
segment scans                    NO

Both Statspack (at level 7) and the AWR report have several “Top N” sections for segment statistics. If we examine these stats for all the indexes on a given table we can get some clues about which indexes are likely to be worth further investigation to see if they could be dropped.

One very simple measure is the number of “physical reads” (which, for indexes, will generally be very similar to “physical read requests”). Since a (real) physical read is generally going to take a significant amount of time, segments with very large numbers of physical reads could be contributing a lot of of time to the total database time – so it’s worth knowing why it’s responsible for so many physical reads and worth cross-checking with v$sql_plan (and its historic equivalents) which statements seem to be using or modifying this index.

Even if it turns out that the index is absolutely necessary, you might still be able to spot opportunities to improve efficiency. If it is subject to a significant number of physical reads it may be that the index is just very large – could you make it smaller by rebuilding it with compression on some of the leading columns, is it an index which (for some reason you can identify) tends to degenerate over time and waste a lot of space and should you rebuild it occasionally. It might be possible (depending on the predicates used) to re-arrange the column order in such a way that the activity is focused onto a particular section of the index rather than being spread across the entire index – or you could even find that by careful choice of global partitioning (which is legal on even a non-partitioned table) you might be able to isolate the activity to a small section of the index.

A more interesting measure, though, comes from comparing the “logical reads” with the number of “db block changes”; and that’s the point of this posting – except that I’ve spent so much time on it already that I’m going to have to write part 2 some time next week.

 

July 17, 2015

Descending Indexes

Filed under: CBO,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 8:42 am GMT Jul 17,2015

I’ve written about optimizer defects with descending indexes before now but a problem came up on the OTN database forum a few days ago that made me decide to look very closely at an example where the arithmetic was clearly defective. The problem revolves around a table with two indexes, one on a date column (TH_UPDATE_TIMESTAMP) and the other a compound index which starts with the same column in descending order (TH_UPDATE_TIMESTAMP DESC, TH_TXN_CODE). The optimizer was picking the “descending” index in cases where it was clearly the wrong index (even after the statistics had been refreshed and all the usual errors relating to date-based indexes had been discounted). Here’s an execution plan from the system which shows that there’s something wrong with the optimizer:


SELECT COUNT(*) FROM TXN_HEADER WHERE TH_UPDATE_TIMESTAMP BETWEEN SYSDATE-30 AND SYSDATE;

---------------------------------------------------------------------------------------
| Id  | Operation          | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                  |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE    |                  |     1 |     8 |            |          |
|*  2 |   FILTER           |                  |       |       |            |          |
|*  3 |    INDEX RANGE SCAN| TXN_HEADER_IDX17 |  1083K|  8462K|     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

There are two clues here: first, Oracle has used the (larger) two-column index when the single column is (almost guaranteed to be) the better choice simply because it will be smaller; secondly, we have a cost of 4 to acquire 1M rowids through an (implicitly b-tree) index range scan, which would require at least 250,000 index entries per leaf block to keep the cost that low (or 2,500 if you set the optimizer_index_cost_adj to 1; so it might just be possible if you had a 32KB block size).

The OP worked hard to supply information we asked for, and to follow up any suggestions we made; in the course of this we got the information that the table had about 90M rows and this “timestamp” column had about 45M distinct values ranging from 6th Sept 2012 to 2nd July 2015 with no nulls.

Based on this information I modelled the problem in an instance of 11.2.0.4 (the OP was using 11.2.0.3).


create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                                                          id,
        to_date('06-SEP-2012 15:13:00','dd-mon-yyyy hh24:mi:ss') +
                trunc((rownum - 1 )/4) / (24 * 60)                      upd_timestamp,
        mod(rownum - 1,10)                                              txn_code,
        rpad('x',50)                                                    padding
from
        generator       v1,
        generator       v2
where
        rownum <= 4 * 1030  *  24 * 60
;

create index t1_asc on t1(upd_timestamp) nologging;
create index t1_desc on t1(upd_timestamp desc) nologging;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 1',
                cascade          => false
        );
end;
/

My data set has 4 rows per minute from 6th Sept 2012 to 3rd July 2015, with both an ascending and descending index on the upd_timestamp column. For reference, here are the statistics about the two indexes:


INDEX_NAME               SAMPLE     BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ---------- ----------- -----------------
T1_DESC                 5932800          2       17379            154559
T1_ASC                  5932800          2       15737             59825

The ascending index is smaller with a significantly better clustering_factor, so for queries where either index would be a viable option the ascending index is the one (we think) that the optimizer should choose. In passing, the 5.9M index entries is exactly the number of rows in the table – these stats were computed automatically as the indexes were created.

Here’s a simple query with execution plan (with rowsource execution stats):

select max(id) from t1 where upd_timestamp between
to_date('01-jun-2015','dd-mon-yyyy')                and
to_date('30-jun-2015','dd-mon-yyyy')

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |      1 |        |    29 (100)|      1 |00:00:01.29 |    4710 |
|   1 |  SORT AGGREGATE              |         |      1 |      1 |            |      1 |00:00:01.29 |    4710 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1      |      1 |    167K|    29   (0)|    167K|00:00:00.98 |    4710 |
|*  3 |    INDEX RANGE SCAN          | T1_DESC |      1 |    885 |     5   (0)|    167K|00:00:00.35 |     492 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."SYS_NC00005$">=HEXTORAW('878CF9E1FEF8FEFAFF')  AND
              "T1"."SYS_NC00005$"<=HEXTORAW('878CF9FEF8FEF8FF') )
       filter((SYS_OP_UNDESCEND("T1"."SYS_NC00005$")>=TO_DATE(' 2015-06-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND SYS_OP_UNDESCEND("T1"."SYS_NC00005$")<=TO_DATE(' 2015-06-30 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))

The optimizer’s index estimate of 885 rowids is a long way off the actual rowcount of 167,000 – even though I have perfect stats describing uniform data and the query is simple enough that the optimizer should be able to get a good estimate. Mind you, the predicate section looks a lot messier than you might expect, and the table estimate is correct (though not consistent with the index estimate, of course).

Here’s the plan I get when I make the descending index invisible:


--------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |  2146 (100)|      1 |00:00:01.25 |    2134 |
|   1 |  SORT AGGREGATE              |        |      1 |      1 |            |      1 |00:00:01.25 |    2134 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1     |      1 |    167K|  2146   (1)|    167K|00:00:00.94 |    2134 |
|*  3 |    INDEX RANGE SCAN          | T1_ASC |      1 |    167K|   453   (2)|    167K|00:00:00.31 |     446 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("UPD_TIMESTAMP">=TO_DATE(' 2015-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "UPD_TIMESTAMP"<=TO_DATE(' 2015-06-30 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

With the descending index invisible we can see that the cardinality estimate for the ascending index correct and notice how much higher this makes the cost. It’s not surprising that the optimizer picked the wrong index with such a difference in cost. The question now is why did the optimizer get the index cardinality (i.e. selectivity, hence cost) so badly wrong.

The answer is that the optimizer has made the same mistake that applications make by storing dates as character strings. It’s using the normal range-based calculation for the sys_op_descend() values recorded against the virtual column and has lost all understanding of the fact that these values represent dates. I can demonstrate this most easily by creating one more table, inserting a couple of rows, gathering stats, and showing you what the internal storage of a couple of “descendomg” dates looks like.


drop table t2;

create table t2 (d1 date);
create index t2_i1 on t2(d1 desc);

insert into t2 values('01-Jun-2015');
insert into t2 values('30-Jun-2015');
commit;

select d1, sys_nc00002$ from t2;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T2',
                method_opt       => 'for all columns size 1',
                cascade          => false
        );
end;
/

column endpoint_value format 999,999,999,999,999,999,999,999,999,999,999,999
break on table_name skip 1

select
        table_name, column_name, endpoint_number, endpoint_value, to_char(endpoint_value,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') end_hex
from
        user_tab_histograms
where
        table_name in ('T1','T2')
and     column_name like 'SYS%'
order by
        table_name, column_name, endpoint_number
;

I’ve put the dates 1st June 2015 and 30th June 2015 into the table because those were the values I used in the where clause of my query. Here are the results showing the internal representation of the (descending) index column and the stored low and high values for the virtual columns in both t1 and t2.


D1        SYS_NC00002$
--------- ------------------------
01-JUN-15 878CF9FEF8FEF8FF
30-JUN-15 878CF9E1FEF8FEFAFF

TABLE_NAME           COLUMN_NAME          ENDPOINT_NUMBER                                   ENDPOINT_VALUE END_HEX
-------------------- -------------------- --------------- ------------------------------------------------ ---------------------------------
T1                   SYS_NC00005$                       0  703,819,340,111,320,000,000,000,000,000,000,000    878CF8FCEFF30BCEBC8823F7000000
                     SYS_NC00005$                       1  703,880,027,955,346,000,000,000,000,000,000,000    878FF6F9EFF20256F3B404F7400000

T2                   SYS_NC00002$                       0  703,819,411,001,549,000,000,000,000,000,000,000    878CF9E1FEF8F24C1C7CED46200000
                     SYS_NC00002$                       1  703,819,419,969,389,000,000,000,000,000,000,000    878CF9FEF8FEEED28AC5B22A200000

If you check the t2 data (sys_nc00002$) values against the end_hex values in the histogram data you’ll see they match up to the first 6 bytes (12 digits). Oracle has done its standard processing – take the first 6 bytes of the column, covert to decimal, round to the most significant 15 digits (technically round(value, -21)), convert and store the result as hex.

So let’s do some arithmetic. The selectivity of a range scan that is in-bounds is (informally): “range we want” / “total range”. I’ve set up t2 to show us the values we will need to calculate the range we want, and I’ve reported the t1 values to allow us to calculate the total range, we just have to subtract the lower value (endpoint number 0) from the higher value for the two sys_nc0000N$ columns. So (ignoring the 21 zeros everywhere) our selectivity is:

  • (703,819,419,969,389 – 703,819,411,001,549) / ( 703,880,027,955,346 – 703,819,340,111,320) = 0.00014777
  • We have 5.9M rows in the table, so the cardinality estimate should be about: 5,932,800 * 0.00014777 = 876.69

The actual cardinality estimate was 885 – but we haven’t allowed for the exact form of the range-based predicate: we should add 1/num_distinct at both ends because the range is a closed range (greater than or EQUAL to, less than or EQUAL to) – which takes the cardinality estimate up to 884.69 which rounds to the 885 that the optimizer produced.

Conclusion

This note shows that Oracle is (at least for dates) losing information about the underlying data when dealing with the virtual columns associated with descending columns in indexes. As demonstrated that can lead to extremely bad selectivity estimates for predicates based on the original columns, and these selectivity estimates make it possible for Oracle to choose the wrong index and introduce a spuriously low cost into the calculation of the full execution plan.

Footnote

This note seems to match bug note 11072246 “Wrong Cardinality estimations for columns with DESC indexes”, but that bug is reported as fixed in 12.1, while this test case reproduces in 12.1.0.2

 

January 16, 2015

Spatial space

Filed under: Infrastructure,Oracle,Statistics,Troubleshooting — Jonathan Lewis @ 1:00 pm GMT Jan 16,2015

One thing you (ought to) learn very early on in an Oracle career is that there are always cases you haven’t previously considered. It’s a feature that is frequently the downfall of “I found it on the internet” SQL.  Here’s one (heavily paraphrased) example that appeared on the OTN database forum a few days ago:

select table_name,round((blocks*8),2)||’kb’ “size” from user_tables where table_name = ‘MYTABLE’;

select table_name,round((num_rows*avg_row_len/1024),2)||’kb’ “size” from user_tables where table_name = ‘MYTABLE’;

The result from the first query is 704 kb,  the result from the second is 25.4 kb … fragmentation, rebuild, CTAS etc. etc.

The two queries are perfectly reasonable approximations (for an 8KB block size, with pctfree of zero) for the allocated space and actual data size for a basic heap table – and since the two values here don’t come close to matching it’s perfectly reasonable to consider doing something like a rebuild or shrink space to reclaim space and (perhaps) to improve performance.

In this case it doesn’t look as if the space reclaimed is likely to be huge (less than 1MB), on the other hand it’s probably not going to take much time to rebuild such a tiny table; it doesn’t seem likely that the rebuild could make a significant difference to performance (though apparently it did), but the act of rebuilding might cause execution plans to change for the better because new statistics might appear as the rebuild took place. The figures came from a test system, though, so maybe the table on the production system was much larger and the impact would be greater.

Being cautious about wasting time and introducing risk, I made a few comments about the question –  and learned that one of the columns was of type SDO_GEOMETRY. This makes a big difference about what to do next, because dbms_stats.gather_table_stats() doesn’t process such columns correctly, which results in a massive under-estimate for the avg_row_len (which is basically the sum of avg_col_len for the table). Here’s an example (run on 12c, based on some code taken from the 10gR2 manuals):


drop table cola_markets purge;

CREATE TABLE cola_markets (
  mkt_id NUMBER,
  name VARCHAR2(32),
  shape SDO_GEOMETRY);

INSERT INTO cola_markets VALUES(
  1,
  'cola_a',
  SDO_GEOMETRY(
    2003,  -- two-dimensional polygon
    NULL,
    NULL,
    SDO_ELEM_INFO_ARRAY(1,1003,3), -- one rectangle (1003 = exterior)
    SDO_ORDINATE_ARRAY(1,1, 5,7) -- only 2 points needed to
          -- define rectangle (lower left and upper right) with
          -- Cartesian-coordinate data
  )
);

insert into cola_markets select * from cola_markets;
/
/
/
/
/
/
/
/
/

execute dbms_stats.gather_table_stats(user,'cola_markets')
select
	avg_row_len, num_rows, blocks,
	round(avg_row_len * num_rows / 7200,0) expected_blocks
from user_tables where table_name = 'COLA_MARKETS';

analyze table cola_markets compute statistics;
select
	avg_row_len, num_rows, blocks,
	round(avg_row_len * num_rows / 7200,0) expected_blocks
from user_tables where table_name = 'COLA_MARKETS';

If you care to count the number of times I execute the “insert as select” it’s 10, so the table ends up with 2^10 = 1024 rows. The 7,200 in the calculated column converts bytes to approximate blocks on the assumption of 8KB blocks and pctfree = 10. Here are the results following the two different methods for generating object statistics:


PL/SQL procedure successfully completed.

AVG_ROW_LEN   NUM_ROWS     BLOCKS EXPECTED_BLOCKS
----------- ---------- ---------- ---------------
         14       1024        124               2

Table analyzed.

AVG_ROW_LEN   NUM_ROWS     BLOCKS EXPECTED_BLOCKS
----------- ---------- ---------- ---------------
        109       1024        124              16

Where does the difference in Expected_blocks come from ? (The Blocks figures is 124 because I’ve used 1MB uniform extents – 128 block – under ASSM (which means 4 space management blocks at the start of the first extent.)

Here are the column lengths after the call to dbms_stats: as you can see the avg_row_len is the sum of avg_col_len.


select column_name, data_type, avg_col_len
from   user_tab_cols
where  table_name = 'COLA_MARKETS'
order by
        column_id
;

COLUMN_NAME          DATA_TYPE                AVG_COL_LEN
-------------------- ------------------------ -----------
MKT_ID               NUMBER                             3
NAME                 VARCHAR2                           7
SYS_NC00010$         SDO_ORDINATE_ARRAY
SHAPE                SDO_GEOMETRY
SYS_NC00008$         NUMBER                             0
SYS_NC00004$         NUMBER                             4
SYS_NC00005$         NUMBER                             0
SYS_NC00006$         NUMBER                             0
SYS_NC00007$         NUMBER                             0
SYS_NC00009$         SDO_ELEM_INFO_ARRAY

The figures from the analyze command are only slightly different, but fortunately the analyze command uses the row directory pointers to calculate the actual row allocation, so picks up information about the impact of inline varrays, LOBs, etc. that the dbms_stats call might not be able to handle.


COLUMN_NAME          DATA_TYPE                AVG_COL_LEN
-------------------- ------------------------ -----------
MKT_ID               NUMBER                             2
NAME                 VARCHAR2                           6
SYS_NC00010$         SDO_ORDINATE_ARRAY
SHAPE                SDO_GEOMETRY
SYS_NC00008$         NUMBER                             1
SYS_NC00004$         NUMBER                             3
SYS_NC00005$         NUMBER                             1
SYS_NC00006$         NUMBER                             1
SYS_NC00007$         NUMBER                             1
SYS_NC00009$         SDO_ELEM_INFO_ARRAY

As a basic reminder – whenever you do anything slightly non-trivial (e.g. something you couldn’t have done in v5, say) then remember that all those dinky little script things you find on the Internet might not actually cover your particular case.

November 20, 2014

Quantum Data

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 11:30 am GMT Nov 20,2014

That’s data that isn’t there until you look for it, sort of, from the optimizer’s perspective.

Here’s some code to create a sample data set:


create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum					id,
	mod(rownum-1,200)			mod_200,
	mod(rownum-1,10000)			mod_10000,
	lpad(rownum,50)				padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'
	);
end;
/

Now derive the execution plans for a couple of queries noting, particularly, that we are using queries that are NOT CONSISTENT with the current state of the data (or more importantly the statistics about the data) – we’re querying outside the known range.


select * from t1 where mod_200  = 300;
select * from t1 where mod_200 >= 300;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  2462 |   151K|  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  2462 |   151K|  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=300)

SQL> select * from t1 where mod_200 >=300;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  2462 |   151K|  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  2462 |   151K|  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200">=300)

The predicted cardinality for mod_200 = 300 is the same as that for mod_200 >= 300. So, to be self-consistent, the optimizer really ought to report no rows (or a token 1 row) for any value of mod_200 greater than 300 – but it doesn’t.


SQL> select * from t1 where mod_200 = 350;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1206 | 75978 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  1206 | 75978 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=350)

SQL> select * from t1 where mod_200 =360;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   955 | 60165 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   955 | 60165 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=360)

SQL> select * from t1 where mod_200 =370;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   704 | 44352 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   704 | 44352 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=370)

SQL> select * from t1 where mod_200 =380;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   452 | 28476 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   452 | 28476 |  1246   (5)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=380)

SQL> select * from t1 where mod_200 in (350, 360, 370, 380);

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  3317 |   204K|  1275   (7)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  3317 |   204K|  1275   (7)| 00:00:07 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("MOD_200"=350 OR "MOD_200"=360 OR "MOD_200"=370 OR "MOD_200"=380)

The IN-list results are consistent with the results for the individual values – but the result for the IN-list is NOT consistent with the result for the original mod_200 >= 300. The optimizer uses a “linear decay” strategy for handling predicates that go out of range, but not in a consistent way. It seems that, as far as out-of-range, range-based predicates are concerned, the data doesn’t exist until the wave front collapses.

Footnote:

This type of anomaly COULD affect some execution plans if your statistics haven’t been engineered to avoid the problems of “out of range” traps.

November 19, 2014

Comparisons

Filed under: Histograms,humour,Oracle,Statistics — Jonathan Lewis @ 12:47 pm GMT Nov 19,2014

“You can’t compare apples with oranges.”

Oh, yes you can! The answer is 72,731,533,037,581,000,000,000,000,000,000,000.


SQL> 
SQL> create table fruit(v1 varchar2(30));
SQL> 
SQL> insert into fruit values('apples');
SQL> insert into fruit values('oranges');
SQL> commit;
SQL> 
SQL> 
SQL> begin
  2  	     dbms_stats.gather_table_stats(
  3  		     ownname	      => user,
  4  		     tabname	      =>'FRUIT',
  5  		     method_opt       => 'for all columns size 2'
  6  	     );
  7  end;
  8  /
SQL> 
SQL> select
  2  	     endpoint_number,
  3  	     endpoint_value,
  4  	     to_char(endpoint_value,'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') hex_value
  5  from
  6  	     user_tab_histograms
  7  where
  8  	     table_name = 'FRUIT'
  9  order by
 10  	     endpoint_number
 11  ;

ENDPOINT_NUMBER                                   ENDPOINT_VALUE HEX_VALUE
--------------- ------------------------------------------------ -------------------------------
              1  505,933,332,254,715,000,000,000,000,000,000,000  6170706c65731ad171a7dca6e00000
              2  578,664,865,292,296,000,000,000,000,000,000,000  6f72616e67658acc6c9dcaf5000000
SQL> 
SQL> 
SQL> 
SQL> select
  2  	     max(endpoint_value) - min(endpoint_value) diff
  3  from
  4  	     user_tab_histograms
  5  where
  6  	     table_name = 'FRUIT'
  7  ;

                                            DIFF
------------------------------------------------
  72,731,533,037,581,000,000,000,000,000,000,000
SQL> 
SQL> spool off


October 17, 2014

sreadtim

Filed under: Oracle,Statistics,System Stats,Troubleshooting — Jonathan Lewis @ 1:22 pm GMT Oct 17,2014

Here’s a question that appeared in my email a few days ago:

 

Based on the formula: “sreadtim = ioseektim + db_block_size/iotrfrspeed” sreadtim should always bigger than ioseektim.

But I just did a query on my system, find it otherwise, get confused,

SQL> SELECT * FROM SYS.AUX_STATS$;<

SNAME                          PNAME                               PVAL1 PVAL2
------------------------------ ------------------------------ ---------- --------------------
SYSSTATS_INFO                  STATUS                                    COMPLETED
SYSSTATS_INFO                  DSTART                                    10-08-2014 10:45
SYSSTATS_INFO                  DSTOP                                     10-10-2014 10:42
SYSSTATS_INFO                  FLAGS                                   1
SYSSTATS_MAIN                  CPUSPEEDNW                     680.062427
SYSSTATS_MAIN                  IOSEEKTIM                              10
SYSSTATS_MAIN                  IOTFRSPEED                           4096
SYSSTATS_MAIN                  SREADTIM                            4.716
SYSSTATS_MAIN                  MREADTIM                            2.055
SYSSTATS_MAIN                  CPUSPEED                             1077
SYSSTATS_MAIN                  MBRC                                    4
SYSSTATS_MAIN                  MAXTHR                          956634112
SYSSTATS_MAIN                  SLAVETHR                           252928

How do we explain this ?

 

This question highlights two points – one important, the other only slightly less so.

The really important point is one of interpretation.  Broadly speaking we could reasonably say that the (typical) time required to perform a single block read is made up of the (typical) seek time plus the transfer time which, using the names of the statistics above, would indeed give us the relationship: sreadtim = ioseektim + db_block_size/iotfrspeed; but we have to remember that we are thinking of a simplified model of the world. The values that we capture for sreadtim include the time it takes for a request to get from Oracle to the O/S, through the various network software and hardware layers and back again, the formula ignores those components completely and, moreover, doesn’t allow for the fact that some “reads” could actually come from one of several caches without any physical disc access taking place; similarly we should be aware that the time for an actual I/O seek would vary dramatically with the current position  of the read head, the radial position of the target block, the speed and current direction of movement of the read head, and the rotational distance to the target block. The formula is not attempting to express a physical law, it is simply expressing an approximation that we might use in a first line estimate of performance.

In fact we can see in the figures above that multi-block reads (typically of 4 blocks)  were faster than single block reads on this hardware for the duration of the sampling period – and that clearly doesn’t fit with the simple view embedded in our formula of how disc drives work.  (It’s a fairly typical effect of SANs, of course, that large read requests make the SAN software start doing predictive read-ahead, so that the next read request from Oracle may find that the SAN has already loaded the required data into its cache.)

There is, however, the second point that these figures highlight – but you have to be in the know to spot the detail: whatever the complexities introduced by SAN caching, we’re not looking at the right figures. The ioseektim and iotfrspeed shown here are the default values used by Oracle. It looks as if the user has called dbms_stats.gather_system_stats() with a 48 hour workload (8th Oct to 10th Oct), but hasn’t yet executed the procedure using the ‘noworkload’ option. Perhaps the ioseektim and iotfrspeed figures from a noworkload call would look a little more reasonable when compared with the 4.716 milliseconds of the workload single block read. There may still be a large gap between the model and the reality, but until the two sets of figures we’re using come from the same place we shouldn’t even think about comparing them.

July 4, 2014

Adjusting Histograms

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 8:32 pm GMT Jul 4,2014

This is a quick response to a question on an old blog post asking how you can adjust the high value if you’ve already got a height-balanced histogram in place. It’s possible that someone will come up with a tidier method, but this was just a quick sample I created and tested on 11.2.0.4 in a few minutes.  (Note – this is specifically for height-balanced histograms,  and it’s not appropriate for 12c which has introduced hybrid histograms that will require me to modify my “histogram faking” code a little).

rem
rem	Script:		adjust_histogram.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Jun 2014
rem	Purpose:
rem
rem	Last tested
rem		11.2.0.4
rem	Not tested
rem		12.1.0.1
rem		11.1.0.7
rem		10.2.0.5
rem	Outdated
rem		 9.2.0.8
rem		 8.1.7.4	no WITH subquery
rem
rem	Notes:
rem	Follow-on from a query on my blog about setting the high value
rem	when you have a histogram.  We could do this by hacking, or by
rem	reading the user_tab_histogram values and doing a proper prepare
rem

start setenv
set timing off

execute dbms_random.seed(0)

drop table t1;

begin
	begin		execute immediate 'purge recyclebin';
	exception	when others then null;
	end;

	begin
		dbms_stats.set_system_stats('MBRC',16);
		dbms_stats.set_system_stats('MREADTIM',10);
		dbms_stats.set_system_stats('SREADTIM',5);
		dbms_stats.set_system_stats('CPUSPEED',1000);
	exception
		when others then null;
	end;
/*
	begin		execute immediate 'begin dbms_stats.delete_system_stats; end;';
	exception	when others then null;
	end;

	begin		execute immediate 'alter session set "_optimizer_cost_model"=io';
	exception	when others then null;
	end;

	begin		execute immediate 'alter session set "_optimizer_gather_stats_on_load" = false';
	exception	when others then null;
	end;
*/

	begin		execute immediate  'begin dbms_space_admin.materialize_deferred_segments(''TEST_USER''); end;';
	exception	when others then null;
	end;

end;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	trunc(sysdate,'YYYY') + trunc(dbms_random.normal * 100,1)	d1
from
	generator	v1,
	generator	v2
where
	rownum <= 1e4
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 32'
	);

end;
/

spool adjust_histogram.lst

prompt	==================
prompt	Current High Value
prompt	==================

select to_char(max(d1),'dd-Mon-yyyy hh24:mi:ss') from t1;

prompt	==============================
prompt	Initial Histogram distribution
prompt	==============================

select
	endpoint_number,
	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val,
	endpoint_value,
	lag(endpoint_value,1) over(order by endpoint_number) lagged_epv,
	endpoint_value -
		lag(endpoint_value,1) over(order by endpoint_number)  delta
from	user_tab_histograms
where
	table_name = 'T1'
and	column_name = 'D1'
;

rem
rem	Note - we can't simply overwrite the last srec.novals
rem	because that doesn't adjust the stored high_value.
rem	We have to make a call to prepare_column_values,
rem	which means we have to turn the stored histogram
rem	endpoint values into their equivalent date types.
rem

prompt	==================
prompt	Hacking the values
prompt	==================

declare

	m_distcnt		number;
	m_density		number;
	m_nullcnt		number;
	srec			dbms_stats.statrec;
	m_avgclen		number;

	d_array			dbms_stats.datearray := dbms_stats.datearray();
	ct			number;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'd1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	); 

	ct := 0;
	for r in (
		select	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val
		from	user_tab_histograms
		where	table_name = 'T1'
		and	column_name = 'D1'
		order by endpoint_number
	) loop

		ct := ct + 1;
		d_array.extend;
		d_array(ct) := r.d_val;
		if ct = 1 then
			srec.bkvals(ct) := 0;
		else
			srec.bkvals(ct) := 1;
		end if;

	end loop;

	d_array(ct) := to_date('30-Jun-2015','dd-mon-yyyy');

	dbms_stats.prepare_column_values(srec, d_array);

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'd1',
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec		=> srec,
		avgclen		=> m_avgclen
	);
end;
/

prompt	============================
prompt	Final Histogram distribution
prompt	============================

select
	endpoint_number,
	to_date(to_char(trunc(endpoint_value)),'J') + mod(endpoint_value,1) d_val,
	endpoint_value,
	lag(endpoint_value,1) over(order by endpoint_number) lagged_epv,
	endpoint_value -
		lag(endpoint_value,1) over(order by endpoint_number)  delta
from	user_tab_histograms
where
	table_name = 'T1'
and	column_name = 'D1'
;

spool off

doc

#

May 5, 2014

endpoint_value

Filed under: Histograms,Oracle,Statistics — Jonathan Lewis @ 2:58 pm GMT May 5,2014

I wrote a note a few years ago about translating the endpoint_value for histograms on character columns, and in that note I left the casual comment that you would: fiddle with this bit of code to handle multibyte character sets”. For anyone who has never found time to do that fiddling, here’s a solution.  In fact, it’s also a better solution for the original character problem than the one I originally published – I have no idea why I didn’t use the utl_raw package to simplify the code years ago.

First, a brief explanation of the problem. To store a character value in the numeric endpoint_value column of a histogram, Oracle:

  • Takes the first 15 bytes of the string (after padding the string with zeros (for varchar2) or spaces (for char))
  • Treats the result as a 15 byte (30 digit) hexadecimal number, and converts to decimal
  • Rounds to 15 significant digits (why?!) and stores the result as the endpoint_value

There are two problems with my old code when trying to convert back to a (multi-byte) character format; after converting back to a hexadecimal format it doesn’t know how many consecutive bytes should be used to make up each character of the string – this problem can be circumvented by using the utl_raw.cast_to_[n]varchar2() function. (Note: the dbms_stats package has a number of convert_raw_value() procedures which could also be used, but to take advantage of them I’d have to create a wrapper function for each procedure.)

The second problem is one that’s easy to overlook (and possibly survive with) for some time – some multi-byte characters start with a zero byte – or perhaps more than a single zero byte.  If the first character of a string that’s going into the histogram data starts with such a character then conversion to a number loses the leading zeros, which means you have to find a way of recovering the right number of zeros before doing the conversion from raw to character. The solution to this one is also fairly straightforward: we know that the endpoint_value when converted to a hexadecimal number should be 15 bytes, which means in its hex string presentation it will be 30 characters: if it’s not 30 characters than pad it on the left with zeros up to 30 characters, then call utl_raw.cast_to_[n]varchar2() on the result. Here’s a dataset to demonstrate the point (the results show came from 11.2.0.4 with the varchar2() using WE8MSWIN1252 and nvarchar2() using AL16UTF16):

create table t1
as
select
	object_type				v_type,
	cast(object_type as nvarchar2(19))	n_type
from
	all_objects
where
	rownum <= 10000
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 254'
	);
end;
/

select 
	column_name,
	endpoint_number,
	lpad(to_char(endpoint_value,'fmxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'),30,'0')
from 
	user_tab_histograms
where 
	table_name = 'T1' 
order by
	column_name desc,
	endpoint_number
;

Column                 EP no LPAD(TO_CHAR(ENDPOINT_VALUE,'F
-------------------- ------- ------------------------------
V_TYPE                    10 434c5553544566083af75039400000
                          12 434f4e5445585b0be13e9077800000
                          13 45444954494f54acac5b71e6c00000
                          65 46554e4354494bfd6e91eac9000000
                       1,218 494e444557ffec1b8cf7b386200000
                       1,284 494e44455820524115d2f593c00000
                       1,287 4f5045524154487b7a4542a0400000
                       1,459 5041434b41476016d731ed1a200000
                       1,473 50524f43454439c0c7fd90d5c00000
                       1,475 53455155454e581a4a938544400000
                       4,576 53594e4f4e59319bbd81e9d6a00000
                       5,613 5441424c45001de17dde9b57c00000
                       5,685 5441424c45204dd13d0c1786c00000
                       5,717 5441424c45204dd13d0c1786c00000
                       6,425 54595044ffffeda95d6a0f02200000
                      10,000 56494556fffffb4cec1ab3e3600000

N_TYPE                    10 0043004c00550091d5132d65900000
                          12 0043004f004e00699afbb463980000
                          13 0045004400490039662489da280000
                          65 00460055004e0032efa0cbcb200000
                       1,218 0049004e0044000ee521e524f00000
                       1,284 0049004e0044000ee521e524f00000
                       1,287 004f00500045007d7a2958e3400000
                       1,459 0050004100430082c9d92def100000
                       1,473 00500052004f003864f4c314680000
                       1,475 005300450051001bb6610d60c00000
                       4,576 00530059004e00250bbf32dc380000
                       5,613 0054004100420046e721e46a700000
                       5,685 0054004100420046e721e46a700000
                       5,717 0054004100420046e721e46a700000
                       6,425 0054005900500056f579391d680000
                      10,000 0056004900450054a090433f680000


32 rows selected.

As you can see, we’ve got a frequency histogram on both columns. If you’re good with ASCII codes you’ll be able to translate the first row for the v_type data: 43 = C, 4c = L, 55 = U, 53 = S, 54 = T, 45 = E, 66 = B, and then we run into funny characters; the ‘B’ and everything after it is part of the rounding error due to the algorithm Oracle is using. You can also check the first row for the n_type: it’s coming from a fixed width character set, 2 bytes per character: 0043 = C, 004c = L, 0055 = U, 0091 = {rounding error} – the rounding means we get much less information from the nvarchar2() histogram.

Rather than depending on knowing our code pages, though, we can get Oracle to do the translation work:


select
	column_name,
	endpoint_number,
	substr(lpad(to_char(endpoint_value,'fmxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'),30,'0'),1,30)  hex_value,
	decode(
		column_name,
			'V_TYPE',
				utl_raw.cast_to_varchar2(
					substr(lpad(to_char(endpoint_value,'fmxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'),30,'0'),1,14) 
				),
			'N_TYPE',
				utl_raw.cast_to_nvarchar2(
					substr(lpad(to_char(endpoint_value,'fmxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'),30,'0'),1,16) 
				)
	)	char_value,
	endpoint_value,
	endpoint_actual_value
from
	user_tab_histograms
where
	table_name = 'T1'
order by
	column_name desc,
	endpoint_number
;

--
--	Clear graphics mode
--

execute dbms_output.put_line(chr(15))

Column       EP no HEX_VALUE                      CHAR_VALUE                                   ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
---------- ------- ------------------------------ ---------- ------------------------------------------------ --------------------------------
V_TYPE          10 434c5553544566083af75039400000 CLUSTEf     349,432,112,834,658,000,000,000,000,000,000,000 CLUSTER
                12 434f4e5445585b0be13e9077800000 CONTEX[     349,492,405,757,772,000,000,000,000,000,000,000 CONTEXT
                13 45444954494f54acac5b71e6c00000 EDITIOT     359,653,496,833,182,000,000,000,000,000,000,000 EDITION
                65 46554e4354494bfd6e91eac9000000 FUNCTIK     365,190,985,547,816,000,000,000,000,000,000,000 FUNCTION
             1,218 494e444557ffec1b8cf7b386200000 INDEWyi     380,625,107,598,029,000,000,000,000,000,000,000 INDEX
             1,284 494e44455820524115d2f593c00000 INDEX R     380,625,107,598,182,000,000,000,000,000,000,000 INDEX PARTITION
             1,287 4f5045524154487b7a4542a0400000 OPERATH     411,819,536,792,506,000,000,000,000,000,000,000 OPERATOR
             1,459 5041434b41476016d731ed1a200000 PACKAG`     416,707,436,884,205,000,000,000,000,000,000,000 PACKAGE
             1,473 50524f43454439c0c7fd90d5c00000 PROCED9     417,053,186,114,358,000,000,000,000,000,000,000 PROCEDURE
             1,475 53455155454e581a4a938544400000 SEQUENX     432,366,569,392,218,000,000,000,000,000,000,000 SEQUENCE
             4,576 53594e4f4e59319bbd81e9d6a00000 SYNONY1     432,771,978,053,825,000,000,000,000,000,000,000 SYNONYM
             5,613 5441424c45001de17dde9b57c00000 TABLE      437,476,545,404,166,000,000,000,000,000,000,000 TABLE
             5,685 5441424c45204dd13d0c1786c00000 TABLE M     437,476,545,404,318,000,000,000,000,000,000,000 TABLE PARTITION
             5,717 5441424c45204dd13d0c1786c00000 TABLE M     437,476,545,404,318,000,000,000,000,000,000,000 TABLE SUBPARTITION
             6,425 54595044ffffeda95d6a0f02200000 TYPDyyi     437,964,430,179,117,000,000,000,000,000,000,000 TYPE
            10,000 56494556fffffb4cec1ab3e3600000 VIEVyyu     448,023,639,403,471,000,000,000,000,000,000,000 VIEW

N_TYPE          10 0043004c00550091d5132d65900000 CLU?          1,358,944,964,706,820,000,000,000,000,000,000  C L U S T E R
                12 0043004f004e00699afbb463980000 CONi          1,358,945,893,128,790,000,000,000,000,000,000  C O N T E X T
                13 0045004400490039662489da280000 EDI9          1,399,507,307,977,370,000,000,000,000,000,000  E D I T I O N
                65 00460055004e0032efa0cbcb200000 FUN2          1,419,794,978,849,800,000,000,000,000,000,000  F U N C T I O N
             1,218 0049004e0044000ee521e524f00000 IND          1,480,640,041,218,460,000,000,000,000,000,000  I N D E X
             1,284 0049004e0044000ee521e524f00000 IND          1,480,640,041,218,460,000,000,000,000,000,000  I N D E X   P A R T I T I O N
             1,287 004f00500045007d7a2958e3400000 OPE}          1,602,335,117,815,120,000,000,000,000,000,000  O P E R A T O R
             1,459 0050004100430082c9d92def100000 PAC?          1,622,612,885,134,180,000,000,000,000,000,000  P A C K A G E
             1,473 00500052004f003864f4c314680000 PRO8          1,622,618,146,436,010,000,000,000,000,000,000  P R O C E D U R E
             1,475 005300450051001bb6610d60c00000 SEQ          1,683,461,351,951,280,000,000,000,000,000,000  S E Q U E N C E
             4,576 00530059004e00250bbf32dc380000 SYN%          1,683,467,541,637,310,000,000,000,000,000,000  S Y N O N Y M
             5,613 0054004100420046e721e46a700000 TABF          1,703,742,523,544,060,000,000,000,000,000,000  T A B L E
             5,685 0054004100420046e721e46a700000 TABF          1,703,742,523,544,060,000,000,000,000,000,000  T A B L E   P A R T I T I O N
             5,717 0054004100420046e721e46a700000 TABF          1,703,742,523,544,060,000,000,000,000,000,000  T A B L E   S U B P A R T I T I
             6,425 0054005900500056f579391d680000 TYPV          1,703,749,951,250,410,000,000,000,000,000,000  T Y P E
            10,000 0056004900450054a090433f680000 VIET          1,744,309,818,645,610,000,000,000,000,000,000  V I E W

32 rows selected.

I’ve limited the raw conversion to substr(,1,14) for the v_code and substr(,1,16) for the n_code because this translates into 7 and 4 characters respectively – and basically you’re lucky if you always get 6 / 3 characters coming to the right values. I can’t show you exactly what my output was like because there were various hidden characters (line feed, backspaces, “shift” and so on) that made a bit of a mess of the results; that’s why the dbms_output.put_line(chr(15)) is there at the end of the script – I needed it to switch my terminal out of graphic mode.

I’ve included endpoint_actual_value in my output. Because “TABLE PARTITION” and “TABLE SUBPARTITION” came to the same endpoint_value Oracle captured the first 32 bytes of the actual values it had found. (Note (a) it is bytes not characters, (b) this goes up to 64 in 12c, and (c) it’s being stored in a varchar2() column, which is why the values for column n_type appear to have alternating spaces – that’s the impact of all the zeros.

 

May 4, 2014

Extended stats

Filed under: 12c,extended stats,Histograms,Oracle,Statistics — Jonathan Lewis @ 1:24 pm GMT May 4,2014

Like the recent article on deleting histograms this is another draft that I rediscovered while searching for some notes I had written on a different topic – so I’ve finally finished it off and published it.

Here’s a quirky little detail of extended stats that came up in an OTN thread earlier on this week [ed: actually 8th Jan 2014]. When you create column group stats, Oracle uses an undocumented function sys_op_combined_hash() to create a hash value, and if you gather simple stats on the column (i.e. no histogram) you can get some idea of the range of values that Oracle generates through the hash function. For example:


create table t1 as
select  1 n1, 2 n2
from dual
connect by level<=5000
union all
select  2, 1
from dual
connect by level<=5000 ; 
 
select dbms_stats.create_extended_stats(user,'t1','(n1, n2)') name from dual; 

begin 
	dbms_stats.gather_table_stats( 
		ownname		=> user,
		tabname		=>'T1',
		method_opt	=> 'for all columns size 1'
	);
end;
/

break on column_name skip 1

column column_name format a32
column endpoint_number heading "EP No."
column endpoint_value  heading "Value" format 999,999,999,999,999,999,999

select
	column_name, endpoint_number, endpoint_value
from
	user_tab_histograms
where
	table_name = 'T1'
order by
	column_name, endpoint_value
;

COLUMN_NAME                          EP No.                        Value
-------------------------------- ---------- ----------------------------
N1                                        0                            1
                                          1                            2

N2                                        0                            1
                                          1                            2

SYS_STUBZH0IHA7K$KEBJVXO5LOHAS            0      298,332,787,864,733,000
                                          1    8,095,781,421,167,520,000

I could have selected low_value and high_value from user_tab_cols, using utl_raw.cast_to_number() to display them in numeric format, but the view user_tab_histograms display the low and high as a two-bucket histogram if there is no actual histogram data for the column in the histogram (histgrm$) table.

We probably don’t need to worry about what the low and high values might be because taking hash values destroys any meaning that a range might have (the optimizer can’t use column group stats in range-based predicates, only in equality predicates). However, we might collect a frequency histogram (or Top-N histogram in 12c) on the column group because there might be some data skew in the sets of values that we need to tell the optimizer about – so let’s gather a histogram with 2 buckets on our sample data set and see what we get:


begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 2'
	);
end;
/

select
	column_name, endpoint_number, endpoint_value
from
	user_tab_histograms
where
	table_name = 'T1'
order by
	column_name, endpoint_value
;

COLUMN_NAME                          EP No.                        Value
-------------------------------- ---------- ----------------------------
N1                                     5000                            1
                                      10000                            2

N2                                     5000                            1
                                      10000                            2

SYS_STUBZH0IHA7K$KEBJVXO5LOHAS         5000                1,977,102,303
                                      10000                7,894,566,276

The histogram values have changed! As one of the posters on the OTN thread points out, what Oracle has actually stored in this case is mod(sys_op_combined_hash() ,9999999999).

So far I haven’t done any exhaustive testing to work out whether the change in the strategy for storing numbers makes any difference to the typical optimizer arithmetic – but I do have at least one case (relating to “missing values” behaviour where the presence or absence of a column group histogram does make a difference to the estimated cardinality in a way that seems inconsistent with other patterns of behaviour: I doubt if it’s actually due to the change in what’s stored, and one day I may come across a client where I actually need to work out what’s going on and how best to work with the anomaly.

Bonus thought:

In 12c Oracle collects column stats automatically as it loads data into an empty table; but not if it’s got extended stats defined on it.  (This is bug number 18425876, labelled as fixed in 12.2 [ed: actually fixed in 12.1.0.2]). Here’s some code modelling a client scenario where we truncate and reload a table every day. In the first part of the demonstration I’ve loaded the table twice to show that after truncating and reloading I get new stats on the table – the first load is 10,000 rows, the second is 20,000 rows and the stats reflect this automatically. In the second part of the code, after adding a set of column group stats, truncating and loading 5,000 rows, the stats from the previous cycle are still in place. (The code is only relevant to 12c, of course)

create table t1 (n1 number, n2 number);

insert	/*+ append */
into	t1
select	object_id, data_object_id
from	all_objects
where
	rownum <= 10000
;
commit;

-- stats have appeared without a call to dbms_stats to gather them.

select blocks, num_rows from user_tables where table_name = 'T1';
select column_name, num_distinct, num_nulls, density, low_value, high_value from user_tab_cols where table_name = 'T1';

truncate table t1;

insert	/*+ append */
into	t1
select	object_id, data_object_id
from	all_objects
where
	rownum <= 20000
;
commit;

-- Stats now show the latest data 

select blocks, num_rows from user_tables where table_name = 'T1';
select column_name, num_distinct, num_nulls, density, low_value, high_value from user_tab_cols where table_name = 'T1';

-- Add a column group to the stats 

select dbms_stats.create_extended_stats(user,'t1','(n1, n2)') name from dual;

truncate table t1;

insert	/*+ append */
into	t1
select	object_id, data_object_id
from	all_objects
where
	rownum <= 5000
;
commit;

-- The stats have not been updated to reflect the new data, and the column group stats are empty

select blocks, num_rows from user_tables where table_name = 'T1';
select column_name, num_distinct, num_nulls, density, low_value, high_value from user_tab_cols where table_name = 'T1';

--------------------------------------------------------
-- Here are the three consecutive sets of results
--------------------------------------------------------

    BLOCKS   NUM_ROWS
---------- ----------
        16      10000

COLUMN_NAME                      NUM_DISTINCT  NUM_NULLS    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------------------- ------------ ---------- ---------- -------------------------- --------------------------
N1                                      10000          0      .0001 C103                       C3020C60
N2                                       2534       7429 .000394633 C103                       C30B2929

    BLOCKS   NUM_ROWS
---------- ----------
        32      20000

COLUMN_NAME                      NUM_DISTINCT  NUM_NULLS    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------------------- ------------ ---------- ---------- -------------------------- --------------------------
N1                                      20000          0     .00005 C103                       C30A4553
N2                                       3115      16848 .000321027 C103                       C30B2929

    BLOCKS   NUM_ROWS
---------- ----------
        32      20000

COLUMN_NAME                      NUM_DISTINCT  NUM_NULLS    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------------------- ------------ ---------- ---------- -------------------------- --------------------------
N1                                      20000          0     .00005 C103                       C30A4553
N2                                       3115      16848 .000321027 C103                       C30B2929
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS

The workaround given in the bug is “add the extended stats after loading the table” – but if you’re constantly truncating and reloading that means you have to drop and add the extended stats and do a tablescan to gather the column group stats every time you reload.

Note: the limitation applies whether you create a column group, “ordinary” extended stats, a virtual column, or an implicit virtual column underlying a function-based index.

Just as a little aside – when I first wrote the demo script I forgot to put in the commit; after the insert/append – which meant I was trying to create column group stats on a table which should have given me Oracle error: “ORA-12838: cannot read/modify an object after modifying it in parallel”; instead this has been trapped by the dbms_stats package and shows up as a slightly confusing:


select dbms_stats.create_extended_stats(user,'t1','(n1, n2)') name from dual
       *
ERROR at line 1:
ORA-20001: Error when processing extension -  resource busy and acquire with NOWAIT specified or timeout expired
ORA-06512: at "SYS.DBMS_STATS", line 12977
ORA-06512: at "SYS.DBMS_STATS", line 44967
ORA-06512: at "SYS.DBMS_STATS", line 44986

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 5,934 other followers