Oracle Scratchpad

October 9, 2015

PL/SQL Functions

Filed under: 12c,Oracle,Performance — Jonathan Lewis @ 6:17 pm GMT Oct 9,2015

Assuming everything else has been tuned to perfection, what’s the best you can do while calling PL/SQL functions from SQL ? Here’s a little code to create a table with some data, and a function that we can use to start an investigation:


create table t1
nologging
pctfree 0
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1,1001))        n1
from
        generator       v1,
        generator       v2
where
        rownum <= 3e5
;

create or replace function func_normal(i_in number)
return number
as
begin
        return round(exp(ln(i_in)));
end;
/

That’s 300,000 rows in the table and a silly little function to use up some CPU to get nowhere. There are 1,000 distinct integer values scattered uniformly through the table, and the function returns the value it’s called with – but it does it a hard way.

Here’s some test code – with “set timing on” – followed by the results:


select count(distinct(n1)) from t1;

select 
        count(distinct(func_normal(n1)))
from    t1
;



COUNT(DISTINCT(N1))
-------------------
               1000

1 row selected.

Elapsed: 00:00:00.03

COUNT(DISTINCT(FUNC_NORMAL(N1)))
--------------------------------
                            1000

1 row selected.

Elapsed: 00:00:11.39

This is running on 12.1.0.2 on a machine with a CPU speed of aboaut 3 MHz – you may want to adjust the number of rows in the table for your own testing.

The question is, what options can we use to improve the efficiency of the calls to the PL/SQL. One option, of course, is to use the new 12c “with PL/SQL” clause – we could embed the function in the SQL like this:


with
        function func_with (
                i_in    number
        ) return number
        is
        begin
                return round(exp(ln(i_in)));
        end;
select
        count(distinct(func_with(n1)))
from    t1
/

COUNT(DISTINCT(FUNC_WITH(N1)))
------------------------------
                          1000

1 row selected.

Elapsed: 00:00:09.77

So, for the cost of copying the function into the SQL we get a 10% improvement in performance – which we could reasonably attribute to an improved efficiency in the call mechanism. There are arguments for and against copying code like this, of course, and my improvement was only 1 second for 300,000 calls, but you may decide that the benefit of the “With-PL/SQL” method is sufficient to justify the effort.

If you don’t want to copy a PL/SQL function into the SQL, though, there is another alternative – the pragma UDF (for user-defined function) – which has the effect of reducing the cost of the so-called “context switch” as you call from SQL to PL/SQL or vice versa.

Technically I think what’s happening is that the stack formats for SQL and PL/SQL are different and the context switch is the work needed to reformat the stack as you pass from one environment to the other – by declaring the function as UDF you probably set it up to deal with the incoming SQL stack itself.

Here’s how to add the UDF pragma to the function, and the results showing the effects:


create or replace function func_udf(i_in number)
return number
as
pragma UDF;
begin
        return round(exp(ln(i_in)));
end;
/

COUNT(DISTINCT(FUNC_UDF(N1)))
-----------------------------
                         1000

1 row selected.

Elapsed: 00:00:09.55

With pragma UDF the standalone function call is slightly faster than the in-line “WITH” function. I did wonder whether the pragma UDF would make the function call slower if I simply called it from a loop in a PL/SQL block, but there didn’t seem to be any significant difference between the normal function and the UDF function.

Initially, then, it looks like UDF is faster than WITH, which is faster than basic; but there are other considerations. My sample data has only 1,000 possible inputs to the function – and Oracle has three different methods for caching that I might be able to benefit from:

  • Declaring the function as deterministic
  • Putting the function into the PL/SQL result cache
  • Modifying the SQL to take advantage of scalar subquery caching

Here’s what the function declaration looks like if I want to use the PL/SQL function cache:


create or replace function func_cached(i_in number)
return number
result_cache
as
begin
        return round(exp(ln(i_in)));
end;
/

Changing my query to use func_cached() the query completed in 1.65 seconds – a clear winner, but can anything else get close.

To make the functions deterministic, I just have to add the word “deterministic” after the declaration of the return type:


create or replace function func_normal(i_in number) 
return number
deterministic
as
begin
        return round(exp(ln(i_in)));
end;
/

We can also add the deterministic keyword to the function defined in the WITH clause. Before reporting the results of testing the functions with determinism, there’s one more strategy to consider. Remove the deterministic key word from the functions, and introduce a scalar subquery to the test query, e.g.:


select
        count(distinct(select func_normal(n1) from dual))
from    t1

;

Here’s a table of results:

Method Initial Deterministic Scalar Subquery
Basic 11.39 4.30 4.32
With 9.77 9.72 3.60
UDF 9.55 9.57 3.57
Cached 1.65 0.71 0.72

Summary

Before stating any conclusions it’s worth remembering that the sample data is based on a relatively small number of distinct input values. It is the repetition that allows us to benefit from things like caching and determinism. On top of that we need to consider the scale of the time-saving in light of the number of calls made (or not made).

Having said that, the PL/SQL function cache is clearly the thing that gives us the most benefit in return for a simple implementation. We should remember that the cache is “public” – i.e. stored in the SGA – and each input value takes up another bit of the public result cache: on the plus side this means that everyone else calling this function gets the benefit of our prior calculation; on the minus side this means if we use the result cache for the wrong function then we could take up a lot of space in the cache for very little benefit. Since the relevant result cache latch has not child latches it’s also possible to end up with latch contention if the too many sessions are taking advantage of the result cache too frequently.

Although we might expect a deterministic function to give us a big benefit (in the “very repetitive inputs” case), we find that the deterministic keyword has no effect (as at 12.1.0.2) in functions declared in a WITH clause or declared as standalone with pragma UDF. Fortunately scalar subquery caching (which seems to use the same hashing algorithm as the deterministic caching algorithm) still works with WITH functions or UDF functions and (slightly surprisingly) standalone functions declared with pragma UDF seem to have a very small edge over WITH functions.

Both of the latter two approaches use a local cache with a strictly limited size. The benefit is that the locality means they won’t interfere with other sessions or end up hogging an extreme amount of a public memory area; the drawback is that the size limit (which can be adjusted with a hidden parameter) means that you can get unlucky with hash collisions and end up with extremely variable performance from day to day because of a small change in the data being processed, or even a change in the order in which an unchanged data set is processed.

Footnote

I’ll leave you with one thought. Without declaring the original function as deterministic, I ran the following query to maximise the benefit from the scalar subquery caching algorithm:


select
        count(distinct(select func_normal(n1) from dual))
from    (
        select /*+ no_eliminate_oby */  n1 from t1 order by n1
        )
;

The query completed in 0.13 seconds: the subquery was called 1,000 times (once for each distinct value – see this ancient URL), and the benefit of eliminating the function calls outweighed the cost of having to sort the data set.

Ref: 12c_function_options.sql

September 30, 2015

Estimate_percent

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 5:10 pm GMT Sep 30,2015

Here’s a live one from OTN – here are a couple of extracts from the problem statement:

We’re experiencing an issue where it seems that the query plan changes from day to day for a particular procedure that runs once a night.
It’s resulting in a performance variance of 10 second completion time vs 20 minutes (nothing in between).
It started occurring about 2 months ago and now it’s becoming more prevalent where the bad query plan is coming up more often.
I noticed that the query plans vary for a simple query.
We do run gather statistics every night. (DBMS_STATS.GATHER_SCHEMA_STATS (ownname=>sys_context( ‘userenv’, ‘current_schema’ ), estimate_percent => 1);)

The query and two execution plans look like this:

select count(*) from cs_bucket_member_v2 where bucket_type='P' and sec_id > 0 and order_id=0;

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |     1 |    12 |   155   (0)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| CS_BUCKET_MEMBER_V2 |  1148 | 13776 |   155   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN          | CS_BUCKET_MEMBER_N1 |  1272 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("BUCKET_TYPE"='P' AND "SEC_ID">0)
   3 - access("ORDER_ID"=0)


------------------------------------------------------------------------------------------
| Id  | Operation          | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                     |     1 |    12 | 11215   (2)| 00:01:41 |
|   1 |  SORT AGGREGATE    |                     |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| CS_BUCKET_MEMBER_V2 |  1522K|    17M| 11215   (2)| 00:01:41 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=0 AND "SEC_ID">0 AND "BUCKET_TYPE"='P')

There are a couple of bits of information that would be useful – such as the database version, the number of rows in the table, the number of distinct values in each column, and whether any of the columns have histograms – but there are a couple of reasonable guesses that we might make about the problem. Notice particularly that the number of rows estimated from the index ranges scan is 1272 and only a small volume is then eliminated by the table filter predicates on sec_id and bucket_type. This suggests that the optimizer has information that tells it that most of the rows in the table have sec_id > 0 and bucket_type = ‘P’, and you might note that that suggests that there’s a histogram on bucket_type.

Rather than stating the most obvious guesses about the problem, though, I’ll start by creating a data set and emulating the problem, starting from an empty schema on 11.2.0.4:

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual
        connect by 
                level <= 1e4
)
select
        rownum                  sec_id,
        case
                when mod(rownum,1000) = 0
                        then 'X'
                        else 'P'
        end                     bucket_type,
        case
                when rownum < 1e6 - 50000 
                        then mod(rownum-1,1e5)
                        else 1000
        end                     order_id,
        lpad(rownum,10,'0')     id_vc,
        rpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6

create index t1_i1 on t1(order_id) nologging; 

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

The column names in the table match those needed by the query, and the bucket_p column has a very skewed distribution that will eliminate very little data; the sec_id column is also not going to eliminate data, but it’s very evenly distributed with no large gaps so not a good candidate for a histogram in any case. The order_id has 50,000 rows out of 1,000,000 (5%) set of a single value, and most of those special rows are at the end of the table – it’s a pretty good candidate for a histogram (if Oracle spots it, and if we actually write queries to access that data).

I’ve run a query that references all three columns so that the default method_opt of “for all columns size auto” will apply to them when I gather stats. So here’s the code that gathers stats and checks the resulting execution plans, first for “auto_sample_size” then for the 1% used by the OP:


set autotrace traceonly explain

begin
        dbms_stats.gather_schema_stats(
/*              estimate_percent => 1, */
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

begin
        dbms_stats.gather_schema_stats(
                estimate_percent => 1,
                ownname          => user
        );
end;
/

select count(*) from t1 where order_id = 1000 and bucket_type = 'P' and sec_id > 1000;

set autotrace off

And here are the two plans – in the same order:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |  2333   (4)| 00:00:12 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 51063 |   598K|  2333   (4)| 00:00:12 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ORDER_ID"=1000 AND "SEC_ID">1000 AND "BUCKET_TYPE"='P')


--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |    12 |    23   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |       |     1 |    12 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| T1    |    20 |   240 |    23   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |    20 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):  
---------------------------------------------------
   2 - filter("SEC_ID">1000 AND "BUCKET_TYPE"='P')
   3 - access("ORDER_ID"=1000)


[Update: Following on from a question in the comments, I’ve expanded this section, and wandered a little off-topic]

I don’t know why, but with a 1% sample (which really did sample 10,000 rows) the optimizer didn’t spot the need for a histogram on order_id, but with the auto_sample_size (which sampled 5,500 – yes, half as many rows) the optimizer spotted the need for the histogram. Checking the trace files the only difference visible in the sample SQL was the presence in the 1% sample of the id_vc and padding columns which were not present in the auto_sample_size trace.

According to the manuals when the method_opt is “for all columns size auto”, then

“Oracle determines the columns on which to collect histograms based on data distribution and the workload of the columns.”

There is nothing in the manuals to suggest that there is a deliberate link between the auto_sample_size and estimate_percent, and there is room for ambiguity in how we interpret this bit of text in the manual so the difference in the SQL used and the effects thereof requires (a) some hand-waving, and/or (b) lots more experimentation.  At the moment I’m prepared to go for hand-waving:

Hypothesis 1: auto_sample_size did not sample the id_vc and padding columns because the (100%) sample taken had given Oracle enough information to decide that the data distribution of those columns was not skewed enough to merit further consideration; but it sampled the three columns that had been used in a fashion that might be helped by a histogram. This sampling spotted the benefit of a histogram on order_id and bucket_type but decided that sec_id didn’t need a histogram

Hypothesis 2: the 1% sample got pretty close to the same results in its estimates of number of distinct values for id_vc and padding as the (100%) auto_sample_size, but still decided to do a sampled test for the data distribution (the manual seems to suggest that the histograms will only be considered if there has been some use of the columns in predicates, but doesn’t explicitly preclude the possibility of creating the histogram on the basis of just the data distribution). After doing the 1% sample to analyze the data for suitability of a histogram the result suggested that only the histogram on bucket_type would be beneficial.  (In fact, after the first sample Oracle took a second 1% histogram sample on just the order_id before deciding that it a histogram on order_id wasn’t needed.)

Bottom line on this: I don’t know if the auto_sample_size “accidentally” eliminated a couple of columns from histogram sampling and if a larger fixed sample size (say 50%, or even 100%) might result in Oracle eliminating a few columns from the histogram; or maybe the code path for histogram samples with auto_sample_size in place is actually a different code path. The only thing I can say is that the two sets of events that appeared from my demonstration don’t seem to be entirely self-consistent, but it would probably take most of a day doing experiments to narrow down the variation in behaviour to a few “best guess” ideas of what’s going on behind the scenes – though unwrapping the code might lead to a more accurate answer more quickly.

Moral

Histograms are tricky things – and you can only make things worse in 11g by NOT using the auto_sample_size.

Footnote

Based on previous experience – my “obvious” guess about the OP’s data was that there was a special-case value for order_id, that the rows for that value were fairly well clustered, probably towards the end of the table, and constituted a small percentage of the table, and that the rest of the data reported “a few” rows per value. That’s why I built the model you see above.

September 29, 2015

Master Notes

Filed under: Oracle — Jonathan Lewis @ 6:52 pm GMT Sep 29,2015

MoS has a number of “Master Note” documents which pop up from time to time while I’m checking for known problems or solutions; they get extended from time to time (and, conversely, link to some articles which are clearly no longer relevant to current versions). This is just a tidied up list of a few of the master notes that I’ve jotted down over time. The Data Dictionary overview appeared in my daily “Hot Topics” report today, and that prompted me to publish the list I’d got so far.

  • Oracle Learning Library (home page)
  • Oracle Online Learning advanced search (hundreds of free short vidoes)
  • Overview of Data Dictionary (1500058.1)
  • Privileges And Roles (1347470.1)
  • Overview of Oracle Segment Storage (1491960.1)
  • Overview of Oracle Temporary Tablespaces (1498442.1)
  • Troubleshooting Oracle Temporary Tablespaces (1524594.1)
  • Overview of Oracle Background Processes (1503146.1)
  • Troubleshooting Oracle Background Processes (1509616.1)
  • Overview for SCN issues (1503937.1)
  • Transaction management
  • Troubleshooting Database Startup/Shutdown Problems (851057.1)
  • Overview of Oracle Data Definition Language (DDL) (1501399.1)
  • Partitioning (1312352.1)
  • Parallel Execution (203238.1)
  • Table Compression 11g (1223705.1)
  • Resource Manager and DBMS_RESOURCE_MANAGER (1484302.1)
  • Materialized Views (1353040.1)
  • Collections (1147234.1)
  • Plan Stability Features (Including SQL Plan Management (SPM)) (1359841.1)
  • Flashback Technologies (1138253.1)
  • Database Vault (1195205.1)
  • Oracle Text Overview of New Features in Release 10g (Doc ID 249991.1)
  • Top 5 RAC issues (1373500.1)
  • Diagnosing ORA-600 (1092832.1)
  • Database Corruption Issues (1088018.1)

Quick Tip: If you search MoS with either of the expressions “overview of” or “Master Note” and a word describing the feature you’re researching then there’s a fair chance that you’ll find a document that catalogues a number of articles about the topic you’re interested in.

September 28, 2015

Result Cache 2

Filed under: 12c,Flashback,Oracle,Performance — Jonathan Lewis @ 8:50 am GMT Sep 28,2015

Following on from my earlier posting of problems with temporary table and the PL/SQL result cache (a combination which the manuals warn you against) here’s another problem – again, to a large degree, self-inflicted.

Imagine you have a complex report involving a large number of financial transactions with a need to include calculations about current exchange rates. Unfortunately the rules about calculating the appropriate exchange rate for any transaction are complex and you find you have a choice between adding 6 tables with outer joins and a couple of aggregate (max) subqueries to the base query or calling a PL/SQL function to calculate the exchange rate for each row. I’m going to create an extremely simplified model of this requirement:

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        case mod(rownum,2) when 0 then 'USD' else 'GBP' end v1,
        case mod(rownum,2) when 0 then 'GBP' else 'USD' end v2
from
        generator       v1
where
        rownum <= 1e3
;

create table t2 (
        v1      varchar2(3),
        v2      varchar2(3),
        cvt     number(10,6),
        constraint t2_pk primary key(v1,v2)
)
organization index
;

insert into t2 values('USD','GBP',0);
insert into t2 values('GBP','USD',1);

commit;

create or replace function plsql_ordinary(
        from_cur        in varchar2,
        to_cur          in varchar2
)
return number is
        m_ret number;
begin
        select /*+ ordinary trace_this */
                cvt
        into    m_ret
        from    t2
        where   v1 = from_cur
        and     v2 = to_cur
        ;

        return m_ret;

end plsql_ordinary;
/

execute dbms_stats.gather_table_stats(user,'t2')

My t1 table represents the set of transactions but only has to worry about two exchange rates, the dollar/sterling and its inverse. My t2 table is an exchange rate table and I’ve loaded it with the two exchange rates I’m interested in. My function plsql_ordinary() takes two currency codes as inputs and returns the exchange rate.

Here’s the modelling query, with a little infrastructure to examine the workload. Note that this will only run on 12c because of the inline PL/SQL function I’ve used for slowing the query down.

set pagesize 20
set arraysize 6

set serveroutput off
alter system flush shared_pool;
alter session set statistics_level = all;

with
        function wait_row_n (
                i_secs          number,
                i_return        number default -1
        ) return number
        is
        begin
                dbms_lock.sleep(i_secs);
                return i_return;
        end wait_row_n;
select
        /*+ driver trace_this */
        wait_row_n(0.3,id),
        plsql_ordinary(v1,v2),
        (select /*+ scalar trace_this */ t2.cvt from t2 where t2.v1 = t1.v1 and t2.v2 = t1.v2) scalar_sub
from
        t1
where
        rownum <= 50
;

set pagesize 40

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

select  sql_id, executions, fetches, rows_processed, sql_text
from    v$sql
where   lower(sql_text) like 'select%trace_this%'
and     lower(sql_text) not like '%v$sql%'
;

The query includes a scalar subquery in the select list to get the same data as the PL/SQL function, and you’ll see the point of that in a while. Because of the arraysize and input parameters to wait_row_n() the query will produce output in batches of 6 rows roughly every two seconds for a total of about 18 seconds – which will give me plenty of time to play around in another session. Before I try to do any damage, though, let’s check the execution plan of the report and the statistics of the queries with “trace_this” in their text:


--------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |      1 |        |     50 |00:00:00.01 |      11 |
|*  1 |  INDEX UNIQUE SCAN | T2_PK |      2 |      1 |      2 |00:00:00.01 |       2 |
|*  2 |  COUNT STOPKEY     |       |      1 |        |     50 |00:00:00.01 |      11 |
|   3 |   TABLE ACCESS FULL| T1    |      1 |     50 |     50 |00:00:00.01 |      11 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."V1"=:B1 AND "T2"."V2"=:B2)
   2 - filter(ROWNUM<=50)


SQL_ID        EXECUTIONS    FETCHES ROWS_PROCESSED
------------- ---------- ---------- --------------
SQL_TEXT
-----------------------------------------------------------------------------
f1bz07bk5rbth         50         50             50
SELECT /*+ ordinary trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

Looking at operation 1 in the plan you can see that the scalar subquery has started just twice (once for each distinct combination of currency codes).
Looking at the results from v$sql you can see that the query in the PL/SQL function was executed 50 times – once for each row. Functions like this can be a massive drain of resources (typically CPU, but also through latch contention on buffer cache and shared pool latches).

Conveniently we realise that in our system the derived exchange rates change very slowly – so how about telling Oracle that the exchange rate function is a deterministic function (it’s nearly true), or better still, perhaps, experiment with the PL/SQL Function Result Cache.

(Note very carefully, however, that the Database Administrators’ Manual for 11.2 makes the following comment about using the deterministic keyword with PL/SQL functions)

DETERMINISTIC

Tells the optimizer that the function returns the same value whenever it is invoked with the same parameter values (if this is not true, then specifying DETERMINISTIC causes unpredictable results).

Look carefully at that “unpredictable” – it’s true … but you might not realise it until too late. Our PL/SQL function is NOT deterministic – after all a function that queries the database to produce a result may produce a different result every time it executes if someone keeps changing the underlying data – but we might wave our hands a bit and say that the table isn’t going to change while we’re running our report so it’s okay to pretend it’s deterministic, we might even make it a read-only table for the duration. Similar thoughts should cross our minds about declaring a PL/SQL function to the result cache – even though the manual doesn’t say anything quite so explicitly threatening about the result cache. But what the heck, let’s just do it and see what happens:

create or replace function plsql_result_cache(
        from_cur        in varchar2,
        to_cur          in varchar2
)
return number
result_cache
is
        m_ret number;
begin
        select /*+ result cache trace_this */
                cvt
        into    m_ret
        from    t2
        where   v1 = from_cur
        and     v2 = to_cur
        ;

        return m_ret;

end plsql_result_cache;
/

create or replace function plsql_deterministic(
        from_cur        in varchar2,
        to_cur          in varchar2
)
return number
deterministic
is
        m_ret number;
begin
        select /*+ det trace_this */
                cvt
        into    m_ret
        from    t2
        where   v1 = from_cur
        and     v2 = to_cur
        ;

        return m_ret;

end plsql_deterministic;
/

...
select
        /*+ driver trace_this */
        wait_row_n(0.3,id),
        plsql_ordinary(v1,v2),
        plsql_deterministic(v1,v2),
        plsql_result_cache(v1,v2),
        (select /*+ scalar trace_this */ t2.cvt from t2 where t2.v1 = t1.v1 and t2.v2 = t1.v2) scalar_sub
from
...

All three functions returned the same set of results as the scalar subquery – and here’s the output from v$sql showing the impact of declaring a deteministic function and a result cache function (note that “result cache” is not a hint in the first statement, it’s just a convenient label):


SQL_ID        EXECUTIONS    FETCHES ROWS_PROCESSED
------------- ---------- ---------- --------------
SQL_TEXT
-----------------------------------------------------------------------------------
49buxp3gba3cg          2          2              2
SELECT /*+ result cache trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

2sh7bm59dkwhw         18         18             18
SELECT /*+ det trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

f1bz07bk5rbth         50         50             50
SELECT /*+ ordinary trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1


The simple function ran 50 times, the deteministic function ran 18 times, and the result cache function ran twice. It required just two executions to get the two distinct results needed, after which the session used the result cache rather than calling the function again.

The deterministic function only remembers its results for the duration of a single database call – in this case the fetch – so on each fetch the session has to re-populate the session’s “deterministic cache”, which takes 2 calls for each fetch, a total of 9 fetches * 2 calls = 18 calls.

Clearly, if I can get away with it safely, the PL/SQL function result cache looks like a real winner, with the deterministic option coming a close second (given that real life ought to be using a significantly larger fetch arraysize). So what could possibly go wrong ? Two things – first, the results … and if the results can go wrong there’s not much point in talking about the second thing at the moment.

My query runs for 18 seconds, I’m going to start another session while it runs and update one of the rows in the t2 table a few seconds after my long-running query starts. Here’s the SQL I’ve run, an extract from the query output, and the results from v$sql:


update  t2 set
        cvt = 2
where   v1 = 'USD' 
;

commit;


WAIT_ROW_N(0.3,ID) PLSQL_ORDINARY(V1,V2) PLSQL_DETERMINISTIC(V1,V2) PLSQL_RESULT_CACHE(V1,V2) SCALAR_SUB
------------------ --------------------- -------------------------- ------------------------- ----------
                 1                     1                          1                         1          1
                 2                     0                          0                         0          0
                 3                     1                          1                         1          1
                 4                     0                          0                         0          0
                 5                     1                          1                         1          1
                 6                     0                          0                         0          0
                 7                     1                          1                         1          1
                 8                     0                          0                         0          0
                 9                     1                          1                         1          1
                10                     2                          0                         2          0
                11                     1                          1                         1          1
                12                     2                          0                         2          0
                13                     1                          1                         1          1
                14                     2                          2                         2          0
                15                     1                          1                         1          1
                16                     2                          2                         2          0


SQL_ID        EXECUTIONS    FETCHES ROWS_PROCESSED
------------- ---------- ---------- --------------
SQL_TEXT
-----------------------------------------------------------------------------------
49buxp3gba3cg          4          4              4
SELECT /*+ result cache trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

2sh7bm59dkwhw         18         18             18
SELECT /*+ det trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

f1bz07bk5rbth         50         50             50
SELECT /*+ ordinary trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

The most important point is that we’ve got results that are not self-consistent – except for the scalar subquery results.

The SQL statements that are executed inside the PL/SQL functions do not run at the SCN of the driving query, each individual statement executes at its own starting SCN. This is an important point that is often overlooked when people write PL/SQL functions that are then called from SQL. The inline scalar subquery, on the other hand, always runs as at the start SCN of the driving query no matter how many times or how late in the lifetime of the driving query it runs.

If we examine the results we can see that the ordinary PL/SQL function and the result cache PL/SQL function changed their output the moment the commit took place (you’ll have to take my word on that, of course), but the deterministic function seemed to delay slightly. We can also see that the number of executions for the ordinary and deterministic functions didn’t change, but the result cache function doubled its executions.

Because of the way I’ve created my data and defined the function, the ordinary function executes its SQL once every row while the deterministic function executes its SQL twice every fetch (once for each pair of input values (though the first fetch from SQL*Plus is a special case) and then remembers the outputs for the rest of the fetch – this is why there is a delay before the deterministic function changes its output and doesn’t introduce any extra calls to the SQL – it was going to re-execute for both values on the fetch starting at id 13 whatever went on around it; the result cache function gets an invalidation warning the moment the other session commits, so re-executes its SQL as many times as needed to repopulate the bits of the cache that have been discarded – and that’s why we see the number of SQL calls doubling, the relevant part of the cache was identified by some sort of hash value for the statement with SQL_ID = ’49buxp3gba3cg’ so both results were discarded and reloaded even though only one of them actually had to change.

Critically every execution of the recursive statements runs at the then-current SCN – so when the underlying data changes our report sees those changes, the report is not self-consistent.

Fortunately there’s something we can do about this – if we want the whole report to operate at the same SCN all we need to do is freeze our session at a specific point in time with the command “set transaction read only;”. This has to be executed as the first statement of a transaction but if we can slip it in just before we start running our report we can ensure that all our SQL statements (including the recursive ones) run as at the same SCN until we issue a commit. When I repeated the data-changing experiment after setting the transaction read only the report ran to completion showing the correct results.

But there was a significant change in the output from v$sql:


SQL_ID        EXECUTIONS    FETCHES ROWS_PROCESSED
------------- ---------- ---------- --------------
SQL_TEXT
-----------------------------------------------------------------------------------
49buxp3gba3cg         44         44             44
SELECT /*+ result cache trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

2sh7bm59dkwhw         18         18             18
SELECT /*+ det trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

f1bz07bk5rbth         50         50             50
SELECT /*+ ordinary trace_this */ CVT FROM T2 WHERE V1 = :B2 AND V2 = :B1

I did my update just after the first batch of rows had appeared on screen – notice how the result cache SQL has executed 44 times instead of 2 (or 4) times. When I set my transaction to read only it looks as if my session stopped using the result cache the moment the other session commited – and that’s a GOOD THING. If my session were able to continue using the result cache that would mean one of two things, either I would be seeing a result created by another user – which would be too new for me, or every other session would be seeing the results from my session – which would (typically) be out of date for them. The session seems to have protected itself from the risk of a result cached PL/SQL function producing inconsistent results – but the workload changed the moment another session committed a change to the data we were interested in.

At that point I stopped investigating “set transaction read only” until a couple of days later when I realised that there was one more thing I had to test – when I changed the data from another session I didn’t check to see what that session would see when it executed the cached function, so I modified the code run by the other session to do the following:


update  t2 set 
        cvt = 2 
where   v1 = 'USD' 
;

commit;

execute dbms_lock.sleep(6)
execute dbms_output.put_line(plsql_result_cache('USD','GBP'))

The other session goes to sleep for a while (to let the reporting session get through a little more work) and then calls the function. I was very pleased to see that the session returned the correct result – the value 2 that it had just written to the table. But what I got from the reporting session wasn’t so good:

WAIT_ROW_N(0.3,ID) PLSQL_ORDINARY(V1,V2) PLSQL_DETERMINISTIC(V1,V2) PLSQL_RESULT_CACHE(V1,V2) SCALAR_SUB
------------------ --------------------- -------------------------- ------------------------- ----------
                 1                     1                          1                         1          1
                 2                     0                          0                         0          0
                 3                     1                          1                         1          1
                 4                     0                          0                         0          0
                 5                     1                          1                         1          1
                 6                     0                          0                         0          0
                 7                     1                          1                         1          1
                 8                     0                          0                         0          0
...
                24                     0                          0                         0          0
                25                     1                          1                         1          1
                26                     0                          0                         0          0
                27                     1                          1                         1          1
                28                     0                          0                         0          0
                29                     1                          1                         1          1
                30                     0                          0                         2          0
                31                     1                          1                         1          1
                32                     0                          0                         2          0

SQL_ID        EXECUTIONS    FETCHES ROWS_PROCESSED SQL_TEXT
------------- ---------- ---------- -------------- --------------------------------
49buxp3gba3cg         32         32             32 SELECT /*+ result cache trace_th
                                                   is */ CVT FROM T2 WHERE V1 = :B2
                                                    AND V2 = :B1

49buxp3gba3cg          1          1              1 SELECT /*+ result cache trace_th
                                                   is */ CVT FROM T2 WHERE V1 = :B2
                                                    AND V2 = :B1

2sh7bm59dkwhw         18         18             18 SELECT /*+ det trace_this */ CVT
                                                    FROM T2 WHERE V1 = :B2 AND V2 = 
                                                    :B1    

f1bz07bk5rbth         50         50             50 SELECT /*+ ordinary trace_this * 
                                                   / CVT FROM T2 WHERE V1 = :B2 AND
                                                    V2 = :B1

I changed t2 just after the first batch of rows had appeared (just after id 6), then called the function a few seconds later – and as I called the function from the other session it queried the data (the second child to 49buxp3gba3cg, executed just once above) and reloaded the result cache. At that moment (visible at id 30) the first session found data in the result cache and stopped re-executing its queries. When my session went read only it protected other sessions from the past by NOT re-populating the result cache as it ran its queries – but if it found something in the result cache it used it (notice how it has recorded 32 executions of the query rather than 44 – it found something in the result cache on the last 12 calls of the function). The protection only goes one way.

Observations

Using PL/SQL functions in the select list to encapsulate complex calculations that query the database is not something you can do casually. You have no guarantee that you will end up with a self-consistent result unless you take action to deal with the threats introduced by concurrent activity – ideally all tables involved should be set to read-only (which is only possible in 12c [Ed: see comment below] became possible from 11.1 onwards, though you can set a whole tablespace readonly in earlier versions: neither strategy is commonly viable). If you decide that you can work around those threats you still have the performance threat implied by the need to do some complex work for each driving row of your result set. For a report the simple solution to consistency is to “freeze” the session SCN by setting your session (transaction) into read only mode.

Once you’ve dealt with the consistency problem you can start to address the performance problen by claiming that you were using deterministic functions. You might be tempted to use the PL/SQL Result Cache to give you an even larger performance boost, but if you do you really have to make the table (or tablespace) read-only to be protected from read-consistency problems. The deterministic strategy may not be as dramatic in its effects as the result cache strategy but, given a reasonably sized fetch array, the performance benefit you get may be all you really need.

Whatever else you do, there’s an important underlying threat to remember. The moment you create a PL/SQL function that uses the result cache or deterministic option you have to ensure that nobody uses that function without ensuring that their code has handled the consistency threat properly. It’s too easy to forget, with the passing of time, that certain procedures have to be adopted when particular coding strategies are used.

Left as Exercise

I was planning to write a further article going into some detail about using dbms_flashback.enable_at_time(systimestamp) instead of “set transaction read only” – a mechanism that might be used to achieve the same read-consistency requirement though, perhaps, used less frequently than the older, less flexible option. I was also planning to include notes in the same araticle about the effect of “alter session set isolation_level = serializable” which some systems probably use to get read-consistency across multiple statements while writing results back to the database.

Both strategies run into the same issue as “set transaction read only”, with the added effect that your session (the one that has tried to “fix” its SCN) will repopulate the cache, so not only could you see newer results from other sessions in the cache; other sessions could see older results because you put them into the cache.

I’m not going to write up these demonstrations (which require fairly simple modifications to the code supplied above) as all the phenomena have been recorded as bugs on MoS (along with the GTT problem from my previous post, and a documentation bug for the PL/SQL manual to clarify the various threats):

  • Bug 21905592 : PL/SQL RESULT CACHE MUST BE BYPASSSED WHEN SESSION SCN NOT CURRENT
  • Bug 21905695 : BAD X-SESSION RESULTS W/ RESULT-CACHED PL/SQL FN AND “SET TXN READ ONLY”
  • Bug 21907155 : PL/SQL RC FN SHOWS ONE SESSION THE CONTENTS OF ANOTHER SESSION’S GTT
  • Bug 21885173 : THE LNPLS SECTION “RESTRICTIONS ON RESULT-CACHED FUNCTIONS” NEEDS RE-WRITE

Footnote

I haven’t mentioned it here, but another strategy for reducing the number of PL/SQL calls is simply to wrap the function call inside a scalar subquery, along the lines of:

select
       (select plsql_ordinary(v1, v2) from dual),
       ...

Provided you don’t have many combinations of (v1,v2) to handle, and with a little luck with Oracle’s internal hashing function, you could find that scalar subquery caching reduces your execution count from once per row to once per combination. Note that the function is the “ordinary” function, not the “fake deterministic” one; internally Oracle uses the same hashing strategy for remembering the results, but the duration of the scalar subquery cache is the statement rather than the fetch.

 

September 23, 2015

Forget-me-nots

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 8:17 am GMT Sep 23,2015

Here’s a little note that I drafted (according to its date stamp) in January 2013 and then forgot to post. (Which adds a little irony to the title.)

============================================================

Here’s an object lesson in (a) looking at what’s in front of you, and (b) how hard it is to remember all the details.

I ran a script today [ED: i.e. some time early Jan 2013] that I’ve have no problems with in earlier versions of Oracle, but today I was running it against 11.2.0.3 for the first time, and hit a problem with autotrace:

SQL> set autotrace on
ERROR:
ORA-28002: the password will expire within 5 days

SP2-0619: Error while connecting
SP2-0611: Error enabling STATISTICS report

Anyone who reads the preceding text closely will see immediately what the problem is – but I saw the bottom line and immediately decided that I had forgotten to set up the plustrace role in this database. So I logged on as SYS and ran the script to create it ($ORACLE_HOME/sqlplus/admin/plustrce.sql) – and discovered that the role already existed. So I decided that I hadn’t granted plustrace to my test_user role, but that didn’t help; so I decided that this was clearly a case of a role that had to be assigned directly to a user, and that didn’t help.

THEN I read the error message properly, changed my password, and everything worked the way as expected.

That’s part (a) of the lesson, here’s part (b): I wrote about running int the same issue nearly two years ago (only that time I had a little excuse for not spotting the problem instantly).

============================================================

September 22, 2015

Result Cache

Filed under: 12c,Bugs,Infrastructure,Oracle,Performance — Jonathan Lewis @ 10:11 am GMT Sep 22,2015

Yesterday I thought I’d spend half an hour before breakfast creating a little demonstration of a feature; some time about midnight I felt it was time to stop because I’d spent enough time chasing around a couple of bugs that produced wrong results in a variety of ways. Today’s short post is just little warning: be VERY careful what you do with the PL/SQL result cache – if you use the results of database queries in the cache you may end up with inconsistent results in your application. Here’s one very simple example of what can go wrong, starting with a little script:


alter system flush shared_pool;

truncate table gtt1;
drop table gtt1;
create global temporary table gtt1 (n1 number)
on commit preserve rows
;

insert into gtt1 values(1);
commit;

execute dbms_stats.gather_table_stats(user,'gtt1');


create or replace function f_cache
return number 
result_cache 
-- relies_on (gtt1)
is
        m_ret number;
begin
        select  max(n1)
        into    f_cache.m_ret
        from    gtt1
        ;

        return f_cache.m_ret;
end;
/

execute dbms_output.put_line(f_cache)

Here’s the output from a session that’s just connected and run this script (the table already existed from an earlier run):

SQL> @temp

System altered.


Table truncated.


Table dropped.


Table created.


1 row created.


Commit complete.


PL/SQL procedure successfully completed.


Function created.

1

PL/SQL procedure successfully completed.

SQL> 

Let’s call this session A, and I’m going to connect with another session which I’ll call session B. The following extracts are cut-and-pastes as I alternate between sessions:

Session B:

SQL> execute dbms_output.put_line(f_cache);
1

PL/SQL procedure successfully completed.

SQL> insert into gtt1 values(0);

1 row created.

SQL> execute dbms_output.put_line(f_cache);
0

PL/SQL procedure successfully completed.


Session B has just seen the data inserted into a global temporary table by session A; but after inserting a row into the table it now sees what it perceives to be the correct answer.

Session A:

SQL> truncate table gtt1;

Table truncated.

SQL> execute dbms_output.put_line(f_cache);
1

PL/SQL procedure successfully completed.

Despite truncating the table, session A still sees the data which has been eliminated unrecoverably.

Session B (where I hadn’t yet committed):


SQL> commit;

Commit complete.

SQL> 

Session A (where I’ve done nothing new):

SQL> execute dbms_output.put_line(f_cache);


PL/SQL procedure successfully completed.

SQL> 

The row has finally “disappeared” because session B committed.

Session B (where I haven’t done anything since committing):


SQL> execute dbms_output.put_line(f_cache);


PL/SQL procedure successfully completed.

SQL> 

Session B no longer sees the data because it’s now seeing what session A has just seen.

Warning.

This is just one of several ways I managed to get surprising inconsistencies when using the (cross-session) PL/SQL Result Cache. Oracle (in 12c) is supposed to know what the cache relies on and react accordingly, but it doesn’t manage to do it properly (even if you include the relies_on clause) – if you query the database in your PL/SQL you may find strange things happen.

The most important point to note in this example is that a session that wasn’t necessarily doing anything wrong got the wrong results because of the actions of another session that Oracle should have protected it from.

The good thing about this is example that it’s documented (sort of) – as it says in the manual: “to be result cached … does not reference … temporary tables ..”. So that should warn people off copying my example; on the other hand the problem I’m seeing arises because Oracle seems to be trying to use the result cache when the manuals suggests it shouldn’t be.

Conclusion

Do not mix the pl/sql result cache with database queries. The cache is public but (unlike the buffer cache) it is not guaranteed to give you read-consistency.

If you think this was an unrealistic example and you don’t need to worry about it – I’ll post a couple more examples in the next couple of weeks. They’ll be slightly special cases again, but I find the Oracle world is full of special cases.

Update

This behaviour is now visible on MoS as “Bug 21907155 : PL/SQL RC FN SHOWS ONE SESSION THE CONTENTS OF ANOTHER SESSION’S GTT”

September 4, 2015

Histogram Tip

Filed under: CBO,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:32 am GMT Sep 4,2015

I’ve just responded to the call for items for the “IOUG Quick Tips” booklet for 2015 – so it’s probably about time to post the quick tip that I put into the 2014 issue. It’s probably nothing new to most readers of the blog, but sometimes an old thing presented in a new way offers fresh insights or better comprehension.

Histogram Tips

A histogram, created in the right way, at the right time, and supported by the correct client-side code, can be a huge benefit to the optimizer; but if you don’t create and use them wisely they can easily become a source of inconsistent performance, and the automatic statistics gathering can introduce an undesirable overhead during the overnight batch. This note explains how you can create histograms very cheaply on the few columns where they are most likely to have a beneficial effect.

set_column_stats

The dbms_stats package have many procedures and functions built into it that allow us to see (get) and manipulate (set) the stored statistics; in particular it holds two functions get_column_stats() and set_column_stats(), and we can use these procedures to create a histogram very cheaply whenever we want at very low cost. Here’s an example that could be modified to suit a character column in a table where you’ve previously collected some stats. It uses a copy of all_objects, limited to exactly 10,000 rows.


create table t1 as
select
	*
from
	all_objects
where
	rownum <= 10000 ; begin dbms_stats.gather_table_stats( ownname => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'
	);
end;
/


declare

	m_distcnt		number;		-- num_distinct
	m_density		number;		-- density
	m_nullcnt		number;		-- num_nulls
	m_avgclen		number;		-- avg_col_len

	srec			dbms_stats.statrec;
	c_array		dbms_stats.chararray;

begin

	dbms_stats.get_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'object_type', 
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec			=> srec,
		avgclen		=> m_avgclen
	); 

	c_array		:= dbms_stats.chararray('A', 'B', 'C', 'X', 'Y');
	srec.bkvals	:= dbms_stats.numarray (  2,   2,   2, 500, 494);
--	srec.rpcnts	:= dbms_stats.numarray (  0,   0,   0,   0,   0);
	srec.epc := 5;

	dbms_stats.prepare_column_values(srec, c_array);

	m_distcnt	:= 5;
	m_density	:= 1/(5000);

	dbms_stats.set_column_stats(
		ownname		=> user,
		tabname		=> 't1',
		colname		=> 'object_type', 
		distcnt		=> m_distcnt,
		density		=> m_density,
		nullcnt		=> m_nullcnt,
		srec			=> srec,
		avgclen		=> m_avgclen
	); 

end;
/

Key features of the code: as you can see, the two calls have identical parameters which identify the table and column name (there is an optional parameter for a (sub) partition name), and most of the basic statistics about the column. The histogram (or low and high values) are accessed through a special record type, and we can populate that record type by supplying an ordered list of values, a matching list of frequencies, and a count of how many values we have supplied.

Since my code is fixing stats on a varchar2() column I’ve declared an array of type dbms_stats.chararray to hold the list of values I want to see in a frequency histogram – there are other array types for dates, raw, number, etc. I’ve then used the structure of the stats record I had declared to hold the list of frequencies (srec.bkvals – possibly a short name for “bucket values”) and the count (srec.epc“end-point count”).

The call to dbms_stats.prepare_column_stats() takes my two arrays and massages them into the correct format for storage as a histogram that I can then write into the data dictionary with the closing call to dbms_stats.set_column_stats(). Before making that call, though, I’ve also set the “num_distinct” variable to tell the optimizer that there are 5 distinct values for the column (it makes sense, but isn’t absolutely necessary, for the num_distinct to match the number of values in the array), and set the “density” to a value that I would like the optimizer to use in it calculations if someone asks for a value that is not in my list.

I’ve included (but commented out) a line that’s relevant to the new histogram mechanisms in 12c –the srec.rpcnts (“repeat counts”) array is used in “hybrid histograms”. It’s not relevant to my example where I’m trying to create a pure frequency histogram, but if I don’t set the array I get an Oracle error: “ORA-06532: Subscript outside of limit”.

Results

There’s one important point to the method that isn’t instantly visible in the code – I created my table with 10,000 rows and there will be no nulls for object_type; but if you sum the array of frequencies it comes to exactly 1,000. This apparent contradiction is not a problem – the optimizer will compare the histogram figures to the total number of non-null entries it has recorded (in other words user_tables.num_rowsuser_tab_columns.num_nulls), and scale up the histogram accordingly. This means that a query for ‘A’ should return an estimated row count of 20 (rather than 2), ‘X’ should return 5,000 (rather than 500) and ‘D’ should return 2 (10,000 rows * 1/5000, the selectivity I had set for non-existent values).

With a little editing to save space, here’s a cut-n-paste from an SQL*Plus session running against 12c:


SQL> set autotrace traceonly explain
SQL> select * from t1 where object_type = 'A';

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    20 |  2100 |    22   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    20 |  2100 |    22   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_TYPE"='A')

SQL> select * from t1 where object_type = 'X';

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  5000 |   512K|    22   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  5000 |   512K|    22   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_TYPE"='X')

SQL> select * from t1 where object_type = 'D';

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   105 |    22   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   105 |    22   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_TYPE"='D')

Conclusion

It is very easy to create truly representative histograms (if you know your data) and the resources required to do so are minimal. If you see instability due to bad luck, or bad timing, gathering a histogram then you benefit enormously from writing code to construct histograms.

Footnote

In 12c the introduction of the “approximate NDV” strategy to collecting frequency histograms, and the introduction of the “Top-frequency” histogram has made automatic gathering of histograms on columns with a relatively small number of distinct values much faster and safer – but timing may still be an issue, and the resources needed to gather a safe hybrid histogram may still justify a hand-coded approach.

 

September 2, 2015

IN/EXISTS bugs

Filed under: 12c,Bugs,CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 8:11 am GMT Sep 2,2015

Here’s a simple data set – I’m only interested in three of the columns in the work that follows, but it’s a data set that I use for a number of different models:


execute dbms_random.seed(0)

create table t1
nologging
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	trunc(dbms_random.value(0,1000))	n_1000,
	trunc(dbms_random.value(0,750))		n_750,
	trunc(dbms_random.value(0,600))		n_600,
	trunc(dbms_random.value(0,400))		n_400,
	trunc(dbms_random.value(0,90))		n_90,
	trunc(dbms_random.value(0,72))		n_72,
	trunc(dbms_random.value(0,40))		n_40,
	trunc(dbms_random.value(0,3))		n_3
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;
create table t2 nologging 
as
select * from t1
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T2',
		method_opt 	 => 'for all columns size 1'
	);
end;
/

The columns I want to consider are n_3, n_400, and n_1000. As their names suggest the columns have 3, 400, and 1000 distinct values respectively and since I’ve used the dbms_random.value() function to generate the data the distinct values are fairly evenly spread across the million rows of the table.

Consider, then, the following two queries:


select
        *
from
        t1
where
        exists (
                select  null
                from    t2
                where   n_1000 = 0
                and     t2.n_400 = t1.n_400
                and     t2.n_3 = t1.n_3
        )
;


select
        *
from
        t1
where
        (t1.n_400, t1.n_3) in (
                select  t2.n_400, t2.n_3
                from    t2
                where   t2.n_1000 = 0
        )
;

The first point to check is that these two queries are logically equivalent.

Once you’re happy with that idea we can work out, informally, how many rows we should expect the queries ought to return: there are 1,200 combinations for (n_400, n_3) so each combination should return roughly 833 rows; if we pick 1,000 rows from the 1 million available we can expect to see 679 of those combinations (that’s Alberto Dell’Era’s “selection without replacement” formula that Oracle uses for adjusting num_distinct to allow for filter predicates). So we might reasonably suggest that the final number of rows as 833 * 679 = 565,607. It turns out that that’s a pretty good estimate – when I ran the query the result was actually 567,018 rows.

So what does Oracle produce for the two execution plans – here are the result from 12c (EXISTS first, then IN):


===================
Multi-column EXISTS
===================
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   920K|    34M|  1259  (11)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT SEMI|      |   920K|    34M|  1259  (11)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | T2   |  1000 | 11000 |   610   (8)| 00:00:01 |
|   3 |   TABLE ACCESS FULL  | T1   |  1000K|    26M|   628  (11)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."N_400"="T1"."N_400" AND "T2"."N_3"="T1"."N_3")
   2 - filter("N_1000"=0)

===================
Equivalent IN query
===================
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   833K|    30M|  1259  (11)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT SEMI|      |   833K|    30M|  1259  (11)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | T2   |  1000 | 11000 |   610   (8)| 00:00:01 |
|   3 |   TABLE ACCESS FULL  | T1   |  1000K|    26M|   628  (11)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."N_400"="T2"."N_400" AND "T1"."N_3"="T2"."N_3")
   2 - filter("T2"."N_1000"=0)

The first thing to note is that the shape of the plans is identical, and the predicate sections are identical – but the final cardinalities are different. Clearly at least one of the cardinalities has to be wrong by a significant amount (7.5% or 10.4%, depending which way round you want to look at it). If you run the test on 11.2.0.4 you find that both plans give the same estimated row count – and it’s the 920,000 rows; so arguably 12c has “fixed” the IN subquery calculation, bringing it closer to a reasonable prediction, but it hasn’t fixed the EXISTS subquery calculation. That 833K prediction, by the way, is what you would expect to see with this data with a basic join – and a semi-join shouldn’t be able to produce more data than  a join.

But both predictions are way off the (informal) expectation, so how have they appeared ?

Working backwards it’s easy to spot that: 833K = 833 * 1,000: Oracle is behaving as if every single row identified in the subquery will produce a separate combination of (n_400, n_3). If we reverse engineer 920K we get: 920K / 833 = 1104 – it would appear that the optimizer thinks the 1,000 rows produced by the subquery will produce 1,104 distinct combinations of (n_400, n_3) so we how did the impossible 1,104 appear in the arithmetic.

If you apply the “selection without replacement” formula to picking 1,000 rows with 400 distinct values from 1,000,000 rows the expected number of distinct values (with rounding) will be 368; if you apply the formula for picking 1,000 rows with 3 distinct values from 1,000,000 rows the expected number will be 3. And 3 * 368 = 1,104. (Remember that in my original estimate I applied the formula after multiplying out the combination of distinct values). The optimizer is using its standard methods, but using internediate results in an unsuitable fashion.

It’s impossible to say what the impact of this particular code path – and the change on the upgrade – might be. The optimizer has over-estimated by 47% in one case and 62% in the other but (a) there may be something about my data that exaggerated an effect that few people will see in the wild and (b) in many cases getting in the right ballpark is enough to get a reasonable plan, and a factor of 2 is the right ballpark.

Of course, a few people will be unlucky with a few queries on the upgrade where the estimate changes – after all a single row difference in the estimate can cause the optimizer to flip between a hash join and a nested loop – but at least you’ve got a little extra information that might help when you see a bad estimate on an important semi-join.

So is there a workaround ? Given that I’ve got 12c, the obvious thing to try is to create a column group at both ends of the semi-join and see what happens. It shouldn’t really make any difference because column groups are targeted at the problems of correlated column – but we might as well try it:


execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for columns (n_400,n_3) size 1')
execute dbms_stats.gather_table_stats(user,'t2',method_opt=>'for columns (n_400,n_3) size 1')

Unfortunately when I did this the final cardinality estimate for both queries dropped to just 833 (the absence of a K on the end isn’t a typo!).

Manually unnesting got me closer:


select
        *
from
        (
        select  distinct n_3, n_400
        from    t2
        where   n_1000 = 0
        )       sq,
        t1
where   
        sq.n_400 = t1.n_400
and     sq.n_3 = t1.n_3
;

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |   649K|    33M|  1260  (11)| 00:00:01 |
|*  1 |  HASH JOIN           |      |   649K|    33M|  1260  (11)| 00:00:01 |
|   2 |   VIEW               |      |   779 | 20254 |   612   (8)| 00:00:01 |
|   3 |    HASH UNIQUE       |      |   779 |  8569 |   612   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| T2   |  1000 | 11000 |   610   (8)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | T1   |  1000K|    26M|   628  (11)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("SQ"."N_400"="T1"."N_400" AND "SQ"."N_3"="T1"."N_3")
   4 - filter("N_1000"=0)

The cardinality of 649K is (allowing for rounding) 833 * 779; so we need to know where the 779 came from. It’s the optimizer standard arithmetic for “distinct” – multiply the N individual selectivities together then divide by the sqrt(2) “N-1” times. So we apply the “selection without replacement formula twice”:

  • adjusted selectivity of n_400 = 367.21
  • adjusted selectivity of n_3 = 3
  • 367.21 * 3 / sqrt(2) = 779

If you create column group statistics for (n_400, n_3) this doesn’t change the optimizer’s estimate for the number of distinct combinations after selection – maybe that’s another enhancement in the pipeline – but, at least in this case, the manual unnesting has got us a little closer to the right estimates without any statistical intervention.

Footnote:

Just for the sake of completeness, here are the plans (with yet more cardinality predictions) that you get if you block the unnesting:


select 
	*
from 
	t1 
where 
	exists (
		select	
			/*+ no_unnest */
			null  
		from	t2 
		where	n_1000 = 0 
		and	t2.n_400 = t1.n_400 
		and	t2.n_3 = t1.n_3
	)
;



---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  1179 | 33012 |   766K (12)| 00:00:30 |
|*  1 |  FILTER            |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  1000K|    26M|   632  (11)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |    11 |   638  (12)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T2" "T2" WHERE
              "N_1000"=0 AND "T2"."N_400"=:B1 AND "T2"."N_3"=:B2))
   3 - filter("N_1000"=0 AND "T2"."N_400"=:B1 AND "T2"."N_3"=:B2)



=====================================
Unnesting blocked and subquery pushed
=====================================
select 
	*
from 
	t1 
where 
	exists (
		select	
			/*+ no_unnest push_subq */
			null  
		from	t2 
		where	n_1000 = 0 
		and	t2.n_400 = t1.n_400 
		and	t2.n_3 = t1.n_3
	)
;

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 50000 |  1367K|  1271  (12)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL | T1   | 50000 |  1367K|   632  (11)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T2   |     1 |    11 |   638  (12)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ PUSH_SUBQ NO_UNNEST */ 0 FROM "T2"
              "T2" WHERE "N_1000"=0 AND "T2"."N_400"=:B1 AND "T2"."N_3"=:B2))
   2 - filter("N_1000"=0 AND "T2"."N_400"=:B1 AND "T2"."N_3"=:B2)

The 1179 comes from the magic of sqrt(2):  1179 = 1,000,000 / (400 * 3 / sqrt(2)).

The 50,000 is just the basic “I dunno, let’s call it 5%”.

 

Reference script: aggregate_selectivity_c.sql

 

September 1, 2015

Index Usage – 4

Filed under: CBO,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 6:41 pm GMT Sep 1,2015

Here’s a thought that came to me while I was writing up a note about identifying redundant indexes a few minutes ago. Sometimes you end up supporting applications with unexpected duplication of data and indexes and need to find ways to reduce overheads. Here’s some code modelling a scenario that I’ve seen more often than I like (actually, just once would be more often than I’d like):


create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e5
)
select
        rownum                                          id,
        trunc(sysdate,'MM') + (rownum-1)/1440           date_time,
        trunc(sysdate,'MM') + trunc((rownum-1)/1440)    date_only,
        rpad('x',100)                                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 ; begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

I’ve got a table holding one row per minute since the start of the month; there’s a column which holds the date and time accurate to the minute, and another column which is supposed to hold just the date part. Is it possible to create a single index that allows Oracle to handles queries relatively efficiently whether they refer to date_time or date_only ? As a starting step could we get an index range scan on the same index for both of the following queries:


select
        max(id)
from
        t1
where
        date_only between sysdate-1 and sysdate
;


select
        max(id)
from
        t1
where
        date_time between sysdate-1 and sysdate
;

As Bob the Builder likes to say: “yes we can”.

There are a few lines of SQL between the table creation and the stats gathering that I didn’t show you. The first creates the constraint that describes the relationship between date_time and date_only – one is the truncated version of the other; the second defines the index we need, and the third (unfortunately) has to be there to declare the date_time column as a mandatory column:

alter table t1
        add constraint t1_trunc_date
        check(
                  date_only = trunc(date_time)
              and (   (date_only is null and date_time is null)
                   or (date_only is not null and date_time is not null)
              )
        )
;

create index t1_i1 on t1(trunc(date_time)) nologging;

alter table t1 modify (date_time not null);

(Given the requirement for date_time to be not null to get my indexing strategy to work, we could simplify the t1_trunc_date constraint to just (date_only = trunc(date_time)) if we declared date_only to be not null as well).

With the extra lines of SQL included here are the resulting execution plans for the two queries (running on 11.2.0.4, but you get the same plans on 12.1.0.2):


=======================================
date_only between sysdate-1 and sysdate
=======================================

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    21 |    92   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE               |       |     1 |    21 |            |          |
|*  2 |   FILTER                      |       |       |       |            |          |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T1    |  4306 | 90426 |    92   (2)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T1_I1 |  4306 |       |    13   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SYSDATE@!>=SYSDATE@!-1)
   3 - filter("DATE_ONLY"<=SYSDATE@! AND "DATE_ONLY">=SYSDATE@!-1)
   4 - access(TRUNC(INTERNAL_FUNCTION("DATE_TIME"))>=SYSDATE@!-1 AND
              TRUNC(INTERNAL_FUNCTION("DATE_TIME"))<=SYSDATE@!)
=======================================
date_time between sysdate-1 and sysdate
=======================================

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    21 |    92   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE               |       |     1 |    21 |            |          |
|*  2 |   FILTER                      |       |       |       |            |          |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T1    |  1442 | 30282 |    92   (2)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T1_I1 |  4306 |       |    13   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SYSDATE@!>=SYSDATE@!-1)
   3 - filter("DATE_TIME"=SYSDATE@!-1)
   4 - access(TRUNC(INTERNAL_FUNCTION("DATE_TIME"))>=TRUNC(SYSDATE@!-1) AND
              TRUNC(INTERNAL_FUNCTION("DATE_TIME"))>=TRUNC(SYSDATE@!))

The optimizer has managed to generate extra predicates in both cases by applying transitive closure to the critical constraint to produce queries that can be addressed (with some inefficiencies) through the single index.

Within limits, therefore, I can reduce two indexes to a single index. The strategy isn’t ideal but it may be appropriate in a few special cases. There are several problems that should be considered carefully:

  • The date_time column has to be declared not null for this optimization strategy to appear – that’s going to limit its applicability.
  • You may have more complex code where the transformation simply can’t be made to appear.
  • The introduction of the trunc() function may change the optimizer’s arithmetic in ways that cause plans to change for the worse
  • (Most important) The index range scan is always a multiple of 24 hours, with the excess data discarded after you reach the table. If you have lots of time-based queries for short time intervals (e.g. less than 8 hours) then the extra work done may outweigh the benefit of reducing the number of indexes – especially if all the excess table visits turn into randomly scattered single block reads.

Despite these drawbacks you may decide that you have a case where the strategy is “good enough” to help you reduce the workload on your system at some critical times during the day or night.

 

Index Usage – 3

Filed under: Indexing,Oracle,Tuning — Jonathan Lewis @ 5:52 pm GMT Sep 1,2015

In my last note on index usage I introduced the idea of looking at v$segstat (or v$segment_statistics) and comparing the “logical reads” statistic with the “db block changes” statistic as an indicator of whether or not the index was used in execution plans. This week I’ll explain the idea and show you some results – with a little commentary – from a production system that was reported on the OTN database forum.

The idea is fairly simple (and simplistic). If you update a typical index you will traverse three blocks (root, branch, leaf) to find the index entry that has to be updated, so if the only reason you use an index is to find out which index entry has to be updated than the number of “db block changes” for that index will be (we hope) roughly one-third of the number of “session logical I/Os” of the index.

We can do some testing of this hypothesis with some simple SQL:


create table t1 nologging as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                                  id,
        trunc(dbms_random.value(0,333333))      n1,
        rpad('x',100)                           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 ; begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

alter table t1 add constraint t1_pk primary key(id) using index nologging;
create index t1_i1 on t1(n1)nologging;

So I’ve got a table with a million rows, a primary key, and an index on a column of randomly generated data. Now all I need to do is run the following little script  a few thousand times and check the segment stats – I’ve avoided using a pl/sql script because of all the special buffer-handling optimisations could appear if I did:


exec :b1 := trunc(dbms_random.value(1,1000001))

update t1
        set n1 = trunc(dbms_random.value(0,333333))
        where   id = :b1;

commit;

There are various ways of checking the segment stats, you could simply launch an AWR snapshot (or statspack snapshot at level 7) before and after the test – the results from the “Segments by …” sections of the report should tell you all you need to know; or you could run a simple piece of SQL like the following before and after the test and then doing some arithimetic:

select
        object_name, statistic_name, value 
from
       v$segment_statistics
where
       owner = {your user name here}
and    object_name in ('T1','T1_PK','T1_I1')
and    statistic_name in (
              'db block changes',
              'logical reads'
)
and     value != 0
order by
        object_name,
        statistic_name
;

I happen to have some snapshot code in a little procedure that does the job I need, so my testbed code looks like this:

execute snap_my_stats.start_snap
execute snap_segstat.start_snap

set termout off
set serveroutput off

variable b1 number

@start_10000    -- invoke my script 10,000 times

spool test

set serveroutput on
set termout on

execute snap_segstat.end_snap
execute snap_my_stats.end_snap

spool off

The question is, what do we expect the results to look like, and what do they actually look like. Given we have 10,000 updates going on we might expect something like the following:

  • T1_PK – index access by primary key, 10,000 * 3 logical I/Os
  • T1 – 10,000 logical I/Os as we find the rows then 10,000 db block changes
  • T1_I1 – index access to find entry to be deleted (10,000 * 3 logical I/Os), repeated to find leaf block for insertion of new entry (10,000 * 3 logical I/Os), with 10,000 * 2 db block changes for the delete/insert actions.

Here are a few results from 12.1.0.2 – if I don’t include a commit in the update script:


12.1.0.2 with no commit
Segment stats
=======================
T1
logical reads                               20,016
db block changes                            19,952

T1_PK
logical reads                               30,016
physical reads                                  19
physical read requests                          19

T1_I1
logical reads                               60,000
db block changes                            21,616

Session Stats
=============
Name                                         Value
----                                         -----
session logical reads                      110,919
consistent gets                             30,051
consistent gets examination                 30,037
db block gets                               80,868
db block changes                            81,989

Some of the figures match the predictions very nicely – in particular the logical reads and db block changes on the T1_I1 index are amazing (so good I feel I have to promise that I didn’t fake them, or wait until after the test to make my prediction;)

There are, however, some anomalies: why have I got 20,000 logical reads and db block changes on the table when I did only 10,000 updates. I was surprised by this, but it is something I’ve seen before: Oracle was locking each row before updating it, so generating two changes and two redo entries (Op Codes 11.4 and 11.5). In the past I’d noticed this as a side effect of setting the audit_trail to DB, but it was happening here with audit_trail =none. (Something to add to my “todo” list – why is this happening, when did it appear.)

You’ll also notice that the session level stats for logical reads nearly matches the table and index level (20K + 30K + 60K = ca. 110K) while the db block changes stats are out by a factor of 2. Don’t forget that for each change to a table or index we make a change to an undo block describing how to reverse that change so the 40,000 data changes are matched by a further 40,000 undo block changes; and on top of this every time we get the next undo block we change our transaction table entry in the undo segment header we’re using, and that accounts for most of the rest. The discrepancy in the number of logical reads is small because while we keeping getting and releasing the table and index blocks, we pin the undo block from the moment we acquire it to the moment it’s full so we don’t record extra logical reads each time we modify it.

Big observation

Based on the figures above, we could probably say that, for an index with a blevel = 2 (height = 3), if the number of db block changes recorded is close to one-third of the logical reads recorded, then that index is a good candidate for review as it may be an index that is not used to access data, it may be an index that does nothing except use up resources to keep itself up to date.

Big problem

Take a look at the statistics when I included the commit in my test case:

12.1.0.2 with commit
Segment Stats
====================
T1
logical reads                               20,000

T1_PK
logical reads                               30,000

T1_I1
logical reads                                  512
db block changes                               160

Session Stats
=============
Name                                         Value
----                                         -----
session logical reads                       80,625
consistent gets                             30,106
consistent gets examination                 30,039
db block gets                               50,519
db block changes                            60,489

Apparently my session has made 60,000 changes – but none of them applied to the table or index! In fact I haven’t even accessed the T1_I1 index! The segment statistics have to be wrong. Moreover, if I commit every update I ought to change an undo segment header block at the start and end of every update, which means I should see at least 20,000 more db block changes in the session (not 20,000 less); and since I’m not pinning undo blocks for long transaction I should see about 10,000 extra logical reads because of the undo block I have to acquire at the start of each short transaction. The session statistics have to be wrong as well!

A quick check on the redo stream shows exactly the change vectors I expect to see for these transactions:

  • 11.4 – lock row price (table)
  • 5.2 – start transaction (update undo segment header)
  • 11.5 – update row piece (table)
  • 10.4 – delete leaf row (index)
  • 10.2 – insert leaf row (index)
  • 5.4 – commit (update undo segment header)
  • 5.1 – update undo block (op 11.1 – undo table row operation)
  • 5.1 – update undo block (op 11.1 – undo table row operation)
  • 5.1 – update undo block (op 10.22 – undo leaf operation)
  • 5.1 – update undo block (op 10.22 – undo leaf operation)

That’s a total of 10 changes per transaction – which means 100,000 db block changes  in total, not 60,000.

This anomaly is so large that it HAS to make my suggested use of the segment stats suspect.  Fortunately, though, the error is in a direction that, while sapping our confidence, doesn’t make checking the numbers a completely pointless exercise.  If the error is such that we lose sight of the work done in modifying the index then the figures remaining are such that they increase our perception of the index as one that is being used for queries as well – in other words the error doesn’t make an index that’s used for queries look like an index that’s only used for self-maintenance.

Case Study

The following figures were the results from the OTN database forum posting that prompted me to write this note and the previous one:

OTN

The poster has some code which gives a report of the indexes on a table (all 26 of them in this case) with their column definition and segment statistics. What (tentative) clues do we get about these indexes as far as this article is concerned ?

Conveniently the code arranges the indexes in order of “change percentage”, and we can see very easily that the first nine indexes in the list show “db block changes” > one-third of “logical reads”, the cut-off point for the article, so it’s worth taking a quick look at those indexes to see if they are suitable candidates for dropping. Inevitably the moment you start looking closely there are a number of observations to add to this starting point.

  1. Look at the number of changes in the first 12 indexes, notice how frequently numbers around 300,000 appear – perhaps that’s indicative of about 300,000 inserts taking place in the interval, in which case the first and 14th indexes (on (zcid) and (ps_spdh) respectively) must be on columns which are very frequently null and are therefore much smaller than the rest of the indes. Even though the index on (zcid) is reported at 39%, perhaps this is an index with a blevel of 1 (height = 2) in which case its cut-off point would be 50% rather than 33% – which means it could well be used for a lot of queries.
  2. The tenth index on (dp_datetime) reports 26%, “change percentage”  which is below the cut-off, but it’s worth noting that are three other indexes (12, 13 and 21) on that table that start with a column called dp_datetime_date. Is dp_datetime_date the truncated value of db_datetime and is it a real column or a virtual column ? Given my comments about the optimizer’s clever trick with indexes on trunc(date_column) in the second post in this series perhaps there’s scope here for getting rid of the dp_datetime index even though the simple numeric suggests that it probably is used for some queries.
  3. Of the three indexes starting with db_datetime_date, one consists of just that single column – so perhaps (as suggested in the first post in this series) we could simply drop that too. Then, when we look at the other two (indexes 12 and 13) we note that index 13 is subject to fives time as much change as index 12 (is that one insert plus 2 updates, given that an update means two changes), but fifteen times as much logical I/O. The extra LIO may be because the index is larger (so many more columns), it may be because the index is used very inefficiently – either way, we might look very carefully at the column ordering to see if index 13 could be rearranged to start the same way as index 12, and then drop index 12.  On top of everything else we might also want to check whether we have the right level of compression on the index – if it’s not very effective until we’ve selected on many columns then it must be subject to a lot of repetition in the first few columns.
  4. I gave a few examples in part one of reasons for dropping indexes based on similarity of columns used – the examples came from this output so I won’t repeat them, but if you refer back to them you will note that the desirability of some of the suggestions in the earlier article is re-inforced by the workload statistics – for example: the similarity of indexes 24 and 24, with an exact ordered match on the first 4 columns, suggests that we consider combining the two indexes into a single index: the fact that both indexes were subject to 2.7 million changes makes this look like a highly desirable target.

Summary

There are a lot of indexes on this table but it looks as if we might be able to drop nearly half of them, although we will have to be very careful before we do so and will probably want to make a couple at a time invisible (and we can make the change “online” in 12c) for a while before dropping them.

Remember, though, that everything I’ve said in this note is guesswork based on a few simple numbers, and I want to emphasise an important point – this note wasn’t trying to tell you how to decide if an index could be dropped, it was pointing out that there’s a simple way to focus your attention on a few places where you’re most likely to find some indexes that are worth dropping.  Run a report like this against the five biggest tables or the five busiest tables or the five tables with the most indexes and you’ll probably find a few easy wins as far as redundant indexes are concerned.

Footnote

While writing up my comments about the optimizer’s tricks with columns like dp_datetime and a virtual dp_datetime_date I had a sudden sneaky thought about how we could play games with the optimizer if both columns were real columns that were kept in synch with each other. If it works out I’ll write it up in a further blog.

August 29, 2015

Index Usage – 2

Filed under: 12c,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 11:33 am GMT Aug 29,2015

I’ve been a little slow in the follow-up to my previous posting on possibly redundant indexes. Before going into the slightly more complex stuff, there’s another peripheral point (but a very important one) that’s worth raising about how clever the optimizer can be. Here’s some code for 11.2.0.4 to demonstrate the point:

create table t1
nologging
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	rownum					id,
	trunc(sysdate,'MM') + (rownum-1)/1440	date_time,
	rpad('x',100)				padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5
;


alter table t1 
add (
        date_only
	generated always as (trunc(date_time)) virtual 
)
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

create index t1_i1 on t1(date_only) nologging;

So, in a two-step process, I’ve got an indexed virtual column that holds the value of the date_time column truncated to just the date. Would you expect the optimizer to use the index to execute the following query efficiently:


select
        max(id)
from
        t1
where
        date_time between sysdate-1 and sysdate
;

Note that the query references the real date_time column not the virtual column date_only, and it’s not using the expression that defines the index – yet the plan reads as follows:


-----------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |     1 |    21 |    86   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE                       |       |     1 |    21 |            |          |
|*  2 |   FILTER                              |       |       |       |            |          |
|*  3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T1    |  1442 | 30282 |    86   (2)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | T1_I1 |  4306 |       |    13   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SYSDATE@!>=SYSDATE@!-1)
   3 - filter("DATE_TIME"<=SYSDATE@! AND "DATE_TIME">=SYSDATE@!-1)
   4 - access("T1"."DATE_ONLY">=TRUNC(SYSDATE@!-1) AND
              "T1"."DATE_ONLY"<=TRUNC(SYSDATE@!))

It’s a little odd that even though the optimizer in the newer versions of Oracle treats many simple expressions on sysdate as constants it still checks (operation 2) that “sysdate >= sysdate – 1” but perhaps that’s just a case of a piece of generic code that isn’t worth the risk or effort of changing.

The key point, of course, is that Oracle has managed to generate some extra predicates that allow it to use the “wrong” index to get a first approximation of the result set fairly efficiently, and then used the original predicate to reduce the approximation down to the correct result set.

If you want a quick sanity check on the access predicates used for operation 4:

  • If date_time >= sysdate-1, then trunc(date_time) >= trunc(sysdate-1)
  • If date_time <= sysdate, then trunc(date_time) <= trunc(sysdate)

This style of predicate manipulation also works numeric data types, but I think its greatest benefit (or convenience) is likely to come from date data types where the data has been created with a time component but there are frequent “date-only” queries. The days of creating two indexes as a workaround for handling generated code that wants to deal with both date_time and trunc(date_time) predicates should be numbered.

Footnote:

This enhancement probably appeared in 11.2.0.2, and I first saw it described in October 2013 in this blog note by Mohamed Houri; but 12c offers a delightful little enhancement – here’s what my table looks like in the 12c version of the code:


SQL> desc t1
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ID                                     NUMBER
 DATE_TIME                              DATE
 PADDING                                VARCHAR2(100)

SQL> 

Where’s the virtual column ? The 12c version of my code had a slightly different definition for it:


alter table t1  
add (
        date_only
        invisible
        generated always as (trunc(date_time)) virtual
)
;

The transformation still works even when the virtual column is invisible. So (subject to searching for anomalies, boundary conditions and bugs) it looks as if you can change the table definition, and get the benefits of two indexes for the price of one without the application realising that anything has changed.

August 25, 2015

Truncate – 2

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 6:25 pm GMT Aug 25,2015

Following on from my earlier comments about how a truncate works in Oracle, the second oldest question about truncate (and other DDL) appeared on the OTN database forum“Why isn’t a commit required for DDL?”

Sometimes the answer to “Why” is simply “that’s just the way it is” – and that’s what it is in this case, I think.  There may have been some historic reason why Oracle Corp. implemented DDL the way they did (commit any existing transaction the session is running, then auto-commit when complete), but once the code has been around for a few years – and accumulated lots of variations – it can be very difficult to change a historic decision, no matter how silly it may now seem.

This posting isn’t about answering the question “why”, though; it’s about a little script I wrote in 2003 in response to a complaint from someone who wanted to truncate a table in the middle of a transaction without committing the transaction. Don’t ask why – you really shouldn’t be executing DDL as part of a transactional process (though tasks like dropping and recreating indexes as part of a batch process is a reasonable strategy).

So if DDL always commits the current transaction how do you truncate a table without committing ? Easy – use an autonomous transaction. First a couple of tables with a little data, then a little procedure to do my truncate:


create table t1 (n1 number);
insert into t1 values(1);

create table t2 (n1 number);
insert into t2 values(1);

create or replace procedure truncate_t1
as
        pragma autonomous_transaction;
begin
        execute immediate 'truncate table t1';
end;
/

Then the code to demonstrate the effect:


prompt  ======================================
prompt  In this example we end up with no rows
prompt  in t1 and only the original row in t2,
prompt  the truncate didn't commit the insert.
prompt  ======================================

insert into t2 values(2);

execute truncate_t1;
rollback;

select * from t1;
select * from t2;


According to my notes, the last time I ran this code was on 9.2.0.3 but I’ve just tested it on 12.1.0.2 and it behaves in exactly the same way.

I’ve only tested the approach with “truncate” and “create table” apparently, and I haven’t made any attempt to see if it’s possible to cause major distruption with cunningly timed concurrent activity; but if you want to experiment you have a mechanism which Oracle could have used to avoid committing the current transaction – and you may be able to find out why it doesn’t, and why DDL is best “auto-committed”.

Truncate

Filed under: Infrastructure,Oracle,redo,undo — Jonathan Lewis @ 8:39 am GMT Aug 25,2015

The old question about truncate and undo (“does a truncate generate undo or not”) appeared on the OTN database forum over the week-end, and then devolved into “what really happens on a truncate”, and then carried on.

The quick answer to the traditional question is essentially this: the actual truncate activity typically generates very little undo (and redo) compared to a full delete of all the data because all it does is tidy up any space management blocks and update the data dictionary; the undo and redo generated is only about the metadata, not about the data itself.

Of course, a reasonable response to the quick answer is: “how do you prove that?” – so I suggested that all you had to do was “switch logfile, truncate a table, dump logfile”. Unfortunately I realised that I had never bothered to do this myself and, despite having far more useful things to do, I couldn’t resist wasting some of my evening doing it. Here’s the little script I wrote to help


create table t2 (v1 varchar2(32));
insert into t2 values (rpad('A',32));
commit;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id, 
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;

create index t1_i1 on t1(id);
alter system flush buffer_cache;
execute dbms_lock.sleep(3)

alter system switch logfile;

insert into t2 values(rpad('X',32));

truncate table t1;and t

insert into t2 values(rpad('Y',32));
commit;

execute dump_log

Procedure dump_log simply dumps the current log file. The call to switch logfile keeps the dumped log file as small as possible; and I’ve flushed the buffer cache with a three second sleep to minimise the number of misleading “Block Written Record” entries that might otherwise appear in the log file after the truncate. There were all sorts of interesting little details in the resulting activity when I tested this on 12.1.0.2 – here’s one that’s easy to spot before you even look at the trace file:


SQL> select object_id, data_object_id, object_name from user_objects where object_name like 'T1%';

 OBJECT_ID DATA_OBJECT_ID OBJECT_NAME
---------- -------------- --------------------
    108705         108706 T1_I1
    108704         108707 T1

Notice how the data_object_id of the index is smaller than that of the table after the truncate ? Oracle truncates (and renumbers) the index before truncating the table.

The truncate activity was pretty much as as I had assumed it would be – with one significant variation. The total number of change vectors report was 272 in 183 redo record (your numbers may vary slightly if you try to reproduce the example), and here’s a summary of the redo OP codes that showed up in those change vectors in order of frequency:


Change operations
=================
  1 OP:10.25    Format root block
  1 OP:11.11    Insert multiple rows (table)
  1 OP:24.1     DDL
  1 OP:4.1      Block cleanout record
  2 OP:10.4     Delete leaf row
  2 OP:13.28    HWM on segment header block
  3 OP:10.2     Insert leaf row
  3 OP:17.28    standby metadata cache invalidation
  4 OP:11.19    Array update (index)
  4 OP:11.5     Update row (index)
 10 OP:13.24    Bitmap Block state change (Level 2)
 11 OP:23.1     Block written record
 12 OP:14.1     redo: clear extent control lock
 12 OP:22.5     File BitMap Block Redo
 14 OP:14.2     redo - lock extent (map)
 14 OP:14.4     redo - redo operation on extent map
 14 OP:5.4      Commit / Rollback
 15 OP:18.3     Reuse record (object or range)
 15 OP:22.16    File Property Map Block (FPM)
 22 OP:13.22    State on Level 1 bitmap block
 24 OP:22.2     File Space Header Redo
 29 OP:5.2      Get undo header
 58 OP:5.1      Update undo block

The line that surprised me was the 14 commit/rollback codes – a single truncate appears to have operated as 14 separate (recursive) transactions. I did start to walk through the trace file to work out the exact order of operation, but it’s really messy, and a tedious task, so I just did a quick scan to get the picture. I may have made a couple of mistakes in the following, but I think the steps were:

  • Start transaction
  • Lock the extent map for the index — no undo needed
  • Lock each bitmap (space management) block  — no undo needed
  • Reset each bitmap block — undo needed to preserve space management information
  • Reset highwater marks where relevant on bitmap and segment header block — undo needed
  • Clear segment header block — undo needed
  • Write all the updated space management blocks to disc (local write waits)
    • Log file records “Block Written Record”.
  • For each space management block in turn
    • Update space management blocks with new data object_id — undo needed
    • Write the updated block to disc (local write wait)
    • Log file records one “Block Written Record” for each block
  • Repeat all the above for the TABLE segment.
  • Start a recursive transacion
    • Insert a row into mon_mod$ — undo needed
    • recursive commit
  • Set DDL marker in redo log (possibly holding the text of the DDL statement, but it’s not visible in the dump)
  • Set object reuse markers in the redo log
  • update tab$  — needs undo, it’s just DML
  • update ind$ — needs undo, it’s just DML
  • update seg$  — needs undo, it’s just DML (twice – once for table once for index)
  • update obj$ — needs undo, it’s just DML (twice – ditto)
  • COMMIT — at last, with a change vector for a “Standby metadata cache invalidation” marker

The remaining 12 transactions look like things that could be delayed to tidy up things like space management blocks for the files and tablespaces and releasing “block locks”.

This first, long, transaction, is the thing that has to happen as an atomic event to truncate the table – and you can imagine that if the database crashed (or you crashed the session) in the middle of a very slow truncate then there seems to be enough information being recorded in the undo to allow the database to roll forward an incomplete truncate, and then roll back to before the truncate.

It would be possible to test whether or not this would actually work – but I wouldn’t want to do it on a database that anyone else was using.

August 19, 2015

Oops

Filed under: Infrastructure,Oracle,Partitioning — Jonathan Lewis @ 10:02 am GMT Aug 19,2015

I made a mistake a few days ago following up a question on the OTN database forum. The question was about a problem creating a hash/list composite partitioned table, and one of the respondants suggested that perhaps the problem appeared because hash/list wasn’t a legal combination.

Spot on: so I confirmed that observation and supplied a link to the official Oracle white paper that listed the combinations that were legal in 11.2 for composite partitioning.  In fact, although I was fairly sure that hash/list wasn’t legal, I had even run up a quick test to check that the attempt would fail before I’d searched online for the document.

Fortunately other people had different ideas about the original failure, and one of them supplied a link to a thread on AskTom which actually included some SQL to create a hash/list composite partitioned table. Naturally I tested it (even though it was from Tom Kyte) and obviously it worked (after all it was from Tom Kyte) – and then I spotted the syntax error in the example I had created for myself.

Trust but verify … and then …

I had fallen into two traps – and one of them was documented in my own “Trust” posting from 2006.

The white paper was dated September 2009 (Tom’s example was dated June 2013) and as it says in my Trust note:

  • If its date is more than about 18 months old – don’t assume it’s (still) true
  • If it’s not your exact version number – don’t assume it’s (still) true

The second trap was an example of confirmation bias, I was fairly sure that my test was supposed to fail with Oracle error “ORA-00922: missing or invalid option”, so when it failed with exactly that error I didn’t check why it had failed and didn’t notice that I had swapped the order of a couple of clauses in the create table statement. It’s very easy to think you’ve done enough when testing – especially when your test results match your expectation.

Update – later that day

So I’ve had an email asking me how I got the ORA-00922.  Here’s the SQL as I wrote it – don’t give the answer in the comments, but how quickly can you spot what I did wrong ?

create table t1
(
        object_name,
        object_type,
        created
)
partition by hash(object_name) partitions 4
subpartition by list(object_type)
subpartition template (
        subpartition sp1 values ('TABLE'),
        subpartition sp2 values ('INDEX'),
        subpartition sp3 values (default)
)
as
select  object_name, object_type, created
from    all_objects
;

It’s a deceptive error (to me, at least) because even though I know it’s wrong it still looks right.

August 17, 2015

Index Usage

Filed under: extended stats,Indexing,Oracle,Tuning — Jonathan Lewis @ 4:25 pm GMT Aug 17,2015

The question of how to identify indexes that could be dropped re-appeared (yet again) on the OTN database forum last week. It’s not really surprising that it recurs so regularly – the problem isn’t an easy one to solve but new (and even less new) users keep hoping that there’s a quick and easy solution.

There are, however, strategies and pointers that can help you to optimise the trade-off between effort, risk, and reward. Broadly the idea is to spend a small amount of effort finding a relatively small number of “expensive” indexes that might be safe to drop, so that when you do the detailed analysis you have a good chance that the time spent will be rewarded by a positive result.

Before we get to some results posted on OTN, it’s worth thinking about the global impact and what we’re trying to achieve, and the threats that go with our attempt to achieve it.

The key detail, of course, is that index maintenance is an expensive process. We could insert 1,000 rows into a table at a cost of writing about 25 table blocks plus a few undo blocks plus something like half a megabyte of redo (assuming, for the purposes of illustration that each row is about 200 bytes on insert). Add one index to the table and we might have to locate and modify 1,000 separate index leaf blocks. The increment on the redo might be about quarter of a megabyte and we may have to access 1,000 different undo blocks for read consistency reasons, but the simple fact that we may need 1,000 buffers to be able to maintain that index is likely to be a significant extra cost on the insert. Make that 10 indexes, or 70 (as one unhappy DBA once told me) and the probability of being able to do high-speed inserts becomes rather low.

Of course we hope that our indexes will allow our queries to operate efficiently with great precision, but inevitably we get to a point where the benefit of precision is outweighed by the cost of maintenance. Our target, then, is to design the set of indexes that makes it possible for the optimizer to find good paths for all the important queries and “good enough” paths for the rest. By the time the system is live, though, it’s too late for “proper design”, and the only option is for damage limitation, a bit of guesswork, and some live testing with fingers crossed (thank goodness for invisible indexes).

The starting point is usually an attempt to identify “the indexes we are not using”, which is typically translated into “the indexes that do not appear in execution plans” – but that’s not actually a good target, for various reasons:

  • Problem 1: If we are using an index it’s possible that we shouldn’t be and that there’s an alternative index available that ought to be more efficient. A corollary to this is that if you do identify and drop such an index you may find that the optimizer doesn’t use the alternative index you were expecting it to use until you take some action to help the optimizer recognise that the alternative is a good choice.
  • Problem 2: if we aren’t using a particular index then perhaps we should be using it and would use it if we dropped one of the other indexes on the table. (And there’s always the possibility that we didn’t happen to use it during the interval we were checking but do use it at some other times)
  • Problem 3: the optimizer is capable of using information about the number of distinct keys in a multi-column index to select an executon plan even though it may not use that index in the plan it finally chooses. We may be able to work around this problem in current versions of Oracle by creating a column group (extended statistics) that matches the definition of each index we drop – but there’s a limit of 20 column groups per table (and we may have to find the “opposite end” of each join where we use the index stats and create a matching column group there).
  • Problem 4: There are some indexes we might not be using but which must exist to avoid the “foreign key locking” problem. It should be easy enough to check, before dropping an index, whether it has to exist to match a foreign key; and even then it may be possible to show that nothing in the application would cause the locking problem to appear – and as a safety measure you could disable locks on the (child) table to ensure that the application doesn’t grind to a halt because of foreign key locking problems.

Provided you remember that problems like these exist, and think carefully about the indexes that your strategy suggests, there are various ways you could approach the problem of identifying indexes that don’t get into execution plans.

v$object_usage

The ink had barely dried on the manual pages for this view before several people (including me) had written notes explaining why this view wasn’t particularly helpful. (I think I even said something about this in Practical Oracle 8i). I won’t repeat the discussion here but it revolves around the fact that an index is flagged as “used” even if it has only been used once in a single execution of a single statement – so you don’t get any idea of the real importance of the index.

v$sql_plan et. al.

If you review the set of in-memory execution plans (and the AWR or Statspack equivalents) you can identify indexes which definitely have been used – but (a) it’s expensive to scan v$sql_plan frequently and (b) the AWR/Statspack repositories only capture a subset of the more expensive plans, so it’s easy to miss indexes which have been used and are relatively important but aren’t in the repository and don’t happen to be in memory at the moments you look.

Review the definitions

If you examine the index definitions you may spot indexes where look very similar. If one index starts with the same columns, in the same order, as another index, there is a good chance that you could reduce two indexes to one – especially if the whole of one of the indexes is the “leading edge” of the other – for example:

  • (dp_datetime_date)
  • (dp_datetime_date, dp_compid)

Even if the leading edges match and the trailing edges differ we might be able to collapse two indexes into one – depending on how selective the leading columns are and how the indexes are used – for example:

  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, dp_datetime_date)
  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, pnr_cfrqsj_date)

which could perhaps be replaced by one of :

  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, dp_datetime_date, pnr_cfrqsj_date)

or

  • (dp_compid, ddzt, cirmhcx, ct_nxr_mhcx, pnr_cfrqsj_date, dp_datetime_date)

Guessing about the use of a typical date column, though, it’s possible that in this example the current trailing date columns are used with a range-based predicate, so it’s possible that this strategy won’t be effective for this pair of indexes.

Even if the order of later columns in the index doesn’t match you may still find that a pair of indexes could be reduced to a single index – for example the pair:

  • (dp_datetime_date, dp_compid)
  • (dp_datetime_date, ddzdt, dp_compid, ct_nxrdh, ct_smsmobilno)

which could perhaps be replaced by just:

  • (dp_datetime_date, dp_compid, ddzdt, ct_nxrdh, ct_smsmobilno)

As a safety measure, of course, you would probably create a new index, then make the subject indexes invisible, and wait for at least a week to see whether any performance problems appear (remembering that one automatic performance threat would be the increase in workload as yet another index – temporarily – has to be maintained).

The difficulty of eliminating indexes by examination is that it takes a lot of effort to investigate all the possibilities, so you really need some way of choosing a relatively small subset of indexes that might be worth the effort. This brings me to the principle topic of this posting – using segment statistics to help you pick which indexes might be worth the effort.

v$segstat / v$segment_statistics

Oracle records a number of workload statistics for each object in memory. The view v$segstat is an efficient version of these statistics, and v$segment_statistics is a friendlier version that joins v$segstat to tables user$, obj$ and ts$, with a filter against ind$ to turn meaningless numbers into names.

SQL&amp;gt; desc V$segstat
 Name                    Null?    Type
 ----------------------- -------- ----------------
 TS#                              NUMBER
 OBJ#                             NUMBER
 DATAOBJ#                         NUMBER
 STATISTIC_NAME                   VARCHAR2(64)
 STATISTIC#                       NUMBER
 VALUE                            NUMBER

SQL&amp;gt; desc V$segment_statistics
 Name                    Null?    Type
 ----------------------- -------- ----------------
 OWNER                            VARCHAR2(30)
 OBJECT_NAME                      VARCHAR2(30)
 SUBOBJECT_NAME                   VARCHAR2(30)
 TABLESPACE_NAME                  VARCHAR2(30)
 TS#                              NUMBER
 OBJ#                             NUMBER
 DATAOBJ#                         NUMBER
 OBJECT_TYPE                      VARCHAR2(18)
 STATISTIC_NAME                   VARCHAR2(64)
 STATISTIC#                       NUMBER
 VALUE                            NUMBER

For each segment Oracle records the following statistics (according to v$segstat_name – but there are a couple more hidden statistics reported in the underlying x$ksolsstat object):

NAME                             SAMPLED
-------------------------------- -------
logical reads                    YES
buffer busy waits                NO
gc buffer busy                   NO
db block changes                 YES
physical reads                   NO
physical writes                  NO
physical read requests           NO
physical write requests          NO
physical reads direct            NO
physical writes direct           NO
optimized physical reads         NO
optimized physical writes        NO
gc cr blocks received            NO
gc current blocks received       NO
ITL waits                        NO
row lock waits                   NO
space used                       NO
space allocated                  NO
segment scans                    NO

Both Statspack (at level 7) and the AWR report have several “Top N” sections for segment statistics. If we examine these stats for all the indexes on a given table we can get some clues about which indexes are likely to be worth further investigation to see if they could be dropped.

One very simple measure is the number of “physical reads” (which, for indexes, will generally be very similar to “physical read requests”). Since a (real) physical read is generally going to take a significant amount of time, segments with very large numbers of physical reads could be contributing a lot of of time to the total database time – so it’s worth knowing why it’s responsible for so many physical reads and worth cross-checking with v$sql_plan (and its historic equivalents) which statements seem to be using or modifying this index.

Even if it turns out that the index is absolutely necessary, you might still be able to spot opportunities to improve efficiency. If it is subject to a significant number of physical reads it may be that the index is just very large – could you make it smaller by rebuilding it with compression on some of the leading columns, is it an index which (for some reason you can identify) tends to degenerate over time and waste a lot of space and should you rebuild it occasionally. It might be possible (depending on the predicates used) to re-arrange the column order in such a way that the activity is focused onto a particular section of the index rather than being spread across the entire index – or you could even find that by careful choice of global partitioning (which is legal on even a non-partitioned table) you might be able to isolate the activity to a small section of the index.

A more interesting measure, though, comes from comparing the “logical reads” with the number of “db block changes”; and that’s the point of this posting – except that I’ve spent so much time on it already that I’m going to have to write part 2 some time next week.

 

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 5,938 other followers