Oracle Scratchpad

May 11, 2015

Efficiency

Filed under: Oracle,Performance — Jonathan Lewis @ 9:24 pm BST May 11,2015

Here’s a question to which I don’t know the answer, and for which I don’t think anyone is likely to need the answer; but it might be an entertaining little puzzle for thr curious.

Assume you’re doing a full tablescan against a simple heap table of no more than 255 columns (and not using RAC, Exadata, In-memory, etc. etc. etc.), and the query is something like:


select  {columns 200 to 250}
from    t1
where   column_255 = {constant}
;

To test the predicate Oracle has to count its way along each row column by column to find column 255. Will it:

  1. transfer columns 200 to 250 to local memory as it goes, then check column_255 — which seems to be a waste of CPU if the predicate doesn’t evaluate to TRUE
  2. evaluate the predicate, then walk the row again to transfer columns 200 to 250 to local memory if the predicate evaluates to TRUE — which also seems to be a waste of CPU
  3. one or other of the above depending on which columns are needed, how long they are, and the selectivity of the predicate

How would you test your hypothesis ?

April 15, 2015

Cartesian join

Filed under: Oracle,Performance,Tuning — Jonathan Lewis @ 6:40 pm BST Apr 15,2015

Some time ago I pulled off the apocryphal “from 2 hours to 10 seconds” trick for a client using a technique that is conceptually very simple but, like my example from last week, falls outside the pattern of generic SQL. The problem (with some camouflage) is as follows: we have a data set with 8 “type” attributes which are all mandatory columns. We have a “types” table with the same 8 columns together with two more columns that are used to translate a combination of attributes into a specific category and “level of relevance”. The “type” columns in the types table are, however, allowed to be null although each row must have at least one column that is not null – i.e. there is no row where every “type” column is null.

The task is to match each row in the big data set with all “sufficiently similar” rows in the types table and then pick the most appropriate of the matches – i.e. the match with the largest “level of relevance”. The data table had 500,000 rows in it, the types table has 900 rows. Here’s a very small data set representing the problem client data (cut down from 8 type columns to just 4 type columns):


create table big_table(
	id		number(10,0)	primary key,
	v1		varchar2(30),
	att1		number(6,0),
	att2		number(6,0),
	att3		number(6,0),
	att4		number(6,0),
	padding		varchar2(4000)
);

create table types(
	att1		number(6,0),
	att2		number(6,0),
	att3		number(6,0),
	att4		number(6,0),
	category	varchar2(12)	not null,
	relevance	number(4,0)	not null
);

insert into big_table values(1, 'asdfllkj', 1, 1, 2, 1, rpad('x',4000));
insert into big_table values(2, 'rirweute', 1, 3, 1, 4, rpad('x',4000));

insert into types values(   1, null, null, null, 'XX',  10);
insert into types values(   1, null, null,    1, 'YY',  20);
insert into types values(   1, null,    1, null, 'ZZ',  20);

commit;

A row from the types table is similar to a source row if it matches on all the non-null columns. So if we look at the first row in big_table, it matches the first row in types because att1 = 1 and all the other attN columns are null; it matches the second row because att1 = 1 and att4 = 1 and the other attN columns are null, but it doesn’t match the third row because types.att3 = 1 and big_table.att3 = 2.

Similarly, if we look at the second row in big_table, it matches the first row in types, doesn’t match the second row because types.att4 = 1 and big_table.att4 = 4, but does match the third row. Here’s how we can express the matching requirement in SQL:


select
	bt.id, bt.v1,
	ty.category,
	ty.relevance
from
	big_table	bt,
	types		ty
where
	nvl(ty.att1(+), bt.att1) = bt.att1
and	nvl(ty.att2(+), bt.att2) = bt.att2
and	nvl(ty.att3(+), bt.att3) = bt.att3
and	nvl(ty.att4(+), bt.att4) = bt.att4
;

You’ll realise, of course, that essentially we have to do a Cartesian merge join between the two tables. Since there’s no guaranteed matching column that we could use to join the two tables we have to look at every row in types for every row in big_table … and we have 500,000 rows in big_table and 900 in types, leading to an intermediate workload of 450,000,000 rows (with, in the client case, 8 checks for each of those rows). Runtime for the client was about 2 hours, at 100% CPU.

When you have to do a Cartesian merge join there doesn’t seem to be much scope for reducing the workload, however I didn’t actually know what the data really looked like so I ran a couple of queries to analyse it . The first was a simple “select count (distinct)” query to see how many different combinations of the 8 attributes existed in the client’s data set. It turned out to be slightly less than 400.

Problem solved – get a list of the distinct combinations, join that to the types table to translate to categories, then join the intermediate result set back to the original table. This, of course, is just applying two principles that I’ve discussed before: (a) be selective about using a table twice to reduce the workload, (b) aggregate early if you can reduce the scale of the problem.

Here’s my solution:


with main_data as (
	select
		/*+ materialize */
		id, v1, att1, att2, att3, att4
	from
		big_table
),
distinct_data as (
	select
		/*+ materialize */
		distinct att1, att2, att3, att4
	from	main_data
)
select
	md.id, md.v1, ty.category, ty.relevance
from
	distinct_data	dd,
	types		ty,
	main_data	md
where
	nvl(ty.att1(+), dd.att1) = dd.att1
and	nvl(ty.att2(+), dd.att2) = dd.att2
and	nvl(ty.att3(+), dd.att3) = dd.att3
and	nvl(ty.att4(+), dd.att4) = dd.att4
and	md.att1 = dd.att1
and	md.att2 = dd.att2
and	md.att3 = dd.att3
and	md.att4 = dd.att4
;

And here’s the execution plan.


---------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |    12 |  2484 |    11  (10)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                            |       |       |            |          |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D6619_8FE93F1 |       |       |            |          |
|   3 |    TABLE ACCESS FULL       | BIG_TABLE                  |     2 |   164 |     2   (0)| 00:00:01 |
|   4 |   LOAD AS SELECT           | SYS_TEMP_0FD9D661A_8FE93F1 |       |       |            |          |
|   5 |    HASH UNIQUE             |                            |     2 |   104 |     3  (34)| 00:00:01 |
|   6 |     VIEW                   |                            |     2 |   104 |     2   (0)| 00:00:01 |
|   7 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6619_8FE93F1 |     2 |   164 |     2   (0)| 00:00:01 |
|*  8 |   HASH JOIN                |                            |    12 |  2484 |     6   (0)| 00:00:01 |
|   9 |    NESTED LOOPS OUTER      |                            |     6 |   750 |     4   (0)| 00:00:01 |
|  10 |     VIEW                   |                            |     2 |   104 |     2   (0)| 00:00:01 |
|  11 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D661A_8FE93F1 |     2 |   104 |     2   (0)| 00:00:01 |
|* 12 |     TABLE ACCESS FULL      | TYPES                      |     3 |   219 |     1   (0)| 00:00:01 |
|  13 |    VIEW                    |                            |     2 |   164 |     2   (0)| 00:00:01 |
|  14 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6619_8FE93F1 |     2 |   164 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access("MD"."ATT1"="DD"."ATT1" AND "MD"."ATT2"="DD"."ATT2" AND
              "MD"."ATT3"="DD"."ATT3" AND "MD"."ATT4"="DD"."ATT4")
  12 - filter("DD"."ATT1"=NVL("TY"."ATT1"(+),"DD"."ATT1") AND
              "DD"."ATT2"=NVL("TY"."ATT2"(+),"DD"."ATT2") AND
              "DD"."ATT3"=NVL("TY"."ATT3"(+),"DD"."ATT3") AND
              "DD"."ATT4"=NVL("TY"."ATT4"(+),"DD"."ATT4"))

Critically I’ve taken a Cartesian join that had a source of 500,000 and a target of 900 possible matches, and reduced it to a join between the 400 distinct combinations and the 900 possible matches. Clearly we can expect this to to take something like one twelve-hundredth (400/500,000) of the work of the original join – bringing 7,200 seconds down to roughly 6 seconds. Once this step is complete we have an intermediate result set which is the 4 non-null type columns combined with the matching category and relevance columns – and can use this in a simple and efficient hash join with the original data set.

Logic dictated that the old and new results would be the same – but we did run the two hour query to check that the results matched.

Footnote: I was a little surprised that the optimizer produced a nested loops outer join rather than a Cartesian merge in the plan above – but that’s probably an arterfact of the very small data sizes in my test.There’s presumably little point in transferring the data into the PGA when the volume is so small.

Footnote 2: I haven’t included the extra steps in the SQL to eliminate the reduce the intermediate result to just “the most relevant” – but that’s just an inline view with an analytic function. (The original code actually selected the data with an order by clause and used a client-side filter to eliminate the excess!).

Footnote 3: The application was a multi-company application – and one of the other companies had not yet gone live on the system because they had a data set of 5 million rows to process and this query had never managed to run to completion in the available time window.  I’ll have to get back to the client some day and see if the larger data set also collapsed to a very small number of distinct combinations and how long the rewrite took with that data set.

 

April 13, 2015

Not Exists

Filed under: CBO,Execution plans,Oracle,Performance — Jonathan Lewis @ 12:51 pm BST Apr 13,2015

The following requirement appeared recently on OTN:


=========================================================================================================
I have a following query and want to get rid of the "NOT EXISTS' clause without changing the end results.

SELECT   A.c,
         A.d,
         A.e,
         A.f
  FROM   A
WHERE   NOT EXISTS (SELECT   1
                       FROM   B
                      WHERE   B.c = A.c AND B.d = A.d AND B.e = A.e);
===========================================================================================================

Inevitably this wasn’t the problem query, and almost inevitably the OP was asking us how to implement a solution which wasn’t appropriate for a problem that shouldn’t have existed. Despite this it’s worth spending a little time to take the request at its face value and look at the sort of thing that could be going on.

First, of course, you cannot get rid of the “not exists” clause, although you may be able to make it look different. If you want “all the rows in A that are not referenced in B” then you HAVE to examine all the rows in A, and you have to do some sort of check for each row to see whether or not it exists in B. The only option you’ve got for doing something about the “not exists” clause is to find a way of making it as a cheap as possible to implement.

A couple of people came up with suggestions for rewriting the query to make it more efficient. One suggested writing it as a “NOT IN” subquery, but it’s worth remembering that the optimizer may cheerfully transform a “NOT IN” subquery to a “NOT EXISTS” subquery if it’s legal and a manual rewrite may overlook the problem of NULLs; another suggested rewriting the query as an outer join, but again it’s worth remembering that the optimimzer may transform a “NOT EXISTS” subquery to an “ANTI-JOIN” – which is a bit like an outer join with filter, only more efficient. So, before suggesting a rewrite, it’s worth looking at the execution plan to see what the optimizer is doing just in case it’s doing something silly. There are two options – anti-join or filter subquery.

Here, with code I’ve run under 10.2.0.5 to match the OP, is a demonstration data set, with the two plans you might expect to see – first, some the data:


execute dbms_random.seed(0)

create table t1
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(0,4))           c,
        trunc(dbms_random.value(0,5))           d,
        trunc(dbms_random.value(0,300))         e,
        rownum                                  f,
        rpad('x',100)                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t2
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(0,4))           c,
        trunc(dbms_random.value(0,5))           d,
        trunc(dbms_random.value(0,300))         e,
        rownum                                  f,
        rpad('x',100)                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 24000
;

create index t1_i1 on t1(c,d,e);
create index t2_i1 on t2(c,d,e);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

The OP had followed up their original query with a claim that “Table A holds 100 million rows and table B holds 24,000″ – that’s a lot of checks (if true) and you ought to be asking how quickly the OP expects the query to run and how many of the 100 M rows are going to survive the check. I’ve set up just 1M rows with 6,000 distinct values for the column combination (c,d,e), and a reference table with 24,000 rows which are likely to include most, but not all, of those 6,000 combinations.

Rather than generate a very large output, I’ve written a query that generates the required data set, then counts it:


select
        max(f), count(*)
from (
        SELECT   /*+ no_merge */
                 A.c,
                 A.d,
                 A.e,
                 A.f
          FROM   t1 A
        WHERE   NOT EXISTS (SELECT   /* no_unnest */
                                      1
                               FROM   t2 B
                              WHERE   B.c = A.c AND B.d = A.d AND B.e = A.e)
)
;

This took about 0.35 seconds to run – aggregating roughly 14,500 rows from 1M. The plan was (as I had expected) based on a (right) hash anti join:


---------------------------------------------------------------------------------
| Id  | Operation               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |       |     1 |    13 |  2183   (5)| 00:00:11 |
|   1 |  SORT AGGREGATE         |       |     1 |    13 |            |          |
|   2 |   VIEW                  |       |   999K|    12M|  2183   (5)| 00:00:11 |
|*  3 |    HASH JOIN RIGHT ANTI |       |   999K|    23M|  2183   (5)| 00:00:11 |
|   4 |     INDEX FAST FULL SCAN| T2_I1 | 24000 |   234K|    11  (10)| 00:00:01 |
|   5 |     TABLE ACCESS FULL   | T1    |  1000K|    14M|  2151   (4)| 00:00:11 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("B"."C"="A"."C" AND "B"."D"="A"."D" AND "B"."E"="A"."E")

Oracle has built an in-memory hash table from the 24,000 rows in t2, then scanned the t1 table, probing the hash table with each row in turn. That’s 1M probe in less than 0.35 seconds. You ought to infer from this that most of the time spent in the original query should have been spent scanning the 100M rows, and only a relatively small increment appear due to the “not exists” clause.

You’ll notice, though that there was a comment in my subquery with the /* no_unnest */ hint embedded – if I change this from a comment to a hint (/*+ */) I should get a plan with a filter subquery, and maybe that’s what’s happening to the OP for some odd reason. Here’s the plan:


------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    13 | 15166   (1)| 00:01:16 |
|   1 |  SORT AGGREGATE      |       |     1 |    13 |            |          |
|   2 |   VIEW               |       |   999K|    12M| 15166   (1)| 00:01:16 |
|*  3 |    FILTER            |       |       |       |            |          |
|   4 |     TABLE ACCESS FULL| T1    |  1000K|    14M|  2155   (4)| 00:00:11 |
|*  5 |     INDEX RANGE SCAN | T2_I1 |     4 |    40 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter( NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T2" "B"
              WHERE "B"."E"=:B1 AND "B"."D"=:B2 AND "B"."C"=:B3))
   5 - access("B"."C"=:B1 AND "B"."D"=:B2 AND "B"."E"=:B3)

The query took 1.65 seconds to complete. (And re-running with rowsource execution statistics enabled, I found that the subquery had executed roughly 914,000 times in that 1.65 seconds). Even if the original query had used the filter subquery plan the subquery shouldn’t have made much difference to the overall performance. Of course if T2 didn’t have that index on (c,d,e) then the filter subquery plan would have been much more expensive – but then, we would really have expected to see the hash anti-join.

If you’re wondering why the subquery ran 914,000 times instead of 1M times, you’ve forgotten “scalar subquery caching”.  The session caches a limited number of results from subquery execution as a query runs and may be able to use cached results (or simply a special “previous-execution” result) to minimise the number of executions of the subquery.

Did you notice the index I created on t1(c,d,e) ? If I drive the query through this index I’ll access all the rows for a given combination of (c,d,e) one after the other and only have to run the subquery once for the set. To make this happen, though, I’ll have to declare one of the columns to be NOT NULL, or add a suitable “column is not null” predicate to the query; and then I’ll probably have to hint the query anyway:


select
        max(f)
from (
        SELECT   /*+ no_merge index(a) */
                 A.c,
                 A.d,
                 A.e,
                 A.f
          FROM   t1 A
        WHERE   NOT EXISTS (SELECT   /*+ no_unnest */
                                      1
                               FROM   t2 B
                              WHERE   B.c = A.c AND B.d = A.d AND B.e = A.e)
        and     c is not null
)
;

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    13 | 65706   (1)| 00:05:29 |
|   1 |  SORT AGGREGATE               |       |     1 |    13 |            |          |
|   2 |   VIEW                        |       |   999K|    12M| 65706   (1)| 00:05:29 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1    | 50000 |   732K| 52694   (1)| 00:04:24 |
|*  4 |     INDEX FULL SCAN           | T1_I1 | 50000 |       |  2869   (2)| 00:00:15 |
|*  5 |      INDEX RANGE SCAN         | T2_I1 |     4 |    40 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("C" IS NOT NULL AND  NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM
              "T2" "B" WHERE "B"."E"=:B1 AND "B"."D"=:B2 AND "B"."C"=:B3))
   5 - access("B"."C"=:B1 AND "B"."D"=:B2 AND "B"."E"=:B3)

Re-running this code with rowsource execution statistics enabled showed that the subquery ran just 6,000 times (as expected) – for a total run time that was slightly faster than the hash anti-join method (0.17 seconds – but I do have a new laptop using SSD only, with a 3.5GHz CPU and lots of memory).

Every which way, if we can get reasonable performance from the underlying table access there’s no way that introducing a “NOT EXISTS” ought to be a disaster. The worst case scenario – for some reason Oracle chooses to run a filter subquery plan and the appropriate index hasn’t been created to support it.

Footnote:

Of course, table A didn’t really exist, it was a three table join; and it didn’t produce 100M rows, it produced anything between zero and 5 million rows, and the effect of the subquery (which correlated back to two of the joined tables) was to leave anything between 0 and 5 million rows. And (apparently) the query was quick enough in the absence of the subquery (producing, for example, 1 million rows in only 5 minutes), but too slow with the subquery in place.

But that’s okay. Because of our tests we know that once we’ve produced a few million rows it takes fractions of a second more to pass them through a hash table with an anti-join to deal with the “not exists” subquery; and I doubt if we have to play silly games to push the data through a filter subquery plan in the right order to squeeze a few extra hundredths of a second from the query.

If the OP is happy with the basic select statement before the “not exists” subquery, all he has to do is take advantage of a no_merge hint:


select  {list of columns}
from
        (
        select /*+ no_merge */ .... rest of original query
        )    v1
where
        not exists (
                select  null
                from    b
                where   b.c = v1.c and b.d = v1.d and b.e = v1.e
        )
;

You’re probably wondering why the OP currently sees a performance problem as the subquery is added. The best guess is that the subquery has introduce a “magic 5% fudge factor” to the arithmetic (did you notice the cardinality of t1 dropping to 50,000 from 1M in the plan above) and made it pick a worse execution plan for the rest of the query. We can’t tell, though, since the OP hasn’t yet given us the information that would allow us to see what’s going wrong.

April 10, 2015

Counting

Filed under: Execution plans,Indexing,Oracle,Performance — Jonathan Lewis @ 5:27 pm BST Apr 10,2015

There’s a live example on OTN at the moment of an interesting class of problem that can require some imaginative thinking. It revolves around a design that uses a row in one table to hold the low and high values for a range of values in another table. The problem is then simply to count the number of rows in the second table that fall into the range given by the first table. There’s an obvious query you can write (a join with inequality) but if you have to join each row in the first table to several million rows in the second table, then aggregate to count them, that’s an expensive strategy.  Here’s the query (with numbers of rows involved) that showed up on OTN; it’s an insert statement, and the problem is that it takes 7 hours to insert 37,600 rows:


    INSERT INTO SA_REPORT_DATA
    (REPORT_ID, CUTOFF_DATE, COL_1, COL_2, COL_3)
    (
    SELECT 'ISRP-734', to_date('&DateTo', 'YYYY-MM-DD'),
           SNE.ID AS HLR
    ,      SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER AS NUMBER_RANGE
    ,      COUNT(M.MSISDN) AS AVAILABLE_MSISDNS
    FROM
           SA_NUMBER_RANGES SNR          -- 10,000 rows
    ,      SA_SERVICE_SYSTEMS SSS        --  1,643 rows
    ,      SA_NETWORK_ELEMENTS SNE       --    200 rows
    ,      SA_MSISDNS M                  --    72M rows
    WHERE
           SSS.SEQ = SNR.SRVSYS_SEQ
    AND    SSS.SYSTYP_ID = 'OMC HLR'
    AND    SNE.SEQ = SSS.NE_SEQ
    AND    SNR.ID_TYPE = 'M'
    AND    M.MSISDN  >= SNR.FROM_NUMBER
    AND    M.MSISDN  <= SNR.TO_NUMBER
    AND    M.STATE  = 'AVL'
    GROUP BY
           SNE.ID,SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER
    )  

The feature here is that we are counting ranges of MSISDN: we take 10,000 number ranges (SNR) and join with inequality to a 72M row table. It’s perfectly conceivable that at some point the data set expands (not necessarily all at once) to literally tens of billions of rows that are then aggregated down to the 37,500 that are finally inserted.

The execution plan shows the optimizer joining the first three tables before doing a merge join between that result set and the relevant subset of the MSISDNs table – which means the MSISDNs have to be sorted and buffered (with a probably spill to disc) before they can be used. It would be interesting to see the rowsource execution stats for the query – partly to see how large the generated set became, but also to see if the ranges involved were so large that most of the time went in constantly re-reading the sorted MSISDNs from the temporary tablespace.

As far as optimisation is concerned, there are a couple of trivial things around the edges we can examine: we have 10,000 number ranges but insert 37,600 results, and the last stages of the plan generated those results so we’ve scanned and aggregated the sorted MSISDNs 37,600 times. Clearly we could look for a better table ordering that (eliminated any number ranges early), then did the minimal number of joins to MSISDN, aggregated, then scaled up to 37,600: with the best join order we might reduce the run time by a factor of 3 or more. (But that’s still a couple of hours run time.)

What we really need to do to make a difference is change the infrastructure in some way – prefereably invisibly to the rest of the application. There are a number of specific details relating to workload, read-consistency, timing, concurrency, etc. that will need to be considered, but broadly speaking, we need to take advantage of a table that effectively holds the “interesting” MSISDNs in sorted order. I’ve kept the approach simple here, it needs a few modifications for a production system. The important bit of the reports is the bit that produces the count, so I’m only going to worry about a two-table join – number ranges and msidn; here’s some model data:


execute dbms_random.seed(0)

create table msisdns
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      msisdn
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table number_ranges
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      from_number,
        trunc(dbms_random.value(1e9,1e10))      to_number
from
        generator       v1
where
        rownum  <= 1000
;

update number_ranges set
        from_number = to_number,
        to_number = from_number
where
        to_number < from_number
;

commit;

I’ve created a table of numbers with values between 10e9 and 10e10 to represent 1 million MSISDNs, and a list of 1,000 number ranges – making sure that the FROM number is not greater than the TO number. Now I need a “summary” table of the MSISDNs, which I’m going to create as an index-organized table:


create table tmp_msisdns (
        msisdn,
        counter,
        constraint tmp_pk primary key (msisdn, counter)
)
organization index
as
select
        msisdn,
        row_number() over(order by msisdn)      counter
from
        msisdns
;

This is only a demonstration so I’ve haven’t bothered with production-like code to check that the MSISDNs I had generated were unique (they were); and I’ve casually included the row_number() as part of the primary key as a performance fiddle even though it’s something that could, technically, allow some other program to introduce bad data if I made the table available for public use rather than task specific.

Finally we get down to the report. To find out how many MSISDN values there are between the FROM and TO number in a range I just have to find the lowest and highest MSISDNs from tmp_msisdn in that range and find the difference between their counter values, and add 1. And there’s a very fast way to find the lowest or highest values when you have the appropriate index – the min/max range scan – but you have to access the table twice, once for the low, once for the high. Here’s the necessary SQL, with execution plan from 12.1.0.2:


select
        nr.from_number, nr.to_number,
--      fr1.msisdn, fr1.counter,
--      to1.msisdn, to1.counter,
        1 + to1.counter - fr1.counter range_count
from
        number_ranges   nr,
        tmp_msisdns     fr1,
        tmp_msisdns     to1
where
        fr1.msisdn = (
                select min(msisdn) from tmp_msisdns where tmp_msisdns.msisdn >= nr.from_number
        )
and     to1.msisdn = (
                select max(msisdn) from tmp_msisdns where tmp_msisdns.msisdn <= nr.to_number
        )
;

-------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |               |       |       |  4008 (100)|          |
|   1 |  NESTED LOOPS                   |               |  1000 | 38000 |  4008   (1)| 00:00:01 |
|   2 |   NESTED LOOPS                  |               |  1000 | 26000 |  2005   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL            | NUMBER_RANGES |  1000 | 14000 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN             | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   5 |     SORT AGGREGATE              |               |     1 |     7 |            |          |
|   6 |      FIRST ROW                  |               |     1 |     7 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN (MIN/MAX)| TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
|*  8 |   INDEX RANGE SCAN              | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   9 |    SORT AGGREGATE               |               |     1 |     7 |            |          |
|  10 |     FIRST ROW                   |               |     1 |     7 |     3   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN (MIN/MAX) | TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("FR1"."MSISDN"=)
   7 - access("TMP_MSISDNS"."MSISDN">=:B1)
   8 - access("TO1"."MSISDN"=)
  11 - access("TMP_MSISDNS"."MSISDN"<=:B1)

Execution time – with 1 million MSISDNs and 1,000 ranges: 0.11 seconds.

For comparative purposes, and to check that the code is producing the right answers, here’s the basic inequality join method:


select
        nr.from_number, nr.to_number, count(*) range_count
from
        number_ranges   nr,
        msisdns         ms
where
        ms.msisdn >= nr.from_number
and     ms.msisdn <= nr.to_number
group by
        nr.from_number, nr.to_number
order by
        nr.from_number
;

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |       |       |       |   472K(100)|          |
|   1 |  HASH GROUP BY        |               |   707K|    14M|  6847M|   472K (17)| 00:00:19 |
|   2 |   MERGE JOIN          |               |   255M|  5107M|       | 13492  (77)| 00:00:01 |
|   3 |    SORT JOIN          |               |  1000 | 14000 |       |     3  (34)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | NUMBER_RANGES |  1000 | 14000 |       |     2   (0)| 00:00:01 |
|*  5 |    FILTER             |               |       |       |       |            |          |
|*  6 |     SORT JOIN         |               |  1000K|  6835K|    30M|  3451   (7)| 00:00:01 |
|   7 |      TABLE ACCESS FULL| MSISDNS       |  1000K|  6835K|       |   245  (14)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("MS"."MSISDN"<="NR"."TO_NUMBER")
   6 - access("MS"."MSISDN">="NR"."FROM_NUMBER")
       filter("MS"."MSISDN">="NR"."FROM_NUMBER")

The two queries produced the same results (apart from ordering); but the second query took 2 minutes 19.4 seconds to complete.

 

Update:

In a moment of idle curiosity I recreated the data with 40 Million rows in the MSISDNs table to get some idea of how fast the entire report process could go when re-engineered (remember the OP has 72M rows, but select the subset flagged as ‘AVL’). It took 1 minute 46 seconds to create the IOT – after which the report for 1,000 number ranges still took less than 0.2 seconds.

 

 

 

 

 

February 19, 2015

255 columns

Filed under: Infrastructure,Oracle,Performance — Jonathan Lewis @ 12:45 am BST Feb 19,2015

You all know that having more than 255 columns in a table is a Bad Thing ™ – and surprisingly you don’t even have to get to 255 to hit the first bad thing about wide tables. If you’ve ever wondered what sorts of problems you can have, here are a few:

  • If you’re still running 10g and gather stats on a table with more than roughly 165 columns then the query Oracle uses to collect the stats will only handle about 165 of them at a time; so you end up doing multiple (possibly sampled) tablescans to gather the stats. The reason why I can’t give you an exact figure for the number of columns is that it depends on the type and nullity of the columns – Oracle knows that some column types are fixed length (e.g. date types, char() types) and if any columns are declared not null then Oracle doesn’t have to worry about counting nulls – so for some of the table columns Oracle will be able to eliminate one or two of the related columns it normally includes in the stats-gathering SQL statement – which means it can gather stats on a few more table columns.  The 165-ish limit doesn’t apply in 11g – though I haven’t checked to see if there’s a larger limit before the same thing happens.
  • If you have more than 255 columns in a row Oracle will split it into multiple row pieces of 255 columns each plus one row piece for “the rest”; but the split counts from the end, so if you have a table with 256 columns the first row-piece has one column the second row-piece has 255 columns. This is bad news for all sorts of operations because Oracle will have to expend extra CPU chasing the the row pieces to make use of any column not in the first row piece. The optimists among you might have expected “the rest” to be in the last row piece. If you want to be reminded how bad row-chaining can get for wide tables, just have a look at an earlier blog note of mine (starting at this comment).
  • A particularly nasty side effect of the row split comes with direct path tablescans – and that’s what Oracle does automatically when the table is large. In many cases all the row pieces for a row will be in the same block; but they might not be, and if a continuation row-piece is in a different block Oracle will do a “db file sequential read” to read that block into the buffer cache and it won’t be cached (see 1st comment below).  As an indication of how badly this can affect performance, the results I got at a recent client site showed “select count(col1) from wide_table” taking 10  minutes while “select count(column40) from wide_table” took 22 minutes because roughly one row in a hundred required a single block read to follow the chain.
  • An important side effect of the split point is that you really need to put the columns you’re going to index near the start of the table to minimise the risk of this row chaining overhead when you create or rebuild an index.
  • On top of everything else, of course, it takes a surprisingly large amount of extra CPU to load a large table if the rows are chained. Another client test reported 140 CPU seconds to load 5M rows of 256 columns, but only 20 CPU seconds to load 255.

If you are going to have tables with more than 255 columns, think very carefully about column order – if you can get all the columns that are almost always null at the end of the row you may get lucky and find that you never need to create a secondary row piece. A recent client had about 290 columns in one table of 16M rows, and 150 columns were null for all 16M rows – unfortunately they had a mandatory “date_inserted” column at the end of the row, but with a little column re-arrangement they eliminated row chaining and saved (more than) 150 bytes storage per row.  Of course, if they have to add and back-fill a non-null column to the table they’re going to have to rebuild the table to insert the column “in the middle”, otherwise all new data will be chained and wasting 150 bytes per row, and any old data that gets updated will suffer a row migration/chain catastrophe.

February 8, 2015

Functions & Subqueries

Filed under: Oracle,Performance,Subquery Factoring,Tuning — Jonathan Lewis @ 4:12 am BST Feb 8,2015

I think the “mini-series” is a really nice blogging concept – it can pull together a number of short articles to offer a much better learning experience for the reader than they could get from the random collection of sound-bites that so often typifies an internet search; so here’s my recommendation for this week’s mini-series: a set of articles by Sayan Malakshinov a couple of years ago comparing the behaviour of Deterministic Functions and Scalar Subquery Caching.

http://orasql.org/2013/02/10/deterministic-function-vs-scalar-subquery-caching-part-1/

http://orasql.org/2013/02/11/deterministic-function-vs-scalar-subquery-caching-part-2/

http://orasql.org/2013/03/13/deterministic-function-vs-scalar-subquery-caching-part-3/

Footnote:
Although I’ve labelled it as “this week’s” series, I wouldn’t want you to assume that I’ll be trying to find a new mini-series every week.

Footnote 2:
I had obviously expected to publish this note a long time ago – but must have forgotten about it. I was prompted to search my blog for “deterministic” very recently thanks to a recent note on the OTN database forum and discovered both this note and an incomplete note about improving the speed of creating function-based indexes by tweaking hidden parameters – which I might yet publish, although if you read all of Sayan’s articles you’ll find the solution anyway.

 

January 19, 2015

Bitmap Counts

Filed under: bitmaps,Indexing,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 12:15 pm BST Jan 19,2015

In an earlier (not very serious) post about count(*) I pointed out how the optimizer sometimes does a redundant “bitmap conversion to rowid” when counting. In the basic count(*) example I showed this wasn’t a realistic issue unless you had set cursor_sharing to “force” (or the now-deprecated “similar”). There are, however, some cases where the optimizer can do this in more realistic circumstances and this posting models a scenario I came across a few years ago. The exact execution path has changed over time (i.e. version) but the anomaly persists, even in 12.1.0.2.

First we create a “fact” table and a dimension table, with a bitmap index on the fact table and a corresponding primary key on the dimension table:


create table area_sales (
	area		varchar2(10)	not null,
	dated		date		not null,
	category	number(3)	not null,
	quantity	number(8,0),
	value		number(9,2),
	constraint as_pk primary key (dated, area),
	constraint as_area_ck check (area in ('England','Ireland','Scotland','Wales'))
)
;

insert into area_sales
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 3000
)
select
	decode(mod(rownum,4),
		0,'England',
		1,'Ireland',
		2,'Scotland',
		3,'Wales'
	),
	sysdate + 0.0001 * rownum,
	mod(rownum-1,300),
	rownum,
	rownum
from
	generator,
	generator
where
	rownum <= 1e6
;

create bitmap index as_bi on area_sales(category) pctfree 0;

create table dim (
	id	number(3) not null,
	padding	varchar2(40)
)
;

alter table dim add constraint dim_pk primary key(id);

insert into dim
select
	distinct category, lpad(category,40,category)
from	area_sales
;

commit;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'AREA_SALES',
		method_opt 	 => 'for all columns size 1',
		cascade		 => true
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'DIM',
		method_opt 	 => 'for all columns size 1',
		cascade		 => true
	);
end;
/

Now we run few queries and show their execution plans with rowsource execution statistics. First a query to count the number of distinct categories used in the area_sales tables, then a query to list the IDs from the dim table that appear in the area_sales table, then the same query hinted to run efficiently.


set trimspool on
set linesize 156
set pagesize 60
set serveroutput off

alter session set statistics_level = all;

select
	distinct category
from
	area_sales
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

==========================================
select  distinct category from  area_sales
==========================================
---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |      1 |        |    300 |00:00:00.01 |     306 |       |       |          |
|   1 |  HASH UNIQUE                 |       |      1 |    300 |    300 |00:00:00.01 |     306 |  2294K|  2294K| 1403K (0)|
|   2 |   BITMAP INDEX FAST FULL SCAN| AS_BI |      1 |   1000K|    600 |00:00:00.01 |     306 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

As you can see, Oracle is able to check the number of distinct categories very quickly by scanning the bitmap index and extracting ONLY the key values from each of the 600 index entries that make up the whole index (the E-rows figure effectively reports the number of rowids identified by the index, but Oracle doesn’t evaluate them to answer the query).


=======================================================================
select  /*+   qb_name(main)  */  dim.* from dim where  id in (   select
   /*+     qb_name(subq)    */    distinct category   from
area_sales  )
========================================================================

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |      1 |        |    300 |00:00:10.45 |     341 |       |       |          |
|*  1 |  HASH JOIN SEMI               |       |      1 |    300 |    300 |00:00:10.45 |     341 |  1040K|  1040K| 1260K (0)|
|   2 |   TABLE ACCESS FULL           | DIM   |      1 |    300 |    300 |00:00:00.01 |      23 |       |       |          |
|   3 |   BITMAP CONVERSION TO ROWIDS |       |      1 |   1000K|    996K|00:00:02.64 |     318 |       |       |          |
|   4 |    BITMAP INDEX FAST FULL SCAN| AS_BI |      1 |        |    599 |00:00:00.01 |     318 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

What we see here is that (unhinted) oracle has converted the IN subquery to an EXISTS subquery then to a semi-join which it has chosen to operate as a HASH semi-join. But in the process of generating the probe (sescond) table Oracle has converted the bitmap index entries into a set of rowids – all 1,000,000 of them in my case – introducing a lot of redundant work. In the original customer query (version 9 or 10, I forget which) the optimizer unnested the subquery and converted it into an inline view with a distinct – but still performed a redundant bitmap conversion to rowids. In the case of the client, with rather more than 1M rows, this wasted a lot of CPU.


=====================================================================
select  /*+   qb_name(main)  */  dim.* from (  select   /*+
qb_name(inline)    no_merge    no_push_pred   */   distinct category
from   area_sales  ) sttv,  dim where  dim.id = sttv.category
=====================================================================

-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |      1 |        |    300 |00:00:00.02 |     341 |       |       |          |
|*  1 |  HASH JOIN                     |       |      1 |    300 |    300 |00:00:00.02 |     341 |  1969K|  1969K| 1521K (0)|
|   2 |   VIEW                         |       |      1 |    300 |    300 |00:00:00.01 |     306 |       |       |          |
|   3 |    HASH UNIQUE                 |       |      1 |    300 |    300 |00:00:00.01 |     306 |  2294K|  2294K| 2484K (0)|
|   4 |     BITMAP INDEX FAST FULL SCAN| AS_BI |      1 |   1000K|    600 |00:00:00.01 |     306 |       |       |          |
|   5 |   TABLE ACCESS FULL            | DIM   |      1 |    300 |    300 |00:00:00.01 |      35 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------

By introducing a manual unnest in the original client code I avoided the bitmap conversion to rowid, and the query executed much more efficiently. As you can see the optimizer has predicted the 1M rowids in the inline view, but used only the key values from the 600 index entries. In the case of the client it really was a case of manually unnesting a subquery that the optimizer was automatically unnesting – but without introducing the redundant rowids.

In my recent 12c test I had to include the no_merge and no_push_pred hints. In the absence of the no_merge hint Oracle did a join then group by, doing the rowid expansion in the process; if I added the no_merge hint without the no_push_pred hint then Oracle did a very efficient nested loop semi-join into the inline view. Although this still did the rowid expansion (predicting 3,333 rowids per key) it “stops early” thanks to the “semi” nature of the join so ran very quickly:


=========================================================================
select  /*+   qb_name(main)  */  dim.* from (  select   /*+
qb_name(inline)    no_merge   */   distinct category  from   area_sales
 ) sttv,  dim where  dim.id = sttv.category
=========================================================================

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |      1 |        |    300 |00:00:00.02 |     348 |
|   1 |  NESTED LOOPS SEMI            |       |      1 |    300 |    300 |00:00:00.02 |     348 |
|   2 |   TABLE ACCESS FULL           | DIM   |      1 |    300 |    300 |00:00:00.01 |      35 |
|   3 |   VIEW PUSHED PREDICATE       |       |    300 |   3333 |    300 |00:00:00.01 |     313 |
|   4 |    BITMAP CONVERSION TO ROWIDS|       |    300 |   3333 |    300 |00:00:00.01 |     313 |
|*  5 |     BITMAP INDEX SINGLE VALUE | AS_BI |    300 |        |    300 |00:00:00.01 |     313 |
-------------------------------------------------------------------------------------------------

Bottom line on all this – check your execution plans that use bitmap indexes – if you see a “bitmap conversion to rowids” in cases where you don’t then visit the table it may be a redundant conversion, and it may be expensive. If you suspect that this is happening then dbms_xplan.display_cursor() may confirm that you are doing a lot of CPU-intensive work to produce a very large number of rowids that you don’t need.

January 7, 2015

Most Recent

Filed under: Execution plans,Oracle,Performance,subqueries — Jonathan Lewis @ 6:21 pm BST Jan 7,2015

There’s a thread on the OTN database forum at present asking for advice on optimising a query that’s trying to find “the most recent price” for a transaction given that each transaction is for a stock item on a given date, and each item has a history of prices where each historic price has an effective start date. This means the price for a transaction is the price as at the most recent date prior to the transaction date.

There is an absolutely standard way of expressing “the most recent occurrence” in SQL. Assume we have a table of (item_code, effective_date, price) with the obvious primary key of (item_code, effective_date), then a requirement to find “the most recent price for item XXXX as at 25th Dec 2014″ case would give us code like the following (note – all the examples in this note were run against Oracle 11.2.0.4):


select  *
from    prices  pri1
where   item_code = 'XXXX'
and     effective_date = (
                select  max(effective_date)
                from    prices  pri2
                where   pri2.item_code = 'XXXX'
                and     pri2.effective_date <= date'2014-12-25'
        )
/

The ideal execution plan that we should expect to see for this query is as follows (with a small variation if you had created the prices table as an index-organized table – which would probably be sensible in many cases):


-----------------------------------------------------------------------------------------
| Id  | Operation                      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |     1 |    52 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID   | PRICES |     1 |    52 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN            | PRI_PK |     1 |       |     1   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE              |        |     1 |    32 |            |          |
|   4 |     FIRST ROW                  |        |     1 |    32 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN (MIN/MAX)| PRI_PK |     1 |    32 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ITEM_CODE"='XXXX' AND "EFFECTIVE_DATE"= (SELECT
              MAX("EFFECTIVE_DATE") FROM "PRICES" "PRI2" WHERE
              "PRI2"."EFFECTIVE_DATE"<=TO_DATE(' 2014-12-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "PRI2"."ITEM_CODE"='XXXX'))

   5 - access("PRI2"."ITEM_CODE"='XXXX' AND "PRI2"."EFFECTIVE_DATE"<=
             TO_DATE('2014-12-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

As you can see, this plan is using the “driving subquery” approach – the order of operation is 5, 4, 3, 2, 1, 0: we do an index min/max range scan in line 5 to find the maximum effective date for the item, then pass that up through the (essentially redundant) First Row and Sort Aggregate operations to use as an input to the index unique scan at operation 2 which passes the rowid up to operation 1 to find the specific row. In my case this was 2 consistent gets for the range scan, 2 more for the unique scan, and one for the table access.

You might point out that my example uses the item_code ‘XXXX’ twice, once in the main query, once in the subquery; and you might decide that this was in very poor taste since we should clearly be using a correlated subquery – the correlating predicate ought to be: pri2.item_code = pri1.item_code. Here’s the execution plan I got when I made that change:


----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     1 |    78 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |         |     1 |    78 |     3   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |         |     1 |    78 |     3   (0)| 00:00:01 |
|   3 |    VIEW                      | VW_SQ_1 |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |     FILTER                   |         |       |       |            |          |
|   5 |      HASH GROUP BY           |         |     1 |    32 |     2   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN       | PRI_PK  |     1 |    32 |     2   (0)| 00:00:01 |
|*  7 |    INDEX UNIQUE SCAN         | PRI_PK  |     1 |       |     0   (0)| 00:00:01 |
|   8 |   TABLE ACCESS BY INDEX ROWID| PRICES  |     1 |    52 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("PRI2"."ITEM_CODE"='XXXX')
   6 - access("PRI2"."ITEM_CODE"='XXXX' AND "PRI2"."EFFECTIVE_DATE"<=
              TO_DATE('2014-12-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   7 - access("ITEM_CODE"='XXXX' AND "EFFECTIVE_DATE"="MAX(EFFECTIVE_DATE)")

The plan changes dramatically, the optimizer has unnested the subquery. In my case this didn’t make any difference to the overall performance as my data set was small, I only had one or two prices per item code, and the query was very basic; but in most other cases the change could be catastrophic.

The Problem Query

The requirement on OTN had a stock transactions (xo_stock_trans) table and a prices (xo_prices) table, and the OP had supplied some code to create and populate these tables with 6.4 million and 4.5 million rows respectively. Unfortunately the xo_prices table didn’t have a suitable unique constraint on it and ended up with lots of items having multiple prices for the same date.  The OP had created a function to return a price for an item given a driving date and price_type, and had a query that called that function three times per row (once for each of three price types); but this did not perform very well and the OP wanted to know if there was a way of addressing the requirement efficiently using pure SQL; (s)he had already tried the following:


select tr.item, tr.trans_date, tr.quantity
    , pr.gross_price
    , pr.net_price
    , pr.special_price
from xo_stock_trans tr
join xo_prices pr on pr.item = tr.item
                and pr.price_date = (select max(pr2.price_date)
                                     from xo_prices pr2
                                     where pr2.item = pr.item
                                       and pr2.price_date <= tr.trans_date
                                     )
where tr.trans_date between '01-AUG-2014' and '31-AUG-2014';  

That was SO close – it’s clearly implementing the right sort of strategy: but it didn’t perform well, so let’s check the execution plan:

------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                    |     1 |    70 |       |   168M(100)|234:06:13 |
|   1 |  NESTED LOOPS                 |                    |     1 |    70 |       |   168M(100)|234:06:13 |
|   2 |   NESTED LOOPS                |                    |     9 |    70 |       |   168M(100)|234:06:13 |
|   3 |    NESTED LOOPS               |                    |     9 |   450 |       |   168M(100)|234:06:13 |
|   4 |     VIEW                      | VW_SQ_1            |   286 | 10010 |       |   168M(100)|234:06:11 |
|   5 |      HASH GROUP BY            |                    |   286 |  7722 |       |   168M(100)|234:06:11 |
|   6 |       MERGE JOIN              |                    |   456G|    11T|       |  9153K(100)| 12:42:50 |
|   7 |        SORT JOIN              |                    |   202K|  2960K|       |   548   (2)| 00:00:03 |
|*  8 |         INDEX RANGE SCAN      | XO_STOCK_TRANS_IX2 |   202K|  2960K|       |   548   (2)| 00:00:03 |
|*  9 |        SORT JOIN              |                    |  4045K|    46M|   154M| 19043   (6)| 00:01:36 |
|* 10 |         INDEX FAST FULL SCAN  | XO_PRICES_IX1      |  4045K|    46M|       |  1936  (10)| 00:00:10 |
|* 11 |     TABLE ACCESS BY USER ROWID| XO_STOCK_TRANS     |     1 |    15 |       |     1   (0)| 00:00:01 |
|* 12 |    INDEX RANGE SCAN           | XO_PRICES_IX1      |     1 |       |       |     2   (0)| 00:00:01 |
|  13 |   TABLE ACCESS BY INDEX ROWID | XO_PRICES          |     1 |    20 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   9 - access(INTERNAL_FUNCTION("PR2"."PRICE_DATE")<=INTERNAL_FUNCTION("TR"."TRANS_DATE"))
       filter(INTERNAL_FUNCTION("PR2"."PRICE_DATE")<=INTERNAL_FUNCTION("TR"."TRANS_DATE"))
  10 - filter("PR2"."PRICE_DATE"<=TO_DATE(' 2014-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  11 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  12 - access("ITEM_1"="PR"."ITEM" AND "PR"."PRICE_DATE"="MAX(PR2.PRICE_DATE)")
       filter("PR"."ITEM"="TR"."ITEM")

The query was limited to August 2014, which was about 198,000 rows in my table, so we might expect some signs of a brute-force approach (tablescans and hash joins rather than indexes and nested loops) – but what we get ends up with a high-precision approach with a very bad cardinality estimate after a brute-force unnesting of the “max(price_date)” subquery. The unnesting has done a range scan over 200,000 stock_trans rows, and an index fast full scan on 4.5 million prices to do a merge join and hash aggregation to find the maximum price_date for each target row in the xo_stock_trans table. (See my earlier posting on table duplication for a variation and explanation of what Oracle has done here). This step is a lot of work, but the optimizer thinks it’s going to produce only 286 rows in the aggregated result, so the next steps in the plan are indexed nested loops – which actually operate 198,000 times.

With the clue from my initial description, we need to aim for a strategy where Oracle doesn’t unnest that subquery – so let’s experiment with a basic /*+ no_unnest */ hint in the subquery and see what happens. Here’s the resulting execution plan:


--------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                |   527 | 18445 |       |  6602M  (1)|999:59:59 |
|*  1 |  FILTER                       |                |       |       |       |            |          |
|*  2 |   HASH JOIN                   |                |  3423M|   111G|  5336K| 76973  (90)| 00:06:25 |
|*  3 |    TABLE ACCESS FULL          | XO_STOCK_TRANS |   202K|  2960K|       |  2531  (13)| 00:00:13 |
|   4 |    TABLE ACCESS FULL          | XO_PRICES      |  4571K|    87M|       |  2275  (11)| 00:00:12 |
|   5 |   SORT AGGREGATE              |                |     1 |    12 |       |            |          |
|   6 |    FIRST ROW                  |                |     1 |    12 |       |     3   (0)| 00:00:01 |
|*  7 |     INDEX RANGE SCAN (MIN/MAX)| XO_PRICES_IX1  |     1 |    12 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("PR"."PRICE_DATE"= (SELECT /*+ NO_UNNEST */ MAX("PR2"."PRICE_DATE") FROM
              "XO_PRICES" "PR2" WHERE "PR2"."PRICE_DATE"<=:B1 AND "PR2"."ITEM"=:B2))
   2 - access("PR"."ITEM"="TR"."ITEM")
   3 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
   7 - access("PR2"."ITEM"=:B1 AND "PR2"."PRICE_DATE"<=:B2)

The subquery now survives, and we can see a min/max range scan in the plan – but the subquery is a filter() subquery and is applied to the result of joining the 200,000 transactions to every price that applies for the item in each transaction. The optimizer thinks that this join will produce roughly 3.4 million rows but in fact with the sample data set (which had many prices per item) the join resulted in 4.4 Billion rows. The min/max subquery is as efficient as it can be, but it’s running far too often; ideally we would like it to run at most once per transaction, so why is it running late ? We could try adding the /*+ push_subq */ hint to the subquery but if we do the plan doesn’t change.

Our rapid “most recent occurrence” revolved around accessing the prices table by index while “pre-querying” for the date using a min/max subquery that knew the relevant item code already. In this case, though, we’re doing a full tablescan of the xo_prices table so the method doesn’t apply. So let’s manipulate the query to force an indexed access path for the join to the xo_prices table by adding the hints /*+ leading(tr pr) use_nl(pr) index(pr) */ to the main body of the query. This is the resulting plan:


--------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                |   527 | 18445 |  6614M  (1)|999:59:59 |
|   1 |  NESTED LOOPS                   |                |  3413K|   113M|    11M  (2)| 16:29:13 |
|   2 |   NESTED LOOPS                  |                |  3413K|   113M|    11M  (2)| 16:29:13 |
|*  3 |    TABLE ACCESS FULL            | XO_STOCK_TRANS |   202K|  2960K|  2531  (13)| 00:00:13 |
|*  4 |    INDEX RANGE SCAN             | XO_PRICES_IX1  |    16 |       |    52   (2)| 00:00:01 |
|   5 |     SORT AGGREGATE              |                |     1 |    12 |            |          |
|   6 |      FIRST ROW                  |                |     1 |    12 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN (MIN/MAX)| XO_PRICES_IX1  |     1 |    12 |     3   (0)| 00:00:01 |
|   8 |   TABLE ACCESS BY INDEX ROWID   | XO_PRICES      |    17 |   340 |    59   (2)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   4 - access("PR"."ITEM"="TR"."ITEM")
       filter("PR"."PRICE_DATE"= (SELECT /*+ NO_UNNEST */ MAX("PR2"."PRICE_DATE") FROM
              "XO_PRICES" "PR2" WHERE "PR2"."PRICE_DATE"<=:B1 AND "PR2"."ITEM"=:B2))
   7 - access("PR2"."ITEM"=:B1 AND "PR2"."PRICE_DATE"<=:B2)

We’re nearly there, the shape of the execution plan – lines 4 to 7, at any rate – matches the shape of the very simple example at the start of this article, we seem to be driving from the min/max subquery at line 7; unfortunately when we look at the predicate section of line 4 of the plan we can see that the subquery is still a filter() subquery not an access() subquery – it’s (nominally) being performed for every index entry in the range scan of the xo_prices index that we do for each xo_stock_trans row. What we want to see is an access() subquery – and checking the SQL we can see how to get there: the subquery currently correlates the item back to the xo_prices table, not to the xo_stock_trans table,  so let’s correct that correlation. Here’s our final query (though not formatted to my preference) with execution plan:


select /*+ leading(tr pr) use_nl(pr) index(pr) */  -- hint added
       tr.item, tr.trans_date, tr.quantity
    , pr.gross_price
    , pr.net_price
    , pr.special_price
from xo_stock_trans tr
join xo_prices pr on pr.item = tr.item
                and pr.price_date = (select /*+ no_unnest */  -- hint added
                                         max(pr2.price_date)
                                     from xo_prices pr2
                                     where pr2.item = tr.item  -- correlate to tr, not pr
                                       and pr2.price_date <= tr.trans_date
                                     )
where tr.trans_date between '01-AUG-2014' and '31-AUG-2014'
;

--------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                |  3423M|   111G|  1824K  (1)| 02:32:02 |
|   1 |  NESTED LOOPS                   |                |  3413K|   113M|  1824K  (1)| 02:32:02 |
|   2 |   NESTED LOOPS                  |                |  3413K|   113M|  1824K  (1)| 02:32:02 |
|*  3 |    TABLE ACCESS FULL            | XO_STOCK_TRANS |   202K|  2960K|  2531  (13)| 00:00:13 |
|*  4 |    INDEX RANGE SCAN             | XO_PRICES_IX1  |    16 |       |     2   (0)| 00:00:01 |
|   5 |     SORT AGGREGATE              |                |     1 |    12 |            |          |
|   6 |      FIRST ROW                  |                |     1 |    12 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN (MIN/MAX)| XO_PRICES_IX1  |     1 |    12 |     3   (0)| 00:00:01 |
|   8 |   TABLE ACCESS BY INDEX ROWID   | XO_PRICES      |    17 |   340 |     9   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("TR"."TRANS_DATE">=TO_DATE(' 2014-08-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "TR"."TRANS_DATE"<=TO_DATE(' 2014-08-31 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   4 - access("PR"."ITEM"="TR"."ITEM" AND "PR"."PRICE_DATE"= (SELECT /*+ NO_UNNEST */
              MAX("PR2"."PRICE_DATE") FROM "XO_PRICES" "PR2" WHERE "PR2"."PRICE_DATE"<=:B1 AND
              "PR2"."ITEM"=:B2))
   7 - access("PR2"."ITEM"=:B1 AND "PR2"."PRICE_DATE"<=:B2)

Finally we can see (from the predicate for line 4) the we run the subquery at most once for each row from xo_stock_trans and we use the result of each subquery execution to drive the index range scan to pick up the matching rows from xo_prices with no further filtering. The order of operation is: 3, 7, 6, 5, 4, 2, 8, 1, 0

The only thing we can do now is decide whether the strategy for indexing into the xo_prices table 200,000 times (for our 30 day requirement) is better than a brute force approach that does a massive join and sort, or a data duplication approach that puts a “price end date” on each xo_prices row to avoid the need to check all prices for an item to find the appropriate one. Ultimately the choice may depend on trading off the human development resources against the machine run-time resources, with an eye on the number of times the query runs and the size of the date range typically involved.

Footnote:

There’s plenty more I could say about this query and how to handle it – but there are too many questions about the correctness of the data definition and content to make it worth pursuing in detail.  You will note, however, that the various execution plans which logically should be returning the same data report dramatically different cardinalities for the final row source; if nothing else this should warn you that maybe the optimizer is going to have trouble producing a good plan because it’s model produced a bad cardinality estimate at some point in a series of transformations.

In fact, when I first saw this query I converted to traditional Oracle syntax (anticipating, incorrectly, a need to do something messy with hints), corrected the subquery correlation to the “obvious” choice, and put in a cardinality hint /*+ cardinality(tr 100) */ for the xo_stock_trans table, and got the execution plan that I’ve managed to produce as the final plan above.

Tactically the correlation column is the really important bit – if that can be set up suitably we just have to work around the optimizer’s arithmetic assumptions.

 

 

January 6, 2015

Count (*)

Filed under: Oracle,Performance — Jonathan Lewis @ 6:04 pm BST Jan 6,2015

The old chestnut about comparing speeds of count(*), count(1), count(non_null_column) and count(pk_column) has come up in the OTN database forum (at least) twice in the last couple of months. The standard answer is to point out that they will all execute the same code, and that the corroborating evidence for that claim is that, for a long time, the 10053 trace files have had a rubric reporting: CNT – count(col) to count(*) transformation or, for an even longer time, that the error message file (oraus.msg for the English Language version) has had an error code 10122 which produced (from at least Oracle 8i, if not 7.3):


SQL> execute dbms_output.put_line(sqlerrm(-10122))
ORA-10122: Disable transformation of count(col) to count(*)

But the latest repetition of the question prompted me to check whether a more recent version of Oracle had an even more compelling demonstration, and it does. I extracted the following lines from a 10053 trace file generated by 11.2.0.4 (and I know 10gR2 is similar) in response to selecting count(*), count(1) and count({non-null column}) respectively:


Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(*)" FROM "TEST_USER"."SAVED_ASH" "SAVED_ASH"

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(1)" FROM "TEST_USER"."SAVED_ASH" "SAVED_ASH"

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(SAMPLE_ID)" FROM "TEST_USER"."SAVED_ASH" "SAVED_ASH"

As you can see, Oracle has transformed all three select lists into count(*), hiding the transformation behind the original column alias. As an outsider’s proof of what’s going on, I don’t think you could get a more positive indicator than that.

 

January 5, 2015

Re-optimization

Filed under: 12c,Oracle,Performance — Jonathan Lewis @ 12:54 pm BST Jan 5,2015

The spelling is with a Z rather than an S because it’s an Oracle thing.

Tim Hall has just published a set of notes on Adaptive Query Optimization, so I thought I’d throw in one extra little detail.

When the optimizer decides that a query execution plan involves some guesswork the run-time engine can monitor the execution of the query and collect some information that may allow the optimizer to produce a better execution plan. The interaction between all the re-optimization mechanisms can get very messy, so I’m not going to try to cover all the possibilities – read Tim’s notes for that – but one of the ways in which this type of information can be kept is now visible in a dynamic performance view.


SQL> select hash_value, sql_id, child_number, hint_text from V$sql_reoptimization_hints;

HASH_VALUE SQL_ID        CHILD_NUMBER HINT_TEXT
---------- ------------- ------------ ----------------------------------------------------------------
3680288808 d5k20s7dpth18            0 OPT_ESTIMATE (@"SEL$6E65FD6A" GROUP_BY ROWS=28.000000 )

1 row selected.

SQL> select sql_id, child_number, is_reoptimizable, sql_text from v$sql where sql_id = 'd5k20s7dpth18';

SQL_ID        CHILD_NUMBER I SQL_TEXT
------------- ------------ - --------------------------------------------------
d5k20s7dpth18            0 Y select * from V$OPTIMIZER_PROCESSING_RATE
d5k20s7dpth18            1 N select * from V$OPTIMIZER_PROCESSING_RATE

2 rows selected.

To get the results above I had flushed the shared pool and then (using the SYS schema) executed the indicated query twice. Pulling the execution plans from memory I got (with a little cosmetic cleaning) these results:


SQL> select * from table(dbms_xplan.display_cursor('d5k20s7dpth18',null));

SQL_ID  d5k20s7dpth18, child number 0
-----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                              |       |       |     1 (100)|          |
|   1 |  VIEW                | GV$OPTIMIZER_PROCESSING_RATE |    28 |  2072 |     1 (100)| 00:00:01 |
|   2 |   HASH GROUP BY PIVOT|                              |    28 |  1568 |     1 (100)| 00:00:01 |
|*  3 |    FIXED TABLE FULL  | X$OPTIM_CALIB_STATS          |    28 |  1568 |     0   (0)|          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("INST_ID"=USERENV('INSTANCE'))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

SQL_ID  d5k20s7dpth18, child number 1
-----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                              |       |       |     1 (100)|          |
|   1 |  VIEW                | GV$OPTIMIZER_PROCESSING_RATE |    28 |  2072 |     1 (100)| 00:00:01 |
|   2 |   HASH GROUP BY PIVOT|                              |    28 |  1568 |     1 (100)| 00:00:01 |
|*  3 |    FIXED TABLE FULL  | X$OPTIM_CALIB_STATS          |    28 |  1568 |     0   (0)|          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("INST_ID"=USERENV('INSTANCE'))

Note
-----
   - statistics feedback used for this statement

The Note section for child zero tells us that we used “dynamic statistics” (and the other_xml column in v$sql_plan still calls it dynamic sampling) as part of the statement optimization; but the Note section for child one tells us that we used “statistics feedback” (and the other_xml column in v$sql_plan still calls it cardinality feedback) – in other words we didn’t do dynamic sampling, we re-used the sample (saved as that in-memory opt_estimate hint) from the previous child.

There’s no great significance in the SQL statement I chose to demonstrate this point – I just happened to be doing some investigation into the Optimizer Processing Rates when I paused to read Tim’s notes, so I based this post on one of the statements I had run a few minutes before.

December 23, 2014

Just in case

Filed under: Infrastructure,Oracle,Performance,redo — Jonathan Lewis @ 10:37 am BST Dec 23,2014

For those who don’t read Oracle-l and haven’t found Nikolay Savvinov’s blog, here’s a little note pulling together a recent question on Oracle-L and a relevant (and probably unexpected) observation from the blog. The question (paraphrased) was:

The developers/data modelers are creating all the tables with varchar2(4000) as standard by default “Just in case we need it”. What do you think of this idea?

The general answer was that it’s a bad idea (and unnecessary, anyway) and one specific threat that got mentioned was the problem of creating indexes and Oracle error ORA-01450; but coincidentally Nikolay Savvinov had written about a performance-related “bug” in August this year, which turned out, only last week, to be expected behaviour. You can read his articles for the details, but since he used a 4KB block size to demonstrate it I thought I’d repeat the exercise using an 8KB block size.

 



drop table t1 purge;
create table t1 (id number(6), v1 varchar(40), v2 varchar2(40), v3 varchar2(40));
create unique index t1_i1 on t1(id);

execute snap_redo.start_snap

insert into t1 
select	object_id, object_name, object_name, object_name 
from	all_objects
where	rownum <= 10000
;

execute snap_redo.end_snap


drop table t1 purge;
create table t1 (id number(6), v1 varchar(4000), v2 varchar2(4000), v3 varchar2(4000));
create unique index t1_i1 on t1(id);

execute snap_redo.start_snap

insert into t1 
select	object_id, object_name, object_name, object_name 
from	all_objects
where	rownum <= 10000
;

execute snap_redo.end_snap

I’ve dropped and created the same table twice, once with varchar2(40) columns and once with varchar2(4000) columns.

I’ve created an index on a (non-character) column – the specific results vary depending on whether the index is unique or non-unique, and whether or not you have the index, and whether or not the table already holds data, and the effective clustering on the index columns etc. etc. but the key difference between the two sets of results doesn’t go away.

I’ve inserted object_name values (maximum usage 32 bytes) into the varchar2() columns, inserting 10,000 rows.

The snap_redo package is one of my simple pre/post packages that calculates changes in values in some dynamic performance view – in this case it’s looking at v$sysstat (system statistics) for statistics relating to redo generation, which means you need to run this test on an otherwise idle instance. Here are the two sets of results from an instance of 11.2.0.4:


Name                                                                     Value
----                                                                     -----
messages sent                                                               11
messages received                                                           11
calls to kcmgcs                                                            313
calls to kcmgas                                                             37
calls to get snapshot scn: kcmgss                                           74
redo entries                                                               769
redo size                                                            1,317,008
redo wastage                                                             3,888
redo writes                                                                 11
redo blocks written                                                      2,664
redo write time                                                             10
redo blocks checksummed by FG (exclusive)                                2,242
redo ordering marks                                                          1
redo subscn max counts                                                       1
redo synch time                                                              7
redo synch time (usec)                                                  88,875
redo synch time overhead (usec)                                          1,810
redo synch time overhead count (<2 msec)                                    11
redo synch writes                                                           11
redo write info find                                                        11
undo change vector size                                                261,136
rollback changes - undo records applied                                      2
IMU undo allocation size                                                17,184


Name                                                                     Value
----                                                                     -----
messages sent                                                                8
messages received                                                            8
calls to kcmgcs                                                            222
calls to kcmgas                                                             56
calls to get snapshot scn: kcmgss                                           52
redo entries                                                            20,409
redo size                                                            5,606,872
redo buffer allocation retries                                               1
redo wastage                                                             1,248
redo writes                                                                  6
redo blocks written                                                     11,324
redo write time                                                             26
redo blocks checksummed by FG (exclusive)                                  571
redo ordering marks                                                         32
redo subscn max counts                                                       1
redo synch time                                                              6
redo synch time (usec)                                                  60,230
redo synch time overhead (usec)                                            159
redo synch time overhead count (<2 msec)                                     1
redo synch writes                                                            1
redo write info find                                                         1
undo change vector size                                              1,590,520
IMU undo allocation size                                                   144

Notice, particularly, the great change in the number of redo entries and the total redo size when the character columns are defined at varchar2(4000). Note particularly that the number of redo entries is roughly “2 * number of rows inserted” – for each row that’s one for the row and one for the index entry. You can check the redo log content by dump the log file, of course (and Nikolay did), or you can take my word for it that Oracle is doing the equivalent of single row processing in the varchar2(4000) case and array processing in the varchar2(40) case.

When Oracle calculates that the row length is larger than the block size it falls back to single row processing; this can increase your redo generation significantly, and since the rate at which you can pump out redo is the ultimate rate at which you can load data this could have a significant impact on your data loading times. Declaring character columns as varchar2(4000) “just in case” is a bad idea.

 

November 28, 2014

Line Numbers

Filed under: Oracle,Performance,Troubleshooting — Jonathan Lewis @ 12:49 pm BST Nov 28,2014

One of the presentations I went to at the DOAG conference earlier on this month was called “PL/SQL Tuning, finding the perf. bottleneck with hierarchical profiler” by Radu Parvu from Finland. If you do a lot of PL/SQL programming and haven’t noticed the dbms_hprof package yet make sure you take a good look at it.

A peripheral question that came up at the end of the session asked about problems with line numbers in pl/sql procedures; why, when you get a run-time error, does the reported line number sometimes look wrong, and how do you find the right line. I can answer (or give at least one reason for) the first part, but not the second part; Julian Dontcheff had an answer for the second bit, but unfortunately I failed to take a note of it.

Here’s the SQL to create, run and list a very simple (and silly) procedure.


define m_value = 3
set timing on

create or replace procedure silly
as
        m_n number;
begin
        for i in 1..1000000 loop
                m_n := exp(sqrt(ln(&m_value)));
        end loop;
end;
/

execute silly

list

Here’s the output I got from running this, and the thing I want you to note is the time it takes to run, and the line number reported for the assignment:


Procedure created.

Elapsed: 00:00:00.01

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
  1  create or replace procedure silly
  2  as
  3          m_n number;
  4  begin
  5          for i in 1..1000000 loop
  6                  m_n := exp(sqrt(ln(&m_value)));
  7          end loop;
  8* end;

It seems to take my instance o.oo seconds to perform 1,000,000 evaluations of the exponent of the square root of the natural logarithm of 3 at line 6 of the code. But let’s make two changes; first, let’s try that with the value -1 (which is a little careless if you know your logarithms).


Procedure created.

Elapsed: 00:00:00.02
BEGIN silly; END;

*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "TEST_USER.SILLY", line 5
ORA-06512: at line 1

Notice that the 6502 error is reported at line 5, not line 6.

Now let’s go back to the value 3, but start the script with the command: alter session set plsql_optimize_level = 0; (the default level is 2, the range is 0 to 3):


Session altered.

Elapsed: 00:00:00.00

Procedure created.

Elapsed: 00:00:00.01
BEGIN silly; END;

*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "TEST_USER.SILLY", line 6
ORA-06512: at line 1

Reducing the pl/sql optimisation level to zero (or one) results in the the error message reporting the problem at line 6 – which matches our original code. So let’s check the effect of running the code at level zero with a valid number for the input.


Session altered.

Elapsed: 00:00:00.00

Procedure created.

Elapsed: 00:00:00.02

PL/SQL procedure successfully completed.

Elapsed: 00:00:24.56


The run time for our pl/sql call has gone up from 0.00 seconds to 24.56 seconds.

Like all good compilers the pl/sql compiler had recognised (at level 2) that we were assigning a constant inside a loop, so it had (effectively) rewritten our code to move the assignment outside the loop, effectively swapping lines 6 and 5, storing the change in the “object” level code, but not in the database copy of the source. The benefit we get (from the default settings) is a saving of 25 seconds of CPU, the (small) penalty we pay is that the lines reported for run-time errors aren’t always going to identify the text that caused the problem.

August 24, 2014

In-memory Aggregation

Filed under: 12c,in-memory,Oracle,Performance — Jonathan Lewis @ 8:05 pm BST Aug 24,2014

The title of this piece is the name given to a new feature in 12.1.0.2, and since I’ve recently blogged about a limitation of the in-memory option I thought I’d pick this feature as the next obvious thing to blog about. This is a bit of a non sequitur, though, as the feature seems to have nothing whatsoever to do with the in-memory option; instead it’s a cunning mechanism combining aspects of the star-transformation (but without the bitmap indexes), Bloom filters, and “group-by” placement to minimise the cost of aggregation over high-volume joins.

Here’s a small data set I’ll use to demonstrate the feature:

create table towns
as
select
        rownum                                          id,
        trunc(dbms_random.value(1,51))                  id_state,
        rpad(dbms_random.string('U',3),12)              name,
        cast (rpad('x',trunc(dbms_random.value(50,60)),'x') as varchar2(60))    padding
from
        all_objects
where
        rownum <= 2000
;

alter table towns add constraint to_pk primary key(id);
create index to_i1 on towns(name);

create table people(
        id_town_work    number(6,0)     not null
                constraint pe_fk_wo references towns,
        id_town_home    number(6,0)     not null
                constraint pe_fk_ho references towns,
        dummy1          varchar2(10),
        dummy2          varchar2(10),
        padding         varchar2(110)
);

insert /*+ append */  into people
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1,2001)),
        trunc(dbms_random.value(1,2001)),
        lpad(rownum,10),
        lpad(rownum,10),
        cast (rpad('x',trunc(dbms_random.value(50,60)),'x') as varchar2(60))    padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;

commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'Towns',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'People',
                method_opt       => 'for all columns size 1'
        );
end;
/

I have a “large” table of people, and people can live in one town and work in another. Towns are in states and I’m interested in a report about people who live in one specific state but work in another (e.g. New Hampshre vs. Massachusetts). There are a couple of “padding” columns to represent the data associated with each town and person that I might want in a report. To keep things simple I haven’t extended the query out to select the name of the state. Here’s the query I might use to get the report I want:

select
        wt.padding,
        ht.padding,
        max(pe.padding)
from
        towns   wt,
        towns   ht,
        people  pe
where
        wt.id_state     = 1
and     pe.id_town_work = wt.id
and     ht.id_state     = 2
and     pe.id_town_home = ht.id
group by
        wt.padding,
        ht.padding
;

You might expect something like the following as the execution plan:

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |    40 |  7600 |   179   (6)| 00:00:01 |
|   1 |  HASH GROUP BY       |        |    40 |  7600 |   179   (6)| 00:00:01 |
|*  2 |   HASH JOIN          |        |    40 |  7600 |   178   (6)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL | TOWNS  |    40 |  2520 |     5   (0)| 00:00:01 |
|*  4 |    HASH JOIN         |        |  2000 |   248K|   173   (6)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| TOWNS  |    40 |  2520 |     5   (0)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| PEOPLE |   100K|  6250K|   165   (4)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PE"."ID_TOWN_HOME"="HT"."ID")
   3 - filter("HT"."ID_STATE"=2)
   4 - access("PE"."ID_TOWN_WORK"="WT"."ID")
   5 - filter("WT"."ID_STATE"=1)

The order of operation (row source generation) is: 3,5,6,4,2,1 – we build a hash table from the towns in state 2; build a hash table from the towns in state 1; scan the people table and probe the state 1 hash table, any row that survives is used to probe the state 2 hash table, and the rows that survive the second probe are aggregated to produce the answer.

When you do this type of thing with very large data sets one of the potential performance threats comes from the volume of data you have to aggregate. As we’ve joined the three tables the row length grows significantly before we finally aggregate (admittedly my data set is small, and the number of rows we’re going to aggregate also appears to be very small according to the predictions). There’s also (in the early stages at least) the potential for passing a very large number of rows from the fact table through the first (and possibly subsequent) hash join, doing a lot of work to eliminate the rows you don’t need.

In 12c the optimizer can choose to minimise both these threat points using “vector transformation”. (The name may also reflect the possibility that the code path will take advantage of vector processing (SIMD) operations if they’re available in the CPU.) Here’s the execution path I got when I added the /*+ vector_transform(@sel$1) */ hint to my query – it’s not sensible for this tiny data set, of course, but the hint is a way of learning what Oracle can do:

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                           |    71 | 15975 |   184   (6)| 00:00:01 |

|   1 |  TEMP TABLE TRANSFORMATION    |                           |       |       |            |          |

|   2 |   LOAD AS SELECT              | SYS_TEMP_0FD9D6661_31399B |       |       |            |          |
|   3 |    VECTOR GROUP BY            |                           |    10 |   670 |     6  (17)| 00:00:01 |
|   4 |     KEY VECTOR CREATE BUFFERED| :KV0000                   |    40 |  2680 |     6  (17)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL        | TOWNS                     |    40 |  2520 |     5   (0)| 00:00:01 |

|   6 |   LOAD AS SELECT              | SYS_TEMP_0FD9D6662_31399B |       |       |            |          |
|   7 |    VECTOR GROUP BY            |                           |    10 |   670 |     6  (17)| 00:00:01 |
|   8 |     KEY VECTOR CREATE BUFFERED| :KV0001                   |    40 |  2680 |     6  (17)| 00:00:01 |
|*  9 |      TABLE ACCESS FULL        | TOWNS                     |    40 |  2520 |     5   (0)| 00:00:01 |

|  10 |   HASH GROUP BY               |                           |    71 | 15975 |   172   (6)| 00:00:01 |
|* 11 |    HASH JOIN                  |                           |    71 | 15975 |   171   (5)| 00:00:01 |
|  12 |     TABLE ACCESS FULL         | SYS_TEMP_0FD9D6662_31399B |    10 |   670 |     2   (0)| 00:00:01 |
|* 13 |     HASH JOIN                 |                           |    71 | 11218 |   169   (5)| 00:00:01 |
|  14 |      TABLE ACCESS FULL        | SYS_TEMP_0FD9D6661_31399B |    10 |   670 |     2   (0)| 00:00:01 |
|  15 |      VIEW                     | VW_VT_C444E4CB            |    71 |  6461 |   167   (5)| 00:00:01 |
|  16 |       HASH GROUP BY           |                           |    71 |  5112 |   167   (5)| 00:00:01 |
|  17 |        KEY VECTOR USE         | :KV0000                   |    71 |  5112 |   167   (5)| 00:00:01 |
|  18 |         KEY VECTOR USE        | :KV0001                   |  2000 |   132K|   167   (5)| 00:00:01 |
|* 19 |          TABLE ACCESS FULL    | PEOPLE                    |   100K|  6250K|   165   (4)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("WT"."ID_STATE"=1)
   9 - filter("HT"."ID_STATE"=2)
  11 - access("ITEM_10"=INTERNAL_FUNCTION("C0") AND "ITEM_11"="C2")
  13 - access("ITEM_8"=INTERNAL_FUNCTION("C0") AND "ITEM_9"="C2")
  19 - filter(SYS_OP_KEY_VECTOR_FILTER("PE"."ID_TOWN_HOME",:KV0001) AND
              SYS_OP_KEY_VECTOR_FILTER("PE"."ID_TOWN_WORK",:KV0000))

There are three critical components to this plan: first, we create a couple of “Key Vectors” from the towns table, then we use those key vectors while scanning the people table and aggregate a minimal data set, finally we join back to the data associated with the key vectors. Reprising my introductory paragraph: the creation and use of the key vectors is similar to the Bloom filter approach; the final join-back is similar to the strategy used in Star Transformations (especially the ones where temp tables appear), and the key vector allows the high-volume fact data to be aggregated as much as possible before adding extra row-length from the dimensions.

In outline Oracle does the following:

  • scan the towns table to extract the id, and padding columns for id_state = 1 / work town – this produced 50 rows with my data set
  • manipulate the result to extract the distinct values of padding, and give each value a unique numeric identifier – this is the information that goes into the temp table (with one extra column) – this produced 10 rows
  • manipulate the result again to produce an in-memory array of (town.id, temp_table.identifier) – this is the key vector, containing 50 elements.

The second temp table and key vector for (id_state = 2 /work town ) will be created in the same way.

As the fact table is scanned Oracle can apply the key vectors very efficiently (we hope) to pick out the people rows that would be involved in the final aggregate and associate with each relevant row the two padding identifiers that belong to that row (this step is a bit like doing 2 hash joins – but presumably much more efficient; Bloom filtering does something very similar). After selecting the minimum number of rows we can aggregate them on the  two padding identifiers (an example of the “aggregate early”/”place group by” principle – aggregate before joining); finally we join back to the two temporary tables to translate the short padding identifiers into the long padding values (just as we do in star transformations with temporary table transformation).

Strangely we aggregate again after the join-back. I don’t think it’s necessary in this case because I’m fairly sure that the join back is on a unique set of columns – but perhaps this is a generic strategy allowing for variations in the mechanism, including such things as cases where the vector transform is only applied to a subset of the dimension tables.

Technically you could almost emulate this strategy in any version of Oracle (and I probably have at various times over the last few years) with the major limitation that the “KEY VECTOR USE” operations at lines 17 and 18 would have to be replaced with hash joins; no doubt, though, the major CPU saving of this approach is the difference between consecutive hash joins and what appears to be (from the execution stats) concurrent vector filtering. At some point – if a client needs the extra performance edge before they get to 12c – I’ll have to see if I can engineer an example in 11g that emulates the whole plan but uses Bloom filtering to approximate the key vector filtering.

 

August 19, 2014

LOB Length

Filed under: Infrastructure,LOBs,Oracle,Performance — Jonathan Lewis @ 6:06 pm BST Aug 19,2014

It’s funny how you can make little savings in work all over the place in Oracle if you’re prepared to look a little closely at what’s going on. Here’s a quirky little example with LOBs and function calls that might just have some greater relevance in other situations. Here’s a little data set, and two queries that I might run against it:


create table tbl(
	c1      clob
)
lob (c1) store as c_lob(
	disable storage in row
	nocache nologging
)
;

begin
	for i in 1..128 loop
		insert into tbl values(rpad('x',4000));
		commit;
	end loop;
end;
/

commit;

-- collect stats, prime dictionary cache and library cache
-- run both queries twice and check stats on second run

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		/*+ no_merge */
		dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		rownum rn, dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

The question that you might ask yourselves when you see these queries is: will they do similar amounts of work. Of course, I wouldn’t be asking the question if the answer were yes. Despite the no_merge() hint, which you might think would have the same effect as the rownum approach, Oracle seems to execute the call to dbms_lob.getlength() twice for each row in the first query, but only once per row for the second query. Here are the stats (from autotrace) on the second run of the two queries when autotrace is enabled:


Statistics (for no_merge)
----------------------------------------------------------
         40  recursive calls
          0  db block gets
        271  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Statistics (for rownum)
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        131  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As you can see, the consistent gets for the no_merge() approach is roughly double that for the rownum approach – and since we have 128 rows/LOBs in the table that looks suspiciously like 2 gets vs. 1 get per LOB depending on the approach – which suggests two calls to the function. This is further corroborated by the execution plans, and especially by the predicate sections (how often have I said “always check the predicates”) which show that the predicate has been pushed inside the view that’s been hinted to be non-mergeable, but it hasn’t been pushed inside the view that uses the rownum instantion trick:


Execution Plan for no_merge()
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    13 |            |          |
|   2 |   VIEW              |      |     6 |    78 |     2   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| TBL  |     6 |   522 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("DBMS_LOB"."GETLENGTH"("C1")>3960)

Execution Plan for rownum
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |      |     1 |    13 |            |          |
|*  2 |   VIEW               |      |   128 |  1664 |     2   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| TBL  |   128 | 11136 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("LEN">3960)

My first thought on seeing this difference was to apply the /*+ no_push_pred */ hint to block predicate pushing – but then I remembered that the hint is about join predicate pushing and this is a simple filter push. A quick search of the hidden parameters, though, revealed this:

_optimizer_filter_pushdown : enable/disable filter predicate pushdown

Setting this parameter to false – either through a call to ‘alter session’ or through an /*+ opt_param( opt_param(‘_optimizer_filter_pushdown’ , ‘false’) */ hint – allowed the no_merge approach to produce the same plan and resource usage as the rownum approach. Of course, for a production system, I’d probably use the rownum approach rather than mess around with hidden parameters.

Footnote:

I don’t know why the code with the no_merge() approach reported 40 recursive calls (on its first execution with autotrace). A couple of variations on the experiment suggested that it had something to do with the number of rows (or consequential buffer visits) that survived the predicate call – for a sufficiently small number of rows the recursive call count happened to drop to zero; but the phenomenon needs further investigation.

June 19, 2014

Delete Costs

Filed under: Bugs,CBO,Execution plans,Hints,Indexing,Oracle,Performance — Jonathan Lewis @ 6:18 pm BST Jun 19,2014

One of the quirky little anomalies of the optimizer is that it’s not allowed to select rows from a table after doing an index fast full scan (index_ffs) even if it is obviously the most efficient (or, perhaps, least inefficient) strategy. For example:


create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum			id,
	mod(rownum,100)		n1,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5
;

create index t1_i1 on t1(id, n1);
alter table t1 modify id not null;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

explain plan for
select /*+ index_ffs(t1) */ max(padding) from t1 where n1 = 0;

select * from table(dbms_xplan.display(null,null,'outline -note'));

In this case we can see that there are going to be 1,000 rows where n1 = 0 spread evenly across the whole table so a full tablescan is likely to be the most efficient strategy for the query, but we can tell the optimizer to do an index fast full scan with the hint that I’ve shown, and if the hint is legal (which means there has to be at least one column in it declared as not null) the optimizer should obey it. So here’s the plan my hinted query produced:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   104 |   207   (4)| 00:00:02 |
|   1 |  SORT AGGREGATE    |      |     1 |   104 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  1000 |   101K|   207   (4)| 00:00:02 |
---------------------------------------------------------------------------

We’d have to examine the 10053 trace file to be certain, but it seems the optimizer won’t consider doing an index fast full scan followed by a trip to the table for a select statement (in passing, Oracle would have obeyed the skip scan – index_ss() – hint). It’s a little surprising then that the optimizer will obey the hint for a delete:


explain plan for
delete /*+ index_ffs(t1) cluster_by_rowid(t1) */ from t1 where n1 = 0;

select * from table(dbms_xplan.display(null,null,'outline -note'));

-------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | DELETE STATEMENT      |       |  1000 |  8000 |    38  (11)| 00:00:01 |
|   1 |  DELETE               | T1    |       |       |            |          |
|*  2 |   INDEX FAST FULL SCAN| T1_I1 |  1000 |  8000 |    38  (11)| 00:00:01 |
-------------------------------------------------------------------------------

You might note three things from this plan. First, the optimizer can consider a fast full scan followed by a table visit (so why can’t we do that for a select); secondly that the cost of the delete statement is only 38 whereas the cost of the full tablescan in the earlier query was much larger at 207 – surprisingly Oracle had to be hinted to consider this fast full scan path, despite the fact that the cost was cheaper than the cost of the tablescan path it would have taken if I hadn’t included the hint; finally you might note the cluster_by_rowid() hint in the SQL – there’s no matching “Sort cluster by rowid” operation in the plan, even though this plan came from 11.2.0.4 where the mechanism and hint are available.

The most interesting of the three points is this: there is a bug recorded for the second one (17908541: CBO DOES NOT CONSIDER INDEX_FFS) reported as fixed in 12.2 – I wonder if this means that an index fast full scan followed by table access by rowid will also be considered for select statements in 12.2.

Of course, there is a trap – and something to be tested when the version (or patch) becomes available. Why is the cost of the delete so low (only 38, the cost of the index fast full scan) when the number of rows to be deleted is 1,000 and they’re spread evenly through the table ? It’s because the cost of a delete is actually calculated as the cost of the query: “select the rowids of the rows I want to delete but don’t worry about the cost of going to the rows to delete them (or the cost of updating the indexes that will have to be maintained, but that’s a bit irrelevant to the choice anyway)”.

So when Oracle does do a delete following an index fast full scan in 12.2, will it be doing it because it’s the right thing to do, or because it’s the wrong thing ?

To be continued … (after the next release/patch).

 

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 5,094 other followers