Oracle Scratchpad

January 6, 2015

Count (*)

Filed under: Oracle,Performance — Jonathan Lewis @ 6:04 pm GMT Jan 6,2015

The old chestnut about comparing speeds of count(*), count(1), count(non_null_column) and count(pk_column) has come up in the OTN database forum (at least) twice in the last couple of months. The standard answer is to point out that they will all execute the same code, and that the corroborating evidence for that claim is that, for a long time, the 10053 trace files have had a rubric reporting: CNT – count(col) to count(*) transformation or, for an even longer time, that the error message file (oraus.msg for the English Language version) has had an error code 10122 which produced (from at least Oracle 8i, if not 7.3):


SQL> execute dbms_output.put_line(sqlerrm(-10122))
ORA-10122: Disable transformation of count(col) to count(*)

But the latest repetition of the question prompted me to check whether a more recent version of Oracle had an even more compelling demonstration, and it does. I extracted the following lines from a 10053 trace file generated by 11.2.0.4 (and I know 10gR2 is similar) in response to selecting count(*), count(1) and count({non-null column}) respectively:


Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(*)" FROM "TEST_USER"."SAVED_ASH" "SAVED_ASH"

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(1)" FROM "TEST_USER"."SAVED_ASH" "SAVED_ASH"

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(SAMPLE_ID)" FROM "TEST_USER"."SAVED_ASH" "SAVED_ASH"

As you can see, Oracle has transformed all three select lists into count(*), hiding the transformation behind the original column alias. As an outsider’s proof of what’s going on, I don’t think you could get a more positive indicator than that.

 

January 5, 2015

Re-optimization

Filed under: 12c,Oracle,Performance — Jonathan Lewis @ 12:54 pm GMT Jan 5,2015

The spelling is with a Z rather than an S because it’s an Oracle thing.

Tim Hall has just published a set of notes on Adaptive Query Optimization, so I thought I’d throw in one extra little detail.

When the optimizer decides that a query execution plan involves some guesswork the run-time engine can monitor the execution of the query and collect some information that may allow the optimizer to produce a better execution plan. The interaction between all the re-optimization mechanisms can get very messy, so I’m not going to try to cover all the possibilities – read Tim’s notes for that – but one of the ways in which this type of information can be kept is now visible in a dynamic performance view.


SQL> select hash_value, sql_id, child_number, hint_text from V$sql_reoptimization_hints;

HASH_VALUE SQL_ID        CHILD_NUMBER HINT_TEXT
---------- ------------- ------------ ----------------------------------------------------------------
3680288808 d5k20s7dpth18            0 OPT_ESTIMATE (@"SEL$6E65FD6A" GROUP_BY ROWS=28.000000 )

1 row selected.

SQL> select sql_id, child_number, is_reoptimizable, sql_text from v$sql where sql_id = 'd5k20s7dpth18';

SQL_ID        CHILD_NUMBER I SQL_TEXT
------------- ------------ - --------------------------------------------------
d5k20s7dpth18            0 Y select * from V$OPTIMIZER_PROCESSING_RATE
d5k20s7dpth18            1 N select * from V$OPTIMIZER_PROCESSING_RATE

2 rows selected.

To get the results above I had flushed the shared pool and then (using the SYS schema) executed the indicated query twice. Pulling the execution plans from memory I got (with a little cosmetic cleaning) these results:


SQL> select * from table(dbms_xplan.display_cursor('d5k20s7dpth18',null));

SQL_ID  d5k20s7dpth18, child number 0
-----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                              |       |       |     1 (100)|          |
|   1 |  VIEW                | GV$OPTIMIZER_PROCESSING_RATE |    28 |  2072 |     1 (100)| 00:00:01 |
|   2 |   HASH GROUP BY PIVOT|                              |    28 |  1568 |     1 (100)| 00:00:01 |
|*  3 |    FIXED TABLE FULL  | X$OPTIM_CALIB_STATS          |    28 |  1568 |     0   (0)|          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("INST_ID"=USERENV('INSTANCE'))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

SQL_ID  d5k20s7dpth18, child number 1
-----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                              |       |       |     1 (100)|          |
|   1 |  VIEW                | GV$OPTIMIZER_PROCESSING_RATE |    28 |  2072 |     1 (100)| 00:00:01 |
|   2 |   HASH GROUP BY PIVOT|                              |    28 |  1568 |     1 (100)| 00:00:01 |
|*  3 |    FIXED TABLE FULL  | X$OPTIM_CALIB_STATS          |    28 |  1568 |     0   (0)|          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("INST_ID"=USERENV('INSTANCE'))

Note
-----
   - statistics feedback used for this statement

The Note section for child zero tells us that we used “dynamic statistics” (and the other_xml column in v$sql_plan still calls it dynamic sampling) as part of the statement optimization; but the Note section for child one tells us that we used “statistics feedback” (and the other_xml column in v$sql_plan still calls it cardinality feedback) – in other words we didn’t do dynamic sampling, we re-used the sample (saved as that in-memory opt_estimate hint) from the previous child.

There’s no great significance in the SQL statement I chose to demonstrate this point – I just happened to be doing some investigation into the Optimizer Processing Rates when I paused to read Tim’s notes, so I based this post on one of the statements I had run a few minutes before.

December 23, 2014

Just in case

Filed under: Infrastructure,Oracle,Performance,redo — Jonathan Lewis @ 10:37 am GMT Dec 23,2014

For those who don’t read Oracle-l and haven’t found Nikolay Savvinov’s blog, here’s a little note pulling together a recent question on Oracle-L and a relevant (and probably unexpected) observation from the blog. The question (paraphrased) was:

The developers/data modelers are creating all the tables with varchar2(4000) as standard by default “Just in case we need it”. What do you think of this idea?

The general answer was that it’s a bad idea (and unnecessary, anyway) and one specific threat that got mentioned was the problem of creating indexes and Oracle error ORA-01450; but coincidentally Nikolay Savvinov had written about a performance-related “bug” in August this year, which turned out, only last week, to be expected behaviour. You can read his articles for the details, but since he used a 4KB block size to demonstrate it I thought I’d repeat the exercise using an 8KB block size.

 



drop table t1 purge;
create table t1 (id number(6), v1 varchar(40), v2 varchar2(40), v3 varchar2(40));
create unique index t1_i1 on t1(id);

execute snap_redo.start_snap

insert into t1 
select	object_id, object_name, object_name, object_name 
from	all_objects
where	rownum <= 10000
;

execute snap_redo.end_snap


drop table t1 purge;
create table t1 (id number(6), v1 varchar(4000), v2 varchar2(4000), v3 varchar2(4000));
create unique index t1_i1 on t1(id);

execute snap_redo.start_snap

insert into t1 
select	object_id, object_name, object_name, object_name 
from	all_objects
where	rownum <= 10000
;

execute snap_redo.end_snap

I’ve dropped and created the same table twice, once with varchar2(40) columns and once with varchar2(4000) columns.

I’ve created an index on a (non-character) column – the specific results vary depending on whether the index is unique or non-unique, and whether or not you have the index, and whether or not the table already holds data, and the effective clustering on the index columns etc. etc. but the key difference between the two sets of results doesn’t go away.

I’ve inserted object_name values (maximum usage 32 bytes) into the varchar2() columns, inserting 10,000 rows.

The snap_redo package is one of my simple pre/post packages that calculates changes in values in some dynamic performance view – in this case it’s looking at v$sysstat (system statistics) for statistics relating to redo generation, which means you need to run this test on an otherwise idle instance. Here are the two sets of results from an instance of 11.2.0.4:


Name                                                                     Value
----                                                                     -----
messages sent                                                               11
messages received                                                           11
calls to kcmgcs                                                            313
calls to kcmgas                                                             37
calls to get snapshot scn: kcmgss                                           74
redo entries                                                               769
redo size                                                            1,317,008
redo wastage                                                             3,888
redo writes                                                                 11
redo blocks written                                                      2,664
redo write time                                                             10
redo blocks checksummed by FG (exclusive)                                2,242
redo ordering marks                                                          1
redo subscn max counts                                                       1
redo synch time                                                              7
redo synch time (usec)                                                  88,875
redo synch time overhead (usec)                                          1,810
redo synch time overhead count (<2 msec)                                    11
redo synch writes                                                           11
redo write info find                                                        11
undo change vector size                                                261,136
rollback changes - undo records applied                                      2
IMU undo allocation size                                                17,184


Name                                                                     Value
----                                                                     -----
messages sent                                                                8
messages received                                                            8
calls to kcmgcs                                                            222
calls to kcmgas                                                             56
calls to get snapshot scn: kcmgss                                           52
redo entries                                                            20,409
redo size                                                            5,606,872
redo buffer allocation retries                                               1
redo wastage                                                             1,248
redo writes                                                                  6
redo blocks written                                                     11,324
redo write time                                                             26
redo blocks checksummed by FG (exclusive)                                  571
redo ordering marks                                                         32
redo subscn max counts                                                       1
redo synch time                                                              6
redo synch time (usec)                                                  60,230
redo synch time overhead (usec)                                            159
redo synch time overhead count (<2 msec)                                     1
redo synch writes                                                            1
redo write info find                                                         1
undo change vector size                                              1,590,520
IMU undo allocation size                                                   144

Notice, particularly, the great change in the number of redo entries and the total redo size when the character columns are defined at varchar2(4000). Note particularly that the number of redo entries is roughly “2 * number of rows inserted” – for each row that’s one for the row and one for the index entry. You can check the redo log content by dump the log file, of course (and Nikolay did), or you can take my word for it that Oracle is doing the equivalent of single row processing in the varchar2(4000) case and array processing in the varchar2(40) case.

When Oracle calculates that the row length is larger than the block size it falls back to single row processing; this can increase your redo generation significantly, and since the rate at which you can pump out redo is the ultimate rate at which you can load data this could have a significant impact on your data loading times. Declaring character columns as varchar2(4000) “just in case” is a bad idea.

 

November 28, 2014

Line Numbers

Filed under: Oracle,Performance,Troubleshooting — Jonathan Lewis @ 12:49 pm GMT Nov 28,2014

One of the presentations I went to at the DOAG conference earlier on this month was called “PL/SQL Tuning, finding the perf. bottleneck with hierarchical profiler” by Radu Parvu from Finland. If you do a lot of PL/SQL programming and haven’t noticed the dbms_hprof package yet make sure you take a good look at it.

A peripheral question that came up at the end of the session asked about problems with line numbers in pl/sql procedures; why, when you get a run-time error, does the reported line number sometimes look wrong, and how do you find the right line. I can answer (or give at least one reason for) the first part, but not the second part; Julian Dontcheff had an answer for the second bit, but unfortunately I failed to take a note of it.

Here’s the SQL to create, run and list a very simple (and silly) procedure.


define m_value = 3
set timing on

create or replace procedure silly
as
        m_n number;
begin
        for i in 1..1000000 loop
                m_n := exp(sqrt(ln(&m_value)));
        end loop;
end;
/

execute silly

list

Here’s the output I got from running this, and the thing I want you to note is the time it takes to run, and the line number reported for the assignment:


Procedure created.

Elapsed: 00:00:00.01

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
  1  create or replace procedure silly
  2  as
  3          m_n number;
  4  begin
  5          for i in 1..1000000 loop
  6                  m_n := exp(sqrt(ln(&m_value)));
  7          end loop;
  8* end;

It seems to take my instance o.oo seconds to perform 1,000,000 evaluations of the exponent of the square root of the natural logarithm of 3 at line 6 of the code. But let’s make two changes; first, let’s try that with the value -1 (which is a little careless if you know your logarithms).


Procedure created.

Elapsed: 00:00:00.02
BEGIN silly; END;

*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "TEST_USER.SILLY", line 5
ORA-06512: at line 1

Notice that the 6502 error is reported at line 5, not line 6.

Now let’s go back to the value 3, but start the script with the command: alter session set plsql_optimize_level = 0; (the default level is 2, the range is 0 to 3):


Session altered.

Elapsed: 00:00:00.00

Procedure created.

Elapsed: 00:00:00.01
BEGIN silly; END;

*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "TEST_USER.SILLY", line 6
ORA-06512: at line 1

Reducing the pl/sql optimisation level to zero (or one) results in the the error message reporting the problem at line 6 – which matches our original code. So let’s check the effect of running the code at level zero with a valid number for the input.


Session altered.

Elapsed: 00:00:00.00

Procedure created.

Elapsed: 00:00:00.02

PL/SQL procedure successfully completed.

Elapsed: 00:00:24.56


The run time for our pl/sql call has gone up from 0.00 seconds to 24.56 seconds.

Like all good compilers the pl/sql compiler had recognised (at level 2) that we were assigning a constant inside a loop, so it had (effectively) rewritten our code to move the assignment outside the loop, effectively swapping lines 6 and 5, storing the change in the “object” level code, but not in the database copy of the source. The benefit we get (from the default settings) is a saving of 25 seconds of CPU, the (small) penalty we pay is that the lines reported for run-time errors aren’t always going to identify the text that caused the problem.

August 24, 2014

In-memory Aggregation

Filed under: 12c,in-memory,Oracle,Performance — Jonathan Lewis @ 8:05 pm GMT Aug 24,2014

The title of this piece is the name given to a new feature in 12.1.0.2, and since I’ve recently blogged about a limitation of the in-memory option I thought I’d pick this feature as the next obvious thing to blog about. This is a bit of a non sequitur, though, as the feature seems to have nothing whatsoever to do with the in-memory option; instead it’s a cunning mechanism combining aspects of the star-transformation (but without the bitmap indexes), Bloom filters, and “group-by” placement to minimise the cost of aggregation over high-volume joins.

Here’s a small data set I’ll use to demonstrate the feature:

create table towns
as
select
        rownum                                          id,
        trunc(dbms_random.value(1,51))                  id_state,
        rpad(dbms_random.string('U',3),12)              name,
        cast (rpad('x',trunc(dbms_random.value(50,60)),'x') as varchar2(60))    padding
from
        all_objects
where
        rownum <= 2000
;

alter table towns add constraint to_pk primary key(id);
create index to_i1 on towns(name);

create table people(
        id_town_work    number(6,0)     not null
                constraint pe_fk_wo references towns,
        id_town_home    number(6,0)     not null
                constraint pe_fk_ho references towns,
        dummy1          varchar2(10),
        dummy2          varchar2(10),
        padding         varchar2(110)
);

insert /*+ append */  into people
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1,2001)),
        trunc(dbms_random.value(1,2001)),
        lpad(rownum,10),
        lpad(rownum,10),
        cast (rpad('x',trunc(dbms_random.value(50,60)),'x') as varchar2(60))    padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;

commit;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'Towns',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'People',
                method_opt       => 'for all columns size 1'
        );
end;
/

I have a “large” table of people, and people can live in one town and work in another. Towns are in states and I’m interested in a report about people who live in one specific state but work in another (e.g. New Hampshre vs. Massachusetts). There are a couple of “padding” columns to represent the data associated with each town and person that I might want in a report. To keep things simple I haven’t extended the query out to select the name of the state. Here’s the query I might use to get the report I want:

select
        wt.padding,
        ht.padding,
        max(pe.padding)
from
        towns   wt,
        towns   ht,
        people  pe
where
        wt.id_state     = 1
and     pe.id_town_work = wt.id
and     ht.id_state     = 2
and     pe.id_town_home = ht.id
group by
        wt.padding,
        ht.padding
;

You might expect something like the following as the execution plan:

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |    40 |  7600 |   179   (6)| 00:00:01 |
|   1 |  HASH GROUP BY       |        |    40 |  7600 |   179   (6)| 00:00:01 |
|*  2 |   HASH JOIN          |        |    40 |  7600 |   178   (6)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL | TOWNS  |    40 |  2520 |     5   (0)| 00:00:01 |
|*  4 |    HASH JOIN         |        |  2000 |   248K|   173   (6)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| TOWNS  |    40 |  2520 |     5   (0)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| PEOPLE |   100K|  6250K|   165   (4)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PE"."ID_TOWN_HOME"="HT"."ID")
   3 - filter("HT"."ID_STATE"=2)
   4 - access("PE"."ID_TOWN_WORK"="WT"."ID")
   5 - filter("WT"."ID_STATE"=1)

The order of operation (row source generation) is: 3,5,6,4,2,1 – we build a hash table from the towns in state 2; build a hash table from the towns in state 1; scan the people table and probe the state 1 hash table, any row that survives is used to probe the state 2 hash table, and the rows that survive the second probe are aggregated to produce the answer.

When you do this type of thing with very large data sets one of the potential performance threats comes from the volume of data you have to aggregate. As we’ve joined the three tables the row length grows significantly before we finally aggregate (admittedly my data set is small, and the number of rows we’re going to aggregate also appears to be very small according to the predictions). There’s also (in the early stages at least) the potential for passing a very large number of rows from the fact table through the first (and possibly subsequent) hash join, doing a lot of work to eliminate the rows you don’t need.

In 12c the optimizer can choose to minimise both these threat points using “vector transformation”. (The name may also reflect the possibility that the code path will take advantage of vector processing (SIMD) operations if they’re available in the CPU.) Here’s the execution path I got when I added the /*+ vector_transform(@sel$1) */ hint to my query – it’s not sensible for this tiny data set, of course, but the hint is a way of learning what Oracle can do:

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                           |    71 | 15975 |   184   (6)| 00:00:01 |

|   1 |  TEMP TABLE TRANSFORMATION    |                           |       |       |            |          |

|   2 |   LOAD AS SELECT              | SYS_TEMP_0FD9D6661_31399B |       |       |            |          |
|   3 |    VECTOR GROUP BY            |                           |    10 |   670 |     6  (17)| 00:00:01 |
|   4 |     KEY VECTOR CREATE BUFFERED| :KV0000                   |    40 |  2680 |     6  (17)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL        | TOWNS                     |    40 |  2520 |     5   (0)| 00:00:01 |

|   6 |   LOAD AS SELECT              | SYS_TEMP_0FD9D6662_31399B |       |       |            |          |
|   7 |    VECTOR GROUP BY            |                           |    10 |   670 |     6  (17)| 00:00:01 |
|   8 |     KEY VECTOR CREATE BUFFERED| :KV0001                   |    40 |  2680 |     6  (17)| 00:00:01 |
|*  9 |      TABLE ACCESS FULL        | TOWNS                     |    40 |  2520 |     5   (0)| 00:00:01 |

|  10 |   HASH GROUP BY               |                           |    71 | 15975 |   172   (6)| 00:00:01 |
|* 11 |    HASH JOIN                  |                           |    71 | 15975 |   171   (5)| 00:00:01 |
|  12 |     TABLE ACCESS FULL         | SYS_TEMP_0FD9D6662_31399B |    10 |   670 |     2   (0)| 00:00:01 |
|* 13 |     HASH JOIN                 |                           |    71 | 11218 |   169   (5)| 00:00:01 |
|  14 |      TABLE ACCESS FULL        | SYS_TEMP_0FD9D6661_31399B |    10 |   670 |     2   (0)| 00:00:01 |
|  15 |      VIEW                     | VW_VT_C444E4CB            |    71 |  6461 |   167   (5)| 00:00:01 |
|  16 |       HASH GROUP BY           |                           |    71 |  5112 |   167   (5)| 00:00:01 |
|  17 |        KEY VECTOR USE         | :KV0000                   |    71 |  5112 |   167   (5)| 00:00:01 |
|  18 |         KEY VECTOR USE        | :KV0001                   |  2000 |   132K|   167   (5)| 00:00:01 |
|* 19 |          TABLE ACCESS FULL    | PEOPLE                    |   100K|  6250K|   165   (4)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("WT"."ID_STATE"=1)
   9 - filter("HT"."ID_STATE"=2)
  11 - access("ITEM_10"=INTERNAL_FUNCTION("C0") AND "ITEM_11"="C2")
  13 - access("ITEM_8"=INTERNAL_FUNCTION("C0") AND "ITEM_9"="C2")
  19 - filter(SYS_OP_KEY_VECTOR_FILTER("PE"."ID_TOWN_HOME",:KV0001) AND
              SYS_OP_KEY_VECTOR_FILTER("PE"."ID_TOWN_WORK",:KV0000))

There are three critical components to this plan: first, we create a couple of “Key Vectors” from the towns table, then we use those key vectors while scanning the people table and aggregate a minimal data set, finally we join back to the data associated with the key vectors. Reprising my introductory paragraph: the creation and use of the key vectors is similar to the Bloom filter approach; the final join-back is similar to the strategy used in Star Transformations (especially the ones where temp tables appear), and the key vector allows the high-volume fact data to be aggregated as much as possible before adding extra row-length from the dimensions.

In outline Oracle does the following:

  • scan the towns table to extract the id, and padding columns for id_state = 1 / work town – this produced 50 rows with my data set
  • manipulate the result to extract the distinct values of padding, and give each value a unique numeric identifier – this is the information that goes into the temp table (with one extra column) – this produced 10 rows
  • manipulate the result again to produce an in-memory array of (town.id, temp_table.identifier) – this is the key vector, containing 50 elements.

The second temp table and key vector for (id_state = 2 /work town ) will be created in the same way.

As the fact table is scanned Oracle can apply the key vectors very efficiently (we hope) to pick out the people rows that would be involved in the final aggregate and associate with each relevant row the two padding identifiers that belong to that row (this step is a bit like doing 2 hash joins – but presumably much more efficient; Bloom filtering does something very similar). After selecting the minimum number of rows we can aggregate them on the  two padding identifiers (an example of the “aggregate early”/”place group by” principle – aggregate before joining); finally we join back to the two temporary tables to translate the short padding identifiers into the long padding values (just as we do in star transformations with temporary table transformation).

Strangely we aggregate again after the join-back. I don’t think it’s necessary in this case because I’m fairly sure that the join back is on a unique set of columns – but perhaps this is a generic strategy allowing for variations in the mechanism, including such things as cases where the vector transform is only applied to a subset of the dimension tables.

Technically you could almost emulate this strategy in any version of Oracle (and I probably have at various times over the last few years) with the major limitation that the “KEY VECTOR USE” operations at lines 17 and 18 would have to be replaced with hash joins; no doubt, though, the major CPU saving of this approach is the difference between consecutive hash joins and what appears to be (from the execution stats) concurrent vector filtering. At some point – if a client needs the extra performance edge before they get to 12c – I’ll have to see if I can engineer an example in 11g that emulates the whole plan but uses Bloom filtering to approximate the key vector filtering.

 

August 19, 2014

LOB Length

Filed under: Infrastructure,LOBs,Oracle,Performance — Jonathan Lewis @ 6:06 pm GMT Aug 19,2014

It’s funny how you can make little savings in work all over the place in Oracle if you’re prepared to look a little closely at what’s going on. Here’s a quirky little example with LOBs and function calls that might just have some greater relevance in other situations. Here’s a little data set, and two queries that I might run against it:


create table tbl(
	c1      clob
)
lob (c1) store as c_lob(
	disable storage in row
	nocache nologging
)
;

begin
	for i in 1..128 loop
		insert into tbl values(rpad('x',4000));
		commit;
	end loop;
end;
/

commit;

-- collect stats, prime dictionary cache and library cache
-- run both queries twice and check stats on second run

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		/*+ no_merge */
		dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		rownum rn, dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

The question that you might ask yourselves when you see these queries is: will they do similar amounts of work. Of course, I wouldn’t be asking the question if the answer were yes. Despite the no_merge() hint, which you might think would have the same effect as the rownum approach, Oracle seems to execute the call to dbms_lob.getlength() twice for each row in the first query, but only once per row for the second query. Here are the stats (from autotrace) on the second run of the two queries when autotrace is enabled:


Statistics (for no_merge)
----------------------------------------------------------
         40  recursive calls
          0  db block gets
        271  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Statistics (for rownum)
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        131  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As you can see, the consistent gets for the no_merge() approach is roughly double that for the rownum approach – and since we have 128 rows/LOBs in the table that looks suspiciously like 2 gets vs. 1 get per LOB depending on the approach – which suggests two calls to the function. This is further corroborated by the execution plans, and especially by the predicate sections (how often have I said “always check the predicates”) which show that the predicate has been pushed inside the view that’s been hinted to be non-mergeable, but it hasn’t been pushed inside the view that uses the rownum instantion trick:


Execution Plan for no_merge()
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    13 |            |          |
|   2 |   VIEW              |      |     6 |    78 |     2   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| TBL  |     6 |   522 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("DBMS_LOB"."GETLENGTH"("C1")>3960)

Execution Plan for rownum
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |      |     1 |    13 |            |          |
|*  2 |   VIEW               |      |   128 |  1664 |     2   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| TBL  |   128 | 11136 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("LEN">3960)

My first thought on seeing this difference was to apply the /*+ no_push_pred */ hint to block predicate pushing – but then I remembered that the hint is about join predicate pushing and this is a simple filter push. A quick search of the hidden parameters, though, revealed this:

_optimizer_filter_pushdown : enable/disable filter predicate pushdown

Setting this parameter to false – either through a call to ‘alter session’ or through an /*+ opt_param( opt_param(‘_optimizer_filter_pushdown’ , ‘false’) */ hint – allowed the no_merge approach to produce the same plan and resource usage as the rownum approach. Of course, for a production system, I’d probably use the rownum approach rather than mess around with hidden parameters.

Footnote:

I don’t know why the code with the no_merge() approach reported 40 recursive calls (on its first execution with autotrace). A couple of variations on the experiment suggested that it had something to do with the number of rows (or consequential buffer visits) that survived the predicate call – for a sufficiently small number of rows the recursive call count happened to drop to zero; but the phenomenon needs further investigation.

Reference Script: lob_oddity.sql

June 19, 2014

Delete Costs

Filed under: Bugs,CBO,Execution plans,Hints,Indexing,Oracle,Performance — Jonathan Lewis @ 6:18 pm GMT Jun 19,2014

One of the quirky little anomalies of the optimizer is that it’s not allowed to select rows from a table after doing an index fast full scan (index_ffs) even if it is obviously the most efficient (or, perhaps, least inefficient) strategy. For example:


create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum			id,
	mod(rownum,100)		n1,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5
;

create index t1_i1 on t1(id, n1);
alter table t1 modify id not null;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

explain plan for
select /*+ index_ffs(t1) */ max(padding) from t1 where n1 = 0;

select * from table(dbms_xplan.display(null,null,'outline -note'));

In this case we can see that there are going to be 1,000 rows where n1 = 0 spread evenly across the whole table so a full tablescan is likely to be the most efficient strategy for the query, but we can tell the optimizer to do an index fast full scan with the hint that I’ve shown, and if the hint is legal (which means there has to be at least one column in it declared as not null) the optimizer should obey it. So here’s the plan my hinted query produced:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   104 |   207   (4)| 00:00:02 |
|   1 |  SORT AGGREGATE    |      |     1 |   104 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  1000 |   101K|   207   (4)| 00:00:02 |
---------------------------------------------------------------------------

We’d have to examine the 10053 trace file to be certain, but it seems the optimizer won’t consider doing an index fast full scan followed by a trip to the table for a select statement (in passing, Oracle would have obeyed the skip scan – index_ss() – hint). It’s a little surprising then that the optimizer will obey the hint for a delete:


explain plan for
delete /*+ index_ffs(t1) cluster_by_rowid(t1) */ from t1 where n1 = 0;

select * from table(dbms_xplan.display(null,null,'outline -note'));

-------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | DELETE STATEMENT      |       |  1000 |  8000 |    38  (11)| 00:00:01 |
|   1 |  DELETE               | T1    |       |       |            |          |
|*  2 |   INDEX FAST FULL SCAN| T1_I1 |  1000 |  8000 |    38  (11)| 00:00:01 |
-------------------------------------------------------------------------------

You might note three things from this plan. First, the optimizer can consider a fast full scan followed by a table visit (so why can’t we do that for a select); secondly that the cost of the delete statement is only 38 whereas the cost of the full tablescan in the earlier query was much larger at 207 – surprisingly Oracle had to be hinted to consider this fast full scan path, despite the fact that the cost was cheaper than the cost of the tablescan path it would have taken if I hadn’t included the hint; finally you might note the cluster_by_rowid() hint in the SQL – there’s no matching “Sort cluster by rowid” operation in the plan, even though this plan came from 11.2.0.4 where the mechanism and hint are available.

The most interesting of the three points is this: there is a bug recorded for the second one (17908541: CBO DOES NOT CONSIDER INDEX_FFS) reported as fixed in 12.2 – I wonder if this means that an index fast full scan followed by table access by rowid will also be considered for select statements in 12.2.

Of course, there is a trap – and something to be tested when the version (or patch) becomes available. Why is the cost of the delete so low (only 38, the cost of the index fast full scan) when the number of rows to be deleted is 1,000 and they’re spread evenly through the table ? It’s because the cost of a delete is actually calculated as the cost of the query: “select the rowids of the rows I want to delete but don’t worry about the cost of going to the rows to delete them (or the cost of updating the indexes that will have to be maintained, but that’s a bit irrelevant to the choice anyway)”.

So when Oracle does do a delete following an index fast full scan in 12.2, will it be doing it because it’s the right thing to do, or because it’s the wrong thing ?

To be continued … (after the next release/patch).

 

April 3, 2014

Cache anomaly

Filed under: Bugs,Oracle,Performance — Jonathan Lewis @ 1:27 pm GMT Apr 3,2014

Just a quick heads-up for anyone who likes to play around with the Keep and Recycle caches.

In 11g Oracle introduced the option for serial direct path reads for tablescans on tables that was sufficiently large – which meant more than the small_table_threshold – provided the table wasn’t already sufficient well cached.  (The rules mean that the choice of mechanism can appear to be a little random in the production environment for tables that are near the threshold size – but if you try testing by doing “alter system flush buffer_cache” you find that you always get direct path reads in testing.)

I’ve just discovered a little oddity about this, though.  I have a table of about 50MB which is comfortably over the threshold for direct path reads. But if I create a KEEP cache (db_keep_cache_size) that is a little larger than the table and then assign the table to the KEEP cache (alter table xxx storage(buffer_pool keep)) then 11.2.0.4 stops doing direct path reads, and caches the table.

Now this doesn’t seem unreasonable – if you’ve assigned an object to the KEEP cache you probably want it (or once wanted it) to be kept in cache as much as possible; so using the KEEP cache if it’s defined and specified makes sense. The reason I mention this as an oddity, though, is that it doesn’t reproduce in 11.1.0.7.

I think I saw a bug note about this combination a few months ago- I was looking for something else at the time and, almost inevitably, I can’t find it when I want it – but I don’t remember whether it was the 11.1 or 11.2 behaviour that was deemed to be correct.

 Update

See comments 1 and 2 below.  I’ve written about this previously, and the caching bechaviour is the correct behaviour. The patch is in 11.2.0.2 and backports are available for 11.1.0.7 and 11.2.0.1. The patch ensures that the table will be read into the cache if it is smaller than the db_keep_cache_size.  (Although we might want to check – see Tanel’s notes – whether this is based on the high water mark recorded in the segment header or on the optimizer stats for the table; and I suppose it might be worth checking that the same feature applies to index fast full scans). From the MoS document:

With the bug fix applied, any object in the KEEP buffer pool, whose size is less than DB_KEEP_CACHE_SIZE, is considered as a small or medium sized object. This will cache the read blocks and avoid subsequent direct read for these objects.

 

 

January 23, 2014

Optimisation

Filed under: Execution plans,Oracle,Performance — Jonathan Lewis @ 6:05 pm GMT Jan 23,2014

Here’s a recent request from the OTN database forum – how do you make this query go faster (tkprof output supplied):

 select a.rowid
   from  a, b
   where A.MARK IS NULL
     and a.cntry_code = b.cntry_code and b.dir_code='XX' and b.numb_type='XXX'
     and upper(Trim(replace(replace(replace(replace(replace(replace(replace(a.co_name,'*'),'&'),'-'),'/'),')'),'('),' '))) like
         upper(Trim(substr(replace(replace(replace(replace(replace(replace(replace(b.e_name,'*'),'&'),'-'),'/'),')'),'('),' '),1,25)))||'%';

(more…)

January 14, 2014

Single block reads

Filed under: Infrastructure,Oracle,Performance — Jonathan Lewis @ 6:52 pm GMT Jan 14,2014

When a “cache read” tablescan (or index fast full scan) takes place we generally expect to see waits on “db file scattered read” as Oracle performs multi-block reads to do the scan. But we all know that Oracle will skip over blocks that are already in the cache and can therefore end up doing multi-block reads of many different sizes, even down to the point where it does single block reads (waiting for “db file sequential read”).

A quirky little question came up on OTN a little while ago: “for a large table we expect multiblock reads to be positioned at the end of the LRU for prompt re-use; but when Oracle does a single block read as part of a tablescan does it go to the end of the LRU (because it’s part of a large tablescan) or does it go to the mid-point of the LRU (because it’s a single block read)?”

The description of how blocks are treated in a tablescan has been simplified, of course, but the question is still valid – so what’s the answer, and how (without going into an extreme level of detail) would you demonstrate it ?

 

January 8, 2014

CR Trivia

Filed under: Infrastructure,Oracle,Performance — Jonathan Lewis @ 6:44 am GMT Jan 8,2014

Everybody “knows” that when you do a tablescan of a table that it starts with two buffer gets on the segment header, though older versions (ca. 8i and earlier) of Oracle used to do 4 buffer gets on the segment header. The upshot of this is that many people still say that if you create a table and insert a single row then you’re going to get 3 buffer gets when you tablescan a table: two for the segment header and one for the data block:

So here’s a test, with the second set of autotrace stats which, for reasons I’ll describe shortly, may not be immediately reproducible on your system:

(more…)

January 1, 2014

NVL()

Filed under: Oracle,Performance — Jonathan Lewis @ 6:11 pm GMT Jan 1,2014

Here’s a point that’s probably fairly well-known, but worth repeating – nvl() and coalesce() aren’t identical in behaviour but you may have some cases where you’re usingnvl() when coalesce() would be a more efficient option.

The reason for this is “short-circuiting”. The expression nvl(expr1, expr2) will return expr2 if expr1 is null, otherwise it will return expr1; the expression coalesce(expr1, expr2, …, exprN) will return the first non-null expression in the list so, in particular, coalesce(expr1, expr2) will give the same result as nvl(expr1, expr2) ; the big difference is that nvl() will evaluate both expressions, while coalesce will evaluate expr2 only if it needs to (i.e. only if expr1 evaluates to null). In many cases the difference in performance will be insignificant – but consider the following type of construct (t1 is a table with a single, numeric, column n1 and a single row):

(more…)

December 11, 2013

Null Quiz

Filed under: Oracle,Performance,Troubleshooting,Tuning — Jonathan Lewis @ 6:42 pm GMT Dec 11,2013

Here’s an example I saw a few months ago of the confusion caused by NULL. As the owner of the problem put it: the first query, run from SQL*Plus for testing purposes, takes no time to complete; but when “put into a pl/sql cursor” (as shown in the second query) it takes ages to complete.

(more…)

November 11, 2013

Reverse Key

Filed under: Indexing,Oracle,Performance — Jonathan Lewis @ 3:19 pm GMT Nov 11,2013

Here’s one of those little details which I would have said just couldn’t be true – except it’s in the manuals, and the manuals happen to be right.

(more…)

September 18, 2013

Distributed Sets

Filed under: distributed,Oracle,Performance,sorting — Jonathan Lewis @ 6:14 pm GMT Sep 18,2013

In an earlier post I’ve described how a distributed query can operate at a remote site if it’s a simple select but has to operate at the local site if it’s a CTAS (create as select) or insert as select. There’s (at least) one special case where this turns out to be untrue … provided you write the query in the correct fashion. I discovered this only as a result of doing a few experiments in response to a question on the OTN database forum.

(more…)

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 5,954 other followers