Dynamic Sampling

February 23, 2010

Dynamic Sampling

Filed under: CBO,Hints,Oracle — Jonathan Lewis @ 7:41 pm GMT Feb 23,2010

If you read the manual pages about “dynamic sampling” it’s easy to get just a little lost in the detail; so this is a brief overview of the variations in the strategies used.

You can set the parameter optimizer_dynamic_sampling to a level between 0 (no sample) and 10 (“sample” whole table).

You can use the dynamic_sampling() hint in two different ways in an SQL statement.

If you use the hint in the “cursor-level” form: /*+ dynamic_sampling ({level}) */ e.g. /*+ dynamic_sampling(4) */, this is equivalent to setting the parameter optimizer_dynamic_sampling to that level for the duration of that query.
If you use the hint in the “table-level” form /*+ dynamic_sampling({alias} {level}) */ e.g. /*+ dynamic_sampling(ord 4) */ you are directing Oracle to sample a specific table unconditionally. In this form, the sample size for levels 1 to 9 is 2^level-1 * “basic sample size” (which is set to 32 blocks by default by parameter _optimizer_dyn_smp_blks.) You can have multiple hints of this form in a single query, one for each table you want sampled.

If you’ve used the parameter setting or the “cursor-level” setting, then there are rules associated with levels 1 to 4 that the optimizer applies to each table in the query to decide whether or not to sample that table. Level 3 attempts to address the problems of “guesses” such as the effects of functions applied to columns, and level 4 attempts to deal with the “dependent predicates” problem; the sample size for these early levels will be 32 or 64 blocks (depending on the parameter _optimizer_dyn_smp_blks and the presence, or absence, of existing stats on the table being tested). From level 4 onwards, the sample size tends to multiply by a factor of 4 for each level, until the sample sizes for the cursor-level and table-level versions of the hints are the same.

Follow-up:

The morning after I wrote this note a related question appeared on the OTN database forum asking why Oracle hadn’t done dynamic sampling in a particular case. I’ve modelled the example with the code below, running on 10.2.0.3, locally managed tablespace, 1MB uniform extent size, 8KB block size:

rem
rem     Script: dynamic_sampling_blog.sql
rem     Author: J P.Lewis
rem     Dated:  Feb 2010
rem

drop table t1;

create table t1
as
select	*
from	all_objects
where   rownum  comment to avoid WordPress format mess
;

begin
        dbms_stats.gather_table_stats(
                user, 't1',method_opt => 'for all columns size 1'
        );
end;
/

select
	blocks, num_rows, sample_size
from
	user_tables
where
	table_name = 'T1'
;

delete from t1 where mod(object_id,10) != 0;
commit;

set autotrace on explain

select count(*) from t1;

select /*+ dynamic_sampling(t1 2) */ count(*) from t1;

set autotrace off

It’s a very simple query, on a very simple structure, and it uses autotrace in an attempt to make the point about dynamic sampling. Here’s a cut-n-paste of the output (starting just after the stats collection):

    BLOCKS   NUM_ROWS SAMPLE_SIZE
---------- ---------- -----------
       128      10000       10000

1 row selected.

9002 rows deleted.

Commit complete.

  COUNT(*)
----------
       998

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    36   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |    36   (0)| 00:00:01 |
-------------------------------------------------------------------

  COUNT(*)
----------
       998

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    36   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |    36   (0)| 00:00:01 |
-------------------------------------------------------------------

We can see that we have statistics.
We can see that we delete 9002 rows
We can see that we have 998 rows left
We can see that the plan (and especially the cardinality of the full tablescan) doesn’t change even though we included a table-level hint to do dynamic sampling.
Moreoever – we can’t see the usual note that you get when the plan is dependent on a dynamic sample (” – dynamic sampling used for this statement”).

It looks as if dynamic sampling hasn’t happened.

However, I “know” that dynamic sampling is supposed to happen unconditionally when you use a table-level hint – so I’m not going to stop at this point. There are cases where you just have to move on from explain plan (or autotrace) and look at the 10053 trace. This is what I found (with a few line-wraps added) when I enabled the 10053 trace for the hinted query:

SINGLE TABLE ACCESS PATH
*** 2010-02-24 10:14:04.671
** Performing dynamic sampling initial checks. **
** Dynamic sampling initial checks returning TRUE (level = 2).
*** 2010-02-24 10:14:04.671
** Generated dynamic sampling query:
    query text :
SELECT /* OPT_DYN_SAMP */
/*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false')
NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (
SELECT /*+ NO_PARALLEL("T1") FULL("T1") NO_PARALLEL_INDEX("T1") */ 1 AS C1, 1 AS C2
FROM "T1" SAMPLE BLOCK (49.218750 , 1) SEED (1) "T1") SAMPLESUB
*** 2010-02-24 10:14:04.671
** Executed dynamic sampling query:
    level : 2
    sample pct. : 49.218750
    actual sample size : 547
    filtered sample card. : 547
    orig. card. : 10000
    block cnt. table stat. : 128
    block cnt. for sampling: 128
    max. sample block cnt. : 64
    sample block cnt. : 63
    min. sel. est. : -1.00000000
** Not using dynamic sampling for single table sel. or cardinality.
DS Failed for : Current SQL statement for this session:
select /*+ dynamic_sampling(t1 2) */ count(*) from t1

So the optimizer did do dynamic sampling, but then decided that it wasn’t going to use the results for this query.

Footnote: I pointed out that for the lower levels of sampling Oracle will sample 32 or 64 blocks. For reasons relating (I think) to rounding errors and the “sample block (percentage)” strategy, the actual sample often seems to fall short by one block.

Update:

The most recent versions of Oracle allow you to specify level 11 for dynamic sampling – a level which allows the optimizer to decide for itself how much sampling to do. The manuals for the most recent versions also seem to introduce a few errors relating to the effect of the hints and the number of blocks sampled.

Comments (31)

31 Comments »

Can you explain, why CBO decided that it wasn`t going to use the results of dynamic sampling?

Comment by Yuri — February 24, 2010 @ 11:51 am GMT Feb 24,2010 | Reply
- Yuri,
  
  I have no idea why the optimizer ignored the sample results in this case. As far as I’m concerned there was technically no point in sampling at all since there was no choice of execution paths that could have been affected by the sample. Possibly the whole thing is just a coding structure that is occasionally redundant because any alternative introduces too much complexity in the feedback loop (i.e. the alternative is to work out possible paths before deciding that it’s not worth sampling – but then you’d have to work out the possible paths again.)
  
  The optimizer clearly saw that the sample was a lot smaller than expected (the trace shows 49.2% of the table showing 547 rows when the statistics would have predicated 4920 rows for that sample percentage). But the trace also suggests that the optimizer rejected the sample BEFORE choosing an execution plan.
  
  Comment by Jonathan Lewis — February 28, 2010 @ 12:45 pm GMT Feb 28,2010 | Reply
Jonathan,
Which 9002 rows did you delete? I mean, we don’t see your delete statement in the script – did you wipe the “first” 90% of the blocks (where rownum<=9002), or 90% of rows in each block, or something completely different? That should affect the quality of the sampling, right? Just curious.
Cheers,
Flado

Comment by Flado — February 26, 2010 @ 10:42 am GMT Feb 26,2010 | Reply
- Flado,
  
  Thanks for pointing out the error – now corrected. It’s the second time I’ve done that this month, cutting out some of the operational code along with the lines of debug and tracing code that weren’t needed for the demo. (A process known in colloquial English as throwing out the baby with the bath water).
  
  As you can see I’ve done a simple mod(id,10) to delete nine out of every 10 rows, and the way I created the data means the rows were “ordered”, so I’ve deleted 90% of every block evenly through the table.
  
  Your comment about patterns in the data affecting the quality of the sampling is very important though. It’s a point I’ve mentioned and demonstrated in my chapter of the latest Oak Table book from Apress. Oracle 11g has better dynamic sampling than 10g , but it still has at least one weak point.
  
  Comment by Jonathan Lewis — February 28, 2010 @ 12:29 pm GMT Feb 28,2010 | Reply
[…] 21-How does different types of dynamic sampling hint work? Jonathan Lewis- Dynamic Sampling […]

Pingback by Blogroll Report 19/02/2010 – 26/02/2010 « Coskan’s Approach to Oracle — March 21, 2010 @ 3:30 am GMT Mar 21,2010 | Reply
Hi,

I ran this testcase on an 11g database and with dynamic sampling at level 10, and despite this, dynamic sampling is still not taking into account by the CBO. When I use the “where rownum <= 9002" clause for deleting the rows, the CBO is mistaken too.

Comment by Ahmed AANGOUR — March 31, 2010 @ 9:26 am BST Mar 31,2010 | Reply
Dear Jonathan ,

I have a scenario for you. In our database we have tables of 50 to 70 GB in size. I have a SQL query that is executing very slow if there are statistics on that tables. After analyzing the AWR Report, i can only find that the the first case that is ( with statistics on the table) the SQL is doing full table scan and there is a huge wait event for Direct Path Read . In the latter (After deleting the statistics) the query is not doing FTS but executing through Nested Loop Join and dynamic sampling is used.

Time taken with before delting statistics is : 00.00.06.05

Time taken with before delting statistics is : 00.21.02.46

I also posted this in OTN formums , however i get any good response on this.

Your Response is appreciated

Regards,

MA

Comment by MA — June 21, 2010 @ 7:53 am BST Jun 21,2010 | Reply
- MA
  
  Time taken with before delting statistics is : 00.00.06.05
  Time taken with before delting statistics is : 00.21.02.46
  
  Did you mean “after deleting” on the first line ?
  
  In principle, this is what dynamic sampling is supposed to do for you.
  
  The optimizer had collected some stats that did not help it generate a good plan; but with dynamic sampling it probably came up with a different view of the number of rows selected from each table and the effect of the join condition.
  
  Comment by Jonathan Lewis — June 21, 2010 @ 7:32 pm BST Jun 21,2010 | Reply
Dear Jonathan

Sorry it was a type , it should be :

Time taken after deleting statistics is : 00.00.06.05
Time before deleting statistics is : 00.21.02.46

The command i used to gather statistisc is :

EXEC DBMS_STATS.gather_table_stats(‘NATGEO’, ‘GEONOM’, estimate_percent => dbms_stats.auto_sample_size,Cascade=> true);

this i have done for all the tables in schema.

I have a doubt , is it fine to have tables with out statistisc ? I tried other ways of gathering statistics but still the performance is dead slow if i have statistics on the tables. Dynamic Smpling seems to be the best in our case.

Could you please tell me if i am gathering the statistisc in a wrong way ? If dynamic sampling is the best , then why does Oracle recommends gathering statistics or is it a buy in my database ?

My Oracle Verion is : 11.1.0.7 on Windows 2003.

Regards

MA

Comment by MA — June 22, 2010 @ 12:15 pm BST Jun 22,2010 | Reply
- MA
  
  Sorry about the delay – I failed to notice your response when you sent it originally.
  
  It is generally better to have gathered statistics tham to use dynamic sampling because the samples tend to be either quite small, or quite expensive. I find, though, that it’s a safer strategy to block general histogram collection and then create histograms only on the columns I know really need them.
  
  You have to remember, though, that there are cases where even 100% sample sizes produce statistics that are “locally” 100% accurate but still don’t allow Oracle to derive correct estimates of volume (the ‘correlated columns’ issue is an example of this) – at which point you need to take some special action, and dynamic sampling can be the ideal solution.
  
  Since you are on 11g, by the way, you may find cases where creating “extended statistics” or “virtual columns” will allow you do pinpoint and address some of your problems using normal stat collection.
  
  Comment by Jonathan Lewis — August 12, 2010 @ 8:22 am BST Aug 12,2010 | Reply
Dear Jonathan,

The behaviour of this hint is a little strange for me in some case.
Example:

/*create test table*/
CREATE TABLE MYTESTTABLE
(
DDGTT_ID NUMBER(22)
);

/*look for stat*/
select table_name,num_rows,blocks,empty_blocks,avg_row_len,at.SAMPLE_SIZE,at.LAST_ANALYZED from all_tables at where table_name = ‘MYTESTTABLE’;
–> no stats

/*simple explain*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select * from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1
–> dynamic_sampling used
–OK

/*same select with dynamic sampling hint*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select /*+ dynamic_sampling(ddgtt 2)*/ * from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1
–> dynamic_sampling used
–OK

/*fix stat*/
EXEC dbms_stats.set_table_stats(‘CKERVOAZOU’,’MYTESTTABLE’,numrows=>1000);
/*look for stat*/
select table_name,num_rows,blocks,empty_blocks,avg_row_len,at.SAMPLE_SIZE,at.LAST_ANALYZED from all_tables at where table_name = ‘MYTESTTABLE’
–> num_rows = 1000 blocks=100 avg_row_len=100

/*simple explain and stats fixed*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select * from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1000
–> dynamic_sampling NOT used
–>OK

/*same select with dynamic sampling hint and stats fixed*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select /*+ dynamic_sampling(ddgtt 2)*/ *from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1
–> dynamic_sampling used
–>Why not !! but it’s not the expected behaviour according the documentation(as per my comprehension)

/*insert some rows*/
insert into MYTESTTABLE select OBJECT_ID from all_objects where rownum cardinality=1000
–> Dynamic sampling NOT used
–> OK

/*same select with dynamic sampling hint and stats fixed*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select /*+ dynamic_sampling(ddgtt 2)*/ *from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1000 !!!!!
–> Dynamic sampling NOT used
–>very strange for me

Oracle 10.2.0.4 with optimizer_dynamic_sampling=2

Workaround is the cardinality hint??

thanks

Comment by christophe Kervoazou — August 9, 2010 @ 3:20 pm BST Aug 9,2010 | Reply
Christophe,

Unless I’ve missed something, you have two complaints:

/*same select with dynamic sampling hint and stats fixed*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select /*+ dynamic_sampling(ddgtt 2)*/ * from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1
–> dynamic_sampling used
–> Why not !! but it’s not the expected behaviour according the documentation(as per my comprehension)

You think that dynamic sampling should not be used in this case. But the second bullet point in my article describes exactly what you’ve done:
If you use the hint in the “table-level” form /*+ dynamic_sampling({alias} {level}) */ e.g. /*+ dynamic_sampling(ord 4) */ you are directing Oracle to sample a specific table unconditionally

Your second complaint is a case where you say that Oracle has not used dynamic sampling when it should:

/*same select with dynamic sampling hint and stats fixed*/
EXPLAIN PLAN SET STATEMENT_ID = ‘abc’ FOR
select /*+ dynamic_sampling(ddgtt 2)*/ *from MYTESTTABLE ddgtt;
SELECT * FROM TABLE(dbms_xplan.display(‘PLAN_TABLE’,’abc’,’ALL’));
–> cardinality=1000 !!!!!
–> Dynamic sampling NOT used
–>very strange for me

But a large section of my article is about a 10053 trace file that shows Oracle taking a dynamic sampling and then deciding not to use it – which is probably what has happened in your case. Given that you inserted 1,000 rows into the table and had statistics which said there were 1,000 rows the perfect agreement of sample and stats would be the perfect reason for Oracle to disregard the sample.

Comment by Jonathan Lewis — August 12, 2010 @ 8:33 am BST Aug 12,2010 | Reply
- Jonathan,
  
  Really sorry for my mistake. It’s true that in this case, there is nothing special :).
  In fact, my lack of understanding appears when I insert a very differnt amount of rows (10 000 rows or more by example)in the table.
  In this case, there is no sampling(confirm by the note in xplan) and statistic seems to be used(cardinality 1000) in spite of the hint.
  When the table contains data, behaviour of the hint is not the same?
  
  You are right it’s probably the same “problem” but in my case,no delete , CBO should not be mistaken.
  
  Sorry again
  
  Regards
  
  Comment by Christophe Kervoazou — August 12, 2010 @ 6:16 pm BST Aug 12,2010 | Reply
[…] first workaround is to add a hint to force Oracle to take a dynamic sample of the critical table. In this case I’ve instructed Oracle to use a 64 block sample. For […]

Pingback by Cardinalilty One « Oracle Scratchpad — August 22, 2010 @ 6:37 pm BST Aug 22,2010 | Reply

Does the dynamic sampling supports AS OF SCN.

tst_pre_eod@MIFEX3&gt; SELECT ParticipantID, BoardID, InstrumentID, MMListAction FROM
         TIBEX_MEMMHybridAdmView as of scn 6148947776 WHERE MEGroupID = 'ME1'
         ORDER BY Timestamp ASC  2    3
  4  /

Execution Plan
----------------------------------------------------------
Plan hash value: 39089053

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                             |     1 |   159 |    88   (3)| 00:00:02 |
|   1 |  SORT ORDER BY                |                             |     1 |   159 |    88   (3)| 00:00:02 |
|   2 |   NESTED LOOPS                |                             |     1 |   159 |    87   (2)| 00:00:02 |
|*  3 |    HASH JOIN                  |                             |    60 |  3780 |    86   (2)| 00:00:02 |
|*  4 |     TABLE ACCESS FULL         | TIBEX_INSTRUMENT            |    47 |  1598 |    59   (0)| 00:00:01 |
|   5 |     SORT UNIQUE               |                             |  2361 | 68469 |    26   (0)| 00:00:01 |
|*  6 |      INDEX FAST FULL SCAN     | XPKTIBEX_ADMINACK           |  2361 | 68469 |    26   (0)| 00:00:01 |
|*  7 |    TABLE ACCESS BY INDEX ROWID| TIBEX_HYBRIDMMINSTRADMIN    |     1 |    96 |     1   (0)| 00:00:01 |
|*  8 |     INDEX UNIQUE SCAN         | XPKTIBEX_HYBRIDMMINSTRADMIN |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B"."MEGROUPID"="C"."SERVERID")
   4 - filter("B"."MEGROUPID"='ME1')
   6 - filter("C"."SERVERID"='ME1')
   7 - filter("A"."INSTRUMENTID"="B"."INSTRUMENTID")
   8 - access("A"."ADMINID"="C"."ADMINID")

tst_pre_eod@MIFEX3&gt; SELECT ParticipantID, BoardID, InstrumentID, MMListAction FROM
         TIBEX_MEMMHybridAdmView  WHERE MEGroupID = 'ME1'
         ORDER BY Timestamp ASC;  2    3

Execution Plan
----------------------------------------------------------
Plan hash value: 1706923895

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                          |     1 |   159 |     5  (20)| 00:00:01 |
|   1 |  SORT ORDER BY                 |                          |     1 |   159 |     5  (20)| 00:00:01 |
|   2 |   NESTED LOOPS SEMI            |                          |     1 |   159 |     4   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |                          |     1 |   130 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL          | TIBEX_HYBRIDMMINSTRADMIN |     1 |    96 |     2   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS BY INDEX ROWID| TIBEX_INSTRUMENT         |     1 |    34 |     1   (0)| 00:00:01 |
|*  6 |      INDEX UNIQUE SCAN         | XPKTIBEX_INSTRUMENT      |     1 |       |     0   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN            | XPKTIBEX_ADMINACK        |   134K|  3799K|     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("B"."MEGROUPID"='ME1')
   6 - access("A"."INSTRUMENTID"="B"."INSTRUMENTID")
   7 - access("A"."ADMINID"="C"."ADMINID" AND "C"."SERVERID"='ME1')
       filter("B"."MEGROUPID"="C"."SERVERID")

Note
-----
   - dynamic sampling used for this statement

Comment by Narasimha — September 15, 2010 @ 2:39 pm BST Sep 15,2010 | Reply

Hi Jonathan,

Do you know a trick to force the CBO to use Dynamic sampling results instead of object statistics?
I would like to use dynamic sampling on a table but just for one query and without having to delete statistics.
Unless I use Dynamic Sampling at level 10, I can’t reach this goal. With level <10 Dynamic sampling is performed but is not taken into account in the selectivity calculation.

SQL> explain plan for
  2  select /*+ dynamic_sampling(m 10) */ count(*) from  MUT_OBJECTS m where OBJ_PARENT_ID = 79249575;
  
 
Explained
 
SQL> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2416190520
-------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost  |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |     1 |     6 |     1 |
|   1 |  SORT AGGREGATE   |               |     1 |     6 |       |
|*  2 |   INDEX RANGE SCAN| IDX_PARENT_ID |  1069 |  6414 |     1 |
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJ_PARENT_ID"=79249575)
Note
-----
   - cpu costing is off (consider enabling it)
   - dynamic sampling used for this statement


SQL> explain plan for select /*+ dynamic_sampling(m 9) */ count(*) from  MUT_OBJECTS m where OBJ_PARENT_ID = 79249575;
 
Explained
 
SQL> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2416190520
-------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost  |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |     1 |     6 |     1 |
|   1 |  SORT AGGREGATE   |               |     1 |     6 |       |
|*  2 |   INDEX RANGE SCAN| IDX_PARENT_ID |     3 |    18 |     1 |
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJ_PARENT_ID"=79249575)
Note
-----
   - cpu costing is off (consider enabling it)
 
18 rows selected

Thanks for your help

Comment by Ahmed AANGOUR — March 30, 2011 @ 12:48 pm BST Mar 30,2011 | Reply

Ahmed,

It looks as if you may be interested in values which are only “a little more popular” than average – which means Oracle doesn’t notice the special cases until you take a very large sample (which – in your case – has to be 100%).

However, you could try adding the hint dynamic_sampling_est_cdn() to your code. This was documented in the 9.2 manuals, but seems to have disappeared from the 10g manuals.

Comment by Jonathan Lewis — March 31, 2011 @ 4:34 pm BST Mar 31,2011 | Reply

Hi Jonathan,
I came across an issue when using dynamic_sampling on partitioned tables where the optimizer selectivity estimation resulted to be very poor.
After detailed analysis I came up with a guess that I explain below.
I was thinking that if you have a chance to read it you might be able to confirm or rectify this theory.

Regards,
Davide

DYNAMIC_SAMPLING BEHAVIOUR ON PARTITIONED TABLES

In the case of a partitioned table the dynamic_sampling engine behaves in a way that will be described below and that can be harmful in specific cases.
The dinamic_sampling level, set at query level using the dynamic_sampling hint or at database level setting the parameter optimizer_dynamic_sampling, indicates in addition to other information how many blocks need to be sampled.
Let’s call this value as N.
I have observed tracing the 10053 event that the actual sampled blocks in the case of a non-partitioned table is N-1, so I am assuming that one of the sampled blocks is the segment header that is probably needed to be read to randomly choose the actual blocks to be sampled.

In the case of a partitioned table I have observed the following behavior of the optimizer.
Assuming that “p” is the number of partitons hit by the query and “N” is the number of blocks to be sampled as indicated by the dynamic_sampling level, the optimizer will calculate the actual number of blocks to be sampled (“n”) as follows:

if p n=(N-p)
if p >= N –> n=N/2

I have explained this with the following theory:
when oracle uses the SAMPLE clause in a select statement (which is used in the sample query generated by the optimizer) it will not consider the segment header, which I assume is needed to be read to randomly choose the sampled blocks, to be included in the required percentage.
In the case of a partitioned table the number of segment headers read is equal to the number of the partitions hit by the query.
When using dynamic_sampling the optimizer doesn’t want to read extra blocks other than what indicated by the dynamic_sampling level, so it will calculate the percentage to be used in the SAMPLE clause considering the number of actual sampled blocks, which is the number of blocks indicated by the dynamic_sampling level (N) less the number of segment headers to be read.
In this case it will meake sure than the sample query will not read more than N blocks (segment headers included).
In the previous formulae if p = N the calculation N-p would give a number less or equal to 0 leading to an inappropriate value to be used in the SAMPLE clause. That’s why it defaults to a number of actual sampled blocks equal to N/2 (and indeed the sample query generated is an union of different SELECT statements hitting different partitons to a total of N/2).

Below there are some excerpts from the various 10053 trace files.
In all the cases the database parameter optimizer_dynamic_sampling was set to 4, which indicates that 32 is the number of blocks to be sampled:

1. Query hitting 1 partition:
total partitions : 1
partitions for sampling : 1
…
max. sample block cnt. : 32
sample block cnt. : 31

2. Query hitting 2 partitions:
total partitions : 1932
partitions for sampling : 2
…
max. sample block cnt. : 32
sample block cnt. : 30

3. Query hitting 31 partitions:
total partitions : 1932
partitions for sampling : 31
…
max. sample block cnt. : 32
sample block cnt. : 1

4. Query hitting 32 partitions:
total partitions : 1932
partitions for sampling : 32
partitions actually sampled from : 16
partitioning pct. : 1.655458
…
max. sample block cnt. : 32
sample block cnt. : 16

5. Query hitting 45 partitions:
total partitions : 1932
partitions for sampling : 45
partitions actually sampled from : 16
partitioning pct. : 2.329193
…
max. sample block cnt. : 32
sample block cnt. : 16

In the above excerpts “max. sample block cnt.” is the number of blocks to be sampled set by the dynamic_samplic level (our “N” variable), the “partitions for sampling” is the number of partitions hit by the query (our “p” variable) and “sample block cnt.” is the actual sampled blocks (our “n” variable).

You will see that when p is very close to N (i.e. excerpt number 3) we have a very small blocks actually used in the sample query, and this can unfortunately lead to an inaccurate estimation by the optimizer.

Comment by Davide Gislon — April 8, 2011 @ 4:27 pm BST Apr 8,2011 | Reply
- Davide,
  
  I don’t have any better knowledge than you do on this aspect of sampling. Your notes prompted me to set up a table with 1,000 partitions and try a few quick tests using the dynamic_sampling hint running through the levels 1 to 9 (which – in the hint form – start Oracle with 32 blocks and then keep doubling up).
  
  The results were interesting – variations on large percentages of a selection of partiitons to very small percentages of the whole table. And some of the results certainly looked consistent with your suggestions.
  
  It could be useful to work out more detail, but I think that I’d probably want to force larger sample sizes anyway when looking at large partitioned tables, so working at the bottom end of the sample size might be counterproductive.
  
  Comment by Jonathan Lewis — April 20, 2011 @ 2:00 pm BST Apr 20,2011 | Reply
Hi Jonathan,
I forgot to mention that the Oracle version where I ran my testing is 10.2.0.5, using ASSM.

Thanks,
Davide

Comment by Davide Gislon — April 11, 2011 @ 9:18 am BST Apr 11,2011 | Reply
[…] https://jonathanlewis.wordpress.com/2010/02/23/dynamic-sampling/ http://oracle-randolf.blogspot.com/2011/06/dynamic-sampling-public-synonyms-and.html Share this:TwitterFacebookLike this:LikeBe the first to like this post. […]

Pingback by Dynamic sampling pitfalls « savvinov — December 21, 2011 @ 10:57 am GMT Dec 21,2011 | Reply
[…] Source:- https://jonathanlewis.wordpress.com/2010/02/23/dynamic-sampling/ […]

Pingback by Dynamic Sampling: Use,Levels,10g-11g behavior, « SureshGandhi — October 7, 2012 @ 7:31 am BST Oct 7,2012 | Reply
[…] another option, though: get Oracle to use dynamic sampling. (You probably guessed that from the title of the post.) So which level would you use to make this […]

Pingback by Dynamic Sampling – 2 | Oracle Scratchpad — May 23, 2013 @ 12:47 pm BST May 23,2013 | Reply
[…] a couple of extracts from a trace file after I’ve set optimizer_dynamic_sampling to level 3. I’ve run two, very similar, SQL statements that both require dynamic sampling […]

Pingback by Cursor Sharing | Oracle Scratchpad — July 7, 2013 @ 5:49 pm BST Jul 7,2013 | Reply
[…] which shows it selected 63 blocks – this should have been 64 but Jonathan Lewis has pointed out that discrepancy already […]

Pingback by 11.2.0.4 new features and a OPTIMIZER_DYNAMIC_SAMPLING change « Oracle DBA – A lifelong learning experience — December 9, 2013 @ 9:14 pm GMT Dec 9,2013 | Reply
[…] just worth highlighing a little detail that is often overlooked, though: there are two variations on the “dynamic sampling” hint, the cursor level and the table level, and the number of blocks sampled at a particular level is […]

Pingback by Dynamic Sampling | Oracle Scratchpad — June 15, 2015 @ 9:41 pm BST Jun 15,2015 | Reply
[…] Source:- https://jonathanlewis.wordpress.com/2010/02/23/dynamic-sampling/ […]

Pingback by Dynamic Sampling: Use,Levels,10g-11g behavior, | Geeks Insights — August 18, 2015 @ 1:08 am BST Aug 18,2015 | Reply
[…] hint的方式进行。针对这个hint， Jonathan lewis在dynamic-sampling-1中有很详细描述，一种是cursor级别(/* +dynamic_sampling(4) */),一种是表级别(/* […]

Pingback by dynamic sampling（动态采样） | LEO Notes — August 17, 2016 @ 8:23 am BST Aug 17,2016 | Reply
[…] too small. So maybe the key to the problem is in the Note section that tells us that Oracle has (successfully) used dynamic sampling for this statement. In other words, all the theory of how the optimizer calculates selectivity may […]

Pingback by Cardinality Puzzle | Oracle Scratchpad — July 12, 2018 @ 12:58 pm BST Jul 12,2018 | Reply
[…] written about dynamic sampling in the past, but here’s a little wrinkle that’s easy to miss. How do you get the […]

Pingback by Dynamic Sampling – 2 | Oracle Scratchpad — January 29, 2019 @ 12:10 pm GMT Jan 29,2019 | Reply
[…] […]

Pingback by Как форсировать применение DYNAMIC SAMPLING на примерах LiveSQL версии 19c | Oracle mechanics — March 10, 2019 @ 9:30 pm GMT Mar 10,2019 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome. Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Scratchpad

February 23, 2010