Here’s a little script I wrote a few years ago to make a point about using the dbms_stats package. I’ve just re-run it on 10.2.0.3 to see if it still behaves the way it used to – and it does. If you want to be just a little bit baffled, set up a database with an 8KB blocks size, a tablespace that is locally managed, uniform extent size of 1MB, using freelist management, then run the script:
drop table t2;
drop table t1;
begin
begin execute immediate 'purge recyclebin';
exception when others then null;
end;
begin execute immediate 'execute dbms_stats.delete_system_stats';
exception when others then null;
end;
begin execute immediate 'alter session set "_optimizer_cost_model"=io';
exception when others then null;
end;
end;
/
alter session set workarea_size_policy = manual;
alter session set hash_area_size = 1048576;
create table t1
as
with generator as (
select --+ materialize
rownum id
from all_objects
where rownum <= 5000
)
select
rownum id,
rownum n1,
lpad(rownum,10,'0') v00,
lpad(rownum,10,'0') v01,
lpad(rownum,10,'0') v02,
lpad(rownum,10,'0') v03,
lpad(rownum,10,'0') v04,
lpad(rownum,10,'0') v05,
lpad(rownum,10,'0') v06,
lpad(rownum,10,'0') v07,
lpad(rownum,10,'0') v08,
lpad(rownum,10,'0') v09,
lpad(rownum,10,'0') v10,
lpad(rownum,10,'0') v11,
lpad(rownum,10,'0') v12,
lpad(rownum,10,'0') v13,
lpad(rownum,10,'0') v14,
lpad(rownum,10,'0') v15,
lpad(rownum,10,'0') v16,
lpad(rownum,10,'0') v17,
lpad(rownum,10,'0') v18,
lpad(rownum,10,'0') v19,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 10000
;
create table t2
as
with generator as (
select --+ materialize
rownum id
from all_objects
where rownum <= 5000
)
select
rownum id,
rownum n1,
lpad(rownum,200,'0') v100,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 10000
;
analyze table t1 compute statistics;
analyze table t2 compute statistics;
set autotrace traceonly explain
prompt =============
prompt Using Analyze
prompt =============
select
t1.v00,
t1.v01,
t1.v02,
t1.v03,
t1.v04,
t1.v05,
t1.v06,
t1.v07,
t1.v08,
t1.v09,
t1.v10,
t1.v11,
t1.v12,
t1.v13,
t1.v14,
t1.v15,
t1.v16,
t1.v17,
t1.v18,
t1.v19,
t2.v100
from
t1,
t2
where
t1.id = t2.id
;
set autotrace off
begin
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'t1',
estimate_percent => 100,
method_opt => 'for all columns size 1'
);
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'t2',
estimate_percent => 100,
method_opt => 'for all columns size 1'
);
end;
/
set autotrace traceonly explain
prompt ================
prompt Using dbms_stats
prompt ================
select
t1.v00,
t1.v01,
t1.v02,
t1.v03,
t1.v04,
t1.v05,
t1.v06,
t1.v07,
t1.v08,
t1.v09,
t1.v10,
t1.v11,
t1.v12,
t1.v13,
t1.v14,
t1.v15,
t1.v16,
t1.v17,
t1.v18,
t1.v19,
t2.v100
from
t1,
t2
where
t1.id = t2.id
;
set autotrace off
Note that the script runs the same query twice – check the two execution plans carefully, they may not be the same.

Isn’t the analyze command deprecated for gathering stats for the CBO?
Comment by Colin 't Hart — March 31, 2010 @ 9:01 pm UTC Mar 31,2010 |
Colin,
Read the first sentence – and then ask yourself what point I might have been trying to make a few years ago.
Comment by Jonathan Lewis — March 31, 2010 @ 9:35 pm UTC Mar 31,2010 |
I tested on my machine:
10.1.0.2.0 on Windows 32bit:
Analyze ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=349 Card=10000 Bytes=4060000) 1 0 HASH JOIN (Cost=349 Card=10000 Bytes=4060000) 2 1 TABLE ACCESS (FULL) OF 'T1' (TABLE) (Cost=76 Card=10000 Bytes=2030000) 3 1 TABLE ACCESS (FULL) OF 'T2' (TABLE) (Cost=73 Card=10000 Bytes=2030000) DBMS_STATS ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=365 Card=10000 Bytes=4290000) 1 0 HASH JOIN (Cost=365 Card=10000 Bytes=4290000) 2 1 TABLE ACCESS (FULL) OF 'T2' (TABLE) (Cost=73 Card=10000 Bytes=2050000) 3 1 TABLE ACCESS (FULL) OF 'T1' (TABLE) (Cost=76 Card=10000 Bytes=2240000)So looks like for analyze CBO will think the row length in the two tables are same but for DBMS_STATS will also consider the columns count, right?
Comment by dhhb — April 1, 2010 @ 6:08 am UTC Apr 1,2010 |
ANALYZE is not counting the length bytes?
The point being, bad statistics can have real, detrimental impact (in this case, slightly wrong hash join order) on execution plan quality even if we don’t use CPU costing?
By the way, extent sizing policy (auto vs. 1M) and free space management (ASSM vs. freelist) didn’t make any difference in my test (10.2.0.3 under RedHat).
Cheers,
Flado
Comment by Flado — April 1, 2010 @ 8:50 am UTC Apr 1,2010 |
[...] Statistics — Jonathan Lewis @ 7:58 pm UTC Apr 1,2010 If you’ve run the scripts from Analyze This, I hope you found that the query gave you two different execution plans. This is my output from the [...]
Pingback by Analyze This – 2 « Oracle Scratchpad — April 1, 2010 @ 8:00 pm UTC Apr 1,2010 |
I do this test on my test db.
os: Red Hat Enterprise Linux Server release 5.2 (Tikanga)
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
--After analyze
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 3964K| | 292 |
|* 1 | HASH JOIN | | 10000 | 3964K| 3506K| 292 |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 1982K| | 47 |
| 3 | TABLE ACCESS FULL| T2 | 10000 | 1982K| | 45 |
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."ID"="T2"."ID")
Note
-----
- cpu costing is off (consider enabling it)
--After using dbms_stats
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 4189K| | 308 |
|* 1 | HASH JOIN | | 10000 | 4189K| 3662K| 308 |
| 2 | TABLE ACCESS FULL| T2 | 10000 | 2001K| | 45 |
| 3 | TABLE ACCESS FULL| T1 | 10000 | 2187K| | 47 |
-------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."ID"="T2"."ID")
Note
-----
- cpu costing is off (consider enabling it)
Comment by jametong — April 3, 2010 @ 1:18 pm UTC Apr 3,2010 |
[...] to use DBMS_STATS instead of analyze ? Jonathan Lewis-Analyze this Jonathan Lewis-Analyze this [...]
Pingback by Blogroll Report 26/03 /2010 – 02/04/2010 « Coskan’s Approach to Oracle — May 23, 2010 @ 10:21 pm UTC May 23,2010 |