Oracle Scratchpad

October 30, 2014

Quiz night

Filed under: CBO,Oracle — Jonathan Lewis @ 6:43 pm GMT Oct 30,2014

Here’s a little puzzle that came up on OTN recently.  (No prizes for following the URL to find the answer) (Actually, no prizes anyway). There’s more in the original code sample than was really needed, so although I’ve done a basic cut and paste from the original I’ve also eliminated a few lines of the text:


execute dbms_random.seed(0)

create table t
as
select rownum as id,
       100+round(ln(rownum/3.25+2)) aS val2,
       dbms_random.string('p',250) aS pad
from dual
connect by level <= 1000
order by dbms_random.value;

begin
  dbms_stats.gather_table_stats(ownname          => user,
                                tabname          => 'T',
                                method_opt       => 'for all columns size 254'
  );
end;
/

column endpoint_value format 9999
column endpoint_number format 999999
column frequency format 999999

select endpoint_value, endpoint_number,
       endpoint_number - lag(endpoint_number,1,0)
                  OVER (ORDER BY endpoint_number) AS frequency
from user_tab_histograms
where table_name = 'T'
and column_name = 'VAL2'
order by endpoint_number
;

alter session set optimizer_mode = first_rows_100;

explain plan set statement_id '101' for select * from t where val2 = 101;
explain plan set statement_id '102' for select * from t where val2 = 102;
explain plan set statement_id '103' for select * from t where val2 = 103;
explain plan set statement_id '104' for select * from t where val2 = 104;
explain plan set statement_id '105' for select * from t where val2 = 105;
explain plan set statement_id '106' for select * from t where val2 = 106;

select statement_id, cardinality from plan_table where id = 0;

The purpose of the method_opt in the gather_table_stats() call is to ensure we get a frequency histogram on val2; and the query against the user_tab_columns view should give the following result:


ENDPOINT_VALUE ENDPOINT_NUMBER FREQUENCY
-------------- --------------- ---------
           101               8         8
           102              33        25
           103             101        68
           104             286       185
           105             788       502
           106            1000       212

Given the perfect frequency histogram, the question then arises why the optimizer seems to calculate incorrect cardinalities for some of the queries; the output from the last query is as follows:


STATEMENT_ID                   CARDINALITY
------------------------------ -----------
101                                      8
102                                     25
103                                     68
104                                    100           -- expected prediction 185
105                                    100           -- expected prediction 502
106                                    100           -- expected prediction 212

I’ve disabled comments so that you can read the answer at OTN if you want to – but see if you can figure out the reason before reading it. (This reproduces on 11g and 12c – and probably on earlier versions all the way back to 9i).

I haven’t done anything extremely cunning with hidden parameters, materialized views, query rewrite, hidden function calls, virtual columns or any other very dirty tricks, by the way.

TrackBack URI

Blog at WordPress.com.