Oracle Scratchpad

May 2, 2014

Costing Bug

Filed under: Bugs,CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 8:53 am BST May 2,2014

It’s amazing how you can find little bugs (or anomalies) as soon as you start to look closely at how things work in Oracle. I started to write an article for All Things Oracle last night about execution plans with subqueries, so wrote a little script to generate some sample data, set up the first sample query, checked the execution plan, and stopped because the final cost didn’t make sense. Before going on I should point out that this probably doesn’t matter and probably wouldn’t cause a change in the execution plan if the calculation were corrected – but it is just an interesting indication of the odd things that can happen when sections of modular code are combined in an open-ended way. Here’s the query (running on 11.2.0.4) with execution plan:


update t1 set 
	n1 = (
		select	max(mod100)
		from	t2
		where	t2.id = t1.id
	),
	n2 = (
		select	max(trunc100)
		from	t3
		where	t3.id = t1.id
	)
where
	id between 101 and 200
;

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT              |       |   101 |  1212 |   812  (25)| 00:00:05 |
|   1 |  UPDATE                       | T1    |       |       |            |          |
|*  2 |   INDEX RANGE SCAN            | T1_I1 |   101 |  1212 |     2   (0)| 00:00:01 |
|   3 |   SORT AGGREGATE              |       |     1 |     7 |            |          |
|   4 |    FIRST ROW                  |       |     1 |     7 |     3   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN (MIN/MAX)| T2_I1 |     1 |     7 |     3   (0)| 00:00:01 |
|   6 |   SORT AGGREGATE              |       |     1 |     7 |            |          |
|   7 |    FIRST ROW                  |       |     1 |     7 |     3   (0)| 00:00:01 |
|*  8 |     INDEX RANGE SCAN (MIN/MAX)| T3_I1 |     1 |     7 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID">=101 AND "ID"<=200)
   5 - access("T2"."ID"=:B1)
   8 - access("T3"."ID"=:B1)

So the cost of running each of the subqueries is 3 – there are two of them, and we expect to run each of the 101 times: for a total cost of 606. So how do we get to 812 as the total cost of the query ?

Further testing:

  • the cost of the plan for updating the two columns with constants is just 4.
  • rebuild the indexes with different values for pctfree to see how the cost changes
  • vary the number of columns updated by subquery
  • check the 10053 trace – for issues or presentation vs. rounding, particularly

Ultimately I decided that for each column updated by subquery the optimizer added 1 to the cost of accessing the table for each row; or, to view it another way, the optimizer used “sum(subquery costs + 1) * number of rows to be updated” so (4 + 4) * 101 + a little bit for the driving table access =  812. This doesn’t seem entirely reasonable – given that a cost is essentially equivalent to assuming that a single block visit is a disk read when we know that when we update multiple columns of the same row we need only read the block into memory at most once. As I said at the start, though this anomaly in costing probably doesn’t matter – there are no further steps to be taken after the update so there’s nothing the optimizer might do differently if the cost of the update had been calculated as 612 rather then 812.

Footnote:

If you want to play about with this query, here’s the code to create the tables – with one proviso, the plan above happens to be one I produced after rebuilding the indexes on t2 and t3 with pctfree 99


create table t1
as
with generator as (
	select  --+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum				id,	
	mod(rownum-1,100)		mod100,
	trunc((rownum - 1)/100)		trunc100,
	rownum				n1,
	rownum				n2,
	lpad(rownum,6,'0')		vc1,
	rpad('x',100)			padding
from
	generator
where
	rownum <= 10000
;

create table t2 as select * from t1;
create table t3 as select * from t1;

create index t1_i1 on t1(id);
create index t2_i1 on t2(id,mod100);
create index t3_i1 on t3(id,trunc100);

begin
	dbms_stats.gather_table_stats(user,'t1');
	dbms_stats.gather_table_stats(user,'t2');
	dbms_stats.gather_table_stats(user,'t3');
end;
/

3 Comments »

  1. Jonathan, could you describe, please, how scalar subqueries costed in 12c?

    It was interesting for me to compare scalar subqueries costing in 12c, since they are accounted in the total cost.
    I have compared your update with “select” with scalar subqueries:

    select
        n1,
        n2,
        (   select--+ no_unnest
              max(mod100)
            from    t2
            where   t2.id = t1.id
        ),
        (   select--+ no_unnest
               max(trunc100)
            from    t3
            where   t3.id = t1.id
        )
    from t1
    where
        id between 101 and 200;
    

    And I have got strange results:

    SQL> explain plan for
      2  update t1 set
      3      n1 = (
      4          select  max(mod100)
      5          from    t2
      6          where   t2.id = t1.id
      7      ),
      8      n2 = (
      9          select  max(trunc100)
     10          from    t3
     11          where   t3.id = t1.id
     12      )
     13  where
     14      id between 101 and 200;
    
    Explained.
    
    SQL> select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------
    Plan hash value: 324944520
    
    ---------------------------------------------------------------------------------------
    | Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT              |       |   101 |  1212 |   812  (25)| 00:00:01 |
    |   1 |  UPDATE                       | T1    |       |       |            |          |
    |*  2 |   INDEX RANGE SCAN            | T1_I1 |   101 |  1212 |     2   (0)| 00:00:01 |
    |   3 |   SORT AGGREGATE              |       |     1 |     7 |            |          |
    |   4 |    FIRST ROW                  |       |     1 |     7 |     3   (0)| 00:00:01 |
    |*  5 |     INDEX RANGE SCAN (MIN/MAX)| T2_I1 |     1 |     7 |     3   (0)| 00:00:01 |
    |   6 |   SORT AGGREGATE              |       |     1 |     7 |            |          |
    |   7 |    FIRST ROW                  |       |     1 |     7 |     3   (0)| 00:00:01 |
    |*  8 |     INDEX RANGE SCAN (MIN/MAX)| T3_I1 |     1 |     7 |     3   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("ID">=101 AND "ID"<=200)
       5 - access("T2"."ID"=:B1)
       8 - access("T3"."ID"=:B1)
    
    22 rows selected.
    
    SQL> explain plan for
      2  select
      3      n1,
      4      n2,
      5      (   select--+ no_unnest
      6            max(mod100)
      7          from    t2
      8          where   t2.id = t1.id
      9      ),
     10      (   select--+ no_unnest
     11             max(trunc100)
     12          from    t3
     13          where   t3.id = t1.id
     14      )
     15  from t1
     16  where
     17      id between 101 and 200;
    
    Explained.
    
    SQL> select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------------------------
    Plan hash value: 144489374
    
    ---------------------------------------------------------------------------------------------
    | Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |       |   101 |  1212 |   307   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE                     |       |     1 |     7 |            |          |
    |   2 |   FIRST ROW                         |       |     1 |     7 |     3   (0)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN (MIN/MAX)       | T2_I1 |     1 |     7 |     3   (0)| 00:00:01 |
    |   4 |  SORT AGGREGATE                     |       |     1 |     7 |            |          |
    |   5 |   FIRST ROW                         |       |     1 |     7 |     3   (0)| 00:00:01 |
    |*  6 |    INDEX RANGE SCAN (MIN/MAX)       | T3_I1 |     1 |     7 |     3   (0)| 00:00:01 |
    |   7 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   101 |  1212 |     4   (0)| 00:00:01 |
    |*  8 |   INDEX RANGE SCAN                  | T1_I1 |   101 |       |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("T2"."ID"=:B1)
       6 - access("T3"."ID"=:B1)
       8 - access("ID">=101 AND "ID"<=200)
    
    22 rows selected.
    

    and I’ve got cost twice less than I expected.
    I have tested also with one subquery only:

    SQL> explain plan for
      2  select
      3      n1,n2
      4     ,(   select--+ no_unnest
      5            max(mod100)
      6          from    t2
      7          where   t2.id = t1.id
      8      )
      9  from t1
     10  where
     11      id between 101 and 200;
    
    ---------------------------------------------------------------------------------------------
    | Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |       |   101 |  1212 |   156   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE                     |       |     1 |     7 |            |          |
    |   2 |   FIRST ROW                         |       |     1 |     7 |     3   (0)| 00:00:01 |
    |*  3 |    INDEX RANGE SCAN (MIN/MAX)       | T2_I1 |     1 |     7 |     3   (0)| 00:00:01 |
    |   4 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   101 |  1212 |     4   (0)| 00:00:01 |
    |*  5 |   INDEX RANGE SCAN                  | T1_I1 |   101 |       |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("T2"."ID"=:B1)
       5 - access("ID">=101 AND "ID"<=200)
    
    

    I don’t understand, why every subquery costed twice less than just multipication 3*101? Why 151.5 instead of 101*3?
    You can see 10053 trace I got for query with 4 subqueries: http://orasql.org/files/xtsql_ora_5748_subq.trc

    Comment by Sayan Malakshinov — May 2, 2014 @ 9:50 pm BST May 2,2014 | Reply

  2. Hello Jonathan,

    I’m running Oracle 11.2.0.3.0 and I’m actually getting a cost much closer to what you would expect: 610.

    Kind regards,
    Arnaud

    Plan hash value: 3456992656
     
    ---------------------------------------------------------------------------------------
    | Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT              |       |   101 |  1212 |   610  (34)| 00:00:07 |
    |   1 |  UPDATE                       | T1    |       |       |            |          |
    |*  2 |   INDEX RANGE SCAN            | T1_I1 |   101 |  1212 |     2   (0)| 00:00:01 |
    |   3 |   SORT AGGREGATE              |       |     1 |     7 |            |          |
    |   4 |    FIRST ROW                  |       |     1 |     7 |     2   (0)| 00:00:01 |
    |*  5 |     INDEX RANGE SCAN (MIN/MAX)| T2_I1 |     1 |     7 |     2   (0)| 00:00:01 |
    |   6 |   SORT AGGREGATE              |       |     1 |     7 |            |          |
    |   7 |    FIRST ROW                  |       |     1 |     7 |     2   (0)| 00:00:01 |
    |*  8 |     INDEX RANGE SCAN (MIN/MAX)| T3_I1 |     1 |     7 |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       2 - access("ID"&gt;=101 AND "ID"&lt;=200)
       5 - access(&quot;T2&quot;.&quot;ID&quot;=:B1)
       8 - access(&quot;T3&quot;.&quot;ID&quot;=:B1)
    

    Comment by Arnaud Wallon — May 5, 2014 @ 1:36 pm BST May 5,2014 | Reply

  3. […] is a little oddity with the cost of this plan (see also: http://jonathanlewis.wordpress.com/2014/05/02/costing-bug/) – the total cost of 610 seems to come from summing the cost of executing the two subqueries 101 […]

    Pingback by Execution Plans part 5: First Child Variations – All Things Oracle — May 28, 2014 @ 3:26 pm BST May 28,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,268 other followers