It’s amazing how you can find little bugs (or anomalies) as soon as you start to look closely at how things work in Oracle. I started to write an article for All Things Oracle last night about execution plans with subqueries, so wrote a little script to generate some sample data, set up the first sample query, checked the execution plan, and stopped because the final cost didn’t make sense. Before going on I should point out that this probably doesn’t matter and probably wouldn’t cause a change in the execution plan if the calculation were corrected – but it is just an interesting indication of the odd things that can happen when sections of modular code are combined in an open-ended way. Here’s the query (running on 11.2.0.4) with execution plan:
update t1 set n1 = ( select max(mod100) from t2 where t2.id = t1.id ), n2 = ( select max(trunc100) from t3 where t3.id = t1.id ) where id between 101 and 200 ; --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | UPDATE STATEMENT | | 101 | 1212 | 812 (25)| 00:00:05 | | 1 | UPDATE | T1 | | | | | |* 2 | INDEX RANGE SCAN | T1_I1 | 101 | 1212 | 2 (0)| 00:00:01 | | 3 | SORT AGGREGATE | | 1 | 7 | | | | 4 | FIRST ROW | | 1 | 7 | 3 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN (MIN/MAX)| T2_I1 | 1 | 7 | 3 (0)| 00:00:01 | | 6 | SORT AGGREGATE | | 1 | 7 | | | | 7 | FIRST ROW | | 1 | 7 | 3 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN (MIN/MAX)| T3_I1 | 1 | 7 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=101 AND "ID"<=200) 5 - access("T2"."ID"=:B1) 8 - access("T3"."ID"=:B1)
So the cost of running each of the subqueries is 3 – there are two of them, and we expect to run each of the 101 times: for a total cost of 606. So how do we get to 812 as the total cost of the query ?
Further testing:
- the cost of the plan for updating the two columns with constants is just 4.
- rebuild the indexes with different values for pctfree to see how the cost changes
- vary the number of columns updated by subquery
- check the 10053 trace – for issues or presentation vs. rounding, particularly
Ultimately I decided that for each column updated by subquery the optimizer added 1 to the cost of accessing the table for each row; or, to view it another way, the optimizer used “sum(subquery costs + 1) * number of rows to be updated” so (4 + 4) * 101 + a little bit for the driving table access = 812. This doesn’t seem entirely reasonable – given that a cost is essentially equivalent to assuming that a single block visit is a disk read when we know that when we update multiple columns of the same row we need only read the block into memory at most once. As I said at the start, though this anomaly in costing probably doesn’t matter – there are no further steps to be taken after the update so there’s nothing the optimizer might do differently if the cost of the update had been calculated as 612 rather then 812.
Footnote:
If you want to play about with this query, here’s the code to create the tables – with one proviso, the plan above happens to be one I produced after rebuilding the indexes on t2 and t3 with pctfree 99
create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select rownum id, mod(rownum-1,100) mod100, trunc((rownum - 1)/100) trunc100, rownum n1, rownum n2, lpad(rownum,6,'0') vc1, rpad('x',100) padding from generator where rownum <= 10000 ; create table t2 as select * from t1; create table t3 as select * from t1; create index t1_i1 on t1(id); create index t2_i1 on t2(id,mod100); create index t3_i1 on t3(id,trunc100); begin dbms_stats.gather_table_stats(user,'t1'); dbms_stats.gather_table_stats(user,'t2'); dbms_stats.gather_table_stats(user,'t3'); end; /
Jonathan, could you describe, please, how scalar subqueries costed in 12c?
It was interesting for me to compare scalar subqueries costing in 12c, since they are accounted in the total cost.
I have compared your update with “select” with scalar subqueries:
And I have got strange results:
and I’ve got cost twice less than I expected.
I have tested also with one subquery only:
I don’t understand, why every subquery costed twice less than just multipication 3*101? Why 151.5 instead of 101*3?
You can see 10053 trace I got for query with 4 subqueries: http://orasql.org/files/xtsql_ora_5748_subq.trc
Comment by Sayan Malakshinov — May 2, 2014 @ 9:50 pm BST May 2,2014 |
Hello Jonathan,
I’m running Oracle 11.2.0.3.0 and I’m actually getting a cost much closer to what you would expect: 610.
Kind regards,
Arnaud
Comment by Arnaud Wallon — May 5, 2014 @ 1:36 pm BST May 5,2014 |
[…] is a little oddity with the cost of this plan (see also: https://jonathanlewis.wordpress.com/2014/05/02/costing-bug/) – the total cost of 610 seems to come from summing the cost of executing the two subqueries 101 […]
Pingback by Execution Plans part 5: First Child Variations – All Things Oracle — May 28, 2014 @ 3:26 pm BST May 28,2014 |