Oracle Scratchpad

August 6, 2012

Arithmetic

Filed under: Oracle — Jonathan Lewis @ 6:27 pm BST Aug 6,2012

Here’s an amusing little question that appeared on OTN a short while ago:

From the performance point of view, which of the following is better and why :

(1) select FILE_NAME, TABLESPACE_NAME, SUM(BYTES)/1024/1024/1024 from dba_data_files

(2)  select FILE_NAME, TABLESPACE_NAME, SUM(BYTES)/(1024*1024*1024) from dba_data_files

Assuming we add the necessary “group by” clause to the end of the queries, how could we find out if there is any difference (other than testing the queries on a system with a very large number of data files to see if we can spot a difference in the CPU usage caused the by change in the arithmetic expression ?

We could check to see if dbms_xplan gives us any clues – perhaps we would see some difference in the “projection” section of the output. (We don’t.)

How about checking the cpu_cost column from the plan table for the various steps of the two different plans – they may differ. (They dont’.)

But there probably is a difference (at least on 11.1.0.7, which is the version I happened to have in front of me at the time), and we can get a clue about it if we change the query a little. Try the following:

explain plan for
select 
	file_name, tablespace_name, sum(bytes)/(1024*1024*1024) 
from 
	dba_data_files 
group by 
	file_name, tablespace_name
having 
	sum(bytes)/(1024*1024*1024) > 0
;

select * from table(dbms_xplan.display);

Note that I’ve added a having clause that uses the same expression that I used in the select list. Here’s how it reappears in the predicate section of the output – Oracle has precalculated the result of the multiplication steps:

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(SUM("BYTES")/1073741824>0)

If we try the same trick with the alternative expresion, we get the following – Oracle hasn’t been able to do the same type of manipulation:

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(SUM("BYTES")/1024/1024/1024>0)

I wouldn’t take this as conclusive, of course, but it looks as if Oracle may have to perform three separate division operations for every single row returned in the second case but just one division per row in the first which might, therefore, use a little less CPU and be a little more efficient.

As a (slightly) corroborative piece of evidence, if you include both predicates in the having clause then the optimizer isn’t able to recognise that they are identical and eliminate one of them. Here’s the resulting predicate:

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(SUM("BYTES")/1073741824>0 AND SUM("BYTES")/1024/1024/1024>0)

4 Comments »

  1. And it turns out that my preferred expression “sum(bytes)/power(1024,3)” also gets pre-calculated and not eliminated (in 11.2.0.2):

    1 – filter(SUM(“BYTES”)/1073741824>0 AND SUM(“BYTES”)/1073741824>0)

    As a matter of fact, even exactly duplicated (before any transformation or pre-calculation) predicates survive. Must check whether they affect any cardinality estimates though.

    Comment by Flado — August 7, 2012 @ 7:22 am BST Aug 7,2012 | Reply

    • Flado,

      Do you have a couple of simple examples of duplicated preidcates surviving ? There have been various changes in the optimizer over the last few versions (10g onwards) to deal with different scenarios, but I hadn’t noticed any cases of duplicates surviving recently. Every version has to be checked in detail, of course, but as a general guideline (and ignoring predicates generated by RLS/FGAC in the very latest versions, I think) if a predicate appears twice it’s selectivity is applied twice.

      Contrary to your observation, I actually have an example in 11.2.0.3 where I would like Oracle to keep a duplicated predicate because that would allow it to use a very efficient execution path that is otherwise not available.

      Comment by Jonathan Lewis — August 7, 2012 @ 5:30 pm BST Aug 7,2012 | Reply

      • Sure – I only added (“having”) predicates to your example and worked on a materialised copy of dba_data_files to keep the plans short. I noticed some interesting things which – in hindsight – were only to be expected:
        1. duplicated predicates do affect the cardinality estimates when statistics are available.
        2. dynamic sampling saves the day: it samples with all the predicates and is therefore not affected by the redundancy or the function in the predicate.
        The version is 11.2.0.2.0 – 64bit on SLES 11.
        Here’s the script (two identical predicates and dynamic sampling):

        set echo off timi off
        drop table my_data_files purge;
        create table my_data_files as 
        select * from dba_data_files union all
        select * from dba_data_files union all
        select * from dba_data_files
        ;
        -- exec dbms_stats.gather_table_stats(null,'my_data_files')
        explain plan for
        select 
        	file_name, tablespace_name, sum(bytes)/(1024*1024*1024) 
        from 
        	my_data_files 
        group by 
        	file_name, tablespace_name
        having 
              sum(bytes)/(1024*1024*1024) > 0 
          and sum(bytes)/(1024*1024*1024) > 0 
        --  and sum(bytes)/power(1024,3) > 0
        ;
        
        select * from table(dbms_xplan.display);
        

        And the output:

        11:51:59 dbread@pd12m>@predicates
        
        Table dropped.
        
        
        Table created.
        
        
        Explained.
        
        
        PLAN_TABLE_OUTPUT
        ----------------------------------
        Plan hash value: 3621239192
        
        -------------------------------------------------------------------------------------
        | Id  | Operation           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
        -------------------------------------------------------------------------------------
        |   0 | SELECT STATEMENT    |               |    66 | 19008 |     4  (25)| 00:00:01 |
        |*  1 |  FILTER             |               |       |       |            |          |
        |   2 |   HASH GROUP BY     |               |    66 | 19008 |     4  (25)| 00:00:01 |
        |   3 |    TABLE ACCESS FULL| MY_DATA_FILES |    66 | 19008 |     3   (0)| 00:00:01 |
        -------------------------------------------------------------------------------------
        
        Predicate Information (identified by operation id):
        ---------------------------------------------------
        
           1 - filter(SUM("BYTES")/1073741824>0 AND SUM("BYTES")/1073741824>0)
        
        Note
        -----
           - dynamic sampling used for this statement (level=2)
        
        19 rows selected.
        
        
        

        Hope this helps.

        Comment by Flado — August 8, 2012 @ 9:54 am BST Aug 8,2012 | Reply

        • Flado,

          Thanks for that – the impact of the HAVING clause is a detail I haven’t looked at for a very long time.
          I suspect the effects are version dependent (when, for example) can the optimizer push the having clause arithmetic down into the WHERE clause arithmetic).

          I’ve just done a couple of quick tests with a larger data set, and come up with two cases – one where the arithmetic is applied twice, and one where it’s applied once. (The predicate still appears twice in the execution plan output, though). I’ll have to write up the details some time.

          Comment by Jonathan Lewis — August 8, 2012 @ 10:31 am BST Aug 8,2012


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,257 other followers