Oracle Scratchpad

October 19, 2012

count(*)

Filed under: Execution plans,Oracle,Performance — Jonathan Lewis @ 6:48 pm BST Oct 19,2012

I came across a nice bit of code on OTN recently that someone had written several years ago (in 2007, in fact)  to demonstrate the different ways in which the optimizer would handle “select count({something}) from table;”. If you want to copy and repeat the test code, you may need to adjust it slightly – it references a type vc2s, which I changed to dbms_stats.chararray, and it references a plan table called toad_plan_table, which I replaced with references to the standard plan_table (getting rid of the truncate as I did so).

The code simply executes a call to explain plan for different statements, then extracts the projection information for the “sort aggregate” line of the execution plan (examining the operation and options columns to do so). The thing that particularly surprised me was that there was a difference between the following two queries:

select count( 1) from t1;
select count(-1) from t1;

According to the test, the first query is converted to count(*), but the second query really does count a load ‘-1’s (whatever that means).

This didn’t seem entirely reasonable, but even though the code was over 5 years old it had been tested against 10gR2. I decided to repeat the test (using 11.2.0.3), and found that there was a slight flaw in the test because Oracle wasn’t quite telling the truth. (The implication of the test is still correct, by the way, because the two queries do, surprisingly, act differently). To highlight the anomaly, I’ve created a data set, and generated the full execution plan with projection for the two queries – and there’s an important detail in the data set that makes it easy to see the difference:

create table t1
as
select  *
from    all_objects
where   rownum <= 20000 ;  create bitmap index t1_b1 on t1(owner);  begin         dbms_stats.gather_table_stats(                 ownname          => user,
                tabname          =>'T1',
                estimate_percent => 100,
                method_opt       => 'for all columns size 1'
        );
end;
/

Note, particularly that I’ve created a bitmap index on the owner column; with stats in place the fastest way of counting the rows in this table will be through a fast full scan on the (tiny) bitmap index. Here are the two plans – count(1) first, then count(-1):

-------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE               |       |     1 |            |          |
|   2 |   BITMAP CONVERSION COUNT     |       | 20000 |     1   (0)| 00:00:01 |
|   3 |    BITMAP INDEX FAST FULL SCAN| T1_B1 |       |            |          |
-------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) COUNT(*)[22]
   2 - COUNT(*)[22]
   3 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 7920]

-------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE               |       |     1 |            |          |
|   2 |   BITMAP CONVERSION TO ROWIDS |       | 20000 |     1   (0)| 00:00:01 |
|   3 |    BITMAP INDEX FAST FULL SCAN| T1_B1 |       |            |          |
-------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - (#keys=0) COUNT((-1))[22]
   2 - "T1".ROWID[ROWID,10]
   3 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 7920]

There are two things to notice here. For count(1) the optimizer has a special operation on bitmap indexes which is the bitmap conversion count – it’s basically counting set bits. For count(-1) the optimizer doesn’t recognise that the same strategy could be used, instead it does a load of arithmetic to convert bits to rowids (that’s why the projection in line 2 is t1.rowid[rowid,10]), so at this point the “-1″ simply doesn’t exist which means the sort aggregate in line 1 is actually counting rowids, even though it seems to be claiming that what it had counted was a load of ‘-1’s. The little lie at the end doesn’t really matter, of course – but I have occasionally seen variations on the bit in the middle (the unnecessary conversion to rowids) producing significant performance side effects in production code.

8 Comments »

  1. Jonathan, thank you for publishing this interesting bit of research!
    You may consider this trivial, but just to clarify: Is there any good reason why one should not do count(*) but count(1) instead?
    Count(*) should be just fine, I thought.

    Kind regards
    Uwe

    Comment by Uwe Hesse — October 21, 2012 @ 4:01 pm BST Oct 21,2012 | Reply

    • Uwe,

      Count(*) should be fine and probably gives Oracle the best possible chance of finding the cheapest strategy.
      The only reason to count anything else would be if the thing you are counting includes nulls (which would be excluded from the count).

      Comment by Jonathan Lewis — December 13, 2012 @ 11:17 am BST Dec 13,2012 | Reply

  2. interesting !

    autotrace also shows another plan for counting unique 1’s as for counting unique -1’s:

    sokrates@11.2 > set autotrace on explain
    sokrates@11.2 > select count(unique +1) from t1;
    
    COUNT(UNIQUE+1)
    ---------------
                  1
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 1606705146
    
    ---------------------------------------------------------------------------------
    | Id  | Operation                     | Name    | Rows  | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |         |     1 |     1   (0)| 00:00:01 |
    |   1 |  SORT GROUP BY                |         |     1 |            |          |
    |   2 |   BITMAP CONVERSION TO ROWIDS |         | 20000 |     1   (0)| 00:00:01 |
    |   3 |    BITMAP INDEX FAST FULL SCAN| T1_BIDX |       |            |          |
    ---------------------------------------------------------------------------------
    
    sokrates@11.2 > select count(unique -1) from t1;
    
    COUNT(UNIQUE-1)
    ---------------
                  1
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3992192035
    
    -------------------------------------------------------------------------------------------
    | Id  | Operation                      | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT               |          |     1 |     3 |     1   (0)| 00:00:01 |
    |   1 |  SORT AGGREGATE                |          |     1 |     3 |            |          |
    |   2 |   VIEW                         | VW_DAG_0 | 20000 | 60000 |     1   (0)| 00:00:01 |
    |   3 |    SORT GROUP BY NOSORT        |          | 20000 |       |     1   (0)| 00:00:01 |
    |   4 |     BITMAP INDEX FAST FULL SCAN| T1_BIDX  |       |       |            |          |
    -------------------------------------------------------------------------------------------
    

    Comment by Matthias Rogel — October 21, 2012 @ 6:46 pm BST Oct 21,2012 | Reply

    • Matthias,
      Interesting – when you use dbms_xplan and look at the projection it’s the opposite to the original example. Count(distinct 1) projects the rowids, count(distinct -1) projects (-1’s)

      I tend to use “distinct” rather than “unique”, by the way. I have an idea that “unique” in this context was deprecated by Oracle a few versions ago (ca. v6).

      Comment by Jonathan Lewis — December 13, 2012 @ 11:23 am BST Dec 13,2012 | Reply

      • I’ve just found on our russian forum another magic behaviour for negative numbers: http://bit.ly/P5E2e9

        -- bug:
        select x, sum(x) over (order by 1 range between 1/3 preceding and 1/3 following) s 
        from (select 1 x from dual union all select 2 from dual);
        
                 X          S
        ---------- ----------
                 1          3 
                 2   -6.8E+57 
        
        -- workaround #1:
        select x, sum(x) over (order by 1 range between 1/3 preceding and round(1/3,30) following) s 
        from (select 1 x from dual union all select 2 from dual);
        
                 X          S
        ---------- ----------
                 1          3 
                 2          3 
        
        -- workaround #2 - with negative numbers:
        SQL> select x, sum(x) over (order by -1 range between 1/3 preceding and 1/3 following) s
          2  from (select 1 x from dual union all select 2 from dual);
        
                 X          S
        ---------- ----------
                 1          3
                 2          3
        

        Comment by Sayan Malakshinov — March 6, 2014 @ 9:00 pm BST Mar 6,2014 | Reply

        • sokrates@12.1 > select x, sum(x) over (order by cast(null as number) range between 1/3 preceding and 1/3 following) s from (select 1 x from dual union all select 2 from dual)
            2  /
          
          	 X	    S
          ---------- ----------
          	 2	    3
          	 1	    3
          
          sokrates@12.1 > select x, 1e50*sum(x) over (order by 0 range between 1/3 preceding and 1/3 following) s from (select 1 x from dual union all select 2 from dual)
            2  /
          ERROR:
          ORA-03113: end-of-file on communication channel
          Process ID: 6691
          Session ID: 38 Serial number: 41
          
          
          
          no rows selected
          
          

          Comment by Matthias Rogel — March 6, 2014 @ 9:16 pm BST Mar 6,2014

  3. I have no intents to start a discussion on count(*) vs count(1). But I have observed that In case of Parent – Child relationship with FK not indexed, if you delete a record from Parent table, select count(1) query is executed in background to find if any child record exists.

    Below is the test case, you will have to activate 10046 to see the query executed against child table. Tested on 11.2.0.1.0 and 10.2.0.5.0 on AIX.

    create table parent(col1 number not null,col2 varchar2(100));
    
    insert into parent select rownum rn,rpad('x',100,'x') from dual connect by level <=10000;
    
    alter table parent add constraint pk_parent primary key (col1);
    
    create table child(col1 number not null,col2 number not null,col3 varchar2(100));
    
    insert into child select rownum rn, mod(rownum,10000)+1,rpad('x',100,'x') from dual connect by level <=100000;
    
    alter table child add constraint fk_child foreign key (col2) references parent(col1);
    
    alter session set timed_statistics = true;
    
    alter session set max_dump_file_size = unlimited;
    
    alter session set statistics_level=all;
    
    alter session set tracefile_identifier = '&1';
    
    alter session set events '10046 trace name context forever, level &2';
    
    delete from child where col2=10;
    
    delete from parent where col1=10;
    
    commit;
    
    disconnect
    
    

    Regards,
    Vijay Sehgal.

    Comment by vijaysehgal — November 5, 2012 @ 10:41 am BST Nov 5,2012 | Reply

    • Vijay,

      Indeed, though when you pull the plan, with projection, from memory, you see the following:

      
      
      PLAN_TABLE_OUTPUT
      ------------------------------------------------------------------------------------------------------------------------
      SQL_ID  dh9mmzvc7h62y, child number 0
      -------------------------------------
       select /*+ all_rows */ count(1) from "TEST_USER"."CHILD" where "COL2"
      = :1
      
      Plan hash value: 1546491375
      
      ----------------------------------------------------------------------------
      | Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
      ----------------------------------------------------------------------------
      |   0 | SELECT STATEMENT   |       |       |       |   212 (100)|          |
      |   1 |  SORT AGGREGATE    |       |     1 |    13 |            |          |
      |*  2 |   TABLE ACCESS FULL| CHILD |   962 | 12506 |   212   (4)| 00:00:02 |
      ----------------------------------------------------------------------------
      
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      
         2 - filter("COL2"=:1)
      
      Column Projection Information (identified by operation id):
      -----------------------------------------------------------
      
         1 - (#keys=0) COUNT(*)[22]
      
      
      

      Note the count(*) that has actually been used.
      That piece of code for the referential integrity check probably dates back to v6 or v7.

      Comment by Jonathan Lewis — December 13, 2012 @ 11:32 am BST Dec 13,2012 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,990 other followers