I came across a nice bit of code on OTN recently that someone had written several years ago (in 2007, in fact) to demonstrate the different ways in which the optimizer would handle “select count({something}) from table;”. If you want to copy and repeat the test code, you may need to adjust it slightly – it references a type vc2s, which I changed to dbms_stats.chararray, and it references a plan table called toad_plan_table, which I replaced with references to the standard plan_table (getting rid of the truncate as I did so).
The code simply executes a call to explain plan for different statements, then extracts the projection information for the “sort aggregate” line of the execution plan (examining the operation and options columns to do so). The thing that particularly surprised me was that there was a difference between the following two queries:
select count( 1) from t1; select count(-1) from t1;
According to the test, the first query is converted to count(*), but the second query really does count a load ’-1′s (whatever that means).
This didn’t seem entirely reasonable, but even though the code was over 5 years old it had been tested against 10gR2. I decided to repeat the test (using 11.2.0.3), and found that there was a slight flaw in the test because Oracle wasn’t quite telling the truth. (The implication of the test is still correct, by the way, because the two queries do, surprisingly, act differently). To highlight the anomaly, I’ve created a data set, and generated the full execution plan with projection for the two queries – and there’s an important detail in the data set that makes it easy to see the difference:
create table t1
as
select *
from all_objects
where rownum <= 20000 ; create bitmap index t1_b1 on t1(owner); begin dbms_stats.gather_table_stats( ownname => user,
tabname =>'T1',
estimate_percent => 100,
method_opt => 'for all columns size 1'
);
end;
/
Note, particularly that I’ve created a bitmap index on the owner column; with stats in place the fastest way of counting the rows in this table will be through a fast full scan on the (tiny) bitmap index. Here are the two plans – count(1) first, then count(-1):
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | BITMAP CONVERSION COUNT | | 20000 | 1 (0)| 00:00:01 | | 3 | BITMAP INDEX FAST FULL SCAN| T1_B1 | | | | ------------------------------------------------------------------------------- Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=0) COUNT(*)[22] 2 - COUNT(*)[22] 3 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 7920] ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | BITMAP CONVERSION TO ROWIDS | | 20000 | 1 (0)| 00:00:01 | | 3 | BITMAP INDEX FAST FULL SCAN| T1_B1 | | | | ------------------------------------------------------------------------------- Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=0) COUNT((-1))[22] 2 - "T1".ROWID[ROWID,10] 3 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 7920]
There are two things to notice here. For count(1) the optimizer has a special operation on bitmap indexes which is the bitmap conversion count – it’s basically counting set bits. For count(-1) the optimizer doesn’t recognise that the same strategy could be used, instead it does a load of arithmetic to convert bits to rowids (that’s why the projection in line 2 is t1.rowid[rowid,10]), so at this point the “-1″ simply doesn’t exist which means the sort aggregate in line 1 is actually counting rowids, even though it seems to be claiming that what it had counted was a load of ‘-1′s. The little lie at the end doesn’t really matter, of course – but I have occasionally seen variations on the bit in the middle (the unnecessary conversion to rowids) producing significant performance side effects in production code.

Jonathan, thank you for publishing this interesting bit of research!
You may consider this trivial, but just to clarify: Is there any good reason why one should not do count(*) but count(1) instead?
Count(*) should be just fine, I thought.
Kind regards
Uwe
Comment by Uwe Hesse — October 21, 2012 @ 4:01 pm UTC Oct 21,2012 |
Uwe,
Count(*) should be fine and probably gives Oracle the best possible chance of finding the cheapest strategy.
The only reason to count anything else would be if the thing you are counting includes nulls (which would be excluded from the count).
Comment by Jonathan Lewis — December 13, 2012 @ 11:17 am UTC Dec 13,2012 |
interesting !
autotrace also shows another plan for counting unique 1′s as for counting unique -1′s:
sokrates@11.2 > set autotrace on explain sokrates@11.2 > select count(unique +1) from t1; COUNT(UNIQUE+1) --------------- 1 Execution Plan ---------------------------------------------------------- Plan hash value: 1606705146 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | SORT GROUP BY | | 1 | | | | 2 | BITMAP CONVERSION TO ROWIDS | | 20000 | 1 (0)| 00:00:01 | | 3 | BITMAP INDEX FAST FULL SCAN| T1_BIDX | | | | --------------------------------------------------------------------------------- sokrates@11.2 > select count(unique -1) from t1; COUNT(UNIQUE-1) --------------- 1 Execution Plan ---------------------------------------------------------- Plan hash value: 3992192035 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 3 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | VIEW | VW_DAG_0 | 20000 | 60000 | 1 (0)| 00:00:01 | | 3 | SORT GROUP BY NOSORT | | 20000 | | 1 (0)| 00:00:01 | | 4 | BITMAP INDEX FAST FULL SCAN| T1_BIDX | | | | | -------------------------------------------------------------------------------------------Comment by Matthias Rogel — October 21, 2012 @ 6:46 pm UTC Oct 21,2012 |
Matthias,
Interesting – when you use dbms_xplan and look at the projection it’s the opposite to the original example. Count(distinct 1) projects the rowids, count(distinct -1) projects (-1′s)
I tend to use “distinct” rather than “unique”, by the way. I have an idea that “unique” in this context was deprecated by Oracle a few versions ago (ca. v6).
Comment by Jonathan Lewis — December 13, 2012 @ 11:23 am UTC Dec 13,2012 |
I have no intents to start a discussion on count(*) vs count(1). But I have observed that In case of Parent – Child relationship with FK not indexed, if you delete a record from Parent table, select count(1) query is executed in background to find if any child record exists.
Below is the test case, you will have to activate 10046 to see the query executed against child table. Tested on 11.2.0.1.0 and 10.2.0.5.0 on AIX.
create table parent(col1 number not null,col2 varchar2(100)); insert into parent select rownum rn,rpad('x',100,'x') from dual connect by level <=10000; alter table parent add constraint pk_parent primary key (col1); create table child(col1 number not null,col2 number not null,col3 varchar2(100)); insert into child select rownum rn, mod(rownum,10000)+1,rpad('x',100,'x') from dual connect by level <=100000; alter table child add constraint fk_child foreign key (col2) references parent(col1); alter session set timed_statistics = true; alter session set max_dump_file_size = unlimited; alter session set statistics_level=all; alter session set tracefile_identifier = '&1'; alter session set events '10046 trace name context forever, level &2'; delete from child where col2=10; delete from parent where col1=10; commit; disconnectRegards,
Vijay Sehgal.
Comment by vijaysehgal — November 5, 2012 @ 10:41 am UTC Nov 5,2012 |
Vijay,
Indeed, though when you pull the plan, with projection, from memory, you see the following:
PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID dh9mmzvc7h62y, child number 0 ------------------------------------- select /*+ all_rows */ count(1) from "TEST_USER"."CHILD" where "COL2" = :1 Plan hash value: 1546491375 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 212 (100)| | | 1 | SORT AGGREGATE | | 1 | 13 | | | |* 2 | TABLE ACCESS FULL| CHILD | 962 | 12506 | 212 (4)| 00:00:02 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("COL2"=:1) Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=0) COUNT(*)[22]Note the count(*) that has actually been used.
That piece of code for the referential integrity check probably dates back to v6 or v7.
Comment by Jonathan Lewis — December 13, 2012 @ 11:32 am UTC Dec 13,2012 |