There’s a very old mantra about the use of the “having” clause that tells us that if it’s valid (i.e. will always give the same results) then any predicate that could be moved from the having clause to the where clause should be moved. In recent versions of Oracle the optimizer will do this for itself in some cases but (for reasons that I’m not going to mention) I came across a silly example recently where a little manual editing produced a massive performance improvement.
Here’s a quick demo:
rem rem Script: where_having.sql rem Author: Jonathan Lewis rem Dated: Oct 2018 rem Purpose: rem rem Last tested rem 19.3.0.0 (Apr 2020) rem 18.3.0.0 rem 12.2.0.1 rem 11.2.0.4 rem create table t1 as select * from all_objects where rownum <= 50000 -- > comment to avoid WordPress format issue ; spool where_having.lst set serveroutput off select /*+ gather_plan_statistics */ object_type, count(*) from t1 group by object_type having count(*) > 0 and 1 = 2 ; select * from table(dbms_xplan.display_cursor(null,null,'allstats last')) ;
The big question is: will Oracle do a full tablescan of t1, or will it apply a “null is not null” filter early to bypass that part of the plan. Here’s the plan pulled from memory, with run-time statistics (all versions from 11g to 18c):
-------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | -------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.02 | 957 | 955 | | | | |* 1 | FILTER | | 1 | | 0 |00:00:00.02 | 957 | 955 | | | | | 2 | HASH GROUP BY | | 1 | 1 | 27 |00:00:00.02 | 957 | 955 | 1186K| 1186K| 1397K (0)| | 3 | TABLE ACCESS FULL| T1 | 1 | 50000 | 50000 |00:00:00.01 | 957 | 955 | | | | -------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter((COUNT(*)>0 AND 1=2))
As you can see, the filter at operation 1 includes the contradiction “1=2”, but Oracle tests this only after doing the full tablescan and aggregation. If you move the “1=2” into the where clause the tablescan doesn’t happen.
Interestingly, if you write the query with an in-line view and trailing where clause:
select /*+ gather_plan_statistics */ * from ( select object_type, count(*) from t1 group by object_type having count(*) > 0 ) where 1 = 2 ;
The optimizer is clever enough to push the final predicate inside the view (where you might expect it to become part of the having clause) and push it all the way down into a where clause on the base table.
----------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | |* 1 | FILTER | | 1 | | 0 |00:00:00.01 | | 2 | HASH GROUP BY | | 1 | 1 | 0 |00:00:00.01 | |* 3 | FILTER | | 1 | | 0 |00:00:00.01 | | 4 | TABLE ACCESS FULL| T1 | 0 | 50000 | 0 |00:00:00.01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(COUNT(*)>0) 3 - filter(NULL IS NOT NULL)
A quirky case of the optimizer handling the (apparently) more complex query better than it does the simpler query. But there’s a rationale that says CVM (complex view merging has an existing code path and this is just using it, while the original count(*) is following a generic “aggregate” codepath and would need special handling for the case that is “obvious” to the human eye.
Update (April 2020)
No change in 19c (19.3.0.0)
Comments and related questions are welcome.