Here’s a little detail about how the optimizer can handle the nvl() function that I hadn’t noticed before (and it goes back to at least 8i). This is running on 11.2.0.3, and table t1 is just all_objects where rownum <= 20000:
SQL> select count(*) from t1 where nvl(subobject_name,'x') = 'x'; Execution Plan ---------------------------------------------------------- Plan hash value: 3724264953 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2 | 38 (6)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 2 | | | |* 2 | TABLE ACCESS FULL| T1 | 19812 | 39624 | 38 (6)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(NVL("SUBOBJECT_NAME",'x')='x') SQL> select count(*) from t1 where subobject_name is null; Execution Plan ---------------------------------------------------------- Plan hash value: 3724264953 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2 | 37 (3)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 2 | | | |* 2 | TABLE ACCESS FULL| T1 | 19812 | 39624 | 37 (3)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SUBOBJECT_NAME" IS NULL)
The optimizer seems to have worked out that my nvl() predicate is basically equivalent to my column is null predicate!
In fact it’s a little more sophisticated than this (which is good, since the two predicates aren’t really equivalent), my nvl() predicate was using a value that I knew couldn’t be in the table and was so far outside the known range of values that its impact on the optimizer;s cardinality calculation was negligible. For a value inside the known range the cardinality was equivalent to the sum of the cardinalities of “column IS NULL” and “column = {constant}”. In other words, it’s the cardinality for the disjunct “column is null or column = {constant}”.
I suppose it’s not really surprising given the way the optimizer can deal with the over-used predicate “columnX = nvl(:bind, columnX)”
Jonathan,
very interesting. In the last years I frequently opted to use COALESCE or CASE instead of NVL for their benefit of short-circuit evaluation, but apparently the CBO prefers NVL – falling back to the standard 1 percent selectivity for functions for the other options:
Sometimes it would be nice if the answer to questions of feature usage was not: it depends …
Martin
Comment by Martin Preiss — August 13, 2013 @ 9:43 pm BST Aug 13,2013 |
Martin,
Thanks for the observation.
Interestingly if you use decode() you get the same effect as nvl(), though you have to be careful to do it right:
If you leave off the final reference to subobject_name you get the 1% again.
Comment by Jonathan Lewis — August 13, 2013 @ 10:10 pm BST Aug 13,2013 |
create table t2 as select * from all_objects where rownum <= 20000;
select (select count(*) from t2) – (select count(*) from t2 where subobject_name is not null) from dual;
Comment by Atul Gupta (@atulgg) — August 14, 2013 @ 9:02 am BST Aug 14,2013 |
Atul Gupta,
Could you explain what you want to demonstrate with this comment, please.
Comment by Jonathan Lewis — August 14, 2013 @ 9:44 am BST Aug 14,2013 |
So, if CBO calculate the cardinality with these predicates(_or_expand_nvl_predicate), it is a pity that, in contrast to the “nvl(:x, column) = column”, we can not get a concatenation in a plan like that:
Comment by Sayan Malakshinov — August 14, 2013 @ 10:55 am BST Aug 14,2013 |