Oracle Scratchpad

February 13, 2018

Coalesce v. NVL

Filed under: CBO,Oracle — Jonathan Lewis @ 11:23 am GMT Feb 13,2018

“Modern” SQL should use the coalesce() function rather than the nvl() function – or so the story goes but do you always want to do that to an Oracle database? The answer is “maybe not”.

Although the coalesce() function can emulate the nvl() function (in many cases) there are significant differences in behaviour, some that suggest it’s a good idea to use the substitution and others that suggest otherwise. Different decisions may be appropriate for different circumstances and this note highlights one case against the substitution.

We’ll start with a simple data set:

rem
rem     Script:         nvl_coalesce_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Feb 2018
rem
rem     Last tested
rem             19.3.0.0
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

create table t1
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4    -- > comment to avoid wordpress format issue
)
select
        rownum                          id,
        case mod(rownum,4)
                when 0  then 'Y'
                        else 'N'
        end                             yes_no,
        case mod(rownum,5)
                when 0  then 'Y'
                when 1  then null
                        else 'N'
        end                             yes_null_no,
        lpad('x',100,'x')               padding
from
        generator
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1 for columns size 5 yes_no yes_null_no'
        );
end;
/

I’ve created a table with 10,000 rows and two columns with a highly skewed data distribution. Because I know that the skew is supposed to have a significant effect I’ve used a non-standard method_opt when gathering stats. (In a production system I would have used the packaged procedure dbms_stats.set_table_prefs() to fix this in place for the table.)

The difference between the yes_no and the yes_null_no columns is that the latter is null for a significant fraction of the rows.

  • yes_no has: 7,500 N, 2,500 Y
  • yes_null_no has: 6,000 N, 2,000 null, 2,000 Y

Let’s now try to count the “N or null” rows using two different functions and see what estimates the optimizer produces for the counts. First counting the yes_no column – using nvl() then coalesce()


set autotrace traceonly explain

select * from t1 where nvl(yes_no,'N') = 'N';
select * from t1 where coalesce(yes_no,'N') = 'N';

set autotrace off

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  7500 |   798K|    24   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  7500 |   798K|    24   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("YES_NO",'N')='N')

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   100 | 10900 |    25   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   100 | 10900 |    25   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(COALESCE("YES_NO",'N')='N')

The estimate for the nvl() is accurate; the estimate for the coalesce() query is 100 rows.

Let’s repeat the test using the yes_null_no column, again starting with nvl() followed by coalesce():


set autotrace traceonly explain

select * from t1 where nvl(yes_null_no,'N') = 'N';
select * from t1 where coalesce(yes_null_no,'N') = 'N';

set autotrace off

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  8000 |   851K|    24   (5)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  8000 |   851K|    24   (5)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NVL("YES_NULL_NO",'N')='N')

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   100 | 10900 |    25   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   100 | 10900 |    25   (8)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(COALESCE("YES_NULL_NO",'N')='N')

Again we get the right result for the nvl() estimate (8,000 = 6,000 N + 2,000 null) and 100 for the coalesce() estimate.

By now you’ve probably realised that the coalesce() estimate is simply the “1% guess for equality” that applies to most cases of function(column). So, as we saw in the previous post, coalesce() gives us the benefits of “short-circuiting” but now we see it also threatens us with damaged cardinality estimates. The latter is probably less important than the former in many cases (especially since we might ne able to address the problem very efficiently using virtual columns), but it’s probably worth remembering.

10 Comments »

  1. Yesterday I happened to look again at the old “where somecolumn = nvl(p_someparam,somecolumn)” optimisation, which gives a plan with filters for null and not null values for the variable, and it only works for NVL, not COALESCE.

    Comment by William Robertson — February 13, 2018 @ 3:21 pm GMT Feb 13,2018 | Reply

    • William,

      Thanks for the comment – it gives me an excuse to point back to the post I did on this just over 10 years ago.

      Worth doing
      a) because it’s always useful to rerun test cases after a change in version
      b) to highlight the fact that there’s a special case of decode() that does the same
      c) to note that the nvl2() function also “fails” to play clever games.
      d) to note that the “proper” predicate “n1 = :n1 or :n1 is null” doesn’t take the cunning plan – even in 12.2.0.1

      Comment by Jonathan Lewis — February 14, 2018 @ 11:15 am GMT Feb 14,2018 | Reply

  2. […] item is, by a roundabout route, a follow-up to yesterday’s note on a critical difference in cardinality estimates that appeared if you used the coalesce() function […]

    Pingback by Join Factorization | Oracle Scratchpad — February 14, 2018 @ 3:39 pm GMT Feb 14,2018 | Reply

  3. […] Coalesce v. NVL – Jonathan Lewis […]

    Pingback by Report query with optional parameters | Jeff Kemp on Oracle — February 15, 2018 @ 2:17 am GMT Feb 15,2018 | Reply

  4. […] that must have been hidden in the view to nvl(DFG,’N’) then Oracle would be able to “or expand” the nvl() and use a more appropriate selectivity for that part of the […]

    Pingback by Cardinality Puzzle | Oracle Scratchpad — July 12, 2018 @ 12:58 pm BST Jul 12,2018 | Reply

  5. […] Di più qui: jonathanlewis.wordpress.com/2018/02/13/coalesce-v-nvl […]

    Pingback by sql - Oracle Differenze tra NVL e Coalesce — November 30, 2018 @ 7:26 am GMT Nov 30,2018 | Reply

  6. […] Más aquí: jonathanlewis.wordpress.com/2018/02/13/coalesce-v-nvl […]

    Pingback by coalesce - Oracle Diferencias entre NVL y se Unen — July 27, 2019 @ 4:20 am BST Jul 27,2019 | Reply

  7. Did anybody ever report this to oracle as a simple bug?

    Comment by Anonymous — December 9, 2019 @ 11:04 am GMT Dec 9,2019 | Reply

    • Anonymous,
      I think it has appeared a few times because of the unexpected performance impact of calling a slow function to supply a value, but there’s a response document on MOS that points out that this just is the way it works. Then there’s nothing in the formal documentation to say that it shouldnt do otherwise.

      Comment by Jonathan Lewis — December 9, 2019 @ 1:42 pm GMT Dec 9,2019 | Reply

  8. […] More here: jonathanlewis.wordpress.com/2018/02/13/coalesce-v-nvl […]

    Pingback by Oracle Differences between NVL and Coalesce — December 17, 2022 @ 5:43 am GMT Dec 17,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: