Oracle Scratchpad

January 17, 2020

Group by Elimination

Filed under: 12c,18c,Bugs,Oracle — Jonathan Lewis @ 12:57 pm GMT Jan 17,2020

Here’s a bug that was highlighted a couple of days ago on the Oracle Developer Community forum; it may be particularly worth thinking about if if you haven’t yet got up to Oracle 12c as it appeared in an optimizer feature that appeared in 12.2 (and hasn’t been completely fixed) even in the latest release of 19c (currently 19.6).

Oracle introduce “aggregate group by elimination” in 12.2, protected by the hidden parameter “_optimizer_aggr_groupby_elim”. The notes on MOS about the feature tell us that Oracle can eliminate a group by operation from a query block if a unique key from every table in the query block appears in the group by clause. Unfortunately there were a couple of gaps in the implementation in 12.2 that can produce wrong results. Here’s some code to model the problem.

rem
rem     Script:         group_by_elim_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2020
rem

create table ref_clearing_calendar(
        calendar_name   char(17),
        business_date   date,
        update_ts       timestamp (6) default systimestamp,
        constraint pk_ref_clearing_calendar 
                        primary key (business_date)
)
/

insert into ref_clearing_calendar (business_date)
select
        sysdate + 10 * rownum
from 
        all_objects 
where 
        rownum <= 40 -- > comment to avoid wordpress format issue
/

commit;

execute dbms_stats.gather_table_stats(null,'ref_clearing_calendar',cascade=>true)

set autotrace on explain

select
        to_char(business_date,'YYYY') , count(*)
from
        ref_clearing_calendar
group by 
        to_char(business_date,'YYYY')
order by 
        to_char(business_date,'YYYY')
/

set autotrace off

I’ve created a table with a primary key on a date column, and then inserted 40 rows which are spaced every ten days from the current date; this ensures that I will have a few dates in each of two consecutive years (future proofing the example!). Then I’ve aggregated to count the rows per year using the to_char({date column},’YYYY’) conversion option to extract the year from the date. (Side note: the table definition doesn’t follow my normal pattern as the example started life in the ODC thread.)

If you run this query on Oracle 12.2 you will find that it returns 40 (non-unique) rows and displays the following execution plan:


---------------------------------------------------------------------------------------------
| Id  | Operation        | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                          |    40 |   320 |     2  (50)| 00:00:01 |
|   1 |  SORT ORDER BY   |                          |    40 |   320 |     2  (50)| 00:00:01 |
|   2 |   INDEX FULL SCAN| PK_REF_CLEARING_CALENDAR |    40 |   320 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

The optimizer has applied “aggregate group by elimination” because it hasn’t detected that the primary key column that appears in the group by clause has been massaged in a way that means the resulting value is no longer unique.

Fortunately this problem with to_char() is fixed in Oracle 18.1 where the query returns two rows using the following execution plan (which I’ve reported from an instance of 19.5):

---------------------------------------------------------------------------------------------
| Id  | Operation        | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                          |    40 |   320 |     2  (50)| 00:00:01 |
|   1 |  SORT GROUP BY   |                          |    40 |   320 |     2  (50)| 00:00:01 |
|   2 |   INDEX FULL SCAN| PK_REF_CLEARING_CALENDAR |    40 |   320 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Unfortunately there is still at least one gap in the implementation. Change the to_char(business_date) to extract(year from business_date) at all three points in the query, and even in 19.6 you’re back to the wrong results – inappropriate aggregate group by elimination and 40 rows returned.

There are a couple of workarounds, one is the hidden parameter _optimizer_aggr_groupby_elim to false at the system or session level, or through an opt_param() hint at the statement level (possibly injected through an SQL_Patch. The other option is to set a fix_control, again at the system, session, or statement level – but there’s seems to be little point in using the fix_control approach (which might be a little obscure for the next developer to see the code) when it seems to do the same as the explicitly named hidden parameter.

select
        /*+ opt_param('_optimizer_aggr_groupby_elim','false') */
        extract(year from business_date) , count(*)
from ,,,

select
        /*+ opt_param('_fix_control','23210039:0') */
        extract(year from business_date) , count(*)
from ...

One final thought about this “not quite fixed” bug. It’s the type of “oversight” error that gives you the feeling that there may be other special cases that might have been overlooked. The key question would be: are there any other functions (and not necessarily datetime functions) that might be applied (perhaps implicitly) to a primary or unique key that would produce duplicate results from distinct inputs – if so has the code that checks the validity of eliminating the aggregate operation been written to notice the threat.

Footnote

The problem with extract() has been raised as a bug on MOS, but it was not public at the time of writing this note.

Update (about 60 seconds after publication)

Re-reading my comment about “other functions” it occurred to me that to_nchar() might, or might not, behave the same way as to_char() in 19c – so I tested it … and got the wrong results in 19c.

 

 

 

5 Comments »

  1. Hi Jonathan,

    I ran into that issue after upgrading one of my databases to 12.2.
    I applied a patch for Bug 26588069 – Wrong Result (Duplicate Rows Produced) Using Group By After Upgrading To 12.2.0.1 (Doc ID 26588069.8) https://support.oracle.com/rs?type=doc&id=26588069.8 to fix that.
    There is also the NO_ELIM_GROUPBY hint that can be used to disable this transformation:

    select  /*+ no_elim_groupby(@sel$1)*/
            to_char(business_date,'YYYY') , count(*)
    from
            ref_clearing_calendar
    group by 
            to_char(business_date,'YYYY')
    order by 
            to_char(business_date,'YYYY')
    /
    
    TO_C   COUNT(*)
    ---- ----------
    2020         68
    2021         12
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 3544265650
    
    ---------------------------------------------------------------------------------------------
    | Id  | Operation        | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT |                          |    80 |   640 |     2  (50)| 00:00:01 |
    |   1 |  SORT GROUP BY   |                          |    80 |   640 |     2  (50)| 00:00:01 |
    |   2 |   INDEX FULL SCAN| PK_REF_CLEARING_CALENDAR |    80 |   640 |     1   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------
    

    Regards,
    Mikhail.

    Comment by Mikhail Velikikh — January 17, 2020 @ 1:29 pm GMT Jan 17,2020 | Reply

    • Mikhail,

      Thanks for the note, I knew there was a third option but I just couldn’t remember it.
      The proper hint is obviously a far better idea than the two opt_param() options I mentioned.

      You might like to try the variant that uses extract(year from business_date) , and the one that uses to_nchar().
      Both get the wrong results in 19.6 so it would be interesting to see if the patch you’ve got for 12.2 fixes them.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — January 17, 2020 @ 1:46 pm GMT Jan 17,2020 | Reply

      • Jonathan,

        Indeed, both to_nchar() and extract(year from business_date) return wrong results in 12.2 even with the patch 26588069, which is disappointing since to_char() works fine.

        Initially I encountered this issue against a query that used the length() function similar to an example below:

        create table gby_elim(x varchar2(10));
         
        insert into gby_elim values ('a');
         
        insert into gby_elim values ('b');
         
        select length(x), count(*)
          from gby_elim
         group by length(x);
        

        The last query produced two rows in 12.2 without the patch 26588069 applied.

        Apparently, Oracle has not fixed this issue with all functions.

        Regards,
        Mikhail.

        Comment by Mikhail Velikikh — January 17, 2020 @ 2:20 pm GMT Jan 17,2020 | Reply

  2. Hi Jonathan,

    thanks for posting that. There is another, not fixed yet, bug I discovered yesterday, related to “_optimizer_aggr_groupby_elim” and possibly _add_col_optim_enabled=TRUE .

    drop table test_tab;
    
    create table test_tab (
    id_person NUMBER(10) not null,
    upd_ts DATE default SYSDATE not null,
    status NUMBER(1) default 9 not null
    );
    
    alter table test_tab add ins_ts DATE default SYSDATE not null;
    alter table test_tab add ins_uid NUMBER(10) not null;
    
    select ap.ins_ts as upd_ts_sort
    from test_tab ap
    where ap.id_person = 5212
    and ap.status = 9
    group by ap.upd_ts; 
    
    

    You will get ORA-7445 [qecgoc2].

    After disabling group-by elimination, you should get regular ORA-00979.

    alter session set "_optimizer_aggr_groupby_elim"=false;
    

    Regards,
    Stanislav

    Comment by Stanislav Studený — January 17, 2020 @ 2:35 pm GMT Jan 17,2020 | Reply

  3. Hello Jonathan,

    Very surprising finding …
    With the risk of sounding “cynical” , I would say that Oracle would do a better job by giving up these attempts to force by all means
    such aggressive and dangerous optimizations …

    For this specific case, it is kind of natural that a GROUP BY that can be eliminated correctly based on the uniqueness of the group by columns
    should most probably not have been there in the first place … so I would rather let the developer to discover it as a logical problem
    and correct his code …
    It is unnatural to start hunting after each and every possible function, available now or in the future, that might return the same result
    for a different argument, and trying to take care and avoid applying the various optimization(s) in each such case …

    It would be more constructive to fully document all the available hints, with full examples, and let the developers use them more confidently
    and by their own decision, after fully understanding their behavior, in cases that prove hard to be optimized automatically by Oracle.

    Thanks a lot & Best Regards,
    Iudith

    Comment by Iudith Mentzel — January 17, 2020 @ 3:02 pm GMT Jan 17,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply to Mikhail Velikikh Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.