Oracle Scratchpad

July 15, 2020

Fetch First vs. Rownum

Filed under: Execution plans,Oracle,Partitioning,Performance,Problem Solving,Tuning — Jonathan Lewis @ 10:11 am BST Jul 15,2020

I’ve pointed out fairly frequently that if you’re running Standard Edition but would like to take advantage of a few features of the Partitioning option then you might be able to do something appropriate with Partition Views (but I’ve just discovered while searching my blog for a suitable item to link to that I haven’t published any of my PV notes on the blog).

I’ve also pointed out that while 12c allows you to use “fetch first N rows” instead of “where rownum <= N” there’s a hidden threat to using the feature because “fetch first N” turns into a hidden row_number() over() analytic function.

Today’s note is a combination of these two topics, prompted by a request to solve a particular coding issue that has appeared a few times on the Oracle Developer Forum and is likely to be recognisable to a number of people.

I’ll start with a very simple model demonstrateing the simplest use of “fetch first N”:


rem
rem     Script:         fetch_first_union.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem

create table t_odd
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        'O'                             flag,
        chr(65 + mod(rownum,26))        class,
        2 * rownum - 1                  id,
        lpad(2 * rownum,10,'0')         v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4   -- > comment to avoid WordPress format issue
;

alter table t_odd modify(flag not null, class not null, id not null);
alter table t_odd add constraint to_chk_odd check (flag = 'O');

create index to_i1 on t_odd(class, id);

With this data set I want to write a query that selects rows for class A where id > 9500, ordered by id – but I only want the first two rows. Here’s a very simple query that gets the result I want, followed by the execution plan from 12.2.0.1 (the A-Rows and E-Rows from 19.3 are slightly different):

set serveroutput off
set linesize 180

alter session set statistics_level = all;

select  /*+ index(t_odd (class, id)) */
        *
from
        t_odd
where
        class = 'A'
and     id > 9500
order by
        class, id
fetch 
        first 2 rows only
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost'));

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |      1 |        |    95 (100)|      2 |00:00:00.01 |       6 |
|*  1 |  VIEW                         |       |      1 |      2 |    95   (0)|      2 |00:00:00.01 |       6 |
|*  2 |   WINDOW NOSORT STOPKEY       |       |      1 |    202 |    95   (0)|      2 |00:00:00.01 |       6 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T_ODD |      1 |    202 |    95   (0)|      3 |00:00:00.01 |       6 |
|*  4 |     INDEX RANGE SCAN          | TO_I1 |      1 |    202 |     2   (0)|      3 |00:00:00.01 |       3 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "T_ODD"."CLASS","T_ODD"."ID")<=2)
   4 - access("CLASS"='A' AND "ID">9500)

The plan – with its stats – shows us that we’ve done an index range scan of an index which will return the data in exactly the order we want, and the “fetch first 2 rows” has been translated into the row_number() over() that we expected; but to our great joy the “window sort stopkey” makes the processing stop very early because Oracle recognises that the base data is arriving in the right order so it isn’t necessary to fetch all of it and sort it. The A-Rows column confirms this interpretation of what has happened.

You might notice, by the way, that the optimizer has costed the query as if it were fetching all the rows even though it “knows” that it’s going to fetch only the first two rows. That’s why I had to include the index hint to make the optimizer use the obvious index – a popular alternative is to use the /*+ first_rows(N) */ hint where N matches, or is similar to, the number of rows you want to fetch. If I had omitted the hint the optimizer would have done a full tablescan and then applied a “window sort pushed rank” operation to sort and limit the result to 2 rows.

So now we come to the real problem: the user has a “current” table and an identical “history” table, and would like to replace the table reference with a reference to a union all view for their clients on Standard Edition, or to a partitioned table for clients running Enterprise Edition – and they don’t really want to do any other code changes. So let’s see what happens when we model the union all. I started with a table called t_odd that held only odd values for id, so I’m going to add a table called t_even that holds only even values for id.

create table t_even
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        'E'                             flag,
        chr(65 + mod(rownum,26))        class,
        2 * rownum                      id,
        lpad(2 * rownum,10,'0')         v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4   -- > comment to avoid WordPress format issue
;

alter table t_even modify(flag not null, class not null, id not null);
alter table t_even add constraint te_chk_even check (flag = 'E');

create index te_i1 on t_even(class, id);

create or replace view v_bare 
as
select * from t_odd
union all
select * from t_even
/

select
        /*+ 
                index(vw.t_odd  (class, id)) 
                index(vw.t_even (class, id)) 
        */
        *
from
        v_bare vw
where
        class = 'A'
and     id > 9500
order by
        class, id
fetch 
        first 2 rows only
;

As you can see t_even is an identically structured table with similar data, and I’ve created a union all view on top of the two tables, changing the query to reference the view rather than referencing a table. Thanks to the costing anomaly (combined with the small size of the tables) I’ve had to supply a couple of “global” hints to tell the optimizer to use the indexes to access the two tables. So how well does the optimizer do its job when we have a union all view?


----------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |        |      1 |        |   192 (100)|      2 |00:00:00.01 |     190 |       |       |          |
|*  1 |  VIEW                                   |        |      1 |      2 |   192   (2)|      2 |00:00:00.01 |     190 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK               |        |      1 |    404 |   192   (2)|      2 |00:00:00.01 |     190 |  2048 |  2048 | 2048  (0)|
|   3 |    VIEW                                 | V_BARE |      1 |    404 |   191   (1)|    404 |00:00:00.01 |     190 |       |       |          |
|   4 |     UNION-ALL                           |        |      1 |        |            |    404 |00:00:00.01 |     190 |       |       |          |
|   5 |      TABLE ACCESS BY INDEX ROWID BATCHED| T_ODD  |      1 |    202 |    95   (0)|    202 |00:00:00.01 |      95 |       |       |          |
|*  6 |       INDEX RANGE SCAN                  | TO_I1  |      1 |    202 |     2   (0)|    202 |00:00:00.01 |       2 |       |       |          |
|   7 |      TABLE ACCESS BY INDEX ROWID BATCHED| T_EVEN |      1 |    202 |    95   (0)|    202 |00:00:00.01 |      95 |       |       |          |
|*  8 |       INDEX RANGE SCAN                  | TE_I1  |      1 |    202 |     2   (0)|    202 |00:00:00.01 |       2 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "VW"."CLASS","VW"."ID")<=2)
   6 - access("CLASS"='A' AND "ID">9500)
   8 - access("CLASS"='A' AND "ID">9500)

Answer: Bad luck, the optimizer isn’t smart enought to find a cheap way through this query. It’s fetched all the relevant data from the two tables before applying the window sort (which it does with some efficiency – the pushed rank) to produce the right answer. As you can see from the A-Rows column, though, it’s had to acquire a couple of hundred rows from each table before getting down to the 2 rows we wanted.

Partitioned Tables

So let’s try to solve the problem by buying into the partitioning option and creating a list-partitioned table with two partitions, one flagged for current data and one flagged for history data – or ‘O’dd and ‘E’ven data as I’ve created them in my model.



create table t_pt (
        flag,
        class,
        id,
        v1,
        padding
)
partition by list (flag) (
        partition pO values('O'),
        partition pE values('E')
)
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        'O'                             flag,
        chr(65 + mod(rownum,26))        class,
        2 * rownum - 1                  id,
        lpad(2 * rownum,10,'0')         v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4   -- > comment to avoid WordPress format issue
;

insert into t_pt
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        'E'                             flag,
        chr(65 + mod(rownum,26))        class,
        2 * rownum                      id,
        lpad(2 * rownum,10,'0')         v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4   -- > comment to avoid WordPress format issue
;


create index tp_i1 on t_pt(class, id) local;
alter table t_pt modify (flag not null, class not null, id not null);

execute dbms_stats.gather_table_stats(user,'t_pt',method_opt=>'for all columns size 1', cascade=>true, granularity=>'ALL')

Note particularly that I have created a local index on this partitioned table – so there’s a very close correspondance between the two tables in the previous example and the two partitions in this example. Here’s the plan when I query the partitioned table for the first two rows:


select  /*+ index(t_pt (class, id)) */
        *
from
        t_pt
where
        class = 'A'
and     id > 9500
order by
        class, id
fetch 
        first 2 rows only
;


--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |       |      1 |        |   190 (100)|      2 |00:00:00.01 |     189 |       |       |          |
|*  1 |  VIEW                                        |       |      1 |      2 |   190   (2)|      2 |00:00:00.01 |     189 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK                    |       |      1 |    404 |   190   (2)|      2 |00:00:00.01 |     189 |  2048 |  2048 | 2048  (0)|
|   3 |    PARTITION LIST ALL                        |       |      1 |    404 |   189   (1)|    404 |00:00:00.01 |     189 |       |       |          |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T_PT  |      2 |    404 |   189   (1)|    404 |00:00:00.01 |     189 |       |       |          |
|*  5 |      INDEX RANGE SCAN                        | TP_I1 |      2 |    404 |     4   (0)|    404 |00:00:00.01 |       4 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "T_PT"."CLASS","T_PT"."ID")<=2)
   5 - access("CLASS"='A' AND "ID">9500)

The optimizer has let us down again. The plan shows us that we have to acquire all the relevant data from the two partitions before applying the row_number() analytic function and discarding all but the two rows we want. (Again we can check the A-Rows column to see that we have started by fetching a total of 404 rows from the table.)

But what happens if we fall back to the good old-fashioned (non-standard) rownum method:


select
        *
from    (
        select  /*+ index(t_pt (class, id)) */
                *
        from
                t_pt
        where
                class = 'A'
        and     id > 9500
        order by
                class, id
        )
where
        rownum <= 2
;

----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |       |      1 |        |   190 (100)|      2 |00:00:00.01 |       6 |       |       |          |
|*  1 |  COUNT STOPKEY                                 |       |      1 |        |            |      2 |00:00:00.01 |       6 |       |       |          |
|   2 |   VIEW                                         |       |      1 |    404 |   190   (2)|      2 |00:00:00.01 |       6 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY                       |       |      1 |    404 |   190   (2)|      2 |00:00:00.01 |       6 |  2048 |  2048 | 2048  (0)|
|   4 |     PARTITION LIST ALL                         |       |      1 |    404 |   189   (1)|      4 |00:00:00.01 |       6 |       |       |          |
|*  5 |      COUNT STOPKEY                             |       |      2 |        |            |      4 |00:00:00.01 |       6 |       |       |          |
|   6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T_PT  |      2 |    404 |   189   (1)|      4 |00:00:00.01 |       6 |       |       |          |
|*  7 |        INDEX RANGE SCAN                        | TP_I1 |      2 |    404 |     4   (0)|      4 |00:00:00.01 |       4 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=2)
   3 - filter(ROWNUM<=2)
   5 - filter(ROWNUM<=2)
   7 - access("CLASS"='A' AND "ID">9500)

Oracle really knows how to use rownum well – notice how there is a count stopkey operation as a child to the partition list all operation, and that’s where our rownum <= 2 predicate is first applied. For each partition Oracle finds “the first two rows” and after it has collected two rows from every partition it sorts them (again with a stopkey) to find the top two in that subset. Check the A-Rows column – we selected a total of 4 rows from the table (2 per partition) and reduced that to 2 rows at operation 3.

Conclusion

There are some patterns of processing where partitioned tables can be a lot friendlier to coders than partition views; if you do have to stick with Standard Edition you can usually get what you want but the coding investment may be significantly higher. Even with partitioned tables, though, there are some “old-fashioned” Oracle methods that do a much nicer job than some of the new-fangled “ANSI” mechanisms.

Footnote

Part of the problem presented here revolves around the desire to keep a pattern of SQL generation that already exists, doing nothing more than replacing a table name with a view (or partitioned table) name.

As we’ve seen, if you start with a simple heap table and try to replace it with a partitioned table you have to use the rownum mechanism rather than the fetch first N rows mechanism.

If you’re running with Standard Edition you can’t do anything simple to replace a table name with the name of a union all view; you’d have to change your code generator to apply all the predicates twice (once for each table) and then apply the rownum predicate or fetch first directive again outside the union all. In other words you have to emulate exactly what Oracle EE manages to do with partitioned tables and rownum.

select
        flag, class, id, v1
from
        (
        select
                /*+ index(t_odd (class, id)) */
                flag, class, id, v1
        from
                t_odd
        where
                class = 'A'
        and     id > 9500
        order by
                class, id
        fetch
                first 2 rows only
        )
        union all
        (
        select
                /*+ index(t_even (class, id)) */
                flag, class, id, v1
        from
                t_even
        where
                class = 'A'
        and     id > 9500
        order by
                class, id
        fetch
                first 2 rows only
        )
order by
        class, id
fetch 
        first 2 rows only
;

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name   | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |        |      1 |        |   192 (100)|      2 |00:00:00.01 |       8 |       |       |          |
|*  1 |  VIEW                              |        |      1 |      2 |   192   (2)|      2 |00:00:00.01 |       8 |       |       |          |
|*  2 |   WINDOW SORT PUSHED RANK          |        |      1 |      4 |   192   (2)|      2 |00:00:00.01 |       8 |  2048 |  2048 | 2048  (0)|
|   3 |    VIEW                            |        |      1 |      4 |   191   (1)|      4 |00:00:00.01 |       8 |       |       |          |
|   4 |     UNION-ALL                      |        |      1 |        |            |      4 |00:00:00.01 |       8 |       |       |          |
|   5 |      VIEW                          |        |      1 |      2 |    95   (0)|      2 |00:00:00.01 |       4 |       |       |          |
|*  6 |       VIEW                         |        |      1 |      2 |    95   (0)|      2 |00:00:00.01 |       4 |       |       |          |
|*  7 |        WINDOW NOSORT STOPKEY       |        |      1 |    202 |    95   (0)|      2 |00:00:00.01 |       4 | 73728 | 73728 |          |
|   8 |         TABLE ACCESS BY INDEX ROWID| T_ODD  |      1 |    202 |    95   (0)|      3 |00:00:00.01 |       4 |       |       |          |
|*  9 |          INDEX RANGE SCAN          | TO_I1  |      1 |    202 |     2   (0)|      3 |00:00:00.01 |       2 |       |       |          |
|* 10 |      VIEW                          |        |      1 |      2 |    95   (0)|      2 |00:00:00.01 |       4 |       |       |          |
|* 11 |       WINDOW NOSORT STOPKEY        |        |      1 |    202 |    95   (0)|      2 |00:00:00.01 |       4 | 73728 | 73728 |          |
|  12 |        TABLE ACCESS BY INDEX ROWID | T_EVEN |      1 |    202 |    95   (0)|      3 |00:00:00.01 |       4 |       |       |          |
|* 13 |         INDEX RANGE SCAN           | TE_I1  |      1 |    202 |     2   (0)|      3 |00:00:00.01 |       2 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_007"."rowlimit_$$_rownumber"<=2)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "from$_subquery$_006"."CLASS","from$_subquery$_006"."ID")<=2)
   6 - filter("from$_subquery$_003"."rowlimit_$$_rownumber"<=2)
   7 - filter(ROW_NUMBER() OVER ( ORDER BY "CLASS","ID")<=2)
   9 - access("CLASS"='A' AND "ID">9500)
  10 - filter("from$_subquery$_005"."rowlimit_$$_rownumber"<=2)
  11 - filter(ROW_NUMBER() OVER ( ORDER BY "CLASS","ID")<=2)
  13 - access("CLASS"='A' AND "ID">9500)


As you can see, the E-Rows still predicts a lot of work, but the A-Rows tells us the work was kept to the minimum we want.

 

6 Comments »

  1. Hi Jonathan,
    i’ve tried your first script on three environments, on 19.3 Enterprise edition, 18.3 enterprise edition on our internal servers and on a RDS instance 19.5 standard edition but in all three cases i’ve this:
    select * from table(dbms_xplan.display_cursor(null,null,’allstats last cost’));

    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID  374taghwrx3aa, child number 0
    -------------------------------------
    select  /*+ index(t_odd (class, id)) */         * from         t_odd
    where         class = 'A' and     id > 9500 order by         class, id
    fetch          first 2 rows only
    
    Plan hash value: 1238738181
    
    ----------------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
    ----------------------------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                      |       |      1 |        |    96 (100)|      2 |00:00:00.01 |      95 |      1 |       |       |          |
    |*  1 |  VIEW                                 |       |      1 |      2 |    96   (2)|      2 |00:00:00.01 |      95 |      1 |       |       |          |
    |*  2 |   WINDOW SORT PUSHED RANK             |       |      1 |    202 |    96   (2)|      2 |00:00:00.01 |      95 |      1 |  2048 |  2048 | 2048  (0)|
    |   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T_ODD |      1 |    202 |    95   (0)|    202 |00:00:00.01 |      95 |      1 |       |       |          |
    |*  4 |     INDEX RANGE SCAN                  | TO_I1 |      1 |    202 |     2   (0)|    202 |00:00:00.01 |       2 |      1 |       |       |          |
    ----------------------------------------------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)   -- > edit to avoid format issue
       2 - filter(ROW_NUMBER() OVER ( ORDER BY "T_ODD"."CLASS","T_ODD"."ID")<=2)  -- > edit to avoid format issue
       4 - access("CLASS"='A' AND "ID">9500)
    

    A-Rows are always 202 and not 3 as in your test. Am i missing something?

    Comment by Cristian Cudizio — July 21, 2020 @ 2:31 pm BST Jul 21,2020 | Reply

    • Christian,

      Thanks for raising the problem.

      I don’t know why you’re seeing that result but there is one tiny clue. The E-rows column varies with version for this query – for 12.1.0.2 and 12.2.0.1 the E-rows column reports 202 rows for operations 2, 3 and 4. In 19.3 it’s only operation 4 that reports E-rows = 202. Also, in 19.3 with optimizer_features_enable = 18.1.0 it’s operations 2,3 and 4.

      This may be a hint that you’ve got some parameter settings that are pushing the optimizer behaviour backwards to an earlier version than the headline version. Perhaps adding the ‘outline’ format to the call to dbms_xplan will give us a clue.

      In passing (and I’ll update the article to clarify the source of the plan I’ve shown) the A-rows also varies: the plan I’ve shown is from 12.2.0.1 with A-Rows = 3 for operations 3 and 4. In 19.3 I get A-rows = 2 for operations 3 and 4.

      Regards
      Jonathan Lewis

      Update: I’ve just run the starting SQL on LiveSQL (19.5) – had to mess about a bit to get the SQL_ID – and it gives the plan I’ve shown with A-rows = 2 for all operations.

      Comment by Jonathan Lewis — July 21, 2020 @ 5:25 pm BST Jul 21,2020 | Reply

  2. Thank you Jonathan! it seems to be NLS_LANGUAGE settting. With ITALIAN i got:

    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID  374taghwrx3aa, child number 1
    -------------------------------------
    select  /*+ index(t_odd (class, id)) */         * from         t_odd
    where         class = 'A' and     id > 9500 order by         class, id
    fetch          first 2 rows only
    
    Plan hash value: 1238738181
    
    -------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
    -------------------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                      |       |      1 |        |    96 (100)|      2 |00:00:00.01 |      95 |       |       |          |
    |*  1 |  VIEW                                 |       |      1 |      2 |    96   (2)|      2 |00:00:00.01 |      95 |       |       |          |
    |*  2 |   WINDOW SORT PUSHED RANK             |       |      1 |    202 |    96   (2)|      2 |00:00:00.01 |      95 |  2048 |  2048 | 2048  (0)|
    |   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T_ODD |      1 |    202 |    95   (0)|    202 |00:00:00.01 |      95 |       |       |          |
    |*  4 |     INDEX RANGE SCAN                  | TO_I1 |      1 |    202 |     2   (0)|    202 |00:00:00.01 |       2 |       |       |          |
    -------------------------------------------------------------------------------------------------------------------------------------------------
    
    Outline Data
    -------------
    
      /*+
          BEGIN_OUTLINE_DATA
          IGNORE_OPTIM_EMBEDDED_HINTS
          OPTIMIZER_FEATURES_ENABLE('19.1.0')
          DB_VERSION('19.1.0')
          ALL_ROWS
          OUTLINE_LEAF(@"SEL$1")
          OUTLINE_LEAF(@"SEL$2")
          NO_ACCESS(@"SEL$2" "from$_subquery$_002"@"SEL$2")
          INDEX_RS_ASC(@"SEL$1" "T_ODD"@"SEL$1" ("T_ODD"."CLASS" "T_ODD"."ID"))
          BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$1" "T_ODD"@"SEL$1")
          END_OUTLINE_DATA
      */
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)   -- > edit to avoid format issue
       2 - filter(ROW_NUMBER() OVER ( ORDER BY "T_ODD"."CLASS","T_ODD"."ID")<=2)   -- > edit to avoid format issue
       4 - access("CLASS"='A' AND "ID">9500)
    

    With NLS_LANGUAGE ENGLISH i got:

    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    SQL_ID  374taghwrx3aa, child number 3
    -------------------------------------
    select  /*+ index(t_odd (class, id)) */         * from         t_odd
    where         class = 'A' and     id > 9500 order by         class, id
    fetch          first 2 rows only
    
    Plan hash value: 3023014569
    
    --------------------------------------------------------------------------------------------------------------
    | Id  | Operation                     | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
    --------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT              |       |      1 |        |     3 (100)|      2 |00:00:00.01 |       3 |
    |*  1 |  VIEW                         |       |      1 |      2 |     3   (0)|      2 |00:00:00.01 |       3 |
    |*  2 |   WINDOW NOSORT STOPKEY       |       |      1 |      2 |     3   (0)|      2 |00:00:00.01 |       3 |
    |   3 |    TABLE ACCESS BY INDEX ROWID| T_ODD |      1 |      2 |     3   (0)|      2 |00:00:00.01 |       3 |
    |*  4 |     INDEX RANGE SCAN          | TO_I1 |      1 |    202 |     2   (0)|      2 |00:00:00.01 |       2 |
    --------------------------------------------------------------------------------------------------------------
    
    Outline Data
    -------------
    
      /*+
          BEGIN_OUTLINE_DATA
          IGNORE_OPTIM_EMBEDDED_HINTS
          OPTIMIZER_FEATURES_ENABLE('19.1.0')
          DB_VERSION('19.1.0')
          ALL_ROWS
          OUTLINE_LEAF(@"SEL$1")
          OUTLINE_LEAF(@"SEL$2")
          NO_ACCESS(@"SEL$2" "from$_subquery$_002"@"SEL$2")
          INDEX_RS_ASC(@"SEL$1" "T_ODD"@"SEL$1" ("T_ODD"."CLASS" "T_ODD"."ID"))
          END_OUTLINE_DATA
      */
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=2)  -- > edit to avoid format issue
       2 - filter(ROW_NUMBER() OVER ( ORDER BY "T_ODD"."CLASS","T_ODD"."ID")<=2)  -- > edit to avoid format issue
       4 - access("CLASS"='A' AND "ID">9500)
    

    Comment by Cristian Cudizio — July 21, 2020 @ 9:56 pm BST Jul 21,2020 | Reply

  3. Criistian,

    Thanks for that.

    I’ve just done a little test on 19.3. When I change the nls_language to ITALIAN I get the same result as you – all 202 rows accessed and sorted. But when I checked the NLS_SORT parameter it, too, had changed (as had a couple of other NLS parameters). The NLS_SORT by default was BINARY but had changed to WEST_EUROPEAN; when I changed the nls_sort back to BINARY the results went back to the original 19.3 results – i.e. only 2 rows fetched and “sorted”.

    Binary is the default nls_sort for the index, so when the “order by” is (implicitly) west_european the index can’t be used to optimise the sort, hence the behaviour.

    Regards
    Jonathan Lewis

    Comment by Jonathan Lewis — July 22, 2020 @ 8:41 am BST Jul 22,2020 | Reply

  4. I can confirm that NLS_SORT is indeed the cause, but it is not as logical as it may seem.
    The first index column, class must match exact value ‘A’, and the sorted column, ID, is of type number, which sort order is not depending on linguistics..
    Removing class from the order by does not solve the issue.

    Comment by Dirk Vanhaute — July 24, 2020 @ 4:12 pm BST Jul 24,2020 | Reply

    • Dirk,

      Thanks for the follow-up.
      It’s a warning about how messy the whole NLS thing can get if you don’t work it all out at the very start.

      There’s a third parameter that (for this example, I think) completes the picture: NLS_COMP.
      When NLS_LANGUAGE changes to Italian the NLS_SORT changes to WEST_EUROPEAN, but the NLS_COMP stays at BINARY.

      So the predicate:

      class = 'A' and id > 9500
      

      is using a BINARY comparison to compare the column with the literal, which means it can use the index; then it passes class up to the table access, which passes class up to the window operation. We can see this if we add the ‘projection’ option to the call to dbms_xplan.display_cursor()

      But the window operation is sorting, so it needs the value nlssort(class,’nls_sort=Italian’) – which is the result of a function call and by the time the optimizer allows for sorting by the result of a function call it’s “forgotten” that the input to the function was the same for all rows – so it introduces a real sort. (Again we can see this from the projection information – and I don’t know why it’s not nlssort(class,’nls_sort=west_european’) at that point.

      If we set the nls_language to Italian, leave the nls_sort at West_European, but changes the nls_comp to linguistic then changes the class predicate to:

      NLSSORT("CLASS",'nls_sort=''WEST_EUROPEAN''')=HEXTORAW('14000100'))
      

      which means the index can’t be used and we do a full tablescan. We still project just class, and still to a Window Sort. But if we now define the index as:

      create index to_i1 on t_odd(NLSSORT("CLASS",'nls_sort=WEST_EUROPEAN'),id);
      

      then the optimizer does an index range scan picks up only two rows, uses a Window Nosort Stopkey … and then sorts the resulting 2 rows for the “sort order by”.
      So, finally, we change the “order by” clause to:

      order by nlssort(class,'nls_sort=west_european'), id 
      

      and we manage to get a two-row ranges scan, window nosort stopkey, and avoid a sort order by !

      That was messing about with 19.3 – and I think I’ll have to spend a little more time getting it clear in my own mind before doing a proper write-up. There’s also a possibility of introducing a “real” virtual column rather than the hidden virtual column that comes from the “function-based” index – but that might introduce the need to change the literal to nlssort(‘A’,’nls_sort=west_european’).

      Regards
      Jonathan Lewis

      P.S. I’m sure it’s logical – for some value of the word “logical” (as Terry Pratchett would say).

      Comment by Jonathan Lewis — July 24, 2020 @ 6:44 pm BST Jul 24,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: