Oracle Scratchpad

July 10, 2013

12c First N

Filed under: 12c,Oracle,trace files,Tuning — Jonathan Lewis @ 5:59 pm BST Jul 10,2013

There have been a couple of nice posts about the “Top N” (or Fetch First N / Next N)  syntax that has appeared in 12c, here and here for example. I particularly like the first set of examples because they include some execution plans that give you a good idea of what’s  going on under the covers. “Under the covers” is important, because if you don’t actually have a large data set to test on you might not realise what impact a “Fetch First” query might have on a full-size production data set.

You can do just a little better than inference from the execution plan, so I thought I’d push the envelope slightly with a small example and appeal to the 10053 trace file. Here’s my test query, with resulting execution plan:

select
        /*+ track this */
        *
from
        t1
order by
        n1
        offset 10 rows
fetch 
        next 1 percent rows with ties
/

SQL_ID  957mk29yc8d6h, child number 0
-------------------------------------
select  /*+ track this */  * from  t1 order by  n1  offset 10 rows
fetch  next 1 percent rows with ties

Plan hash value: 2273146475

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |       |  3789 (100)|          |
|*  1 |  VIEW               |      |   100K|    14M|       |  3789   (1)| 00:00:01 |
|   2 |   WINDOW SORT       |      |   100K|    11M|    13M|  3789   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T1   |   100K|    11M|       |   678   (2)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("from$_subquery$_002"."rowlimit_$$_rownumber">10 AND
              "from$_subquery$_002"."rowlimit_$$_rank"<=CASE WHEN (10>=0) THEN 10 ELSE 0
              END +CEIL("from$_subquery$_002"."rowlimit_$$_total"*1/100)))

Note that I’ve asked Oracle to skip the first ten rows then report the next 1 percent of the data – based on a given ordering – but to include any rows beyond the 1 percent where the ordering values still match the last row of the 1 percent (Tim Hall’s post includes an example showing the difference between “with ties” and “rows only”).

Line 2 of the execution plan is very revealing – we have a “Window Sort” operation which tells you that the optimizer has to be using some sort of analytic function to get the result set.

So here’s the “Unparsed Query” (extracted with a bit of re-formatting from the 10053 trace file) that Oracle generated from the query as it transformed it to a form it could optimize:

SELECT 
        from$_subquery$_002.ID ID,
        from$_subquery$_002.N1 N1,
        from$_subquery$_002.SMALL_VC SMALL_VC,
        from$_subquery$_002.PADDING PADDING 
FROM    (
        SELECT 
                T1.ID ID,
                T1.N1 N1,
                T1.SMALL_VC SMALL_VC,
                T1.PADDING PADDING,
                T1.N1 "rowlimit_$_  0",
                ROW_NUMBER() OVER ( ORDER BY T1.N1) rowlimit_$$_rownumber,
                RANK()       OVER ( ORDER BY T1.N1) rowlimit_$$_rank,
                COUNT(*)     OVER () rowlimit_$$_total 
        FROM 
                TEST_USER.T1 T1
        ) from$_subquery$_002 
WHERE 
        from$_subquery$_002.rowlimit_$$_rank <= -- > comment added to avoid wordpress format issue
                CASE  
                        WHEN (10>=0) 
                                THEN FLOOR(TO_NUMBER(10)) 
                                ELSE 0 
                        END + 
                CEIL(from$_subquery$_002.rowlimit_$$_total*1/100) 
AND     from$_subquery$_002.rowlimit_$$_rownumber > 10 
ORDER BY 
        from$_subquery$_002."rowlimit_$_0"
;

As expected we can see the query has been transformed to include analytic functions – three of them. Oracle is basically taking a nice, easily readable, syntax and converting it behind the scenes to the sort of SQL we would once have written by hand. (That’s a good thing, of course – clarity of code does protect you from accidents and it’s best if messy rewrites are handled by machines rather than error-prone people).

Although the SQL shows three analytic functions Oracle needs only one window sort since the count(*) is over() the whole data set (no partitioning or ordering required), and the rank() and row_number() functions are partitioned and ordered by the same expression. Nevertheless my particular query has, behind the scenes, sorted the entire data set to get the 1% that I wanted.

Footnote:

The rank() function appears because of the “with ties” option. If we had chosen the “rows only” option Oracle wouldn’t have needed to discover rows with the same rank() as the last of the 1 percent.

For completeness, here’s the script to generate my data set:


rem
rem     Script:         12c_fetch_first_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2012
rem

execute dbms_random.seed(0)

create table t1
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual 
        connect by 
                level <= 10000 -- > comment to avoid wordpress format issue
)
select
        rownum                  id,
        trunc(dbms_random.value(0,5000))        n1,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 100000 -- > comment to avoid wordpress format issue
/

Note that you don’t need to gather stats – 12c does that automatically on CTAS and “insert as select” into an empty table – though the stats collection doesn’t include histograms (rarely a great loss) or index stats.

Update Oct 2013

In response to a comment below, it’s worth mentioning that some simple forms of the “fetch first N” can use the operation “WINDOW SORT PUSHED RANK“, this operation (like SORT ORDER BY STOPKEY) is able to limit the data volume sorted, but uses the older, less efficient, “Version 1” sort to achieve this and may still have to generate the entire pre-sorted data set. If you don’t have the PUSHED RANK (or STOPKEY) then I believe the entire incoming rowsource has to be sorted, but Oracle will be using the newer (10g), more efficient, “Version 2” sort.

Update Aug 2022

Retested on Oracle 19.11 this example did show a slight change in the “Unparsed Query” but only in a change to one expression, highlighted in lines 21-25 above, which turned into the much tidier:

        greatest(floor(to_number(10)), 0) +

27 Comments »

  1. Does it push down the top-n for parallel execution plans?

    Comment by Greg Rahn (@GregRahn) — July 11, 2013 @ 7:25 am BST Jul 11,2013 | Reply

  2. […] comment from Greg Rahn in response to my posting yesterday prompted me to do a quick follow-up (test time ca. 3 minutes, write-up time, ca. 50 minutes – […]

    Pingback by 12c Top N (px) | Oracle Scratchpad — July 11, 2013 @ 9:04 am BST Jul 11,2013 | Reply

  3. Somehow automated stats not working and it is going for a dynamic stats. may be i am missing something.

    SQL> drop table t1 purge;
    
    Table dropped.
    
    SQL> create table t1
      2  as
      3  with generator as (
      4  select --+ materialize
      5  rownum id
      6  from dual
      7  connect by
      8  level <= 10000
      9  )
     10  select
     11  rownum id,
     12  trunc(dbms_random.value(0,5000)) n1,
     13  lpad(rownum,10,'0') small_vc,
     14  rpad('x',100) padding
     15  from
     16  generator v1,
     17  generator v2
     18  where
     19  rownum <= 100000;
    
    SQL> select count(*) from t1;
    
      COUNT(*)
    ----------
        100000
    
    SQL> select table_name,num_rows,last_analyzed from user_tables where table_name=
    'T1';
    
    TABLE_NAME
    --------------------------------------------------------------------------------
    
      NUM_ROWS LAST_ANAL
    ---------- ---------
    T1
    
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 2273146475
    
    ------------------------------------------------------------------------------------
    | Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT    |      | 96859 |    12M|       |  2384   (1)| 00:00:01 |
    |*  1 |  VIEW               |      | 96859 |    12M|       |  2384   (1)| 00:00:01 |
    |   2 |   WINDOW SORT       |      | 96859 |  8040K|  9576K|  2384   (1)| 00:00:01 |
    |   3 |    TABLE ACCESS FULL| T1   | 96859 |  8040K|       |   478   (1)| 00:00:01 |
    ------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber">10 AND
                  "from$_subquery$_002"."rowlimit_$$_rank"=0) THEN 10 ELSE 0
                  END +CEIL("from$_subquery$_002"."rowlimit_$$_total"*1/100))
    
    Note
    -----
       - dynamic statistics used: dynamic sampling (level=2)
    
    
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
           1760  consistent gets
              0  physical reads
              0  redo size
          34904  bytes sent via SQL*Net to client
           1281  bytes received via SQL*Net from client
             69  SQL*Net roundtrips to/from client
              1  sorts (memory)
              0  sorts (disk)
           1006  rows processed
    

    Comment by raova — July 17, 2013 @ 2:45 pm BST Jul 17,2013 | Reply

  4. Got it. I created the table in SYS schema and that is the reason it has not gathered stats. Following limitations appy for online statistics gathering for bulk loads.

    Currently, statistics gathering does not happen for bulk load statements when any of the following conditions apply to the target table
    It is in an Oracle-owned schema such as SYS.
    It is a nested table.
    It is an index-organized table (IOT).
    It is an external table.
    It is a global temporary table defined as ON COMMIT DELETE ROWS.
    It has virtual columns.
    It has a PUBLISH preference set to FALSE.
    It is partitioned, INCREMENTAL is set to true, and extended syntax is not used

    http://docs.oracle.com/cd/E16655_01/server.121/e15858/tgsql_statscon.htm#BABHEJCD

    Ananda

    Comment by raova — July 17, 2013 @ 4:37 pm BST Jul 17,2013 | Reply

    • Ananda,

      Thanks for the follow-up and the reference.

      SYS is always catching people out – but I’ll bet a few people (me included) get caught out by the virtual column limitation over the next couple of years.

      I’ve tried to fix up the formatting – not entirely successfully.

      Comment by Jonathan Lewis — July 24, 2013 @ 12:28 pm BST Jul 24,2013 | Reply

  5. Tried to use OFFSET…FETCH in PL/SQL and immediately got an issue. It seems to me that passing a variable to FETCH is not supported:

    SQL> declare
      2      type table_tt is table of employees%rowtype;
      3      v_tt table_tt;
      4
      5      v_limit_nr number:=10;
      6      v_counter_nr number:=0;
      7  begin
      8      select *
      9      bulk collect into v_tt
     10      from employees
     11      offset v_counter_nr*v_limit_nr rows
     12      fetch next v_limit_nr rows only; -- variable 
     13  end;
     14  /
        type table_tt is table of employees%rowtype;
                                  *
    ERROR at line 2:
    ORA-03113: end-of-file on communication channel
    Process ID: 3060
    Session ID: 20 Serial number: 35307
    SQL>
    
    Doing the same with hard-coded value works fine
    
     ... 
     12      fetch next 10 rows only; -- hard-code
     13  end;
     14  /
    PL/SQL procedure successfully completed.
    SQL>
    

    Couldn’t find any reference to this anywhere, so it looks like a bug. Before I log it, could anybody confirm/deny the issue for different environment?

    I am running:
    – the official release of Oracle 12c
    – on MS Windows 2012 (64-bit)
    – on the top of VMWare ESXi 5.1
    – VMX-9 version of the file

    Best regards!

    Comment by Michael "Misha" Rosenblum — August 26, 2013 @ 7:28 pm BST Aug 26,2013 | Reply

    • declare
      type TAB is table of number;
      x tab;
      begin
      select sal bulk collect into x from EMP
      order by SAL desc
      OFFSET 3 rows
      FETCH next 5 rows only ;
      for I in x.first..x.last LOOP
      DBMS_OUTPUT.PUT_LINE(‘x value is ‘||X(I));
      end loop;
      end;

      Comment by venkat — August 22, 2014 @ 9:37 pm BST Aug 22,2014 | Reply

      • Let me reiterate: the problem is with passing “NEXT N” as a variable and not as a constant (5 in your case) – it is a bug #17404511 (I logged it later than made this comment here). Just checked its status on MOS – it had not been delivered in 12.1.0.2 (scheduled for 12.2)

        Regards,
        Michael

        Comment by Michael "Misha" Rosenblum — August 22, 2014 @ 9:49 pm BST Aug 22,2014 | Reply

        • Michael,

          Thanks for the follow-up.
          It’s nice to know that the enhancement/fix is intended.

          Comment by Jonathan Lewis — August 22, 2014 @ 10:16 pm BST Aug 22,2014

        • Another update. For unknown reasons I entered #17404511 in the MOS search – and it came back to me with the pointer to “12.1.0.2 Patch Set – List of Bug Fixes by Problem Type (Doc ID 1683802.1)” – it seems to be that the mentioned bug is listed under the category “Undocumented Oracle Server”. H-m-m-m… Maybe they’ve pushed a fix earlier than expected and just didn’t update the bug itself? I don’t have 12.1.0.2 installed yet, but maybe somebody can do a quick check and post results?

          Comment by Michael "Misha" Rosenblum — August 25, 2014 @ 5:04 pm BST Aug 25,2014

        • Michael,

          Just tried a simple test from SQL*PLus in 12.1.0.2

          variable m_off number
          variable m_nxt number

          execute :m_off := 10
          execute :m_nxt := 10

          select
          *
          from
          t1
          order by
          n1
          offset
          :m_off rows
          fetch
          next :m_nxt rows only
          ;

          This returns the expected data.

          Wrapping it into pl/sql with bulk collect, table of records, and local variables for offset and count also works.

          Comment by Jonathan Lewis — August 25, 2014 @ 9:07 pm BST Aug 25,2014

  6. Curiously the execution plan produced by the new row limiter is different to the previous “SORT ORDER BY STOPKEY” method. I wonder which one is faster, the old method, or the analytic functions. Answer: The old method by a hair.

    SQL>  create table pwi as select trunc(dbms_random.value(0,5000)) n1 
    from dual connect by level  <= 10000
    
    explain plan for
    select * from (select * from pwi order by 1) where rownum  <= 5
    
    select * from table (dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------
    Plan hash value: 1721279410
    
    --------------------------------------------------------------------------------
    | Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT        |      |     5 |    65 |     3   (0)| 00:00:01 |
    |*  1 |  COUNT STOPKEY          |      |       |       |            |          |
    |   2 |   VIEW                  |      |  1000 | 13000 |     3   (0)| 00:00:01 |
    |*  3 |    SORT ORDER BY STOPKEY|      |  1000 |  4000 |     3   (0)| 00:00:01 |
    |   4 |     TABLE ACCESS FULL   | PWI  |  1000 |  4000 |     3   (0)| 00:00:01 |
    --------------------------------------------------------------------------------
    
    SQL> set timing on
    
    SQL> select * from pwi order by 1 fetch first 5 rows only;
    
            N1
    ----------
             0
             0
             0
             0
             0
    
    Elapsed: 00:00:00.12
    
    SQL> select * from (select * from pwi order by 1) where rownum <=5;
    
            N1
    ----------
             0
             0
             0
             0
             0
    
    Elapsed: 00:00:00.09
    

    NOTE: I executed the above multiple times to get a stable value.

    Comment by Peter Wiseman — October 28, 2013 @ 2:42 am GMT Oct 28,2013 | Reply

    • Peter,

      WordPress messed up your example because of the “less than” symbols. I think I’ve fixed it up correctly.

      I’m slightly surprised that you see such a difference in your example: 3/100 of a second is quite large if that’s a consistent difference over such a small amount of work. In your example Oracle is able to use WINDOW SORT PUSHED RANK for the 12c variant, so both versions of the SQL need only sort a very small part of the data, and the 10032 trace shows that the both use a version 1 sort, with the same number of comparisons and output rows (See this post: https://jonathanlewis.wordpress.com/2009/12/28/short-sorts/ ) . I guess that means the generic strategy for the analytic approach is the main culprit.

      Comment by Jonathan Lewis — October 28, 2013 @ 12:13 pm GMT Oct 28,2013 | Reply

      • Correction – I was using 1,000,000 rows for my elapsed timing information. I failed to updated the create table statement and capture new execution plan statistics.

        Comment by Peter Wiseman — October 28, 2013 @ 1:20 pm GMT Oct 28,2013 | Reply

  7. […] just added a little update to my posting on the 12c “First N” clause, pointing out that there are variations in the WINDOW SORT operation that can make a difference to […]

    Pingback by First N | Oracle Scratchpad — October 28, 2013 @ 12:50 pm GMT Oct 28,2013 | Reply

  8. Jonathan, what is /*+ track this */? I wasn’t able to find it among the valid hints:

    1 select * from V$SQL_HINT
    2* where lower(name) like ‘%track%’
    SQL> /

    no rows selected

    Comment by Mladen Gogala — October 28, 2013 @ 5:58 pm GMT Oct 28,2013 | Reply

    • Mladen,

      It’s not a hint, it’s a piece of text that I can use to search v$sql if I need to. I tend to use the hint form (rather than the comment form) because some platforms strip out comments before passing the SQL to the database.

      Comment by Jonathan Lewis — October 28, 2013 @ 6:11 pm GMT Oct 28,2013 | Reply

  9. Hi Jonathan.
    I wondered if the new dbms_utility.expand_sql_text procedure would show the same “unparsed query” you got from the 10053 trace.
    Well, it does, just with different table aliases:

    var x clob
    BEGIN
      dbms_utility.expand_sql_text(
        input_sql_text  => 
          'select /*+ track this */ * 
           from t1 
           order by n1
           offset 10 rows
           fetch next 1 percent rows with ties',
        output_sql_text => :x);
    END;
    /
    
    print x
    
    SELECT "A1"."ID"       "ID",
    	   "A1"."N1"       "N1",
    	   "A1"."SMALL_VC" "SMALL_VC",
    	   "A1"."PADDING"  "PADDING"
    FROM   (SELECT "A2"."ID" "ID",
    			   "A2"."N1" "N1",
    			   "A2"."SMALL_VC" "SMALL_VC",
    			   "A2"."PADDING" "PADDING",
    			   "A2"."N1" "rowlimit_$_  0",
    			   row_number() over(ORDER BY "A2"."N1") "rowlimit_$$_rownumber",
    			   rank() over(ORDER BY "A2"."N1") "rowlimit_$$_rank",
    			   COUNT(*) over() "rowlimit_$$_total"
    		FROM   "HR"."T1" "A2") "A1"
    WHERE  "A1"."rowlimit_$$_rank" = 0) <=          
    CASE            WHEN (10>=0)
    THEN floor(to_number(10)) ELSE 0 END + ceil("A1"."rowlimit_$$_total" * 1 / 100)
    AND "A1"."rowlimit_$$_rownumber" > 10
    ORDER  BY "A1"."rowlimit_$_  0";
    

    Thanks,
    Oren.

    Comment by Oren Nakdimon @DBoriented — November 1, 2013 @ 2:09 pm GMT Nov 1,2013 | Reply

  10. […] error out of the way we can start to think about the query.  It’s using the (fairly new) “Fetch first N rows” syntax, which means we may have to find a lot of data and sort it before returning a subset: […]

    Pingback by Index Engineering | Oracle Scratchpad — January 20, 2020 @ 4:53 pm GMT Jan 20,2020 | Reply

  11. […] etc. The error doesn’t seem to be particularly relevant, of course, until you remember that “fetch first” creates an inline view using the analytic row_number() under the […]

    Pingback by Fetch First Update | Oracle Scratchpad — June 3, 2020 @ 1:49 pm BST Jun 3,2020 | Reply

  12. […] also pointed out that while 12c allows you to use “fetch first N rows” instead of “where rownum <= N” there’s a hidden threat to using the feature […]

    Pingback by Fetch First vs. Rownum | Oracle Scratchpad — July 15, 2020 @ 10:12 am BST Jul 15,2020 | Reply

  13. […] jonathanlewis.wordpress.com docs.oracle.com […]

    Pingback by Cláusula de limitación de filas para consultas Top-N en Oracle Database 12c versión 1 (12.1) - DBandTech.com — October 4, 2020 @ 12:53 am BST Oct 4,2020 | Reply

  14. […] It would be nice to see it in Oracle as well as it could make an enormous difference to “fetch first N” […]

    Pingback by I wish | Oracle Scratchpad — November 9, 2020 @ 12:01 pm GMT Nov 9,2020 | Reply

  15. […] a thread on the MOSC (needs an account) database tuning forumat the moment asking why a “fetch first N” query to fetch next 41 rows with an offset of 8602 rows takes longer to run than the same query […]

    Pingback by Pagination cost | Oracle Scratchpad — July 21, 2022 @ 3:57 pm BST Jul 21,2022 | Reply

  16. […] Fetch First under the covers (July 2013): How does Oracle transform Fetch First / Top-N before optimising – should be retested for newer versions. […]

    Pingback by Execution Plans Catalogue | Oracle Scratchpad — August 12, 2022 @ 11:00 am BST Aug 12,2022 | Reply

  17. […] Fetch First under the covers (July 2013): How does Oracle transform Fetch First / Top-N before optimising – should be retested for newer versions. […]

    Pingback by Troubleshooting catalogue | Oracle Scratchpad — August 12, 2022 @ 11:01 am BST Aug 12,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.