12c First N

July 10, 2013

12c First N

Filed under: 12c,Oracle,trace files,Tuning — Jonathan Lewis @ 5:59 pm BST Jul 10,2013

There have been a couple of nice posts about the “Top N” (or Fetch First N / Next N) syntax that has appeared in 12c, here and here for example. I particularly like the first set of examples because they include some execution plans that give you a good idea of what’s going on under the covers. “Under the covers” is important, because if you don’t actually have a large data set to test on you might not realise what impact a “Fetch First” query might have on a full-size production data set.

You can do just a little better than inference from the execution plan, so I thought I’d push the envelope slightly with a small example and appeal to the 10053 trace file. Here’s my test query, with resulting execution plan:

select
        /*+ track this */
        *
from
        t1
order by
        n1
        offset 10 rows
fetch 
        next 1 percent rows with ties
/

SQL_ID  957mk29yc8d6h, child number 0
-------------------------------------
select  /*+ track this */  * from  t1 order by  n1  offset 10 rows
fetch  next 1 percent rows with ties

Plan hash value: 2273146475

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |       |  3789 (100)|          |
|*  1 |  VIEW               |      |   100K|    14M|       |  3789   (1)| 00:00:01 |
|   2 |   WINDOW SORT       |      |   100K|    11M|    13M|  3789   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T1   |   100K|    11M|       |   678   (2)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("from$_subquery$_002"."rowlimit_$$_rownumber">10 AND
              "from$_subquery$_002"."rowlimit_$$_rank"<=CASE WHEN (10>=0) THEN 10 ELSE 0
              END +CEIL("from$_subquery$_002"."rowlimit_$$_total"*1/100)))

Note that I’ve asked Oracle to skip the first ten rows then report the next 1 percent of the data – based on a given ordering – but to include any rows beyond the 1 percent where the ordering values still match the last row of the 1 percent (Tim Hall’s post includes an example showing the difference between “with ties” and “rows only”).

Line 2 of the execution plan is very revealing – we have a “Window Sort” operation which tells you that the optimizer has to be using some sort of analytic function to get the result set.

So here’s the “Unparsed Query” (extracted with a bit of re-formatting from the 10053 trace file) that Oracle generated from the query as it transformed it to a form it could optimize:

SELECT 
        from$_subquery$_002.ID ID,
        from$_subquery$_002.N1 N1,
        from$_subquery$_002.SMALL_VC SMALL_VC,
        from$_subquery$_002.PADDING PADDING 
FROM    (
        SELECT 
                T1.ID ID,
                T1.N1 N1,
                T1.SMALL_VC SMALL_VC,
                T1.PADDING PADDING,
                T1.N1 "rowlimit_$_  0",
                ROW_NUMBER() OVER ( ORDER BY T1.N1) rowlimit_$$_rownumber,
                RANK()       OVER ( ORDER BY T1.N1) rowlimit_$$_rank,
                COUNT(*)     OVER () rowlimit_$$_total 
        FROM 
                TEST_USER.T1 T1
        ) from$_subquery$_002 
WHERE 
        from$_subquery$_002.rowlimit_$$_rank <= -- > comment added to avoid wordpress format issue
                CASE  
                        WHEN (10>=0) 
                                THEN FLOOR(TO_NUMBER(10)) 
                                ELSE 0 
                        END + 
                CEIL(from$_subquery$_002.rowlimit_$$_total*1/100) 
AND     from$_subquery$_002.rowlimit_$$_rownumber > 10 
ORDER BY 
        from$_subquery$_002."rowlimit_$_0"
;

As expected we can see the query has been transformed to include analytic functions – three of them. Oracle is basically taking a nice, easily readable, syntax and converting it behind the scenes to the sort of SQL we would once have written by hand. (That’s a good thing, of course – clarity of code does protect you from accidents and it’s best if messy rewrites are handled by machines rather than error-prone people).

Although the SQL shows three analytic functions Oracle needs only one window sort since the count(*) is over() the whole data set (no partitioning or ordering required), and the rank() and row_number() functions are partitioned and ordered by the same expression. Nevertheless my particular query has, behind the scenes, sorted the entire data set to get the 1% that I wanted.

Footnote:

The rank() function appears because of the “with ties” option. If we had chosen the “rows only” option Oracle wouldn’t have needed to discover rows with the same rank() as the last of the 1 percent.

For completeness, here’s the script to generate my data set:


rem
rem     Script:         12c_fetch_first_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2012
rem

execute dbms_random.seed(0)

create table t1
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual 
        connect by 
                level <= 10000 -- > comment to avoid wordpress format issue
)
select
        rownum                  id,
        trunc(dbms_random.value(0,5000))        n1,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',100)           padding
from
        generator       v1,
        generator       v2
where
        rownum <= 100000 -- > comment to avoid wordpress format issue
/

Note that you don’t need to gather stats – 12c does that automatically on CTAS and “insert as select” into an empty table – though the stats collection doesn’t include histograms (rarely a great loss) or index stats.

Update Oct 2013

In response to a comment below, it’s worth mentioning that some simple forms of the “fetch first N” can use the operation “WINDOW SORT PUSHED RANK“, this operation (like SORT ORDER BY STOPKEY) is able to limit the data volume sorted, but uses the older, less efficient, “Version 1” sort to achieve this and may still have to generate the entire pre-sorted data set. If you don’t have the PUSHED RANK (or STOPKEY) then I believe the entire incoming rowsource has to be sorted, but Oracle will be using the newer (10g), more efficient, “Version 2” sort.

Update Aug 2022

Retested on Oracle 19.11 this example did show a slight change in the “Unparsed Query” but only in a change to one expression, highlighted in lines 21-25 above, which turned into the much tidier:

        greatest(floor(to_number(10)), 0) +

Comments (27)

27 Comments »

Does it push down the top-n for parallel execution plans?

Comment by Greg Rahn (@GregRahn) — July 11, 2013 @ 7:25 am BST Jul 11,2013 | Reply
- Greg,
  
  Answered (at least, in part – I can’t guarantee variations, obviously) in the article referenced by the next pingback.
  
  12c Top N (px)
  
  Comment by Jonathan Lewis — July 24, 2013 @ 12:17 pm BST Jul 24,2013 | Reply
[…] comment from Greg Rahn in response to my posting yesterday prompted me to do a quick follow-up (test time ca. 3 minutes, write-up time, ca. 50 minutes – […]

Pingback by 12c Top N (px) | Oracle Scratchpad — July 11, 2013 @ 9:04 am BST Jul 11,2013 | Reply

Somehow automated stats not working and it is going for a dynamic stats. may be i am missing something.

SQL> drop table t1 purge;

Table dropped.

SQL> create table t1
  2  as
  3  with generator as (
  4  select --+ materialize
  5  rownum id
  6  from dual
  7  connect by
  8  level <= 10000
  9  )
 10  select
 11  rownum id,
 12  trunc(dbms_random.value(0,5000)) n1,
 13  lpad(rownum,10,'0') small_vc,
 14  rpad('x',100) padding
 15  from
 16  generator v1,
 17  generator v2
 18  where
 19  rownum <= 100000;

SQL> select count(*) from t1;

  COUNT(*)
----------
    100000

SQL> select table_name,num_rows,last_analyzed from user_tables where table_name=
'T1';

TABLE_NAME
--------------------------------------------------------------------------------

  NUM_ROWS LAST_ANAL
---------- ---------
T1


Execution Plan
----------------------------------------------------------
Plan hash value: 2273146475

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 96859 |    12M|       |  2384   (1)| 00:00:01 |
|*  1 |  VIEW               |      | 96859 |    12M|       |  2384   (1)| 00:00:01 |
|   2 |   WINDOW SORT       |      | 96859 |  8040K|  9576K|  2384   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T1   | 96859 |  8040K|       |   478   (1)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber">10 AND
              "from$_subquery$_002"."rowlimit_$$_rank"=0) THEN 10 ELSE 0
              END +CEIL("from$_subquery$_002"."rowlimit_$$_total"*1/100))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1760  consistent gets
          0  physical reads
          0  redo size
      34904  bytes sent via SQL*Net to client
       1281  bytes received via SQL*Net from client
         69  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       1006  rows processed

Comment by raova — July 17, 2013 @ 2:45 pm BST Jul 17,2013 | Reply

Got it. I created the table in SYS schema and that is the reason it has not gathered stats. Following limitations appy for online statistics gathering for bulk loads.

Currently, statistics gathering does not happen for bulk load statements when any of the following conditions apply to the target table
It is in an Oracle-owned schema such as SYS.
It is a nested table.
It is an index-organized table (IOT).
It is an external table.
It is a global temporary table defined as ON COMMIT DELETE ROWS.
It has virtual columns.
It has a PUBLISH preference set to FALSE.
It is partitioned, INCREMENTAL is set to true, and extended syntax is not used

http://docs.oracle.com/cd/E16655_01/server.121/e15858/tgsql_statscon.htm#BABHEJCD

Ananda

Comment by raova — July 17, 2013 @ 4:37 pm BST Jul 17,2013 | Reply
- Ananda,
  
  Thanks for the follow-up and the reference.
  
  SYS is always catching people out – but I’ll bet a few people (me included) get caught out by the virtual column limitation over the next couple of years.
  
  I’ve tried to fix up the formatting – not entirely successfully.
  
  Comment by Jonathan Lewis — July 24, 2013 @ 12:28 pm BST Jul 24,2013 | Reply
Tried to use OFFSET…FETCH in PL/SQL and immediately got an issue. It seems to me that passing a variable to FETCH is not supported:
```
SQL> declare
  2      type table_tt is table of employees%rowtype;
  3      v_tt table_tt;
  4
  5      v_limit_nr number:=10;
  6      v_counter_nr number:=0;
  7  begin
  8      select *
  9      bulk collect into v_tt
 10      from employees
 11      offset v_counter_nr*v_limit_nr rows
 12      fetch next v_limit_nr rows only; -- variable 
 13  end;
 14  /
    type table_tt is table of employees%rowtype;
                              *
ERROR at line 2:
ORA-03113: end-of-file on communication channel
Process ID: 3060
Session ID: 20 Serial number: 35307
SQL>

Doing the same with hard-coded value works fine

 ... 
 12      fetch next 10 rows only; -- hard-code
 13  end;
 14  /
PL/SQL procedure successfully completed.
SQL>
```
Couldn’t find any reference to this anywhere, so it looks like a bug. Before I log it, could anybody confirm/deny the issue for different environment?

I am running:
– the official release of Oracle 12c
– on MS Windows 2012 (64-bit)
– on the top of VMWare ESXi 5.1
– VMX-9 version of the file

Best regards!

Comment by Michael "Misha" Rosenblum — August 26, 2013 @ 7:28 pm BST Aug 26,2013 | Reply
- declare
  type TAB is table of number;
  x tab;
  begin
  select sal bulk collect into x from EMP
  order by SAL desc
  OFFSET 3 rows
  FETCH next 5 rows only ;
  for I in x.first..x.last LOOP
  DBMS_OUTPUT.PUT_LINE(‘x value is ‘||X(I));
  end loop;
  end;
  
  Comment by venkat — August 22, 2014 @ 9:37 pm BST Aug 22,2014 | Reply
  - Let me reiterate: the problem is with passing “NEXT N” as a variable and not as a constant (5 in your case) – it is a bug #17404511 (I logged it later than made this comment here). Just checked its status on MOS – it had not been delivered in 12.1.0.2 (scheduled for 12.2)
    
    Regards,
    Michael
    
    Comment by Michael "Misha" Rosenblum — August 22, 2014 @ 9:49 pm BST Aug 22,2014 | Reply
    - Michael,
      
      Thanks for the follow-up.
      It’s nice to know that the enhancement/fix is intended.
      
      Comment by Jonathan Lewis — August 22, 2014 @ 10:16 pm BST Aug 22,2014
    - Another update. For unknown reasons I entered #17404511 in the MOS search – and it came back to me with the pointer to “12.1.0.2 Patch Set – List of Bug Fixes by Problem Type (Doc ID 1683802.1)” – it seems to be that the mentioned bug is listed under the category “Undocumented Oracle Server”. H-m-m-m… Maybe they’ve pushed a fix earlier than expected and just didn’t update the bug itself? I don’t have 12.1.0.2 installed yet, but maybe somebody can do a quick check and post results?
      
      Comment by Michael "Misha" Rosenblum — August 25, 2014 @ 5:04 pm BST Aug 25,2014
    - Michael,
      
      Just tried a simple test from SQL*PLus in 12.1.0.2
      
      variable m_off number
      variable m_nxt number
      
      execute :m_off := 10
      execute :m_nxt := 10
      
      select
      *
      from
      t1
      order by
      n1
      offset
      :m_off rows
      fetch
      next :m_nxt rows only
      ;
      
      This returns the expected data.
      
      Wrapping it into pl/sql with bulk collect, table of records, and local variables for offset and count also works.
      
      Comment by Jonathan Lewis — August 25, 2014 @ 9:07 pm BST Aug 25,2014

Curiously the execution plan produced by the new row limiter is different to the previous “SORT ORDER BY STOPKEY” method. I wonder which one is faster, the old method, or the analytic functions. Answer: The old method by a hair.

SQL>  create table pwi as select trunc(dbms_random.value(0,5000)) n1 
from dual connect by level  <= 10000

explain plan for
select * from (select * from pwi order by 1) where rownum  <= 5

select * from table (dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1721279410

--------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     5 |    65 |     3   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY          |      |       |       |            |          |
|   2 |   VIEW                  |      |  1000 | 13000 |     3   (0)| 00:00:01 |
|*  3 |    SORT ORDER BY STOPKEY|      |  1000 |  4000 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL   | PWI  |  1000 |  4000 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------

SQL> set timing on

SQL> select * from pwi order by 1 fetch first 5 rows only;

        N1
----------
         0
         0
         0
         0
         0

Elapsed: 00:00:00.12

SQL> select * from (select * from pwi order by 1) where rownum <=5;

        N1
----------
         0
         0
         0
         0
         0

Elapsed: 00:00:00.09

NOTE: I executed the above multiple times to get a stable value.

Comment by Peter Wiseman — October 28, 2013 @ 2:42 am GMT Oct 28,2013 | Reply

Peter,

WordPress messed up your example because of the “less than” symbols. I think I’ve fixed it up correctly.

I’m slightly surprised that you see such a difference in your example: 3/100 of a second is quite large if that’s a consistent difference over such a small amount of work. In your example Oracle is able to use WINDOW SORT PUSHED RANK for the 12c variant, so both versions of the SQL need only sort a very small part of the data, and the 10032 trace shows that the both use a version 1 sort, with the same number of comparisons and output rows (See this post: https://jonathanlewis.wordpress.com/2009/12/28/short-sorts/ ) . I guess that means the generic strategy for the analytic approach is the main culprit.

Comment by Jonathan Lewis — October 28, 2013 @ 12:13 pm GMT Oct 28,2013 | Reply
- Correction – I was using 1,000,000 rows for my elapsed timing information. I failed to updated the create table statement and capture new execution plan statistics.
  
  Comment by Peter Wiseman — October 28, 2013 @ 1:20 pm GMT Oct 28,2013 | Reply

[…] just added a little update to my posting on the 12c “First N” clause, pointing out that there are variations in the WINDOW SORT operation that can make a difference to […]

Pingback by First N | Oracle Scratchpad — October 28, 2013 @ 12:50 pm GMT Oct 28,2013 | Reply
Jonathan, what is /*+ track this */? I wasn’t able to find it among the valid hints:

1 select * from V$SQL_HINT
2* where lower(name) like ‘%track%’
SQL> /

no rows selected

Comment by Mladen Gogala — October 28, 2013 @ 5:58 pm GMT Oct 28,2013 | Reply
- Mladen,
  
  It’s not a hint, it’s a piece of text that I can use to search v$sql if I need to. I tend to use the hint form (rather than the comment form) because some platforms strip out comments before passing the SQL to the database.
  
  Comment by Jonathan Lewis — October 28, 2013 @ 6:11 pm GMT Oct 28,2013 | Reply

Hi Jonathan.
I wondered if the new dbms_utility.expand_sql_text procedure would show the same “unparsed query” you got from the 10053 trace.
Well, it does, just with different table aliases:

var x clob
BEGIN
  dbms_utility.expand_sql_text(
    input_sql_text  => 
      'select /*+ track this */ * 
       from t1 
       order by n1
       offset 10 rows
       fetch next 1 percent rows with ties',
    output_sql_text => :x);
END;
/

print x

SELECT "A1"."ID"       "ID",
	   "A1"."N1"       "N1",
	   "A1"."SMALL_VC" "SMALL_VC",
	   "A1"."PADDING"  "PADDING"
FROM   (SELECT "A2"."ID" "ID",
			   "A2"."N1" "N1",
			   "A2"."SMALL_VC" "SMALL_VC",
			   "A2"."PADDING" "PADDING",
			   "A2"."N1" "rowlimit_$_  0",
			   row_number() over(ORDER BY "A2"."N1") "rowlimit_$$_rownumber",
			   rank() over(ORDER BY "A2"."N1") "rowlimit_$$_rank",
			   COUNT(*) over() "rowlimit_$$_total"
		FROM   "HR"."T1" "A2") "A1"
WHERE  "A1"."rowlimit_$$_rank" = 0) <=          
CASE            WHEN (10>=0)
THEN floor(to_number(10)) ELSE 0 END + ceil("A1"."rowlimit_$$_total" * 1 / 100)
AND "A1"."rowlimit_$$_rownumber" > 10
ORDER  BY "A1"."rowlimit_$_  0";

Thanks,
Oren.

Comment by Oren Nakdimon @DBoriented — November 1, 2013 @ 2:09 pm GMT Nov 1,2013 | Reply

[…] error out of the way we can start to think about the query. It’s using the (fairly new) “Fetch first N rows” syntax, which means we may have to find a lot of data and sort it before returning a subset: […]

Pingback by Index Engineering | Oracle Scratchpad — January 20, 2020 @ 4:53 pm GMT Jan 20,2020 | Reply
[…] etc. The error doesn’t seem to be particularly relevant, of course, until you remember that “fetch first” creates an inline view using the analytic row_number() under the […]

Pingback by Fetch First Update | Oracle Scratchpad — June 3, 2020 @ 1:49 pm BST Jun 3,2020 | Reply
[…] also pointed out that while 12c allows you to use “fetch first N rows” instead of “where rownum <= N” there’s a hidden threat to using the feature […]

Pingback by Fetch First vs. Rownum | Oracle Scratchpad — July 15, 2020 @ 10:12 am BST Jul 15,2020 | Reply
[…] jonathanlewis.wordpress.com docs.oracle.com […]

Pingback by Cláusula de limitación de filas para consultas Top-N en Oracle Database 12c versión 1 (12.1) - DBandTech.com — October 4, 2020 @ 12:53 am BST Oct 4,2020 | Reply
[…] It would be nice to see it in Oracle as well as it could make an enormous difference to “fetch first N” […]

Pingback by I wish | Oracle Scratchpad — November 9, 2020 @ 12:01 pm GMT Nov 9,2020 | Reply
[…] a thread on the MOSC (needs an account) database tuning forumat the moment asking why a “fetch first N” query to fetch next 41 rows with an offset of 8602 rows takes longer to run than the same query […]

Pingback by Pagination cost | Oracle Scratchpad — July 21, 2022 @ 3:57 pm BST Jul 21,2022 | Reply
[…] Fetch First under the covers (July 2013): How does Oracle transform Fetch First / Top-N before optimising – should be retested for newer versions. […]

Pingback by Execution Plans Catalogue | Oracle Scratchpad — August 12, 2022 @ 11:00 am BST Aug 12,2022 | Reply
[…] Fetch First under the covers (July 2013): How does Oracle transform Fetch First / Top-N before optimising – should be retested for newer versions. […]

Pingback by Troubleshooting catalogue | Oracle Scratchpad — August 12, 2022 @ 11:01 am BST Aug 12,2022 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome. Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Scratchpad

July 10, 2013

12c First N

Footnote:

Update Oct 2013

Update Aug 2022

27 Comments »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS

Oracle Scratchpad

July 10, 2013

12c First N

Footnote:

Update Oct 2013

Update Aug 2022

Share this:

Related

27 Comments »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS