Oracle Scratchpad

January 31, 2011

ANSI Outer

Filed under: ANSI Standard,CBO,Execution plans,lateral view,Oracle — Jonathan Lewis @ 6:59 pm GMT Jan 31,2011

Here’s an example of ANSI SQL that does something in a fashion that arguably looks a little tidier than the strategy you have to adopt in Oracle. As so often when I compare Oracle syntax and ANSI syntax it’s an example that relates to an outer join. We start with two tables – as usual I have locally managed tablespaces, 8KB blocks, 1MB uniform extents and freelist management. I’ve also disabled system statistics (CPU costing):


rem
rem     Script:         ansi_outer_selective.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2011
rem

create table t1 
as
select
        rownum - 1                      id,
        mod(rownum - 1,20)              n1,
        lpad(rownum - 1,10,'0')         v1,
        rpad('x',100)                   padding
from
        all_objects
where
        rownum <= 4000  -- > comment to avoid WordPress format issue
;

create table t2
as
select
        rownum - 1                      id,
        mod(rownum - 1,20)              n1,
        lpad(rownum - 1,10,'0')         v1,
        rpad('x',100)                   padding
from
        all_objects
where
        rownum <= 4000  -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(id);
create index t2_i1 on t2(id);


begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                estimate_percent => 100,
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                estimate_percent => 100,
                method_opt       => 'for all columns size 1'
        );

end;
/


If you’re familiar with ANSI SQL you won’t need more than a couple of moments to interpret the following query – but I have to admit that I had to think about it carefully before I figured out what it was trying to achieve:


select
        /*+ gather_plan_statistics */
        t1.id,
        t1.n1,
        t1.v1,
        t2.n1
from
        t1
left join
        t2
on
        t2.id = t1.n1
and     t1.n1 in (7, 11, 13)
where
        t1.id = 15
;

        ID         N1 V1                 N1
---------- ---------- ---------- ----------
        15         15 0000000015

--------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS OUTER            |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID  | T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN            | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
|   4 |   VIEW                         |       |      1 |      1 |      0 |00:00:00.01 |       0 |
|*  5 |    FILTER                      |       |      1 |        |      0 |00:00:00.01 |       0 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |      0 |00:00:00.01 |       0 |
|*  7 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |      0 |00:00:00.01 |       0 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"=15)
   5 - filter(("T1"."N1"=7 OR "T1"."N1"=11 OR "T1"."N1"=13))
   7 - access("T2"."ID"="T1"."N1")
       filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))

Look carefully at the query. The ON clause includes a reference to the t1 table that is NOT a join condition. This condition means that Oracle will try to find a matching row in t2 only if the n1 value in t1 is in 7,11, or 13.

Since the join is left outer, any t1 row where id = 15 will be returned, but the n1 column from t2 will be reported only if the t1.n1 value is in 7,11, or 13 and there is a t2 row with a matching id value.

How would you express the same requirment in standard Oracle syntax ? Here’s one possibility:

select
        /*+ gather_plan_statistics */
        t1.id,
        t1.n1,
        t1.v1,
        t2.n1
from
        t1, t2
where
        t1.id = 15
and     t2.id(+) = case
                when t1.n1 not in (7, 11, 13) 
                        then null
                        else t1.n1
        end
;


        ID         N1 V1                 N1
---------- ---------- ---------- ----------
        15         15 0000000015

------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS OUTER          |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2    |      1 |      1 |      0 |00:00:00.01 |       0 |
|*  5 |    INDEX RANGE SCAN          | T2_I1 |      1 |      1 |      0 |00:00:00.01 |       0 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"=15)
   5 - access("T2"."ID"=CASE  WHEN (("T1"."N1"<>7) AND ("T1"."N1"<>11) AND
              ("T1"."N1"<>13)) THEN NULL ELSE "T1"."N1" END )

You will have noticed that I used the /*+ gather_plan_statistics */ hint in both queries, and the plans I’ve shown are the ones that I pulled from memory with their last execution statistics included.

The plans are clearly just a little different from each other – but are they equally efficient ?

Both plans start the same way – for each relevant row in t1 they call line 4 – and in both cases there is only one relevant row in t1, hence one call to line 4.

In the first plan we call the subroutine to create a rowsource (the VIEW operator), and this calls the FILTER operation. The filter operation is an example of a “conditional” filter – i.e. if the test in line 5 is true then line 6 is called – and in this case the test is false so line 6 is never called (Starts = 0) and Oracle doesn’t try to access table t2. So we fetch one row from t1, then call two subroutines that between them do a “case” test but access no data blocks.

In the second plan line 4 calls the table access operation, which calls the index range scan operation in line 5 – but line 5 starts with a call to the case statement that returns NULL – so even though we call the index range scan operation, we don’t access any data blocks, which means we don’t pass any rowids to the table access in line 4, which means that that operation doesn’t access any data blocks. The net workload is that lines 4 and 5 in the second plan also represent nothing more than two subroutine calls and a “case” test.

The two plans are virtually identical in resource usage – so your choice of which to use probably comes down to aesthetic appeal and comfort factor. Thanks to my previous habits I think the Oracle variant is “obviously” much easier to understand – but the client probably felt the opposite because they spent most of their time writing ANSI.

Is there another way to write the query – yes, there is, but for some reason it’s not supported. If you look at the 10053 trace file for the ANSI example you’ll find that Oracle has rewritten it with a LATERAL subquery before optimising it. So, presumably, we can appply the same transformation manually:


select
        /*+ gather_plan_statistics */
        t1.id,
        t1.n1,
        t1.v1,
        t2.n1
from
        t1,
        lateral (
                (
                select
                        t2.n1
                from
                        t2
                where
                        t1.n1 in (7, 11, 13)
                and     t2.id = t1.n1
                )
        )(+) t2
where
        t1.id = 15
;


        lateral (
                *
ERROR at line 9:
ORA-00933: SQL command not properly ended

On second thoughts perhaps we can’t – but it was a nice idea.

Oracle uses the table() operator with collections (or pipelined functions) as the equivalent of the ANSI LATERAL() operator, but doesn’t let you use it with queries (you get Oracle error “ORA-22905: cannot access rows from a non-nested table item” if you try).

The concept is simple – the table() or lateral() operator allows you to introduce a subquery in the FROM clause that references columns from a table that appears earlier in the same FROM clause.

I don’t know why Oracle doesn’t support the lateral() operator in end-user code – but actually, for experimental purposes only of course, if you want to play with it you can always set event 22829:


alter session set events '22829 trace name context forever';

-- execute lateral query, and get this plan

--------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS OUTER            |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID  | T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN            | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
|   4 |   VIEW                         |       |      1 |      1 |      0 |00:00:00.01 |       0 |
|*  5 |    FILTER                      |       |      1 |        |      0 |00:00:00.01 |       0 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |      0 |00:00:00.01 |       0 |
|*  7 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |      0 |00:00:00.01 |       0 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"=15)
   5 - filter(("T1"."N1"=7 OR "T1"."N1"=11 OR "T1"."N1"=13))
   7 - access("T2"."ID"="T1"."N1")
       filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))

The plan is identical to the plan for the ANSI after transformation. I’ll leave it to you to decide whether the code is easier to understand than the ANSI or ordinary Oracle versions – but I’d like to see it made legal, even if I didn’t find many cases where I needed it.

23 Comments »

  1. Oracle uses the table() operator with collections (or pipelined functions) as the equivalent of the ANSI LATERAL() operator, but doesn’t let you use it with queries (you get Oracle error “ORA-22905: cannot access rows from a non-nested table item” if you try).

    Jonathan,

    it’s doable:

    create or replace type t_array as table of number;
    /
    
    select
        /*+ gather_plan_statistics */
        t1.id,
        t1.n1,
        t1.v1,
        t2.column_value
    from
        t1,
        table (cast(multiset(
            select
                t2.n1
            from
                t2
            where
                t1.n1 in (7, 11, 13)
            and    t2.id = t1.n1
                  ) as t_array)
        )(+) t2
    where
        t1.id = 15
    ;
    
                      ID                   N1 V1                                               COLUMN_VALUE
    -------------------- -------------------- ---------------------------------------- --------------------
                      15                   15 0000000015                               NULL
    
    1 row selected.
    
    SQL> @x
    
    PLAN_TABLE_OUTPUT
    -------------------------------------------------------------------------------------------------------
    SQL_ID  7fq13c8ynxujz, child number 0
    -------------------------------------
    select     /*+ gather_plan_statistics */     t1.id,     t1.n1,
    t1.v1,     t2.column_value from     t1,     table (cast(multiset(
      select             t2.n1         from             t2         where
             t1.n1 in (7, 11, 13)         and    t2.id = t1.n1
     ) as t_array)     )(+) t2 where     t1.id = 15
    
    Plan hash value: 483134546
    
    -------------------------------------------------------------------------------------------------------
    | Id  | Operation                           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    -------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |       |      1 |        |      1 |00:00:00.01 |       4 |
    |   1 |  NESTED LOOPS OUTER                 |       |      1 |   8168 |      1 |00:00:00.01 |       4 |
    |   2 |   TABLE ACCESS BY INDEX ROWID       | T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
    |*  3 |    INDEX RANGE SCAN                 | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
    |   4 |   COLLECTION ITERATOR SUBQUERY FETCH|       |      1 |   8168 |      0 |00:00:00.01 |       0 |
    |*  5 |    FILTER                           |       |      1 |        |      0 |00:00:00.01 |       0 |
    |   6 |     TABLE ACCESS BY INDEX ROWID     | T2    |      0 |      1 |      0 |00:00:00.01 |       0 |
    |*  7 |      INDEX RANGE SCAN               | T2_I1 |      0 |      1 |      0 |00:00:00.01 |       0 |
    -------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("T1"."ID"=15)
       5 - filter((:B1=7 OR :B2=11 OR :B3=13))
       7 - access("T2"."ID"=:B1)

    Comment by Timur Akhmadeev — January 31, 2011 @ 8:26 pm GMT Jan 31,2011 | Reply

    • Timur,

      Very cunning – but you only get half marks for that one because it is still using a collection operator, and (in general) you would have to create both a scalar and a table type to support each subquery you wanted to use, and it’s got the 8,168 (or blocksize – 24) cardinality issue to address.

      Comment by Jonathan Lewis — January 31, 2011 @ 10:23 pm GMT Jan 31,2011 | Reply

  2. I personaly prefer the ANSI syntax but always try to avoid “ON clauses that are NOT join conditions”. I’m sure you are not the only one that “had to think about it carefully before figuring out what it was trying to achieve”. It is the same with me, each and every time I see such thing.

    I found a way to write such queries that is very clear and understandable to me. We just need to understand that the “ON clauses that are NOT join conditions” are actually “WHERE clauses” for the table being outer joined. This is visible in the predicate of line 7 in your very first query:

    7 - access("T2"."ID"="T1"."N1")
        filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))
    

    So we can rew rewrite the query using this understanding. Here is an example with Oracle syntax (but the same can be done with ANSI):

    
    select
        t1.id,
        t1.n1,
        t1.v1,
        t2.n1
    from
        t1,
        (
          select
              t2.id,
              t2.n1
          from
              t2
          where
              t2.id in (7, 11, 13)
         ) t2
    where
        t1.id     = 15
    and t2.id (+) = t1.n1;
    
    
            ID         N1 V1                 N1
    ---------- ---------- ---------- ----------
            15         15 0000000015
    
    

    Comment by Todor Botev — February 2, 2011 @ 1:27 am GMT Feb 2,2011 | Reply

    • Todor,

      Nicely done, but this is a special case where transitive closure allows you to move a predicate on t1 to become a predicate on t2. Can you extend your method to a query like:

      select  
          t1.id,  
          t1.n1,  
          t1.v1,  
          t2.n1  
      from  
          t1  
      left join  
          t2  
      on  
          t2.id = t1.n1  
      and T1.COLX in (7, 11, 13)  -- different t1 column (yes I know it wasn't in the original definition)
      where  
          t1.id = 15  
      ;  
      
      

      Comment by Jonathan Lewis — February 2, 2011 @ 1:18 pm GMT Feb 2,2011 | Reply

      • Yes, you are right – the solution cannot be applied in a more general case.

        Comment by Todor Botev — February 2, 2011 @ 1:53 pm GMT Feb 2,2011 | Reply

      • Let me try.

        alter table t1 add colx number;
        update t1 set colx = n1;
        commit;
        

        When select statement is written like this

        select
            t1.id,
            t1.n1,
            t1.v1,
            t2.n1
        from
            t1
        ,
            t2
        where
            t1.id = 15
        and t2.id(+) = t1.n1
        and T1.COLX in (7, 11, 13)
        

        then Oracle treats “T1.COLX in (7, 11, 13)” as post-join predicate and we don’t achieve desirable recordset.

        So the main point is: write condition “T1.COLX in (7, 11, 13)” so that Oracle treats it as pre-join predicate:

        SQL> select
          2      t1.id,
          3      t1.n1,
          4      t1.v1,
          5      t2.n1
          6  from
          7      t1
          8  ,
          9      t2
         10  where
         11      t1.id = 15
         12  and t2.id(+) = t1.n1
         13  and decode(T1.COLX,7,0,11,0,13,0)=nvl2(t2.id(+),0,0);
        
                ID         N1 V1                 N1
        ---------- ---------- ---------- ----------
                15         15 0000000015
        
        SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
        
        PLAN_TABLE_OUTPUT
        -------------------
        SQL_ID  du7ky61ubqxxv, child number 0
        -------------------------------------
        select     t1.id,     t1.n1,     t1.v1,     t2.n1 from     t1 ,     t2
        where     t1.id = 15 and t2.id(+) = t1.n1 and
        decode(T1.COLX,7,0,11,0,13,0)=nvl2(t2.id(+),0,0)
        
        Plan hash value: 3024481811
        
        ------------------------------------------------------------------------------------------------
        | Id  | Operation                    | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
        ------------------------------------------------------------------------------------------------
        |   0 | SELECT STATEMENT             |       |      1 |        |      1 |00:00:00.01 |       6 |
        |   1 |  NESTED LOOPS OUTER          |       |      1 |      1 |      1 |00:00:00.01 |       6 |
        |   2 |   TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
        |*  3 |    INDEX RANGE SCAN          | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
        |   4 |   TABLE ACCESS BY INDEX ROWID| T2    |      1 |      1 |      0 |00:00:00.01 |       2 |
        |*  5 |    INDEX RANGE SCAN          | T2_I1 |      1 |      1 |      0 |00:00:00.01 |       2 |
        ------------------------------------------------------------------------------------------------
        
        Predicate Information (identified by operation id):
        ---------------------------------------------------
        
           3 - access("T1"."ID"=15)
           5 - access("T2"."ID"="T1"."N1")
               filter(DECODE("T1"."COLX",7,0,11,0,13,0)=NVL2("T2"."ID",0,0))
        
        
        26 rows selected.
        

        Of course, the plan is little bit different than yours with “T1.COLX in (7, 11, 13)” condition. But we can see both filter and access predicates. :-)

        Comment by sqlmdx — July 7, 2011 @ 6:18 pm BST Jul 7,2011 | Reply

      • And another one approach. Just for fun.

        select
            /*+ gather_plan_statistics */
            t1.id,
            t1.n1,
            t1.v1,
            t2.column_value n1
        from
            t1,
            table (
                (
                select
                    collect(t2.n1)
                from
                    t2
                where
                    t1.colx in (7, 11, 13)
                and t2.id = t1.n1
                )
            )(+) t2
        where
            t1.id = 15
        ;
        

        This approach has the same issues as Timur’s one (shown above).
        PS. Sorry, but I can’t figure out how to format code in your blog.

        Comment by sqlmdx — July 7, 2011 @ 6:23 pm BST Jul 7,2011 | Reply

  3. usefulness(“ANSI SQL”, “Oracle”) = usefulness(“Shakespeare’s English”, “battlefield”)

    The optimizer “had to think” about it. Doesn’t that tell you something.

    Comment by Alex Nedoboi — February 2, 2011 @ 12:54 pm GMT Feb 2,2011 | Reply

  4. Correct if I’m wrong. But I have heard that the CBO cannot optimize a query that uses an ANSI Outer Join syntax. The CBO must convert it to Oracle’s outer join syntax before the query can be optimized. Is that correct?

    Comment by Jimmy — February 2, 2011 @ 4:10 pm GMT Feb 2,2011 | Reply

    • Jimmy,

      Essentially the optimizer will transform almost any SQL to suit its purposes; ANSI is only slightly special in this respect; but you are right that in effect Oracle transforms from ANSI style to Oracle style before optimising in most cases. As far as I know there are only two classes of ANSI SQL that have a “native optimization” code path – the full outer join and the partitioned outer join.

      Comment by Jonathan Lewis — February 2, 2011 @ 7:44 pm GMT Feb 2,2011 | Reply

  5. […] metadata to the table by partition identifier – except you would probably need to use a laterval view, which Oracle doesn’t support, and make the partition extended syntax part of the lateral […]

    Pingback by Partitioned Bitmaps « Oracle Scratchpad — July 1, 2011 @ 5:21 pm BST Jul 1,2011 | Reply

  6. […] table’s metadata to the table by partition identifier – except you would probably need to use a laterval view, which Oracle doesn’t support, and make the partition extended syntax part of the lateral […]

    Pingback by Partitioned Bitmaps « Ukrainian Oracle User Group — July 4, 2011 @ 9:49 pm BST Jul 4,2011 | Reply

  7. I use to think in rowsets, that is to say, reduce any complex SQL to its sets of rowset over which I apply operators in order to produce the desire output. This leads to the usage of ‘with v as …, v1 as ()… ‘ and piling of ‘select … from ( ) ‘. Thought less elegant, it lis a faster delivery method of complex SQL where would-be-lateral are just another rowset to produce and integrate. The downside is rather the optimizer behaviour when the layers piles, however this is balanced by the fact that rowset-building-approach allows a gradual build-and-tune of the query while you are integrating rowset after rowset.

    Comment by Bernard Polarski — July 13, 2011 @ 7:15 am BST Jul 13,2011 | Reply

  8. […] solution and followed by window functions, scalar subqueries, and lateral derived tables. Note that event 22829 needs to be set before using lateral derived tables; thanks to Jonathan Lewis for that […]

    Pingback by Part 1: The Hitchhiker’s Guide to SQL: Lateral derived tables and other alternatives to GROUP BY « So Many Oracle Manuals, So Little Time — April 16, 2012 @ 1:36 am BST Apr 16,2012 | Reply

  9. […] instead of an index fast full scan. (You’ll need to set event 22829 – as shown in this posting – to check the […]

    Pingback by ANSI Outer 2 « Oracle Scratchpad — July 16, 2012 @ 4:55 pm BST Jul 16,2012 | Reply

  10. […] 4 and 15. If you tried to run this SQL from the command line (after fiddling the event to enable lateral() views), or even just the simple select running from lines 13 to 26, you would get error ORA-00923: […]

    Pingback by ROWID | Oracle Scratchpad — June 4, 2013 @ 9:09 am BST Jun 4,2013 | Reply

  11. […] in 11g (at least) and could even be used by the end user by setting some event, as described here by Jonathan […]

    Pingback by DB Oriented — August 10, 2013 @ 4:25 pm BST Aug 10,2013 | Reply

  12. […] Lateral inline views היו בשימוש פנימי על ידי ה-optimizer כבר בגרסה 11g (לפחות), ואף היו ניתנים לשימוש על ידי משתמשי הקצה ע"י הדלקת event מסויים, כפי שמתואר כאן. […]

    Pingback by DB Oriented — August 17, 2013 @ 9:15 pm BST Aug 17,2013 | Reply

  13. just a small and late addition: in 12c the lateral() operator is available for user queries: http://docs.oracle.com/cd/E16655_01/server.121/e17209/statements_10002.htm#SQLRF55632.

    Comment by Martin Preiss — May 23, 2014 @ 7:51 am BST May 23,2014 | Reply

  14. […] Reading on ANSI: Oracle Optimizer Blog Jonathan Lewis on ANSI Outer Jonathan Lewis on […]

    Pingback by Outer Join with OR and Lateral View Decorrelation | OraStory — July 6, 2016 @ 5:33 pm BST Jul 6,2016 | Reply

  15. […] Whilst this is generally not true and performance is equivalent, there are edge cases where it is true, mainly related to outer joins. The more recent our version of Oracle the fewer and more edgy those cases become. It’s not an argument for using the older syntax, just something to bear in mind. Find out more. […]

    Pingback by 1z0-071 oracle exam question which answers is correct? – SQL — February 4, 2021 @ 10:12 am GMT Feb 4,2021 | Reply

  16. […] “for each row do two high-precision lookups”. I implemented this by joining two lateral views of the lookup table to the measures table. Since I was sitting in front of a copy of 11.2.0.4 at […]

    Pingback by Case Study | Oracle Scratchpad — April 5, 2021 @ 3:36 pm BST Apr 5,2021 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.