Here’s an example of ANSI SQL that does something in a fashion that arguably looks a little tidier than the strategy you have to adopt in Oracle. As so often when I compare Oracle syntax and ANSI syntax it’s an example that relates to an outer join. We start with two tables – as usual I have locally managed tablespaces, 8KB blocks, 1MB uniform extents and freelist management. I’ve also disabled system statistics (CPU costing):
rem rem Script: ansi_outer_selective.sql rem Author: Jonathan Lewis rem Dated: Jan 2011 rem create table t1 as select rownum - 1 id, mod(rownum - 1,20) n1, lpad(rownum - 1,10,'0') v1, rpad('x',100) padding from all_objects where rownum <= 4000 -- > comment to avoid WordPress format issue ; create table t2 as select rownum - 1 id, mod(rownum - 1,20) n1, lpad(rownum - 1,10,'0') v1, rpad('x',100) padding from all_objects where rownum <= 4000 -- > comment to avoid WordPress format issue ; create index t1_i1 on t1(id); create index t2_i1 on t2(id); begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', estimate_percent => 100, method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T2', estimate_percent => 100, method_opt => 'for all columns size 1' ); end; /
If you’re familiar with ANSI SQL you won’t need more than a couple of moments to interpret the following query – but I have to admit that I had to think about it carefully before I figured out what it was trying to achieve:
select /*+ gather_plan_statistics */ t1.id, t1.n1, t1.v1, t2.n1 from t1 left join t2 on t2.id = t1.n1 and t1.n1 in (7, 11, 13) where t1.id = 15 ; ID N1 V1 N1 ---------- ---------- ---------- ---------- 15 15 0000000015 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------------- | 1 | NESTED LOOPS OUTER | | 1 | 1 | 1 |00:00:00.01 | 4 | | 2 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 1 | 1 |00:00:00.01 | 4 | |* 3 | INDEX RANGE SCAN | T1_I1 | 1 | 1 | 1 |00:00:00.01 | 3 | | 4 | VIEW | | 1 | 1 | 0 |00:00:00.01 | 0 | |* 5 | FILTER | | 1 | | 0 |00:00:00.01 | 0 | | 6 | TABLE ACCESS BY INDEX ROWID| T2 | 0 | 1 | 0 |00:00:00.01 | 0 | |* 7 | INDEX RANGE SCAN | T2_I1 | 0 | 1 | 0 |00:00:00.01 | 0 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("T1"."ID"=15) 5 - filter(("T1"."N1"=7 OR "T1"."N1"=11 OR "T1"."N1"=13)) 7 - access("T2"."ID"="T1"."N1") filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))
Look carefully at the query. The ON clause includes a reference to the t1 table that is NOT a join condition. This condition means that Oracle will try to find a matching row in t2 only if the n1 value in t1 is in 7,11, or 13.
Since the join is left outer, any t1 row where id = 15 will be returned, but the n1 column from t2 will be reported only if the t1.n1 value is in 7,11, or 13 and there is a t2 row with a matching id value.
How would you express the same requirment in standard Oracle syntax ? Here’s one possibility:
select /*+ gather_plan_statistics */ t1.id, t1.n1, t1.v1, t2.n1 from t1, t2 where t1.id = 15 and t2.id(+) = case when t1.n1 not in (7, 11, 13) then null else t1.n1 end ; ID N1 V1 N1 ---------- ---------- ---------- ---------- 15 15 0000000015 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------------ | 1 | NESTED LOOPS OUTER | | 1 | 1 | 1 |00:00:00.01 | 4 | | 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 4 | |* 3 | INDEX RANGE SCAN | T1_I1 | 1 | 1 | 1 |00:00:00.01 | 3 | | 4 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 1 | 0 |00:00:00.01 | 0 | |* 5 | INDEX RANGE SCAN | T2_I1 | 1 | 1 | 0 |00:00:00.01 | 0 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("T1"."ID"=15) 5 - access("T2"."ID"=CASE WHEN (("T1"."N1"<>7) AND ("T1"."N1"<>11) AND ("T1"."N1"<>13)) THEN NULL ELSE "T1"."N1" END )
You will have noticed that I used the /*+ gather_plan_statistics */ hint in both queries, and the plans I’ve shown are the ones that I pulled from memory with their last execution statistics included.
The plans are clearly just a little different from each other – but are they equally efficient ?
Both plans start the same way – for each relevant row in t1 they call line 4 – and in both cases there is only one relevant row in t1, hence one call to line 4.
In the first plan we call the subroutine to create a rowsource (the VIEW operator), and this calls the FILTER operation. The filter operation is an example of a “conditional” filter – i.e. if the test in line 5 is true then line 6 is called – and in this case the test is false so line 6 is never called (Starts = 0) and Oracle doesn’t try to access table t2. So we fetch one row from t1, then call two subroutines that between them do a “case” test but access no data blocks.
In the second plan line 4 calls the table access operation, which calls the index range scan operation in line 5 – but line 5 starts with a call to the case statement that returns NULL – so even though we call the index range scan operation, we don’t access any data blocks, which means we don’t pass any rowids to the table access in line 4, which means that that operation doesn’t access any data blocks. The net workload is that lines 4 and 5 in the second plan also represent nothing more than two subroutine calls and a “case” test.
The two plans are virtually identical in resource usage – so your choice of which to use probably comes down to aesthetic appeal and comfort factor. Thanks to my previous habits I think the Oracle variant is “obviously” much easier to understand – but the client probably felt the opposite because they spent most of their time writing ANSI.
Is there another way to write the query – yes, there is, but for some reason it’s not supported. If you look at the 10053 trace file for the ANSI example you’ll find that Oracle has rewritten it with a LATERAL subquery before optimising it. So, presumably, we can appply the same transformation manually:
select /*+ gather_plan_statistics */ t1.id, t1.n1, t1.v1, t2.n1 from t1, lateral ( ( select t2.n1 from t2 where t1.n1 in (7, 11, 13) and t2.id = t1.n1 ) )(+) t2 where t1.id = 15 ; lateral ( * ERROR at line 9: ORA-00933: SQL command not properly ended
On second thoughts perhaps we can’t – but it was a nice idea.
Oracle uses the table() operator with collections (or pipelined functions) as the equivalent of the ANSI LATERAL() operator, but doesn’t let you use it with queries (you get Oracle error “ORA-22905: cannot access rows from a non-nested table item” if you try).
The concept is simple – the table() or lateral() operator allows you to introduce a subquery in the FROM clause that references columns from a table that appears earlier in the same FROM clause.
I don’t know why Oracle doesn’t support the lateral() operator in end-user code – but actually, for experimental purposes only of course, if you want to play with it you can always set event 22829:
alter session set events '22829 trace name context forever'; -- execute lateral query, and get this plan -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | -------------------------------------------------------------------------------------------------- | 1 | NESTED LOOPS OUTER | | 1 | 1 | 1 |00:00:00.01 | 4 | | 2 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 1 | 1 |00:00:00.01 | 4 | |* 3 | INDEX RANGE SCAN | T1_I1 | 1 | 1 | 1 |00:00:00.01 | 3 | | 4 | VIEW | | 1 | 1 | 0 |00:00:00.01 | 0 | |* 5 | FILTER | | 1 | | 0 |00:00:00.01 | 0 | | 6 | TABLE ACCESS BY INDEX ROWID| T2 | 0 | 1 | 0 |00:00:00.01 | 0 | |* 7 | INDEX RANGE SCAN | T2_I1 | 0 | 1 | 0 |00:00:00.01 | 0 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("T1"."ID"=15) 5 - filter(("T1"."N1"=7 OR "T1"."N1"=11 OR "T1"."N1"=13)) 7 - access("T2"."ID"="T1"."N1") filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))
The plan is identical to the plan for the ANSI after transformation. I’ll leave it to you to decide whether the code is easier to understand than the ANSI or ordinary Oracle versions – but I’d like to see it made legal, even if I didn’t find many cases where I needed it.
Jonathan,
it’s doable:
Comment by Timur Akhmadeev — January 31, 2011 @ 8:26 pm GMT Jan 31,2011 |
Timur,
Very cunning – but you only get half marks for that one because it is still using a collection operator, and (in general) you would have to create both a scalar and a table type to support each subquery you wanted to use, and it’s got the 8,168 (or blocksize – 24) cardinality issue to address.
Comment by Jonathan Lewis — January 31, 2011 @ 10:23 pm GMT Jan 31,2011 |
I personaly prefer the ANSI syntax but always try to avoid “ON clauses that are NOT join conditions”. I’m sure you are not the only one that “had to think about it carefully before figuring out what it was trying to achieve”. It is the same with me, each and every time I see such thing.
I found a way to write such queries that is very clear and understandable to me. We just need to understand that the “ON clauses that are NOT join conditions” are actually “WHERE clauses” for the table being outer joined. This is visible in the predicate of line 7 in your very first query:
So we can rew rewrite the query using this understanding. Here is an example with Oracle syntax (but the same can be done with ANSI):
Comment by Todor Botev — February 2, 2011 @ 1:27 am GMT Feb 2,2011 |
Todor,
Nicely done, but this is a special case where transitive closure allows you to move a predicate on t1 to become a predicate on t2. Can you extend your method to a query like:
Comment by Jonathan Lewis — February 2, 2011 @ 1:18 pm GMT Feb 2,2011 |
Yes, you are right – the solution cannot be applied in a more general case.
Comment by Todor Botev — February 2, 2011 @ 1:53 pm GMT Feb 2,2011 |
Let me try.
When select statement is written like this
then Oracle treats “T1.COLX in (7, 11, 13)” as post-join predicate and we don’t achieve desirable recordset.
So the main point is: write condition “T1.COLX in (7, 11, 13)” so that Oracle treats it as pre-join predicate:
Of course, the plan is little bit different than yours with “T1.COLX in (7, 11, 13)” condition. But we can see both filter and access predicates. :-)
Comment by sqlmdx — July 7, 2011 @ 6:18 pm BST Jul 7,2011 |
And another one approach. Just for fun.
This approach has the same issues as Timur’s one (shown above).
PS. Sorry, but I can’t figure out how to format code in your blog.
Comment by sqlmdx — July 7, 2011 @ 6:23 pm BST Jul 7,2011 |
usefulness(“ANSI SQL”, “Oracle”) = usefulness(“Shakespeare’s English”, “battlefield”)
The optimizer “had to think” about it. Doesn’t that tell you something.
Comment by Alex Nedoboi — February 2, 2011 @ 12:54 pm GMT Feb 2,2011 |
Correct if I’m wrong. But I have heard that the CBO cannot optimize a query that uses an ANSI Outer Join syntax. The CBO must convert it to Oracle’s outer join syntax before the query can be optimized. Is that correct?
Comment by Jimmy — February 2, 2011 @ 4:10 pm GMT Feb 2,2011 |
Jimmy,
Essentially the optimizer will transform almost any SQL to suit its purposes; ANSI is only slightly special in this respect; but you are right that in effect Oracle transforms from ANSI style to Oracle style before optimising in most cases. As far as I know there are only two classes of ANSI SQL that have a “native optimization” code path – the full outer join and the partitioned outer join.
Comment by Jonathan Lewis — February 2, 2011 @ 7:44 pm GMT Feb 2,2011 |
[…] metadata to the table by partition identifier – except you would probably need to use a laterval view, which Oracle doesn’t support, and make the partition extended syntax part of the lateral […]
Pingback by Partitioned Bitmaps « Oracle Scratchpad — July 1, 2011 @ 5:21 pm BST Jul 1,2011 |
[…] table’s metadata to the table by partition identifier – except you would probably need to use a laterval view, which Oracle doesn’t support, and make the partition extended syntax part of the lateral […]
Pingback by Partitioned Bitmaps « Ukrainian Oracle User Group — July 4, 2011 @ 9:49 pm BST Jul 4,2011 |
I use to think in rowsets, that is to say, reduce any complex SQL to its sets of rowset over which I apply operators in order to produce the desire output. This leads to the usage of ‘with v as …, v1 as ()… ‘ and piling of ‘select … from ( ) ‘. Thought less elegant, it lis a faster delivery method of complex SQL where would-be-lateral are just another rowset to produce and integrate. The downside is rather the optimizer behaviour when the layers piles, however this is balanced by the fact that rowset-building-approach allows a gradual build-and-tune of the query while you are integrating rowset after rowset.
Comment by Bernard Polarski — July 13, 2011 @ 7:15 am BST Jul 13,2011 |
[…] solution and followed by window functions, scalar subqueries, and lateral derived tables. Note that event 22829 needs to be set before using lateral derived tables; thanks to Jonathan Lewis for that […]
Pingback by Part 1: The Hitchhiker’s Guide to SQL: Lateral derived tables and other alternatives to GROUP BY « So Many Oracle Manuals, So Little Time — April 16, 2012 @ 1:36 am BST Apr 16,2012 |
[…] instead of an index fast full scan. (You’ll need to set event 22829 – as shown in this posting – to check the […]
Pingback by ANSI Outer 2 « Oracle Scratchpad — July 16, 2012 @ 4:55 pm BST Jul 16,2012 |
[…] 4 and 15. If you tried to run this SQL from the command line (after fiddling the event to enable lateral() views), or even just the simple select running from lines 13 to 26, you would get error ORA-00923: […]
Pingback by ROWID | Oracle Scratchpad — June 4, 2013 @ 9:09 am BST Jun 4,2013 |
[…] in 11g (at least) and could even be used by the end user by setting some event, as described here by Jonathan […]
Pingback by DB Oriented — August 10, 2013 @ 4:25 pm BST Aug 10,2013 |
[…] Lateral inline views היו בשימוש פנימי על ידי ה-optimizer כבר בגרסה 11g (לפחות), ואף היו ניתנים לשימוש על ידי משתמשי הקצה ע"י הדלקת event מסויים, כפי שמתואר כאן. […]
Pingback by DB Oriented — August 17, 2013 @ 9:15 pm BST Aug 17,2013 |
just a small and late addition: in 12c the lateral() operator is available for user queries: http://docs.oracle.com/cd/E16655_01/server.121/e17209/statements_10002.htm#SQLRF55632.
Comment by Martin Preiss — May 23, 2014 @ 7:51 am BST May 23,2014 |
Martin,
Thanks for supplying that piece of information.
Comment by Jonathan Lewis — May 23, 2014 @ 8:03 am BST May 23,2014 |
[…] Reading on ANSI: Oracle Optimizer Blog Jonathan Lewis on ANSI Outer Jonathan Lewis on […]
Pingback by Outer Join with OR and Lateral View Decorrelation | OraStory — July 6, 2016 @ 5:33 pm BST Jul 6,2016 |
[…] Whilst this is generally not true and performance is equivalent, there are edge cases where it is true, mainly related to outer joins. The more recent our version of Oracle the fewer and more edgy those cases become. It’s not an argument for using the older syntax, just something to bear in mind. Find out more. […]
Pingback by 1z0-071 oracle exam question which answers is correct? – SQL — February 4, 2021 @ 10:12 am GMT Feb 4,2021 |
[…] “for each row do two high-precision lookups”. I implemented this by joining two lateral views of the lookup table to the measures table. Since I was sitting in front of a copy of 11.2.0.4 at […]
Pingback by Case Study | Oracle Scratchpad — April 5, 2021 @ 3:36 pm BST Apr 5,2021 |