Oracle Scratchpad

November 19, 2018

Table order

Filed under: ANSI Standard,Execution plans,Oracle,Tuning — Jonathan Lewis @ 1:30 pm GMT Nov 19,2018

Over the last few days I’ve highlighted on Twitter a couple of older posts showing how a change in the order that tables appear in the from clause could affect the execution plan of a query. In one case the note was purely theoretical describing a feature of the way the optimizer works with simple query blocks, in the other case the note was about an anomaly with table elimination that could appear with both “ANSI” and “traditional” Oracle syntax.

Here’s another note that might be more generally useful – an example of an odd side effect of ordering and “ANSI” syntax, with a suggestion for a pattern for writing ANSI SQL. It’s based on a test I wrote to play around with a problem that showed up on the Oracle database forum more than six years ago and shows a strange inconsistency. The setup is a little long-winded as the example involves 4 tables, so I’ll leave the script to create, load and index the tables to the end of the note. Here’s the query that introduced the problem; it’s a fairly straightforward 4 table join with two (left) outer joins:


select
        episode.episode_id , episode.cross_ref_id , episode.date_required ,
        product.number_required,
        request.site_id
from
        episode
left join
        request
on      episode.cross_ref_id = request.cross_ref_id
join
        product
ON      episode.episode_id = product.episode_id
left join
        product_sub_type
ON      product.prod_sub_type_id = product_sub_type.prod_sub_type_id
where
        episode.department_id = 2
and     product.status = 'I'
order by
        episode.date_required
;

And here’s the execution plan:


----------------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         | 33333 |  1725K|       | 17135   (4)| 00:00:01 |
|   1 |  SORT ORDER BY       |         | 33333 |  1725K|  2112K| 17135   (4)| 00:00:01 |
|*  2 |   HASH JOIN OUTER    |         | 33333 |  1725K|  1632K| 16742   (4)| 00:00:01 |
|*  3 |    HASH JOIN         |         | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| PRODUCT | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| EPISODE |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    TABLE ACCESS FULL | REQUEST |  4000K|    57M|       | 13542   (3)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EPISODE"."CROSS_REF_ID"="REQUEST"."CROSS_REF_ID"(+))
   3 - access("EPISODE"."EPISODE_ID"="PRODUCT"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

The first thing you’ll notice, of course, is that the plan reports a three table join. Thanks to various referential integrity constraints, the absence of the table in the final select list, and the nature of the join to that table, the optimizer has determined that the product_sub_type table could be eliminated from the join without changing the result set.

What you can’t tell from the plan is that there’s an index on the request table that holds all the columns needed to satisfy the query, and an index fast full scan on the index would be significantly more efficient than the tablescan that appears at operation 6.

Having noticed from the plan that product_sub_type is redundant, the obvious thing to do before investigating further is to rewrite the statement to remove the table . Here’s the resulting query, with execution plan:

----------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 33333 |  1725K|       |  5525   (6)| 00:00:01 |
|   1 |  SORT ORDER BY         |             | 33333 |  1725K|  2112K|  5525   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |             | 33333 |  1725K|  1632K|  5132   (7)| 00:00:01 |
|*  3 |    HASH JOIN           |             | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | PRODUCT     | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL  | EPISODE     |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN| IX4_REQUEST |  4000K|    57M|       |  1932   (7)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EPISODE"."CROSS_REF_ID"="REQUEST"."CROSS_REF_ID"(+))
   3 - access("EPISODE"."EPISODE_ID"="PRODUCT"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

So – when the optimizer removes the product_sub_type from the query the plan reports a tablescan of request, when we remove product_sub_type the plan reports an index fast full scan of an appropriate index – which appears to be roughly one seventh (1,932/13,542) of the size of the table. It’s a little surprising that the optimizer didn’t get it right by itself – but “ANSI” style SQL often displays quirky little side effects because of the way the optimizer transforms it into traditional Oracle style.

We could stop at that point, of course, but then you’d wonder about the significance of the title of the post. So let’s play around with the join order of the original query, without removing the product_sub_type table.

As a general strategy (though not an absolute rule) I tend to arrange code so that outer joins don’t appear before “inner” joins. In this example that means I would have written the original statement as follows:


select
        episode.episode_id, episode.cross_ref_id, episode.date_required,
        product.number_required,
        request.site_id
from
        episode
join
        product
ON      product.episode_id = episode.episode_id
left join
        product_sub_type
ON      product_sub_type.prod_sub_type_id = product.prod_sub_type_id
left join
        request
on      request.cross_ref_id = episode.cross_ref_id
where
        episode.department_id = 2
and     product.status        = 'I'
order by
        episode.date_required
;

All I’ve done is move the join between episode and product up the SQL, following it with the outer join to product_sub_type, finally closing with the outer join between episode and request. Here’s the execution plan – which you might expect to look exactly like the original plan:


----------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 33333 |  1725K|       |  5525   (6)| 00:00:01 |
|   1 |  SORT ORDER BY         |             | 33333 |  1725K|  2112K|  5525   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |             | 33333 |  1725K|  1632K|  5132   (7)| 00:00:01 |
|*  3 |    HASH JOIN           |             | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | PRODUCT     | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL  | EPISODE     |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN| IX4_REQUEST |  4000K|    57M|       |  1932   (7)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("REQUEST"."CROSS_REF_ID"(+)="EPISODE"."CROSS_REF_ID")
   3 - access("PRODUCT"."EPISODE_ID"="EPISODE"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

The product_sub_type table has been eliminated and we’re doing an index fast full scan of the ix4_request index instead of a tablescan of the much larger request table.

tl;dr

Changing the order of the tables in an ANSI join – especially when there are outer joins involved – could make a significant difference to the way the query is transformed and optimised. While it is nice to write the table ordering so that “chains” of joins are easily visible, bear in mind that re-ordering the join to postpone outer joins may be enough to help the optimizer produce a better execution plan.

Footnote

If you want to play around with the example, here’s the code to create and load the tables. The code doesn’t follow my usual style as most of it is cut-n-pasted from the Oracle forum thread:


rem
rem     script:         Ansi_outer_5.sql
rem     Dated:          July 2012
rem     Author:         Jonathan Lewis
rem
rem     Last tested
rem             18.3.0.0        iffs still not used by default
rem             12.2.0.1        iffs still not used by default
rem

create table episode (
        episode_id number (*,0),
        department_id number (*,0),
        date_required date,
        cross_ref_id varchar2 (11),
        padding varchar2 (80),
        constraint pk_episode primary key (episode_id)
)
;

create table product_sub_type (
        prod_sub_type_id number (*,0),
        sub_type_name varchar2 (20),
        units varchar2 (20),
        padding varchar2 (80),
        constraint pk_product_sub_type primary key (prod_sub_type_id)
)
;

create table product (
        product_id number (*,0),
        prod_type_id number (*,0),
        prod_sub_type_id number (*,0),
        episode_id number (*,0),
        status varchar2 (1),
        number_required number (*,0),
        padding varchar2 (80),
        constraint pk_product primary key (product_id),
        constraint nn_product_episode check (episode_id is not null) 
)
;

alter table product add constraint fk_product 
        foreign key (episode_id) references episode (episode_id)
;

alter table product add constraint fk_prod_sub_type
        foreign key (prod_sub_type_id) references product_sub_type (prod_sub_type_id)
;

create table request (
        request_id number (*,0),
        department_id number (*,0),
        site_id number (*,0),
        cross_ref_id varchar2 (11),
        padding varchar2 (80),
        padding2 varchar2 (80),
        constraint pk_request primary key (request_id),
        constraint nn_request_department check (department_id is not null),
        constraint nn_request_site_id check (site_id is not null)
)
;

prompt  ===================
prompt  Loading episode ...
prompt  ===================

insert /*+ append */ into episode
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, 2,
    sysdate + mod (r, 14),
    to_char (r, '0000000000'),
    'ABCDEFGHIJKLMNOPQRSTUVWXYZ' || to_char (r, '000000')
  from generator g
where g.r <= 3e5
/ 

commit;

prompt  ============================
prompt  Loading product_sub_type ...
prompt  ============================

insert /*+ append */ into product_sub_type
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, 
       to_char (r, '000000'),
       to_char (mod (r, 3), '000000'),
       'ABCDE' || to_char (r, '000000')
  from generator g
where g.r <= 15
/ 

commit;

prompt  ===================
prompt  Loading product ...
prompt  ===================

insert /*+ append */ into product
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, mod (r, 12) + 1, mod (r, 15) + 1, mod (r, 300000) + 1,
       decode (mod (r, 3), 0, 'I', 1, 'C', 2, 'X', 'U'),
       dbms_random.value (1, 100), NULL
  from generator g
where g.r <= 1e5
/ 

commit;

prompt  ===================
prompt  Loading request ...
prompt  ===================

insert /*+ append */ into request
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e7
       ) 
select 
        r, mod (r, 4) + 1, 1, to_char (r, '0000000000'),
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567890123456789' || to_char (r, '000000'),
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789012345678' || to_char (r, '000000')
  from generator g
where g.r <= 4e6
/ 

commit;

create index ix1_episode_cross_ref on episode (cross_ref_id);

create index ix1_product_episode on product (episode_id);
create index ix2_product_type on product (prod_type_id);

create index ix1_request_site on request (site_id);
create index ix2_request_dept on request (department_id);
create index ix3_request_cross_ref on request (cross_ref_id);
create index ix4_request on request (cross_ref_id, site_id);

exec dbms_stats.gather_schema_stats ('test_user')

Note that there is a call to gather_schema_stats() at the end, rather than a set of 4 calls to gather_table_stats(); you may want to change this. The entire data set, including indexes, will need about 1.5GB of free space.

 

January 4, 2016

ANSI bug

Filed under: ANSI Standard,Bugs,Oracle — Jonathan Lewis @ 1:12 pm GMT Jan 4,2016

In almost all cases the SQL you write using the ANSI (SQL-92) standard syntax is tranformed into a statement using Oracle’s original syntax before being optimised – and there are still odd cases where the translation is not ideal.  This can result in poor performance, it can result in wrong results. The following examples arrived in my in-tray a couple of weeks ago:

with
    table1 as ( select 1 my_number from dual ),
    table2 as ( select 1 my_number from dual )
select *
    from (
        select sum(table3.table2.my_number) the_answer
            from table1
            left join table2 on table1.my_number = table2.my_number
            group by table1.my_number
        );


with
    table1 as ( select 1 my_number from dual ),
    table2 as ( select 1 my_number from dual )
select sum(table3.table2.my_number) the_answer
    from table1
    left join table2 on table1.my_number = table2.my_number
    group by table1.my_number;

Notice the reference to table3.table2.my_number in the select list of both queries – where does the “table3” bit come from ? These queries should result in Oracle error ORA-00904: “TABLE3″.”TABLE2″.”MY_NUMBER”: invalid identifier.

If you’re running 11.2.0.4 (and, probably, earlier versions) both queries produce the following result:


THE_ANSWER
----------
         1

1 row selected.

If you’re running 12.1.0.2 the first query produces the ORA-00904 error that it should do, but the second query still survives to produce the same result as 11.2.0.4.

Update (8th July 2016)

I’ve just seen a reference to this blog note on the OTN database forum that prompted me to point out that the problem doesn’t occur if you switch to traditional Oracle syntax – the expected Oracle error message appears in 11.2.0.4:

SQL> ed temp

with
    table1 as ( select 1 my_number from dual ),
    table2 as ( select 1 my_number from dual )
select *
    from (
        select sum(table3.table2.my_number) the_answer
/*
            from table1
            left join table2 on table1.my_number = table2.my_number
*/
            from table1, table2
            where table2.my_number(+) = table1.my_number
            group by table1.my_number
        )
 15  ;
        select sum(table3.table2.my_number) the_answer
                   *
ERROR at line 6:
ORA-00904: "TABLE3"."TABLE2"."MY_NUMBER": invalid identifier


SQL> ed temp3

with
    table1 as ( select 1 my_number from dual ),
    table2 as ( select 1 my_number from dual )
select *
    from (
        select sum(table3.table2.my_number) the_answer
--            from table1
--            left join table2 on table1.my_number = table2.my_number
            from table1, table2
            where table2.my_number(+) = table1.my_number
            group by table1.my_number
        )
 13  ;
        select sum(table3.table2.my_number) the_answer
                   *
ERROR at line 6:
ORA-00904: "TABLE3"."TABLE2"."MY_NUMBER": invalid identifier

July 22, 2015

Invalidation

Filed under: ANSI Standard,Bugs,Oracle — Jonathan Lewis @ 12:45 pm GMT Jul 22,2015

Someone who attended my sessions at the Bucharest Oracle Summit earlier on this year sent me an example of a quirky little bug, possibly related to the newer “fine-grained” invalidation mechanisms, possibly related to ANSI syntax SQL, that’s very easy to reproduce. (That’s always nice for Oracle support – a perfect test case.)

All it takes is two tables and a packaged procedure that queries those tables. The package is coded to do something that should not be allowed in production code; but “should not” and “is not” are very different things. For anyone who wants to play with the example, here’s the script to create the necessary objects:


drop package pkg_test;
drop table t2 purge;
drop table t1 purge;

create table t1 (id1 number, val1 varchar2(10));
create table t2 (id2 number, val2 varchar2(10));

insert into t1 values(1,rpad('x',10,'x'));
insert into t2 values(1,rpad('x',10,'x'));

execute dbms_stats.gather_table_stats(user,'t1')
execute dbms_stats.gather_table_stats(user,'t2')

create or replace package pkg_test is
   procedure pr_call;
end pkg_test;
/

create or replace package body pkg_test as

   procedure pr_call is

      cursor cur_ids is
         select *                   -- Naughty !
           from t1
           join t2
             on t2.id2 = t1.id1
         ;

      rec_id cur_ids%rowtype := null;

   begin
      open cur_ids;
      fetch cur_ids into rec_id;
      close cur_ids;
      dbms_output.put_line(rec_id.val1 || '-' || rec_id.val2);

   exception
      when others then
         if cur_ids%isopen then
            close cur_ids;
         end if;
         raise;

   end pr_call;

end pkg_test;
/

Having created the procedure I’m now going to call it – and then add a column to table t1. What’s that going to do to a packaged procedure with a “select *”?

Pause for thought …

Here’s some SQL to run the test.


set serveroutput on

prompt   *** Make a first call to the procedure: no error ***
execute  pkg_test.pr_call

prompt   *** add a column to one of the tables
alter table t1 add col_test varchar2(20);

prompt   *** Make two more calls to the procedure: ouch! ***
execute  pkg_test.pr_call
execute  pkg_test.pr_call

prompt  *** Recompile before a third call ***
execute  dbms_ddl.alter_compile('package body', user, 'pkg_test')
execute  pkg_test.pr_call

Unless I’ve managed to cut-n-paste the wrong bits of code, you would have got the following error for the 2nd and 3rd calls to the package:


BEGIN
*
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected - got -
ORA-06512: at "TEST_USER.PKG_TEST", line 25
ORA-06512: at line 2

The package body should (I believe) have invalidated and recompiled itself for the second execution, and even if it failed on the first attempt surely it should have invalidated itself on the ORA-932 and recompiled itself and succeeded on the third execution.  (If you remove the exception clause you’ll find that the error is intially raised at the fetch, by the way).

If we change the “select *” to explicitly name the columns we want, viz:“select t1.id1, t1.val1, t2.id2, t2.val2” we don’t get the ORA-00932 errors (just as we would probably expect). What we might not expect is that the errors also disappear if we leave the “select *” in place but change the query from ANSI syntax to traditional Oracle syntax.

Footnote:

Obviously you shouldn’t use the lazy “*” notation in any production code – it can cause several different problems (including the dangers of “whoops, I didn’t mean to make that one invisible”) – but if you do you may find that you end up with packaged procedures that crash for no apparent reason until you recompile them. Perhaps ORA-00932 is the only possible error message, but maybe it’s possible to cause other errors to appear. Even worse, though I haven’t tried to force it yet, you may find that you can construct cases where the package reports no error but modifies the wrong data.

I’ve tested this code on versions 11.2.0.4 and 12.1.0.2 and see the same results on both.

March 27, 2015

ANSI expansion

Filed under: ANSI Standard,Oracle,Troubleshooting — Jonathan Lewis @ 10:46 am GMT Mar 27,2015

Here’s a quirky little bug that appeared on the OTN database forum in the last 24 hours which (in 12c, at least) produces an issue which I can best demonstrate with the following cut-n-paste:


rem
rem     Script:         12c_col_prob.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2015
rem

SQL> desc purple
 Name                                Null?    Type
 ----------------------------------- -------- ------------------------
 G_COLUMN_001                        NOT NULL NUMBER(9)
 P_COLUMN_002                                 VARCHAR2(2)

SQL> select p.*
  2  from GREEN g
  3    join RED r on g.G_COLUMN_001 = r.G_COLUMN_001
  4    join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001;
  join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001
       *
ERROR at line 4:
ORA-01792: maximum number of columns in a table or view is 1000

SQL> select p.g_column_001, p.p_column_002
  2  from GREEN g
  3    join RED r on g.G_COLUMN_001 = r.G_COLUMN_001
  4    join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001;

no rows selected

A query that requires “star-expansion” fails with ORA-01792, but if you explicitly expand the ‘p.*’ to list all the columns it represents the optimizer is happy. (The posting also showed the same difference in behaviour when changing “select constant from  {table join}” to “select (select constant from dual) from {table join}”)

The person who highlighted the problem supplied code to generate the tables so you can repeat the tests very easily; one of the quick checks I did was to modify the code to produce tables with a much smaller number of columns and then expanded the SQL to see what Oracle would have done with the ANSI. So, with only 3 columns each in table RED and GREEN, this is what I did:

set serveroutput on
set long 20000

variable m_sql_out clob

declare
    m_sql_in    clob :=
                        '
                        select p.*
                        from GREEN g
                        join RED r on g.G_COLUMN_001 = r.G_COLUMN_001
                        join PURPLE p on g.G_COLUMN_001 = p.G_COLUMN_001
                        ';
begin

    dbms_utility.expand_sql_text(
        m_sql_in,
        :m_sql_out
    );

end;
/

column m_sql_out wrap word
print m_sql_out

The dbms_utility.expand_sql_text() function is new to 12c, and you’ll need the execute privilege on the dbms_utility package to use it; but if you want to take advantage of it in 11g you can also find it (undocumented) in a package called dbms_sql2.

Here’s the result of the expansion (you can see why I reduced the column count to 3):


M_SQL_OUT
--------------------------------------------------------------------------------
SELECT "A1"."G_COLUMN_001_6" "G_COLUMN_001","A1"."P_COLUMN_002_7" "P_COLUMN_002"
FROM  (SELECT "A3"."G_COLUMN_001_0" "G_COLUMN_001","A3"."G_COLUMN_002_1"
"G_COLUMN_002","A3"."G_COLUMN_003_2" "G_COLUMN_003","A3"."G_COLUMN_001_3"
"G_COLUMN_001","A3"."R_COLUMN__002_4" "R_COLUMN__002","A3"."R_COLUMN__003_5"
"R_COLUMN__003","A2"."G_COLUMN_001" "G_COLUMN_001_6","A2"."P_COLUMN_002"
"P_COLUMN_002_7" FROM  (SELECT "A5"."G_COLUMN_001"
"G_COLUMN_001_0","A5"."G_COLUMN_002" "G_COLUMN_002_1","A5"."G_COLUMN_003"
"G_COLUMN_003_2","A4"."G_COLUMN_001" "G_COLUMN_001_3","A4"."R_COLUMN__002"
"R_COLUMN__002_4","A4"."R_COLUMN__003" "R_COLUMN__003_5" FROM
"TEST_USER"."GREEN" "A5","TEST_USER"."RED" "A4" WHERE
"A5"."G_COLUMN_001"="A4"."G_COLUMN_001") "A3","TEST_USER"."PURPLE" "A2" WHERE
"A3"."G_COLUMN_001_0"="A2"."G_COLUMN_001") "A1"

Tidying this up:


SELECT
        A1.G_COLUMN_001_6 G_COLUMN_001,
        A1.P_COLUMN_002_7 P_COLUMN_002
FROM    (
        SELECT
                A3.G_COLUMN_001_0 G_COLUMN_001,
                A3.G_COLUMN_002_1 G_COLUMN_002,
                A3.G_COLUMN_003_2 G_COLUMN_003,
                A3.G_COLUMN_001_3 G_COLUMN_001,
                A3.R_COLUMN__002_4 R_COLUMN__002,
                A3.R_COLUMN__003_5 R_COLUMN__003,
                A2.G_COLUMN_001 G_COLUMN_001_6,
                A2.P_COLUMN_002 P_COLUMN_002_7
        FROM    (
                SELECT
                        A5.G_COLUMN_001 G_COLUMN_001_0,
                        A5.G_COLUMN_002 G_COLUMN_002_1,
                        A5.G_COLUMN_003 G_COLUMN_003_2,
                        A4.G_COLUMN_001 G_COLUMN_001_3,
                        A4.R_COLUMN__002 R_COLUMN__002_4,
                        A4.R_COLUMN__003 R_COLUMN__003_5
                FROM
                        TEST_USER.GREEN A5,
                        TEST_USER.RED A4
                WHERE
                        A5.G_COLUMN_001=A4.G_COLUMN_001
                ) A3,
                TEST_USER.PURPLE A2
        WHERE
                A3.G_COLUMN_001_0=A2.G_COLUMN_001
        ) A1

As you can now see, the A1 alias lists all the columns in GREEN, plus all the columns in RED, plus all the columns in PURPLE – totalling 3 + 3 + 2 = 8. (There is a little pattern of aliasing and re-aliasing that turns the join column RED.g_column_001 into G_COLUMN_001_3, making it look at first glance as if it has come from the GREEN table).

You can run a few more checks, increasing the number of columns in the RED and GREEN tables, but essentially when the total number of columns in those two tables goes over 998 then adding the two extra columns from PURPLE makes that intermediate inline view break the 1,000 column rule.

Here’s the equivalent expanded SQL if you identify the columns explicitly in the select list (even with several hundred columns in the RED and GREEN tables):


SELECT
        A1.G_COLUMN_001_2 G_COLUMN_001,
        A1.P_COLUMN_002_3 P_COLUMN_002
FROM    (
        SELECT
                A3.G_COLUMN_001_0 G_COLUMN_001,
                A3.G_COLUMN_001_1 G_COLUMN_001,
                A2.G_COLUMN_001 G_COLUMN_001_2,
                A2.P_COLUMN_002 P_COLUMN_002_3
        FROM    (
                SELECT
                        A5.G_COLUMN_001 G_COLUMN_001_0,
                        A4.G_COLUMN_001 G_COLUMN_001_1
                FROM
                        TEST_USER.GREEN A5,
                        TEST_USER.RED A4
                WHERE
                        A5.G_COLUMN_001=A4.G_COLUMN_001
                ) A3,
                TEST_USER.PURPLE A2
        WHERE
                A3.G_COLUMN_001_0=A2.G_COLUMN_001
        ) A1

As you can see, the critical inline view now holds only the original join columns and the columns required for the select list.

If you’re wondering whether this difference in expansion could affect execution plans, it doesn’t seem to; the 10053 trace file includes the following (cosmetically altered) output:


Final query after transformations:******* UNPARSED QUERY IS *******
SELECT
        P.G_COLUMN_001 G_COLUMN_001,
        P.P_COLUMN_002 P_COLUMN_002
FROM
        TEST_USER.GREEN   G,
        TEST_USER.RED     R,
        TEST_USER.PURPLE  P
WHERE
        G.G_COLUMN_001=P.G_COLUMN_001
AND     G.G_COLUMN_001=R.G_COLUMN_001

So it looks as if the routine to transform the syntax puts in a lot of redundant text then the optimizer takes it all out again.

The problem doesn’t exist with traditional Oracle syntax, by the way, it’s an artefact of Oracle’s expansion of the ANSI syntax and 11.2.0.4 is quite happy to handle the text generated by the ANSI transformation when there are well over 1,000 columns in the inline view.

June 4, 2013

ROWID

Filed under: ANSI Standard,Bugs,lateral view,Oracle — Jonathan Lewis @ 9:09 am GMT Jun 4,2013

Here’s a suggestion to help you avoid wasting time. If you ever include the rowid in a query – not that that should happen very commonly – make sure you give it an alias, especially if you’re using ANSI SQL. If you don’t, you may find yourself struggling to work out why you’re getting an irrational error message. Here’s an example that appeared recently on the OTN forum, with the output cut-n-pasted from a system running 11.1.0.7:

(more…)

July 16, 2012

ANSI Outer 2

Filed under: ANSI Standard,CBO,Execution plans,lateral view,Oracle — Jonathan Lewis @ 4:55 pm GMT Jul 16,2012

A comment on a recent post of mine pointed me to a question on the OTN SQL and PL/SQL Forum where someone had presented a well-written test case of an odd pattern of behaviour in ANSI SQL. I made a couple of brief comments on the thread, but thought it worth highlighting here as well. The scripts to create the required tables (plus a few extras) are all available on OTN. If you create only the four tables needed and all their indexes you will need about 1.3GB of space.

The core of the problem is this: there is a three table join which does a hash join involving an index fast full scan on a particular index; when you add a fourth table to the join this fast full scan turns into a full tablescan for no obvious reason. Here are the queries, with the plans that I got when running 10.2.0.3. (My final plan is slightly different from the plan shown on OTN – I have a right outer hash join to the last table where the OP had a nested loop outer – but the difference is not significant). The queries, with their execution plans, are below- the three table join first:
(more…)

January 31, 2011

ANSI Outer

Filed under: ANSI Standard,CBO,Execution plans,lateral view,Oracle — Jonathan Lewis @ 6:59 pm GMT Jan 31,2011

Here’s an example of ANSI SQL that does something in a fashion that arguably looks a little tidier than the strategy you have to adopt in Oracle. As so often when I compare Oracle syntax and ANSI syntax it’s an example that relates to an outer join. We start with two tables – as usual I have locally managed tablespaces, 8KB blocks, 1MB uniform extents and freelist management. I’ve also disabled system statistics (CPU costing):

(more…)

December 3, 2010

ANSI – argh

Filed under: ANSI Standard,CBO,Execution plans,Hints,Ignoring Hints — Jonathan Lewis @ 7:30 pm GMT Dec 3,2010

I’m not keen on ANSI standard SQL – even though it is, technically, the strategic option and even though you have to use it for full outer joins and partitioned outer joins.

One reason for disliking it is that it “separates join predicates from filter predicates” – a reason often given in praise of the syntax which, to my mind, claims a spurious distinction and introduces a mechanism that makes it harder to keep mental track of what’s going to happen as you walk  through the join order. (I have to admit that I was temporarily fooled into thinking it was quite a nice idea – in an abstract sort of way.)
(more…)

March 20, 2008

ANSI SQL

Filed under: ANSI Standard,CBO,Infrastructure,Troubleshooting — Jonathan Lewis @ 4:26 pm GMT Mar 20,2008

Someone sent me an email a little while ago about a problem they were having with two databases that were using different execution plans for the same query. 

But the two databases were believed to be identical, and the optimizer was running rule-based in both cases, so it shouldn’t have been possible to get different execution plans.

(more…)

Powered by WordPress.com.