Oracle Scratchpad

June 22, 2019

ANSI bug

Filed under: ANSI Standard,Bugs,Oracle — Jonathan Lewis @ 1:01 pm BST Jun 22,2019

The following note is about a script that I found on my laptop while I was searching for some details about a bug that appears when you write SQL using the ANSI style format rather than traditional Oracle style. The script is clearly one that I must have cut and pasted from somewhere (possibly the OTN/ODC database forum) many years ago without making any notes about its source or resolution. All I can say about it is that the file has a creation date of July 2012 and I can’t find any reference to a problem through Google searches – though the tables and even a set of specific insert statements appears in a number of pages that look like coursework for computer studies and MoS has a similar looking bug “fixed in 11.2”.

Here’s the entire script:

rem
rem     Script:         ansi_bug.sql
rem     Author:         ???
rem     Dated:          July 2012
rem

CREATE TABLE Student (
  sid INT PRIMARY KEY,
  name VARCHAR(20) NOT NULL,
  address VARCHAR(20) NOT NULL,
  major CHAR(2)
);

CREATE TABLE Professor (
  pid INT PRIMARY KEY,
  name VARCHAR(20) NOT NULL,
  department VARCHAR(10) NOT NULL
);

CREATE TABLE Course (
  cid INT PRIMARY KEY,
  title VARCHAR(20) NOT NULL UNIQUE,
  credits INT NOT NULL,
  area VARCHAR(5) NOT NULL
);

CREATE TABLE Transcript (
  sid INT,
  cid INT,
  pid INT,
  semester VARCHAR(9),
  year CHAR(4),
  grade CHAR(1) NOT NULL,
  PRIMARY KEY (sid, cid, pid, semester, year),
  FOREIGN KEY (sid) REFERENCES Student (sid),
  FOREIGN KEY (cid) REFERENCES Course (cid),
  FOREIGN KEY (pid) REFERENCES Professor (pid)
);

INSERT INTO Student (sid, name, address, major) VALUES (101, 'Nathan', 'Edinburg', 'CS');
INSERT INTO Student (sid, name, address, major) VALUES (105, 'Hussein', 'Edinburg', 'IT');
INSERT INTO Student (sid, name, address, major) VALUES (103, 'Jose', 'McAllen', 'CE');
INSERT INTO Student (sid, name, address, major) VALUES (102, 'Wendy', 'Mission', 'CS');
INSERT INTO Student (sid, name, address, major) VALUES (104, 'Maria', 'Pharr', 'CS');
INSERT INTO Student (sid, name, address, major) VALUES (106, 'Mike', 'Edinburg', 'CE');
INSERT INTO Student (sid, name, address, major) VALUES (107, 'Lily', 'McAllen', NULL);

INSERT INTO Professor (pid, name, department) VALUES (201, 'Artem', 'CS');
INSERT INTO Professor (pid, name, department) VALUES (203, 'John', 'CS');
INSERT INTO Professor (pid, name, department) VALUES (202, 'Virgil', 'MATH');
INSERT INTO Professor (pid, name, department) VALUES (204, 'Pearl', 'CS');
INSERT INTO Professor (pid, name, department) VALUES (205, 'Christine', 'CS');

INSERT INTO Course (cid, title, credits, area) VALUES (4333, 'Database', 3, 'DB');
INSERT INTO Course (cid, title, credits, area) VALUES (1201, 'Comp literacy', 2, 'INTRO');
INSERT INTO Course (cid, title, credits, area) VALUES (6333, 'Advanced Database', 3, 'DB');
INSERT INTO Course (cid, title, credits, area) VALUES (6315, 'Applied Database', 3, 'DB');
INSERT INTO Course (cid, title, credits, area) VALUES (3326, 'Java', 3, 'PL');
INSERT INTO Course (cid, title, credits, area) VALUES (1370, 'CS I', 4, 'INTRO');

INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (101, 4333, 201, 'Spring', '2009', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (101, 6333, 201, 'Fall', '2009', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (101, 6315, 201, 'Fall', '2009', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (103, 4333, 203, 'Summer I', '2010', 'B');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (102, 4333, 201, 'Fall', '2009', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (103, 3326, 204, 'Spring', '2008', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (104, 1201, 205, 'Fall', '2009', 'B');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (104, 1370, 203, 'Summer II', '2010', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (106, 1201, 205, 'Fall', '2009', 'C');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (106, 1370, 203, 'Summer II', '2010', 'C');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (105, 3326, 204, 'Spring', '2001', 'A');
INSERT INTO Transcript (sid, cid, pid, semester, year, grade) VALUES (105, 6315, 203, 'Fall', '2008', 'A');

SELECT 
        pid, 
        name, title
FROM 
        Professor 
NATURAL LEFT OUTER JOIN 
        (
                Transcript 
        NATURAL JOIN 
                Course
        )
;

SELECT 
        name, title
FROM 
        Professor 
NATURAL LEFT OUTER JOIN 
        (
                Transcript 
        NATURAL JOIN 
                Course
        )
;

SELECT 
        name, title
FROM 
        Professor 
NATURAL LEFT OUTER JOIN 
        (
                Transcript 
        NATURAL JOIN 
                Course
        )
order by pid
;

I’ve run three minor variations of the same query – the one in the middle selects two columns from a three table join using natural joins. The first query does the same but includes an extra column in the select list while the third query selects only the original columns but orders the result set by the extra column.

The middle query returns 60 rows – the first and third, with the “extra” column projected somewhere in the execution plan, return 13 rows.

I didn’t even have a note of the then-current version of Oracle when I copied this script, but I’ve just run it on 12.2.0.1, 18.3.0.0, and 19.2.0.0 (using LiveSQL), and the error reproduces on all three versions.

3 Comments »

  1. Hello Jonathan,

    I tried to play around a little with these examples in LiveSQL, and the 3 execution plans are as follows:

    ----------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                       | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
    ----------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                |                 |      1 |        |     13 |00:00:00.01 |      29 |       |       |          |
    |*  1 |  HASH JOIN OUTER                |                 |      1 |     12 |     13 |00:00:00.01 |      29 |  3843K|  3843K| 3229K (0)|
    |   2 |   TABLE ACCESS STORAGE FULL     | PROFESSOR       |      1 |      5 |      5 |00:00:00.01 |       7 |  1025K|  1025K|          |
    |   3 |   VIEW                          |                 |      1 |     12 |     12 |00:00:00.01 |      22 |       |       |          |
    |*  4 |    HASH JOIN                    |                 |      1 |     12 |     12 |00:00:00.01 |      22 |  3363K|  3363K| 3743K (0)|
    |   5 |     TABLE ACCESS STORAGE FULL   | COURSE          |      1 |      6 |      6 |00:00:00.01 |       7 |  1025K|  1025K|          |
    |   6 |     INDEX STORAGE FAST FULL SCAN| SYS_C0015988681 |      1 |     12 |     12 |00:00:00.01 |      15 |  1025K|  1025K|          |
    ----------------------------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       1 - access("PROFESSOR"."PID"="from$_subquery$_004"."PID")
       4 - access("TRANSCRIPT"."CID"="COURSE"."CID")
     
    
    
    -----------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                        | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
    -----------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                 |                 |      1 |        |     60 |00:00:00.01 |      22 |       |       |          |
    |   1 |  MERGE JOIN OUTER                |                 |      1 |     60 |     60 |00:00:00.01 |      22 |       |       |          |
    |   2 |   TABLE ACCESS STORAGE FULL      | PROFESSOR       |      1 |      5 |      5 |00:00:00.01 |      11 |  1025K|  1025K|          |
    |   3 |   BUFFER SORT                    |                 |      5 |     12 |     60 |00:00:00.01 |      11 |  2048 |  2048 | 2048  (0)|
    |   4 |    VIEW                          | VW_LAT_13EA77AB |      1 |     12 |     12 |00:00:00.01 |      11 |       |       |          |
    |*  5 |     HASH JOIN                    |                 |      1 |     12 |     12 |00:00:00.01 |      11 |  3363K|  3363K| 3690K (0)|
    |   6 |      TABLE ACCESS STORAGE FULL   | COURSE          |      1 |      6 |      6 |00:00:00.01 |       7 |  1025K|  1025K|          |
    |   7 |      INDEX STORAGE FAST FULL SCAN| SYS_C0015988681 |      1 |     12 |     12 |00:00:00.01 |       4 |  1025K|  1025K|          |
    -----------------------------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       5 - access("TRANSCRIPT"."CID"="COURSE"."CID")
     
    
    -----------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                        | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
    -----------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                 |                 |      1 |        |     13 |00:00:00.01 |      18 |       |       |          |
    |   1 |  SORT ORDER BY                   |                 |      1 |     12 |     13 |00:00:00.01 |      18 |  2048 |  2048 | 2048  (0)|
    |*  2 |   HASH JOIN OUTER                |                 |      1 |     12 |     13 |00:00:00.01 |      18 |  3843K|  3843K| 3258K (0)|
    |   3 |    TABLE ACCESS STORAGE FULL     | PROFESSOR       |      1 |      5 |      5 |00:00:00.01 |       7 |  1025K|  1025K|          |
    |   4 |    VIEW                          |                 |      1 |     12 |     12 |00:00:00.01 |      11 |       |       |          |
    |*  5 |     HASH JOIN                    |                 |      1 |     12 |     12 |00:00:00.01 |      11 |  3363K|  3363K| 3749K (0)|
    |   6 |      TABLE ACCESS STORAGE FULL   | COURSE          |      1 |      6 |      6 |00:00:00.01 |       7 |  1025K|  1025K|          |
    |   7 |      INDEX STORAGE FAST FULL SCAN| SYS_C0015988681 |      1 |     12 |     12 |00:00:00.01 |       4 |  1025K|  1025K|          |
    -----------------------------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - access("PROFESSOR"."PID"="from$_subquery$_004"."PID")
       5 - access("TRANSCRIPT"."CID"="COURSE"."CID")
     
    

    Strangely, for the second SELECT we have a MERGE JOIN OUTER as the second join, while in the other two SELECT-s
    we have a HASH join.

    In my opinion, Oracle is bugged by “not knowing” which columns to project from the innermost NATURAL join
    to the outermost NATURAL join, so it looks like it simply takes only those columns that appear to be used
    by the next level SELECT.

    So, for the 1st and 3rd SELECT it takes columns PID and TITLE, while for the 2nd SELECT it only takes TITLE,
    and then it has no common column names for the outermost NATURAL JOIN.

    If we explicitly use a SELECT for the innermost join, like below,
    then this query also correctly returns 13 rows:

    
    SELECT    
            name, title
    FROM 
            Professor 
    NATURAL LEFT OUTER JOIN 
            ( 
             SELECT * FROM
                    Transcript 
            NATURAL JOIN 
                    Course
            )
    ;
    
    

    It looks to me that NATURAL JOIN is better to be avoided at all, as it might completely mess results up,
    for example when new columns are added to the tables.

    Thanks a lot & Best Regards,
    Iudith Mentzel

    Comment by Iudith Mentzel — June 22, 2019 @ 4:24 pm BST Jun 22,2019 | Reply

    • Iudith,

      Thanks for the comment – I agree with your analysis regarding the innermost query “forgetting” that it might need to project columns that might be relevant to a later natural join.

      The merge join / hash join is probably a costing thing; it’s got to be a merge for the middle query because there is no proper join predicate, but the optimizer can choose between a merge and a hash (or NL) for the other two. In one of my tests (perhaps because of a different process id producing a different pattern in the data – ASSM headace) the optimizer chose the merge join for the query with the “order by”, bypassing the need to do a sort operation after the join.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — June 23, 2019 @ 12:45 pm BST Jun 23,2019 | Reply

  2. Hi Jonathan – the Oracle Real World Performance Group took a look and they have bugged it. It is unpublished bug# 30066014. Thanks.

    Comment by Nigel Bayliss — July 18, 2019 @ 11:10 am BST Jul 18,2019 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.