How do you trouble-shoot a problem ? It’s not an easy question to answer when posed in this generic fashion; but perhaps it’s possible to help people trouble-shoot by doing some examples in front of them. (This is why I’ve got so many statspack/AWR examples – just reading a collection of different problems helps you to get into the right mental habit.)
So here’s a problem someone sent me yesterday. Since it only took a few seconds to read, and included a complete build for a test case, with results, and since it clearly displayed an Oracle bug, I took a look at it. (I’ve trimmed the test a little bit, there were a few more queries leading up to the error):
create table person (id number(2), name varchar2(10)) ; insert into person values (1, 'Alpha') ; insert into person values (2, 'Bravo') ; insert into person values (3, 'Charlie') ; insert into person values (4, 'Charles') ; insert into person values (5, 'Delta') ; create or replace view vtest as select id, 'C' as letter from person where name like 'C%' ; select p.id, p.name, v.id, v.letter from person p left join vtest v on v.id = p.id order by p.id ;
The problem was that 10.2.0.4 and 11.2.0.2 gave different results – and the 11.2.0.2 result was clearly wrong. So the question was: “is there something broken with outer joins on views, or possibly ANSI outer joins?” (The ansswer to the last question is always “probably” as far as I’m concerned, but I wouldn’t turn that into a “yes” without checking first.) Here are the two results:
10.2.0.4: ======== ID NAME ID L ---------- ---------- ---------- - 1 Alpha 2 Bravo 3 Charlie 3 C 4 Charles 4 C 5 Delta 11.2.0.2 ======== ID NAME ID L ---------- ---------- ---------- - 1 Alpha C 2 Bravo C 3 Charlie 3 C 4 Charles 4 C 5 Delta C
Clearly the extra ‘C’s in the letter column are wrong.
So what to do next ? Knowing that Oracle transforms ANSI SQL before evaluating an execution plan I decided to run the 10053 trace. Sometimes you get lucky and see the “unparsed SQL” in this trace file, a representation (though not necessarily 100% exact) image of the statement for which Oracle will generate a plan. I was lucky, this was the unparsed SQL (cosmetically enhanced):
SELECT P.ID ID, P.NAME NAME, PERSON.ID ID, CASE WHEN PERSON.ROWID IS NOT NULL THEN 'C' ELSE NULL END LETTER FROM TEST_USER.PERSON P, TEST_USER.PERSON PERSON WHERE PERSON.ID (+) = P.ID AND PERSON.NAME(+) LIKE 'C%' ORDER BY P.ID ;
So I ran this query, and found that the same error appeared – so it wasn’t about ANSI or views. So possibly it’s something about the CASE statement and/or the ROWID in the CASE statement, which I tested by adding three extra columns to the query:
person.name, person.rowid, CASE WHEN PERSON.name IS NOT NULL THEN 'C' ELSE NULL END LETTER
With these extra columns I got the following results from the query:
ID NAME ID NAME ROWID L L ---------- ---------- ---------- ---------- ------------------ - - 1 Alpha C 2 Bravo C 3 Charlie 3 Charlie AAAT7gAAEAAAAIjAAC C C 4 Charles 4 Charles AAAT7gAAEAAAAIjAAD C C 5 Delta C
So the CASE did the right thing with the person.name column, but the wrong thing with the person.rowid column.
Time to get onto MOS (Metalink).
I searched the bug database with the key words: case rowid null
This gave me 2,887 hits, so I added the expression (with the double quotes in place) “outer join”
This gave me 110 hits, so from the “product category” I pick “Oracle Database Products”
This gave me 80 hits, and the first one on the list was:
Bug 10269193: WRONG RESULTS WITH OUTER JOIN AND CASE EXPRESSION OPTIMIZATION CONTAINING ROWID
The text matched my problem, so job done – except it’s reported as not fixed until 12.1
This isn’t a nice bug, of course, because the particular problem can be generated automatically in the transformation of ANSI outer joins to Oracle outer joins, so you can’t just change the code.
In passing, it’s taken me 31 minutes to write this note – that’s 10 minutes longer than it took to pin down the bug, but I have to say I got lucky on two counts: first, that the “unparsed SQL” was available, second that my choice of key words for MOS got me to the bug so quickly (which is where I usually find I waste most time).
Jonathan, Really it’s interesting test case, but I am surprized it was working fine in 11.2.0.1
SQL> create table person (id number(2), name varchar2(10)) ;
Table created.
SQL>
SQL> insert into person values (1, ‘Alpha’) ;
1 row created.
SQL> insert into person values (2, ‘Bravo’) ;
1 row created.
SQL> insert into person values (3, ‘Charlie’) ;
1 row created.
SQL> insert into person values (4, ‘Charles’) ;
1 row created.
SQL> insert into person values (5, ‘Delta’) ;
1 row created.
SQL> commit;
Commit complete.
SQL> create or replace view vtest as
2 select id, ‘C’ as letter from person where name like ‘C%’ ;
View created.
SQL>
SQL> select p.id, p.name, v.id, v.letter
2 from person p
3 left join vtest v on v.id = p.id
4 order by p.id ;
ID NAME ID L
———- ———- ———- –
1 Alpha
2 Bravo
3 Charlie 3 C
4 Charles 4 C
5 Delta
SQL> select * from v$version;
BANNER
———————————————————————-
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – Production
PL/SQL Release 11.2.0.1.0 – Production
CORE 11.2.0.1.0 Production
TNS for 32-bit Windows: Version 11.2.0.1.0 – Production
NLSRTL Version 11.2.0.1.0 – Production
Sir, correct me any thing missing from my end. I will try to test it on version which you mentioned across, let me cross check across.
Comment by Pavan Kumar Kumar — August 3, 2011 @ 6:13 pm BST Aug 3,2011 |
Pavan Kumar,
11.2.0.1 doesn’t have the problem.
Comment by Jonathan Lewis — August 3, 2011 @ 8:44 pm BST Aug 3,2011 |
Jonathan ,
It was fixed in 11.2.0.2 ( patchset 8 for Windows ) .
Bug no ( 10269193 ) listed in the 2nd line from the bottom of the “Bugs Fixed”
Comment by Zahir Mohideen — August 3, 2011 @ 8:13 pm BST Aug 3,2011 |
This bug is fixed and is backported already to 11.2.0.2.
Linux x86-64 fix is available here http://aru.us.oracle.com:8080/ARU/ViewPatchRequest/process_form?aru=13290641
Other ports are also available.
Comment by Greg Rahn — August 3, 2011 @ 8:13 pm BST Aug 3,2011 |
Greg, Zahir,
Thanks for the update – once I’d found the note with it’s “fixed in 12.1” tag I didn’t carry on looking for patches. I should have realised that there might have been a backport or two.
Comment by Jonathan Lewis — August 3, 2011 @ 8:45 pm BST Aug 3,2011 |
Bugs can never be “fixed in” a shipped release or patch set — they are always fixed in the current “main” code branch, then backported if possible.
Comment by Greg Rahn — August 4, 2011 @ 1:35 pm BST Aug 4,2011 |
This is a very interesting article.
You make a good point about systematically analysing and trouble-shooting (Tanel will be pleased).
I didn’t know that Oracle actually interprets ANSI SQL into Oracle SQL.
I thought that both SQL standards were directly interpreted into the operations that Oracle performs.
Your article has now put a slight kibosh on my use of ANSI SQL in Oracle for fear of suffering future problems. How would you know you’re seeing the right results in a complex statement? A trust issue has been created, in my eyes.
Comment by Darryl Griffiths — August 4, 2011 @ 2:20 pm BST Aug 4,2011 |
Darryl,
I wouldn’t let the fact that ANSI (usually) has to be transformed to “Oracle” SQL before anything else happens – there’s so much transformation available for ordinary Oracle SQL that a little bit more probably won’t make much difference to correctness. And to respond to your worry about right results – I first asked that question way back in 7.2 days when it because possible to run 10 second queries that summed millions of rows: how do you know the answer’s right: take a look at the “Wrong Results” section of an patch release and look how many non-ANSI bugs there are that can get you the wrong answer.
If you want an argument against ANSI in Oracle, the one that gets to me is the one about hinting: https://jonathanlewis.wordpress.com/2010/12/03/ansi-argh/
Comment by Jonathan Lewis — August 6, 2011 @ 8:46 am BST Aug 6,2011 |
[…] premier Oracle Expert Jonathan Lewis blogs about not-so-easy question; How do you trouble-shoot a […]
Pingback by Log Buffer #232, A Carnival of the Vanities for DBAs | The Pythian Blog — August 5, 2011 @ 3:01 pm BST Aug 5,2011 |