The Oracle Developer Community forum often sees SQL that is hard to read – sometimes because it’s a brutal tangle of subqueries, sometimes because the formatting is bad, sometimes because the use of table and column aliases is non-existent or badly done. One particular “pattern” for the last weakness is code where the same table alias (typically a single letter) is used at several different points in the query.
I’ve often said that every table in a query should have a different alias and the alias should be used at every column usage in the query (the note at this URL includes a couple of refinements). I’ve just discovered another reason why this is a good idea and why you shouldn’t use the same alias twice in a query. Here’s a simplified demonstration of the threat – tested on 19.3.0.0 (though present on earlier versions):
rem Script: using_bug.sql rem Author: Jonathan Lewis rem Dated: Jan 2020 rem Purpose: rem rem Last tested rem 19.3.0.0 rem create table t2 as select rownum id, object_id, object_name, object_type, rpad('x',100) padding from all_objects where rownum <= 5000 --> comment to avoid wordpress format issue and mod(object_id,2) = 1 --> odd numbers only / create table t1 as select rownum id, object_id, object_name, object_type, rpad('x',100) padding from all_objects where rownum <= 5000 --> comment to avoid wordpress format issue and mod(object_id,2) = 0 --> even numbers only /
I’ve created two tables from the view all_objects, one of the tables holds rows where the object_id is even, the other where it is odd, so if I join these two tables on object_id the result set will be empty. So here are three queries that join the two tables – with the little twist that I’ve (accidentally) given both tables the same alias X in all three cases:
prompt ======================================= prompt Here's a query that might "confuse" the prompt optimizer when we try to explain it prompt ======================================= explain plan for select max(object_name) from t1 X join t2 X using (object_id); prompt ================================== prompt So might this one, but it doesn't. prompt ================================== explain plan for select max(object_id) from t1 X join t2 X using (object_id); select * from table(dbms_xplan.display); prompt =================================================== prompt With this one A-rows matches E-rows: and it's NOT 0 prompt =================================================== alter session set statistics_level = all; set serveroutput off set linesize 156 select count(*) from t1 X join t2 X using (object_id); select * from table(dbms_xplan.display_cursor(null,null,'cost allstats last'));
In the absence of the explicit aliases the first query should produce an execution plan; but when both tables are given the same alias the attempt to explain (or run) the query produced the error “ORA-00918: column ambiguously defined” – and that’s arguably a good thing.
The second query does better – or worse, depending on your point of view. Nominally the join is perfectly valid and the optimizer produces an execution plan for the query. But the plan predicts a Cartesian merge join with a result set of 25M rows – which doesn’t look like a good estimate – and the plan doesn’t have a Predicate Information section.
So we use a count(*) for the third query – just in case the result set is, somehow, 25M rows – and enable rowsource execution statistics, and acquire the plan from memory after running the query (which takes nearly 14 seconds of hammering the CPU to death). And here’s the output:
COUNT(*) ---------- 25000000 1 row selected. PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 85ygrcg4n3ymz, child number 0 ------------------------------------- select count(*) from t1 X join t2 X using (object_id) Plan hash value: 4259280259 ----------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 67358 (100)| 1 |00:00:13.38 | 200 | 198 | | | | | 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:13.38 | 200 | 198 | | | | | 2 | MERGE JOIN CARTESIAN| | 1 | 25M| 67358 (5)| 25M|00:00:10.55 | 200 | 198 | | | | | 3 | TABLE ACCESS FULL | T2 | 1 | 5000 | 15 (7)| 5000 |00:00:00.01 | 100 | 99 | | | | | 4 | BUFFER SORT | | 5000 | 5000 | 67343 (5)| 25M|00:00:04.54 | 100 | 99 | 133K| 133K| 118K (0)| | 5 | TABLE ACCESS FULL | T1 | 1 | 5000 | 13 (0)| 5000 |00:00:00.01 | 100 | 99 | | | | -----------------------------------------------------------------------------------------------------------------------------------------
Yes, instead of zero rows Oracle managed to produce 25M rows. The execution plan is a Cartesian merge join, and that’s because the optimizer has lost the join predicate (I didn’t supply a Predicate Information section because there wasn’t one – note the absence of any asterisks against any operations in the plan).
Interestingly the Query Block / Alias section of the plan (when I called for it) reported the two aliases as X_0001 and X_0002, so internally Oracle did manage to find two different aliases – but too late, presumably.
Conclusion
Give a little thought to using table aliases sensibly and safely. It’s trivial to fix this example, but some of the messy SQL that goes into production might end up running into the same issue without it being so easy to spot the source of the anomaly.
Footnote
This is Bug 25342699 : WRONG RESULTS WITH ANSI JOIN USING AND IDENTICAL TABLE ALIASES reported Jan 2017 against 12.1.0.2, not yet fixed.
Update (Feb 2021)
The bug is still present in 19.8 (can be tested here on LiveSQL).
Hi Jonathan,
In fact, CBO did not ‘loose’ the predicate, it simply took x.object_id=x.object_id as the join condition, and considered X on both sides to be T1, and so removed the predicate as always true. Can it be blamed for doing so? In the case of 2 tables only, one could argue that CBO should guess one X to be T1 and the other to be T2, but if you had a third table joined, even using another alias (e.g. join T3 X1 on (X.object_id=x1.object_id), which of T1 or T2 should be X?
Comment by Ghassan Salem — March 18, 2020 @ 12:21 pm GMT Mar 18,2020 |
Ghassan,
I like the explanation. I probably should have realised that when the first one failed it was actually complaining about the ambiguity of the object_id in select list rather than complaining about the ON clause.
I did point out, by the way, that the optimizer re-aliased the tables to X_0001 and X_0002 by the time the plan was complete, and that does suggest that a mechanism for resolving the ambiguity almost exists already.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — March 18, 2020 @ 12:58 pm GMT Mar 18,2020 |
I think the correct thing to do, would be to reject those queries as invalid (like most other databases do).
Comment by Hans — March 18, 2020 @ 12:27 pm GMT Mar 18,2020 |
Hans,
I agree – and that’s what the first example appeared to be doing.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — March 18, 2020 @ 12:48 pm GMT Mar 18,2020 |
Hi,
In both cases I obtain an
ERROR: table name “x” specified more than once
with pg11.
As Hans stated I don’t see that as a limitation I think it’s the desired behavior.
Regards,
Phil
Comment by Phil Florent — December 24, 2020 @ 1:43 pm GMT Dec 24,2020 |
Hello Jonathan,
thank you for pointing me here from the Oracle-L thread ANSI-Join-Rename-of-Alias-Leads-to-a-Different-Result.
I think that the example in the above mentioned thread more clearly demonstrates, that on some cases caused by the duplicated alias Oracle simple ignores
the column ambiguously defined error and silently take the first occurence of the column.
KR,
Jaromir D.B. Nemec
Comment by Jaromir D.B. Nemec — January 12, 2021 @ 10:39 pm GMT Jan 12,2021 |