Here’s a recent question from comp.databases.oracle.server:
Is there any documentation available on the default query result ordering in the absence of an ORDER BY clause? I looked through the Oracle docs and could not find much. I do not plan to depend on this behavior but I am curious on how it works.
Now, if you want the data to come out in a particular order, you have to put in an order by clause (and leave it up to the optimizer to decide how to implement that order by) but it’s an interesting question – and it’s good to be curious; so I thought I’d run up a two-table demonstration in 9i showing how the order of the result set can be affected by the join mechanism used to join the tables.
create table t1
as
select
rownum id,
mod(rownum + 2,4) n1,
lpad(rownum,10,'0') small_vc,
rpad('x',100) padding
from
all_objects
where rownum <= 4
;
/*
ID N1 SMALL_VC PADDING
---------- ---------- ---------- -------
1 3 0000000001 x
2 0 0000000002 x
3 1 0000000003 x
4 2 0000000004 x
*/
create table t2
as
select * from t1
union all
select * from t1
union all
select * from t1
union all
select * from t1
;
/*
ID N1 SMALL_VC PADDING
---------- ---------- ---------- -------
1 3 0000000001 x
2 0 0000000002 x
3 1 0000000003 x
4 2 0000000004 x
1 3 0000000001 x
2 0 0000000002 x
3 1 0000000003 x
4 2 0000000004 x
1 3 0000000001 x
2 0 0000000002 x
3 1 0000000003 x
4 2 0000000004 x
1 3 0000000001 x
2 0 0000000002 x
3 1 0000000003 x
4 2 0000000004 x
*/
-- collect statistics here
select
/*+ ordered use_nl(t2) */
/*+ ordered use_hash(t2) */
/*+ ordered use_merge(t2) */
t1.id t1_id,
t1.n1 t1_n1,
t2.id t2_id,
t2.n1 t2_n1
from
t1, t2
where
t2.n1 = t1.n1
;
Obviously you should only implement one of the three sets of hints at a time. The three sets of output are as follows.
===========
Nested loop
===========
T1_ID T1_N1 T2_ID T2_N1
---------- ---------- ---------- ----------
1 3 1 3
1 3 1 3
1 3 1 3
1 3 1 3
2 0 2 0
2 0 2 0
2 0 2 0
2 0 2 0
3 1 3 1
3 1 3 1
3 1 3 1
3 1 3 1
4 2 4 2
4 2 4 2
4 2 4 2
4 2 4 2
16 rows selected.
===========
Hash join
===========
T1_ID T1_N1 T2_ID T2_N1
---------- ---------- ---------- ----------
1 3 1 3
2 0 2 0
3 1 3 1
4 2 4 2
1 3 1 3
2 0 2 0
3 1 3 1
4 2 4 2
1 3 1 3
2 0 2 0
3 1 3 1
4 2 4 2
1 3 1 3
2 0 2 0
3 1 3 1
4 2 4 2
16 rows selected.
===========
Merge join
===========
T1_ID T1_N1 T2_ID T2_N1
---------- ---------- ---------- ----------
2 0 2 0
2 0 2 0
2 0 2 0
2 0 2 0
3 1 3 1
3 1 3 1
3 1 3 1
3 1 3 1
4 2 4 2
4 2 4 2
4 2 4 2
4 2 4 2
1 3 1 3
1 3 1 3
1 3 1 3
1 3 1 3
16 rows selected.
Note how the ordering for the nested loop is basically driven by the first table in the join; the ordering for the hash join is driven by the second table; and the order for the merge join is driven by the sort order of the join columns.
If you think back to my notes on filter subqueries, you will realise that the change in join order could easily affect the number of times a filter subquery gets called.
Will this close forever the discussion about the use of ORDER BY? ;-)
I think not, you have left out the test case with the GROUP BY, and somebody for sure will claim that Oracle always order the result with GROUP BY…
Thanks a lot for your precious effort!!!
Comment by Franco — March 5, 2007 @ 2:12 pm BST Mar 5,2007 |
I’ve heard that if the query is done in parallel, that can affect the order the records are returned in as well. Is that true?
Comment by Mike McAllister — March 5, 2007 @ 2:25 pm BST Mar 5,2007 |
Don’t forget indexes, multiple partition access, different versions, HINTS – what else?
Comment by Glenn — March 5, 2007 @ 2:51 pm BST Mar 5,2007 |
Group by can or can NOT do order by.
In Oracle 10 there’s a new group by mode (group by hash if I remember well) that made order by less sure…I’m sure about that :D
Comment by Antonio — March 5, 2007 @ 3:40 pm BST Mar 5,2007 |
Mike, parallel query was, indeed, one of the easy options for showing that you can’t guarantee a specific order without including an explicit order by.
Antonio, just to make life interesting, 10g can even eliminate an order by in a non-mergeable view if the optimizer thinks it will not change the result set.
Comment by Jonathan Lewis — March 5, 2007 @ 5:39 pm BST Mar 5,2007 |
I believe there’s one exception, is there not? A sorted hash cluster will (if I remember the documentation well enough!) return rows in the order of its SORT BY clause, without an ORDER BY clause being mentioned at query time.
Comment by dizwell — March 5, 2007 @ 9:55 pm BST Mar 5,2007 |
Howard, there is some truth in that statement – although it may not be entirely safe. My memory of the marketing was that the structure of the sorted hash cluster was supposed to ensure that inserts and queries would operate very efficiently without needing to perform an explicit sort to retrieve the data in order.
However my tests with a 10032, repeated a few moments ago, showed Oracle sorting data which was not in order to produce a sorted output whether or not I used an order by clause. So possibly there’s something a little underhand and dangerous going on there at present. I’d still put the clause in – whatever the marketingn literature might say.
Comment by Jonathan Lewis — March 6, 2007 @ 1:05 am BST Mar 6,2007 |
The Oracle manual documentation for the SELECT statement saids: Use the ORDER BY clause to order rows returned by the statement. Without an order_by_clause, no guarantee exists that the same query executed more than once will retrieve rows in the same order.
So without an ORDER BY clause the order of the rows may change and is not specified. The only way to guarantee the order of the rows is by using the ORDER BY clause.
Comment by DSanchez — March 6, 2007 @ 1:27 am BST Mar 6,2007 |
DSanchez, I don’t think you have to convince anyone who knows SQL that an order by clause is necessary to guarantee correct ordering – but I believe Howard could have been thinking of this descriptive comment from the section about sorted hash clusters in the 10.2 Database Admin Guide B-14231-01, p. 19-4:
Given the sort order of the data, the following query would return the call records for a specified hash key by oldest record first.
SELECT * WHERE telephone_number = 6505551212;
It’s always a little irritating when manuals contradict each other.
Comment by Jonathan Lewis — March 6, 2007 @ 3:09 am BST Mar 6,2007 |
I think it’s a good idea to include “select * ” from both tables at the beginning of your examples so that we could see the “raw order” of the rows. Although the raw order cann be guessed by the create statements, for me it would be good to see it – and it would make the point more clear.
Comment by Todor Botev — March 6, 2007 @ 8:55 am BST Mar 6,2007 |
Todor, I think you’re right – I’ve made the change, and also given the selected columns aliases to identify which table they cam from, and the visual impact is better.
Comment by Jonathan Lewis — March 6, 2007 @ 10:32 am BST Mar 6,2007 |
Thanks Jonathan – I think, too, that it’s better now.
Comment by Todor Botev — March 6, 2007 @ 11:18 am BST Mar 6,2007 |
I need to ask that :) – why did you add the other two columns – small_vc, padding – to the table t1? They seem to have nothing to do with the point. Just to simulate better a real situation – bacause we rarely have tables with only two columns in reality?
Comment by Todor Botev — March 6, 2007 @ 11:49 am BST Mar 6,2007 |
Todor, it’s a side effect of having a standard template for test cases. As you guessed, better simulation for general testing.
Comment by Jonathan Lewis — March 6, 2007 @ 1:22 pm BST Mar 6,2007 |
Jonathan, the entry about sorted hash clusters in the 10.2 Database Admin Guide B-14231-01, p. 19-4 has been reported as a documentation bug. See following discussion:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:74320098178823#74643668232655
Comment by DSanchez — March 6, 2007 @ 11:57 pm BST Mar 6,2007 |
Todor/Jonathan,
Even the concept of “raw order” seems dangerous, as that is often what people think they’re going to get from a single table FTS and what they think is equivalent to the order in which rows were inserted — I’m sure that someone could knock up an easy demonstration of this as fallacious reasoning, based on parallel query for example :D
Comment by David Aldridge — March 9, 2007 @ 4:48 pm BST Mar 9,2007 |
David,
Here’s an example:
create tablespace test
datafile ‘E:\Oracle\Bases\MIK8\TESTS_01.DBF’ size 10m,
‘E:\Oracle\Bases\MIK8\TESTS_02.DBF’ size 10m
default storage (initial 20k next 20k pctincrease 0);
drop table test;
create table test (id number, val char(2000)) parallel 3 tablespace test;
alter table test allocate extent (datafile ‘E:\Oracle\Bases\MIK8\TESTS_02.DBF’);
alter table test allocate extent (datafile ‘E:\Oracle\Bases\MIK8\TESTS_01.DBF’);
alter table test allocate extent (datafile ‘E:\Oracle\Bases\MIK8\TESTS_01.DBF’);
alter table test allocate extent (datafile ‘E:\Oracle\Bases\MIK8\TESTS_02.DBF’);
desc test
begin
for i in reverse 0..25 loop
insert into test values (i, chr(ascii(‘A’)+i));
end loop;
end;
/
commit;
SQL> select extent_id, file_id, block_id, blocks from dba_extents
2 where owner=user and segment_name=’TEST’;
EXTENT_ID FILE_ID BLOCK_ID BLOCKS
———- ———- ———- ———-
0 7 2 5
1 7 7 5
2 6 2 5
3 6 7 5
4 7 12 5
5 6 12 5
6 rows selected.
SQL> select id,
2 dbms_rowid.rowid_relative_fno(rowid) file#,
3 dbms_rowid.rowid_block_number(rowid) block#,
4 dbms_rowid.rowid_row_number(rowid) row#
5 from test
6 /
ID FILE# BLOCK# ROW# ExtNb BlockNb BlocksInExtent
———- ———- ———- ———- —– ——- ————–
16 6 2 0 — 2 0 5
14 6 4 0 — 2 2 5
13 6 5 0 — 2 3 5
22 7 6 0 — 0 4 5
20 7 8 0 — 1 1 5
10 6 8 0 — 3 2 5
19 7 9 0 — 1 2 5
7 6 11 0 — 3 4 5
1 6 12 0 — 5 0 5
0 6 13 0 — 5 1 5
3 7 15 0 — 4 1 5
25 7 3 0 — 0 1 5
24 7 4 0 — 0 2 5
23 7 5 0 — 0 3 5
21 7 7 0 — 1 0 5
18 7 10 0 — 1 3 5
6 7 12 0 — 4 0 5
5 7 13 0 — 4 1 5
4 7 14 0 — 4 2 5
15 6 3 0 — 2 1 5
12 6 6 0 — 2 4 5
11 6 7 0 — 3 0 5
9 6 9 0 — 3 2 5
8 6 10 0 — 3 3 5
17 7 11 0 — 1 4 5
2 7 16 0 — 4 4 5
26 rows selected.
SQL> /
ID FILE# BLOCK# ROW#
———- ———- ———- ———-
16 6 2 0
14 6 4 0
13 6 5 0
21 7 7 0
8 6 10 0
17 7 11 0
0 6 13 0
2 7 16 0
15 6 3 0
12 6 6 0
11 6 7 0
9 6 9 0
18 7 10 0
6 7 12 0
4 7 14 0
3 7 15 0
25 7 3 0
24 7 4 0
23 7 5 0
22 7 6 0
20 7 8 0
10 6 8 0
19 7 9 0
7 6 11 0
1 6 12 0
5 7 13 0
26 rows selected.
SQL> /
ID FILE# BLOCK# ROW#
———- ———- ———- ———-
25 7 3 0
24 7 4 0
23 7 5 0
20 7 8 0
1 6 12 0
0 6 13 0
16 6 2 0
13 6 5 0
22 7 6 0
21 7 7 0
9 6 9 0
7 6 11 0
18 7 10 0
6 7 12 0
4 7 14 0
15 6 3 0
14 6 4 0
12 6 6 0
11 6 7 0
10 6 8 0
8 6 10 0
19 7 9 0
17 7 11 0
5 7 13 0
3 7 15 0
2 7 16 0
26 rows selected.
Regards
Comment by Michel Cadot — March 9, 2007 @ 7:20 pm BST Mar 9,2007 |