Oracle Scratchpad

June 3, 2007

Sorting

Filed under: Infrastructure,sorting,Troubleshooting — Jonathan Lewis @ 8:45 pm BST Jun 3,2007

This queston came up on the Oracle newsgroup a few days ago:

I have a table (call it policy) with three columns a, b and c. The table has two rows, with column c having value zero for both rows. I run the following query

select * from policy order by c;

As both the rows have a value of zero, the result should be sorted ascending by rowid, but I see the opposite;  viz. the result set is sorted descending by rowid.

Is that an issue with the version of 10g server, I am using or is it some settings of the Oracle server?

Various people replied to say that you should never assume any ordering beyond the order you explicitly state in the order by clause. But the question does raise a couple of interesting points.

Let’s start by running the test (it’s not hard to write up a test case, so why not do so when you ask the question). The following is good enough – and I’ve appended the output of the query when running on 10.2.0.1:


drop table t1;
create table t1 (a number, b number, c number);        

insert into t1 values(1,1,0);
insert into t1 values(1,1,0);        

commit;        

select t1.*, t1.rowid from t1 order by c;
         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAATncAAGAAABSKAAB
         1          1          0 AAATncAAGAAABSKAAA        

2 rows selected.

Sure enough, the results are in the “wrong” order.

So what do you do next ? The first couple of ideas are: add a third, fourth and fifth row to see if the “descending order” observation is accurate; then try running the test on a different version of Oracle.

Here’s the output from 10.2.0.1, after adding more and more rows:


         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAATncAAGAAABSKAAA
         1          1          0 AAATncAAGAAABSKAAC
         1          1          0 AAATncAAGAAABSKAAB       

         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAATncAAGAAABSKAAA
         1          1          0 AAATncAAGAAABSKAAD
         1          1          0 AAATncAAGAAABSKAAC
         1          1          0 AAATncAAGAAABSKAAB       

         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAATncAAGAAABSKAAA
         1          1          0 AAATncAAGAAABSKAAB
         1          1          0 AAATncAAGAAABSKAAE
         1          1          0 AAATncAAGAAABSKAAD
         1          1          0 AAATncAAGAAABSKAAC

The results are NOT in descending order of rowid – it just looks that way in the very first case.

But here’s the output from the same test running on 9.2.0.8:


         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAALJkAAJAAABIKAAA
         1          1          0 AAALJkAAJAAABIKAAB      

         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAALJkAAJAAABIKAAA
         1          1          0 AAALJkAAJAAABIKAAB
         1          1          0 AAALJkAAJAAABIKAAC      

         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAALJkAAJAAABIKAAA
         1          1          0 AAALJkAAJAAABIKAAB
         1          1          0 AAALJkAAJAAABIKAAC
         1          1          0 AAALJkAAJAAABIKAAD      

         A          B          C ROWID
---------- ---------- ---------- ------------------
         1          1          0 AAALJkAAJAAABIKAAA
         1          1          0 AAALJkAAJAAABIKAAB
         1          1          0 AAALJkAAJAAABIKAAC
         1          1          0 AAALJkAAJAAABIKAAD
         1          1          0 AAALJkAAJAAABIKAAE

So is seems more likely that there is a sorting effect (possibly accidental) on rowids in 9.2.0.8.

The Answer

Oracle introduced a new sorting algorithm (sometimes known as the Version 2 sort, which is how it is labelled in the 10032 trace) in 10.2.

The previous algorithm was effectively building an in-memory index on the incoming data using a balanced binary tree and seeking to the righ (i.e. optimised towards data that appeared in the correct order and keeping such data in the order of appearance – hence the apparent sorting of rowids in our example in 9i).

The CPU and memory overheads for this algorithm are a bit fierce for large sorts, so the new algorithm does something completely different (possibly based on a variant of the heapsort, though it isn’t actually a heapsort) which is more efficient on memory and CPU. It has the side-effect though, of re-ordering incoming rows even when the data is not arriving out of order.

Someone who knew their sorting algorithms really well might even be able to infer the algorithm Oracle was using by extending the test case and watching the rowids re-ordering themselves as the result set grows. But I’m not going to volunteer for that task.

If you want to disable the new sorting mechanism, there is a hidden parameter to affect it. As usual, you shouldn’t use hidden parameters without first receiving confirmation from Oracle support that you need to, but the relevant parameter is: _newsort_enabled, which defaults to true in 10g.

6 Comments »

  1. Same set of data is working fine in Oracle 10.1.0.2, that is its giving next priority to rowid if the order by column has same values.

    Also , For same data if you use nvl(c,0) in the order by clause, than again some glitch in sorting.

    Comment by Neeraj — June 4, 2007 @ 1:00 pm BST Jun 4,2007 | Reply

  2. Neeraj, it is important to realise that neither vesion is ‘working fine’, or displaying a ‘glitch’.

    Any apparent ordering that has not been specified is simply a side-effect of implementation, and is therefore neither right nor wrong.

    10.1 shows the old behaviour because the new sort (like the new hash group by) was introduced in 10.2

    Comment by Jonathan Lewis — June 5, 2007 @ 7:18 am BST Jun 5,2007 | Reply

  3. I saw another case but am not sure if it’s related to the “newsort” (tried to disable it without luck):

    drop table p purge;
    drop table c purge;
    create table p (c1 varchar2(10));

    insert into p values(‘C001′);
    insert into p values(‘T001′);

    create index idx$p on p(c1);

    create table c (c1 varchar2(10), c2 number);

    insert into c values(‘C001′, 1);
    insert into c values(‘C001′, 2);
    insert into c values(‘C001′, 3);

    insert into c values(‘T001′, 3);
    insert into c values(‘T001′, 2);
    insert into c values(‘T001′, 1);

    create index idx$c on c(c1, c2);

    select p.c1, c.c2
    from p, c
    where p.c1 = c.c1
    order by 1
    /

    — From 8i: c2 shown in the inserted order
    C1 C2
    ———-
    C001 1
    C001 2
    C001 3
    T001 3
    T001 2
    T001 1

    — From 10g: c2 shown in the “asc” order
    C1 C2
    ———-
    C001 1
    C001 2
    C001 3
    T001 1
    T001 2
    T001 3

    Acutally, the query shows the same with or without “order by” clause in both 8i and 10gR2.

    Comment by Dan — July 31, 2007 @ 2:06 am BST Jul 31,2007 | Reply

  4. Dan,
    In a case like this you really can’t comment until you’ve checked the execution plan. A possible explanation for the difference could simply be that 8i used a hash join and 10g used a nested loop with index on the second table.

    Comment by Jonathan Lewis — August 2, 2007 @ 3:43 pm BST Aug 2,2007 | Reply

  5. I came across MetaLink Note#6817844.8 titled “Bug 6817844 – Multi pass sort with auto memory management even with plenty of PGA”

    I’ll try to build a test case if I have time tonight.

    However, would you care to comment on the Bug ?

    Comment by Hemant K Chitale — May 26, 2008 @ 6:47 am BST May 26,2008 | Reply

  6. Hermant,

    There’s not a lot that can be said, really.

    What I’d like to see is a little more detail of why it can happen – are there any features like “number of columns in a multi-column sort”, “critical byte length of the things being sorted”, “some aspect of number of concurrent sorts” – and so on.

    There has to be something more than “it happens at random”, or they wouldn’t be able to fix it. (And if it is a bug that means it can seem to happen at random I’d like to know that too).

    Comment by Jonathan Lewis — May 27, 2008 @ 9:54 am BST May 27,2008 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,267 other followers