Oracle Scratchpad

June 1, 2020

Order By

Filed under: Bugs,CBO,Indexing,Oracle — Jonathan Lewis @ 1:05 pm BST Jun 1,2020

This is a brief note with an odd history – and the history is more significant than the note.

While searching my library for an example of an odd costing effect for the “order by” clause I discovered a script that looked as if I’d written for 11.1.0.6 in 2008 to demonstrate a redundant sort operation appearing in an execution plan; and then I discovered a second script written for 11.2.0.4 in 2014 demonstrating a variant of the same thing (presumably because I’d not found the original script in 2014) and the second script referenced a MOS bug number

Bug 18701129 : SORT ORDER BY ISN’T AVOIDED WHEN ROWID IS ADDED TO ORDER BY CLAUSE

Whenever I “discover” an old bug test I tend to re-run it to check whether or not the bug has been fixed.  So that’s what I did, and found that the anomaly was still present in 19.3.0.0. The really odd thing, though, was that the bug note no longer existed – and even after I’d done a few searches involving the text in the description I couldn’t manage to find it!

For the record, here’s the original 2008 script (with a couple of minor edits)


rem
rem     Script:         order_by_anomaly.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2008
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0        Still sorting
rem             12.2.0.1
rem             11.1.0.6
rem

set linesize 180
set pagesize 60

create table test 
as 
select  * 
from    all_objects 
where   rownum <= 10000 -- >  comment to avoid wordpress format issue
;

alter table test modify object_name not null;
create index i_test_1 on test(object_name);

analyze table test compute statistics;

set serveroutput off
alter session set statistics_level = all;

select  * 
from    (select * from test order by object_name) 
where 
        rownum < 11 -- > comment to avoid wordpress format issue
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));



select  * 
from    (select /*+ index(test) */ * from test order by object_name,rowid) 
where
        rownum < 11 -- > comment to avoid wordpress format issue
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

alter session set statistics_level = typical;
set serveroutput on

Yes, that is an analyze command – it’s a pretty old script and I must have been a bit lazy about writing it. (Or, possibly, it’s a script from an Oracle-l or Oracle forum posting and I hadn’t re-engineered it.)

I’ve run two queries – the first uses an inline view to impose an order on some data and then selects the first 10 rows. The second query does nearly the same thing but adds an extra column to the “order by” clause – except it’s not a real column it’s the rowid pseudo-column. Conveniently there’s an index on the table that is a perfect match for the “order by” clause and it’s on a non-null column so the optimizer can walk the index in order and stop after 10 rows.

Adding the rowid to the “order by” clause shouldn’t make any difference to the plan as the index Oracle is using is a single column non-unique index, which means that the internal representation makes it a two-column index where the rowid is (quite literally) stored as the second column. But here are the two execution plans:


----------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        |     10 |00:00:00.01 |       7 |
|*  1 |  COUNT STOPKEY                |          |      1 |        |     10 |00:00:00.01 |       7 |
|   2 |   VIEW                        |          |      1 |     10 |     10 |00:00:00.01 |       7 |
|   3 |    TABLE ACCESS BY INDEX ROWID| TEST     |      1 |  10000 |     10 |00:00:00.01 |       7 |
|   4 |     INDEX FULL SCAN           | I_TEST_1 |      1 |     10 |     10 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<11)



----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |      1 |        |     10 |00:00:00.01 |    4717 |       |       |          |
|*  1 |  COUNT STOPKEY                         |          |      1 |        |     10 |00:00:00.01 |    4717 |       |       |          |
|   2 |   VIEW                                 |          |      1 |  10000 |     10 |00:00:00.01 |    4717 |       |       |          |
|*  3 |    SORT ORDER BY STOPKEY               |          |      1 |  10000 |     10 |00:00:00.01 |    4717 |  4096 |  4096 | 4096  (0)|
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| TEST     |      1 |  10000 |  10000 |00:00:00.01 |    4717 |       |       |          |
|   5 |      INDEX FULL SCAN                   | I_TEST_1 |      1 |  10000 |  10000 |00:00:00.01 |      44 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<11)
   3 - filter(ROWNUM<11)


When I add the rowid to the “order by” clause the optimizer no longer sees walking the index as an option for avoiding work; it wants to collect all the rows from the table, sort them, and then report the first 10. In fact walking the index became such an expensive option that I had to hint the index usage (hence the non-null declaration) to make the optimizer choose it, the default plan for 19.3 was a full tablescan and sort.

It’s just a little example of an edge case, of course. It’s a pity that the code doesn’t recognise the rowid as (effectively) a no-op addition to the ordering when the rest of the “order by” clause matches the index declaration, but in those circumstances the rowid needn’t be there at all and you wouldn’t expect anyone to include it.

As I said at the start – the interesting thing about this behaviour is that it was once described in a bug note that has since disappeared from public view.

 

1 Comment »

  1. Jonathan,

    In spite of this edge case, the sort calculation in Oracle is still much more sophisticated than in other products. MS SQL Server, for example, calculates the cost based on the maximum varchar size as defined in the table column instead of using the average size in the column statistics. Consequently, the cost will be massively overestimated in cases where the average data size is much smaller than the maximum data size in the column definition.

    Comment by Nenad Noveljic — June 1, 2020 @ 3:56 pm BST Jun 1,2020 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.