I posted a little holiday quiz – timed to appear just before midnight (GMT) on 24th December – that asked about the number of rows sorted and the memory used for queries like:
select sortcode from ( select sortcode from t1 order by sortcode ) where rownum <= 10 ;
The number and variety of the responses was gratifying. It’s always interesting to see how many important little details appear as people start to tackle even fairly straight-forward questions like this.
The first question fired back at me was the obvious one – what was the type definition of sortcode ? Without this critical piece of information it’s not possible to say much about the memory requirements. The answer was char(6). (I hadn’t omitted this deliberately, it just happens that in the UK banking system sort-codes are 6 character fixed length codes – and that’s what the column was supposed to represent – not that anyone else would have known that, of course.)
But knowing that it’s char(6) isn’t enough – after all we have characters sets that are fixed single-byte at one extreme and variable length up to four bytes at the other. On top of that, if you want to estimate the total memory used you also need to know whether you are talking about 32-bit Oracle or 64-bit Oracle because the size of the pointers used in the sort mechanism varies with the size of CPU. Then there’s the problem of Oracle version – since 10gR2 introduced a new sorting mechanism which (if it gets used) consumes less overhead.
One respondent even suggested that I might have set the table up as an index-organized table (IOT), or as a table in an index cluster – thus allowing myself an indexed access path without explicitly “creating an index ont the table”. (A reasonable suggestion in a real-life scenario – but I would (probably) have given a little warning if I were trying to be tricky). A suggestion that didn’t appear, but may have been relevant, was the possibility that the table was range partitioned on the column I was querying – it’s possible (but I haven’t checked it ) that even in the absence of indexes the optimizer might have done something clever in some versions to use partition elimination. [Update: I see Charles Hooper thought of, and tested, that idea while I was typing up this note].
Various people then modelled the problem, using explain plan, event 10046 with tkprof, queries against v$sessstat and so on, so get an idea of what happens – and found some (apparently) contradictory results. In one case, checking v$sesstat showed 1,000,000 rows sorted, while a simultaneous check of the tkprof output reported the actual number of rows from the sort order by stopkey operation was 10. In another case the execution plan from explain plan for this query and the simpler “select sortcode from t1 order by sortcode” both showed the same results for rows, bytes and temporary (sort) space.
The one thing that didn’t get mentioned in any of the comments was event 10032 – the event to trace sort statistics. Running 10.2.0.3, this is the content of the tracefile I got for the query returning just 10 rows (the table held 1,048,576 rows rather than the 1,000,000 stated in the original problem):
---- Sort Parameters ------------------------------ sort_area_size 65536 sort_area_retained_size 65536 sort_multiblock_read_count 1 max intermediate merge width 3 ---- Sort Statistics ------------------------------ Input records 1048576 Output records 114 Total number of comparisons performed 1048968 Comparisons performed by in-memory sort 1048968 Total amount of memory used 8192 Uses version 1 sort Does not use asynchronous IO ---- End of Sort Statistics -----------------------
You’ll notice that (a) there are no statistics about disk I/O – this sort completed in memory; (b) the memory allocated as a workarea by the end of the sort was 65,536 bytes – I was using automatic workarea_size_policy, so the sort_area_size and sort_area_retained_size lines report the final memory allocation); and (c) the total memory used was only 8192 bytes.
It is a little odd, you may think, that the number of output rows was reported as 114 rather than the 10 we asked for; it’s also fairly important that Oracle has reported a version 1 sort – the mechanism that was replaced in 10gR2 by a new sorting algorithm.
In comparison look at the trace information for the query “select sortcode from t1 order by sortcode”
--- Sort Parameters ------------------------------ sort_area_size 19013632 sort_area_retained_size 19013632 sort_multiblock_read_count 1 max intermediate merge width 1159 *** 2009-12-23 14:41:30.062 ---- Sort Statistics ------------------------------ Input records 1048576 Output records 1048576 Total number of comparisons performed 11324792 Comparisons performed by in-memory sort 11324792 Total amount of memory used 19013632 Uses version 2 sort Does not use asynchronous IO
Note particularly that the memory demand is much higher, the sort is a version 2 sort, and the number of comparisons performed is much higher. Oracle is doing something a little bit clever when is handles the “short” sort, but it can only do it by using the version 1 sort.
I pointed out in Cost Based Oracle – Fundamentals that Oracle does not “sort” data in the way that a person would sort a deck of cards (say); basic sorting algorithms for computers involve copying the data into memory and then manipulating a set of pointers to the data so that walking through the pointers in order causes you to jump around the copied data, reading it back in sorted order.
For a version 1 sort, Oracle’s algorithm uses a “binary insertion tree”. This means it is basically building a balanced b-tree index on the data as the data is read into memory; moreover there are only two entries in each node of the tree (each node points to just two child nodes or, ultimately, just two items of data). This balanced binary tree algorithm is resource-intensive on memory and CPU – which is why Oracle finally introduced a better algorithm in 10gR2. (When I forced Oracle to use the version 1 sort on the full data set the memory required was 24MB with 20M comparisons compared to the 19MB and 11M comparisons above).
However, if you are building a binary index as you read the data an interesting advantage appears when you only want to report a tiny subset of the full sorted data set. And at this point, you uncover the ambiguity in the question: “How many rows does Oracle sort?”
Consider the sequence of steps, which must go something like this:
- we wish to report the top 10 rows, and the optimizer is aware of this and has special code to handle the task.
- we copy the first 10 rows from the table into memory and build the binary tree – and we keep track of the largest value we have linked into the binary tree.
- from this point onwards we read a value into memory and compare it with the known largest value in the tree
- if the value is too large we discard it and read the next value, otherwise we insert it into the binary tree and discard the previous largest value from the tree
The total number of comparisons we perform is made up of two components – comparing every row with the current largest (so N-1 comparisons if we have N rows), and doing the comparisons (log2(N) for each row) to insert a few “replacement” values into a “small” binary tree.
So how many rows have we sorted ? You could argue for three possibilities –
- the 10 rows we finally get
- the 1,000,000 rows we looked at
- or the “X” rows that we inserted into the binary tree as we went along
Notice in our “short sort” trace above that the number of output rows reports was 114 – I assume that this is the number of rows that were inserted into the binary tree. Since we have a binary tree holding 10 items it has a height of 4, so we should expect the total number of comparisons to be roughly 1048576 + 4 * 114 … and the actual result is pretty close.
As a final comment on how unfair my original question was, just think about the effect the order of the incoming data has on the efficiency of the algorithm. The number of inserts into the b-tree could vary dramatically with the pre-existing ordering of the data. Here’s an extreme data set to make the point.
execute dbms_random.seed(0) create table t1 as with generator as ( select --+ materialize rownum id from all_objects where rownum <= 5000 ) select rownum - 1 ordered, 1000000 - rownum reversed, trunc(dbms_random.value(1,1000000)) randomised from generator v1, generator v2 where rownum <= 1 * 1048576 ; select * from ( select * from t1 order by ordered -- select * from t1 order by reversed -- select * from t1 order by randomised ) where rownum <= 10 ;
Depending on your choice of column in the order by clause, the output rows, comparisons, and memory used are as follows:
Ordered ------- Output records 10 Total number of comparisons performed 1048575 Total amount of memory used 8192 Reversed -------- Output records 1048576 Total number of comparisons performed 5242856 Total amount of memory used 1130496 Randomised ---------- Output records 135 Total number of comparisons performed 1049032 Total amount of memory used 8192