Oracle Scratchpad

July 22, 2008

Sorted Hash Clusters – 2

Filed under: Infrastructure,Oracle,Performance — Jonathan Lewis @ 7:16 am BST Jul 22,2008

[Back to part 1][Forward to part 3]

In the first part of this series, I showed you how a query by hash key against a sorted hash cluster would return the data in an order dictated by the sort key – without showing a sort operation in the execution plan, even in the absence of an ‘order by’ clause.

The next step in the investigation is to look at a an example with a larger data set so that an interesting effect can become visible.  In this example I will use just one value for the hash key, and create a number of rather long rows.

create cluster sorted_hash_cluster (
	hash_value	number(4)	,
	sort_value	number(4)	sort
)
hashkeys 10
hash is hash_value
size 8000
;

create table sorted_hash_table (
	hash_value	number(4),
	sort_value	number(4),
	v1		varchar2(10),
	padding		varchar2(1000)
)
cluster sorted_hash_cluster (
	hash_value	,
	sort_value
)
;

begin
	for i in 1..100 loop
		insert into sorted_hash_table values(
			1,
			trunc(dbms_random.value(0,1000)),
			lpad(i,10),
			rpad('x',1000)
		);
		commit;
	end loop;
end;
/

After collecting stats, I can see that the data is spread across 15 blocks (I’m running this on 10.2.0.3, using an 8KB block size and freelist management).

Having established the size of the data, I run the following queries – checking v$sysstat and x$kcbsw/x$kcbwh as I do so, and enabling trace event 10032 (which is the sort statistics trace).

select
	hash_value, sort_value, rowid, v1
from
	sorted_hash_table
where
	hash_value = 1
order by
	sort_value
;

select
	hash_value, sort_value, rowid, v1, padding
from
	sorted_hash_table
where
	hash_value = 1
order by
	sort_value
;

Note that the second query is selecting the 1,000 character padding column – which means the result set is going to be sized at about 100KB.

The execution plan for both queries looks like the following one (no surprises – we access by hash key, and get an invisible sort taking place in both cases). However, although the rows and costs for the two queries are identical, they will show  different values for the bytes column. One shows 3000, the other shows 99K.

----------------------------------------------------------------------------
| Id  | Operation         | Name              | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                   |   100 |  3000 |     0   (0)|
|*  1 |  TABLE ACCESS HASH| SORTED_HASH_TABLE |   100 |  3000 |            |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("HASH_VALUE"=1)

The next step is to look at the results of the 10032 trace – which (with some cosmetic surgery) both look like this:

---- Sort Statistics ------------------------------
Input records                             100
Output records                            100
Total number of comparisons performed     542
  Comparisons performed by in-memory sort 542
Total amount of memory used               8192
Uses version 1 sort
Does not use asynchronous IO

Although one data set apparently was much larger than the other, we still used only 8KB to sort it ! So take a look at the critical output of x$kcbsw:

          Why0          Why1          Why2    Other Wait
          ----          ----          ----    ----------
             7             0             0             0 kdswh02: kdsgrp
             7             0             0             0 kdiwh06: kdifbk
             7             0             0             0 kdiwh07: kdifbk
             1             0             0             0 kdiwh08: kdiixs
           115             0             0             0 kdqs_get_block_where: kdsq_get_block
          ----          ----          ----    ----------
           137             0             0             0 Total: 5 rows

Note particularly the 115 calls to: kdqs_get_block_where: kdsq_get_block … that’s accesses to the hash cluster blocks: I’m fairly convinced that that’s 15 accesses to read the blocks on a first pass, and 100 accesses to pick up the rows by rowid in the right order on the second pass.

Oracle has used a two-pass approach. It’s used the hash key to pick up the rowid and sort key from every relevant row, then it has sorted the result set by sort key, then it’s used the rowids to revisit the row and collect the bulk of the data. This keeps the sorting to a minimum – but can push up the buffer gets and related latch activity.

For comparison, if you change the cluster into a simple hash cluster on just the hash key, the x$kcbsw/x$kcbwh results and sort statistics for the larger query look like this:

          Why0          Why1          Why2    Other Wait
          ----          ----          ----    ----------
             7             0             0             0 kdswh02: kdsgrp
            15             0             0             0 kdswh06: kdscgr
             7             0             0             0 kdiwh06: kdifbk
             7             0             0             0 kdiwh07: kdifbk
          ----          ----          ----    ----------
            36             0             0             0 Total: 4 rows

Input records                             100
Output records                            100
Total number of comparisons performed     639
  Comparisons performed by in-memory sort 639
Total amount of memory used               112640
Uses version 2 sort
Does not use asynchronous IO

Note that the main data acquisition is through 15 calls to kdswh06: kdscgr; then we need a larger amount of memory for sorting, and switch to the newer ‘version 2′ sort that appear in 10gR2.

Sorted hash clusters play some very interesting games; and part 3 will disclose the final interesting detail of how they work.

[Back to part 1][Forward to part 3]

4 Comments »

  1. Hi, Jonathan:
    I have a question about the exact page arrangement of the sorted hash cluster.

    In the example you listed in this article, if we create another table (e.g., sorted_hash_table2) in the sorted_hash_cluster. There might be two ways to store
    the data.

    1. Block 1-1000 are used for sorted_hash_table, Block 1001-2000 are used for
    sorted_hash_table2.
    2. Block 1-2000 are used for both sorted_hash_table and sorted_hash_table2.
    Within each block, the data from both tables are stored together.

    Solution 2 might be good for join performance, but lock will be a nightmare.

    Could you tell me what’s your thoughts?

    Thanks!

    Comment by tony — September 24, 2008 @ 6:29 pm BST Sep 24,2008 | Reply

  2. Tony,

    The point of clustering is to put into one place all the data items that go together – this is true of index clusters and hash clusters.

    The answer to your question is option 2.

    The hash cluster defines a key that is common to both tables, the volume of data per key and the number of key value you expect. Oracle decides how many blocks will be needed, and how many keys per block.

    When you add tables to the cluster you identify certain table columns with the cluster key. If two rows (whether they are from the same table or different tables) hold the same value in the key columns, then they are supposed to go in the same block.

    I don’t see why this should cause a particularly nightmarish locking problem.

    Comment by Jonathan Lewis — September 26, 2008 @ 1:45 pm BST Sep 26,2008 | Reply

  3. [...] Sorted Hash Clusters Filed under: Execution plans, Infrastructure, Tuning, trace files — Jonathan Lewis @ 9:25 pm UTC Jul 13,2008 [Forward to Part 2] [...]

    Pingback by Sorted Hash Clusters « Oracle Scratchpad — July 10, 2009 @ 12:35 pm BST Jul 10,2009 | Reply

  4. [...] this very specific trick that makes the sorted hash cluster particularly useful. I showed in another series of notes how a simple query against a single table in a sorted hash cluster minimised the volume of data [...]

    Pingback by Manual Optimisation 3 « Oracle Scratchpad — December 29, 2009 @ 11:20 am BST Dec 29,2009 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,877 other followers