Oracle Scratchpad

August 28, 2013

Sorted Hash Clusters RIP

Filed under: Bugs,Infrastructure,Oracle — Jonathan Lewis @ 8:22 am BST Aug 28,2013

Sorted Hash Clusters have been around for several years, but I’ve not yet seen them being used, or even investigated in detail. This is a bit of a shame, really, because they seem to be engineered to address a couple of interesting performance patterns.

The basic concept is that data items that look alike are stored together (clustered) by applying a hashing function to generate a block address; but on top of that, if you query the data by “hashkey”, the results are returned in sorted order of a pre-defined “sortkey” without any need for sorting. (On top of everything else, the manuals describing what happens and how it works are wrong).

Yesterday I had reason to take a closer look at them, and decided that perhaps the reason no one talks about them is that they simply aren’t safe.  Here’s a trivial demonstration, which I’ve run on 10.2.0.5, 11.2.0.3, and 12.1.0.1:


execute dbms_random.seed(0)

create cluster sorted_hash_cluster (
	hash_value	number(6,0),
	sort_value	varchar2(2)	sort
)
size 300
hashkeys 100
;

create table sorted_hash_table (
	hash_value	number(6,0),
	sort_value	varchar2(2),
	v1		varchar2(10),
	padding		varchar2(30)
)
cluster sorted_hash_cluster (
	hash_value, sort_value
)
;

begin
	for i in 1..5000 loop
		insert into sorted_hash_table values(
			trunc(dbms_random.value(0,99)),
			dbms_random.string('U',2),
			lpad(i,10),
			rpad('x',30,'x')
		);
		commit;
	end loop;
end;
/

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'sorted_hash_table'
	);
end;
/

select count(*) from sorted_hash_table where hash_value = 92;
select count(*) from sorted_hash_table where hash_value = 92 and sort_value is null;
select count(*) from sorted_hash_table where hash_value = 92 and sort_value is not null;

select * from sorted_hash_table where hash_value = 92 and sort_value >= 'YR';
select * from sorted_hash_table where hash_value = 92 and sort_value > 'YR';

I think the nature of the last two queries is exactly the type for which the feature has been invented – just check the results, which come from a cut-n-paste after setting echo on:


SQL> select count(*) from sorted_hash_table where hash_value = 92;

  COUNT(*)
----------
        60

1 row selected.

SQL> select count(*) from sorted_hash_table where hash_value = 92 and sort_value is null;

  COUNT(*)
----------
        60

1 row selected.

SQL> select count(*) from sorted_hash_table where hash_value = 92 and sort_value is not null;

  COUNT(*)
----------
        60

1 row selected.

SQL> select * from sorted_hash_table where hash_value = 92 and sort_value >= 'YR';

HASH_VALUE SO V1         PADDING
---------- -- ---------- ------------------------------
        92 YR       4773 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        92 ZF        250 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        92 ZJ       2046 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
        92 ZT         65 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

4 rows selected.

SQL>
SQL> select * from sorted_hash_table where hash_value = 92 and sort_value > 'YR';

no rows selected

So: Null is not null, and ‘ZF’ is not greater than ‘YR’, it’s only greater than or equal to ‘YR’ !
I’d be interested to see the test cases that the developer used for this feature that allowed it to ship at all.

4 Comments »

  1. Probably the reason they aren’t save is that no one uses them ?
    Otherwise, there would be a filed bug ( and a fix, because such a fundamental misbehaviour which already existed in 10.2 would have been fixed, I would suppose ).

    Comment by Matthias Rogel — August 28, 2013 @ 9:12 am BST Aug 28,2013 | Reply

    • Matthias,

      I think you’re right – the more use a feature gets the more rapidly the bugs are found and fixed. The thing that surprises me about sorted hash clusters is that it looks like the sort of feature that was designed for a specific (big) customer – but if it was the customer decided not to use it.

      Comment by Jonathan Lewis — August 29, 2013 @ 6:16 am BST Aug 29,2013 | Reply

  2. Jonathan,

    I have filed public oracle bug 17373605 on your behalf to have this resolved.

    Bernard
    (Oracle Rdbms BDE)

    Comment by Bernard — August 28, 2013 @ 12:02 pm BST Aug 28,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,161 other followers