Oracle Scratchpad

September 21, 2014

Partitioned Clusters

Filed under: 12c,clusters,Infrastructure,Oracle — Jonathan Lewis @ 7:28 pm GMT Sep 21,2014

In case you hadn’t noticed it, partitioning has finally reached clusters in 12c – specifically 12.1.0.2. They’re limited to hash clusters with range partitioning, but it may be enough to encourage more people to use the technology. Here’s a simple example of the syntax:


create cluster pt_hash_cluster (
        id              number(8,0),
        d_date          date,
        small_vc        varchar2(8),
        padding         varchar2(100)
)
-- single table
hashkeys 10000
hash is id
size 700
partition by range (d_date) (
        partition p2011Jan values less than (to_date('01-Feb-2011','dd-mon-yyyy')),
        partition p2011Feb values less than (to_date('01-Mar-2011','dd-mon-yyyy')),
        partition p2011Mar values less than (to_date('01-Apr-2011','dd-mon-yyyy'))
)
;

I’ve been waiting for them to appear ever since 11.2.0.1 and the TPC-C benchmark that Oracle did with them – they’ve been a long time coming (check the partition dates – that gives you some idea of when I wrote this example).

Just to add choice (a.k.a. confusion) 12.1.0.2 has also introduce attribute clustering so you can cluster data in single tables without creating clusters – but only while doing direct path loads or table moves. The performance intent is similar, though the technology and circumstances of use are different.

June 17, 2014

Cluster Nulls

Filed under: clusters,Indexing,Infrastructure,NULL,Oracle — Jonathan Lewis @ 7:39 am GMT Jun 17,2014

Yesterday’s posting was a reminder that bitmap indexes, unlike B-tree indexes in Oracle,  do store entries where every column in the index is null. The same is true for cluster indexes – which are implemented as basic B-tree indexes. Here’s a test case I wrote to demonstrate the point.

drop table tc1;
drop cluster c including tables;

purge recyclebin;

create cluster c(val number);
create index c_idx on cluster c;
create table tc1 (val number, n1 number, padding varchar2(100)) cluster c(val);

insert into tc1
select
	decode(rownum,1,to_number(null),rownum), rownum, rpad('x',100)
from
	all_objects
where
	rownum <= 100
;

insert into tc1 select * from tc1;
insert into tc1 select * from tc1;
insert into tc1 select * from tc1;
insert into tc1 select * from tc1;
insert into tc1 select * from tc1;
commit;

analyze cluster c compute statistics;
execute dbms_stats.gather_table_stats(user,'tc1');

set autotrace traceonly explain

select * from tc1 where val = 2;
select * from tc1 where val is null;

set autotrace off

Here are the two execution plans from the above queries:


------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |    32 |  3424 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS CLUSTER| TC1   |    32 |  3424 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN  | C_IDX |     1 |       |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("VAL"=2)

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    32 |  3424 |    14   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TC1  |    32 |  3424 |    14   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("VAL" IS NULL)

Spot the problem: the second query doesn’t use the index. So, despite the fact that I said that fully null index entries are stored in cluster indexes you might (as the first obvious question) wonder whether or not I was right. So here’s a piece of the symbolic dump of the index.


kdxconro 100                -- Ed: 100 entries (rows) in the leaf block

row#0[8012] flag: -----, lock: 0, data:(8):  02 00 02 0b 00 00 01 00
col 0; len 2; (2):  c1 03

row#1[7999] flag: -----, lock: 0, data:(8):  02 00 02 0c 00 00 01 00
col 0; len 2; (2):  c1 04

...

row#98[6738] flag: -----, lock: 2, data:(8):  02 00 02 6d 00 00 01 00
col 0; len 2; (2):  c2 02

row#99[8025] flag: -----, lock: 0, data:(8):  02 00 02 0a 00 00 01 00
col 0; NULL

The NULL entry is right there – sorted as the high value – just as it is in a bitmap index. But the optimizer won’t use it, even if you hint it.

Of course, Oracle uses it internally when inserting rows (how else, otherwise, would it rapidly find which the heap block needed for the next NULL insertion) – but it won’ use it to retrieve the data, it always uses a full table (cluster) scan. That’s really a little annoying if you’ve made the mistake of allowing nulls into your cluster.

There is a workaround, of course – in this case (depending on version) you could create a virtual column with index or a function-based index that (for example) uses the nvl2() function to convert nulls to a know value and all non-null entries to null; this would give you an index on just the original nulls which would be as small as possible and also have a very good clustering_factor. You’d have to change the code from “column is not null” to a predicate that matched the index, though. For example:


create index tc1_null on tc1(nvl2(val,null,0));

execute dbms_stats.gather_table_stats(user,'tc1', method_opt=>'for all hidden columns size 1');

select * from tc1 where nvl2(val,null,0) = 0;

----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |    32 |  3456 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TC1      |    32 |  3456 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | TC1_NULL |    32 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(NVL2("VAL",NULL,0)=0)

I’ve included the compress keyword (which implicitly means “all columns” for a non-unique index) in the definition since you only need 4 repetitions of the single-byte entry that is the internal representation of zero before you start to win on the storage trade-off – but since it’s likely to be a small index anyway that’s not particularly important. (Reminder: although Oracle collects index stats on index creation, you still need to collect column stats on the hidden column underlying the function-based columns on the index to give the optimizer full information.)

February 2, 2014

Clustaghhh!

Filed under: clusters,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 10:27 pm GMT Feb 2,2014

It doesn’t matter which bit of Oracle technology you want to use, eventually someone, somewhere, runs into the special case where something nasty happens. Here’s an edge case for people using (index) clusters – Oracle Bug  17866999 ora-1499 for cluster following rman convert

It comes from a conversation on Oracle-L where Jack van Zanen reported a problem of inconsistent results after migrating data between platforms using rman to converts some tablespaces containing index clusters. This is the starting post where he shows a query that is clearly getting the wrong answer (select where channel_number = 503 obviously shouldn’t return data with channel_number 501).

(more…)

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,267 other followers