This note is just a little background for an impending article on the costs and effects of exchanging partitions, splitting partitions, merging partitions – or any other DDL on partitions except dropping them – in the light ofr “deferred global index maintenance”.
Summary Information: while a “typical” rowid is stored in a B-tree index using 6 bytes (relative file number, block within file, row within block), a rowid for a partitioned table is stored in a global, or globally partitioned, index as 10 bytes (table partition data object id, relative file number, block within file, row within block).
You may choose to skip the rest of the article which simply demonstrates this fact, but if you do read it then it may give you a better intuitive understanding of the consequences of partition maintenance commands.
I’m going to start with a simple script to create a range partitioned table with a local and a global index.
rem
rem Script: global_indexes.sql
rem Author: Jonathan Lewis
rem Dated: Jan 2023
rem
rem Last tested
rem 19.11.0.0
rem
create table pt_range (
id not null,
grp,
small_vc,
padding
)
nologging
partition by range(id) (
partition p0200 values less than (200),
partition p0400 values less than (400),
partition p0600 values less than (600),
partition p0800 values less than (800),
partition p0100 values less than (1000)
)
as
select
rownum - 1 id,
mod(rownum,50) grp,
to_char(trunc(rownum/20)) small_vc,
rpad('x',100) padding
from
dual
connect by
level <= 1e3
;
alter table pt_range
add constraint pt_pk primary key(id)
using index (create unique index pt_pk on pt_range(id) local)
/
Then I’m going to run a few queries against the data dictionary, and I’ll show you the results after each query. First a list of all the objects I’ve created in a previously empty schema. (I should point out that I’m running from SQL*Plus and I haven’t shown you all the column declarations in my login.sql:
break on object_name skip 1
select
object_name, object_type,
object_id, to_char(object_id,'XXXXXXXX') hex_id,
subobject_name
from
user_objects
order by
1,2,3
;
OBJECT_NAME OBJECT_TYPE OBJECT_ID HEX_ID SUBOBJECT_NAME
-------------------- ----------------------- ---------- --------- ----------------------
PT_I1 INDEX 169810 29752
PT_PK INDEX 169804 2974C
INDEX PARTITION 169805 2974D P0200
INDEX PARTITION 169806 2974E P0400
INDEX PARTITION 169807 2974F P0600
INDEX PARTITION 169808 29750 P0800
INDEX PARTITION 169809 29751 P0100
PT_RANGE TABLE 169798 29746
TABLE PARTITION 169799 29747 P0200
TABLE PARTITION 169800 29748 P0400
TABLE PARTITION 169801 29749 P0600
TABLE PARTITION 169802 2974A P0800
TABLE PARTITION 169803 2974B P0100
I really ought to be reporting the data_object_id since that’s the column that Oracle generally uses for “physical” navigation: when you move an index, for example, its object_id stays the same but it gets a new data_object_id. Since the objects in my case are all brand new the data_object_id will match the object_id (though the INDEX and TABLE won’t have physical segments so won’t have a data_object_id at all). I’ve shown you the hexadecimal format of the (data_)object_id because that’s how the values will appear in the dump files that I’ll be producing later.
column file_no new_value m_file_no
colum block_no new_value m_block_no
select
dbms_rowid.rowid_object(lbid) object_no,
dbms_rowid.rowid_relative_fno(lbid) file_no,
dbms_rowid.rowid_block_number(lbid) block_no,
ct
from (
select
lbid, count(*) ct
from
(
select
/*+ index_ffs(pt_range(grp)) */
sys_op_lbid(&m_global_id, 'L', pt_range.rowid) lbid
from
pt_range
where
grp > = 0
)
group by
lbid
)
/
OBJECT_NO FILE_NO BLOCK_NO CT
---------- ---------- ---------- ----------
169810 36 2310 398
169810 36 2311 202
169810 36 2309 400
I’ve run a messy little query using the sys_op_lbid() function to list all the index leaf blocks in the global index with their (tablespace-relative) file and block number (and number of index entries in the leaf block). I’m just going to pick the last reported leaf block and dump it:
alter system flush buffer_cache;
alter system dump datafile &m_file_no block &m_block_no;
If I examine the resulting trace file the bulk of the block will be made up of an ordered list of the index entries in the block.
row#0[8017] flag: -------, lock: 0, len=15
col 0; len 1; (1): 80
col 1; len 10; (10): 00 02 97 47 09 00 04 04 00 31
row#1[8002] flag: -------, lock: 0, len=15
col 0; len 1; (1): 80
col 1; len 10; (10): 00 02 97 47 09 00 04 05 00 24
row#2[7987] flag: -------, lock: 0, len=15
col 0; len 1; (1): 80
col 1; len 10; (10): 00 02 97 47 09 00 04 06 00 17
row#3[7972] flag: -------, lock: 0, len=15
col 0; len 1; (1): 80
col 1; len 10; (10): 00 02 97 47 09 00 04 07 00 0b
You’ll notice two things about this tiny extract:
- First it appears to be a two-column index; that’s because Oracle handles a non-unique index by treating the table’s rowid as if it were an extra column in the index definition, so the “last column” of a non-unique index entry is the rowid of the row that that index value is pointing to.
- Secondly that the col 1 values (the rowids) are 10 bytes rather than the “normal” 6.
If you review the code that generated the table and its global index you’ll see that the grp column should have 20 rows for each distinct value, so I’m going to pick out the 20 consecutive rows for one key value from this index leaf block, stripping out all but the rowid entries in the order they appear in the index:
col 0; len 2; (2): c1 14
col 1; len 10; (10): 00 02 97 47 09 00 04 04 00 12
col 1; len 10; (10): 00 02 97 47 09 00 04 05 00 05
col 1; len 10; (10): 00 02 97 47 09 00 04 05 00 37
col 1; len 10; (10): 00 02 97 47 09 00 04 06 00 2a
col 1; len 10; (10): 00 02 97 48 09 00 04 84 00 12
col 1; len 10; (10): 00 02 97 48 09 00 04 85 00 06
col 1; len 10; (10): 00 02 97 48 09 00 04 85 00 38
col 1; len 10; (10): 00 02 97 48 09 00 04 86 00 2c
col 1; len 10; (10): 00 02 97 49 09 00 05 04 00 12
col 1; len 10; (10): 00 02 97 49 09 00 05 05 00 06
col 1; len 10; (10): 00 02 97 49 09 00 05 05 00 38
col 1; len 10; (10): 00 02 97 49 09 00 05 06 00 2c
col 1; len 10; (10): 00 02 97 4a 09 00 05 84 00 12
col 1; len 10; (10): 00 02 97 4a 09 00 05 85 00 06
col 1; len 10; (10): 00 02 97 4a 09 00 05 85 00 38
col 1; len 10; (10): 00 02 97 4a 09 00 05 86 00 2c
col 1; len 10; (10): 00 02 97 4b 09 00 06 04 00 12
col 1; len 10; (10): 00 02 97 4b 09 00 06 05 00 06
col 1; len 10; (10): 00 02 97 4b 09 00 06 05 00 38
col 1; len 10; (10): 00 02 97 4b 09 00 06 06 00 2c
End dump data blocks tsn: 6 file#: 36 minblk 2309 maxblk 2309
I’ve split the rowids into 5 ranges (I had 5 table partitions) of 4 rowids each. I’ve also split each rowid by putting a little extra space after the first 4 bytes. Take a look at those groups of 4 bytes and compare the values there with the hex_id values I’ve listed for the TABLE partitions above (and the first one is highlighted to make it easier.
As I claimed at the start of this note, the first 4 bytes of the rowid are the (data_)object_id of the table partition.
Consequences
For a global index the index entries for a given key value are likely to be grouped fairly well within each leaf block. The arrangement isn’t perfect, though; remember that index entries are piled on a “heap” as they arrive at a leaf block and they only appear to be in order in the dump because the leaf block “row directory” is maintained in real time to report them in order. Nevertheless, the rowids for a given key in a given table partition will all be in the same leaf block (or small set of adjacent blocks). It’s worth thinking about this when you start experimenting with the impact of partition maintenance.
For example – if you create a new table for exchange its data_object_id will be higher than any current data_object_id so when you exchange an existing partition with the new table you will (eventually) be deleting all the rowids at one part of each key’s range and inserting a batch of rowids at the “high end” of each key’s range. The overall workload due to that pattern of activity might vary dramatically with different distributions of data – and that might make you review the data you’re using in your tests.
[…] Global Rowids […]
Pingback by Partitioning Catalogue | Oracle Scratchpad — February 10, 2023 @ 12:16 pm GMT Feb 10,2023 |
[…] Global Rowids […]
Pingback by Indexing Catalogue | Oracle Scratchpad — February 10, 2023 @ 12:17 pm GMT Feb 10,2023 |
[…] Global Rowids (Feb 2023): a note describing how rowids are stored in global indexes […]
Pingback by Infrastructure Catalogue | Oracle Scratchpad — February 10, 2023 @ 12:25 pm GMT Feb 10,2023 |