Oracle Scratchpad

February 9, 2023

Global Rowids

Filed under: Indexing,Infrastructure,Oracle — Jonathan Lewis @ 4:39 pm GMT Feb 9,2023

This note is just a little background for an impending article on the costs and effects of exchanging partitions, splitting partitions, merging partitions – or any other DDL on partitions except dropping them – in the light ofr “deferred global index maintenance”.

Summary Information: while a “typical” rowid is stored in a B-tree index using 6 bytes (relative file number, block within file, row within block), a rowid for a partitioned table is stored in a global, or globally partitioned, index as 10 bytes (table partition data object id, relative file number, block within file, row within block).

You may choose to skip the rest of the article which simply demonstrates this fact, but if you do read it then it may give you a better intuitive understanding of the consequences of partition maintenance commands.

I’m going to start with a simple script to create a range partitioned table with a local and a global index.

rem
rem     Script:         global_indexes.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2023
rem
rem     Last tested 
rem             19.11.0.0
rem 

create table pt_range (
        id      not null,
        grp,
        small_vc,
        padding
)
nologging
partition by range(id) (
        partition p0200 values less than (200),
        partition p0400 values less than (400),
        partition p0600 values less than (600),
        partition p0800 values less than (800),
        partition p0100 values less than (1000)
)
as
select
        rownum - 1                      id,
        mod(rownum,50)                  grp,
        to_char(trunc(rownum/20))       small_vc,
        rpad('x',100)                   padding
from
        dual
connect by
        level <= 1e3
;

alter table pt_range 
        add constraint pt_pk primary key(id)
        using index (create unique index pt_pk on pt_range(id) local)
/


Then I’m going to run a few queries against the data dictionary, and I’ll show you the results after each query. First a list of all the objects I’ve created in a previously empty schema. (I should point out that I’m running from SQL*Plus and I haven’t shown you all the column declarations in my login.sql:

break on object_name skip 1

select
        object_name, object_type, 
        object_id, to_char(object_id,'XXXXXXXX') hex_id, 
        subobject_name
from
        user_objects 
order by
         1,2,3
;

OBJECT_NAME          OBJECT_TYPE              OBJECT_ID HEX_ID    SUBOBJECT_NAME
-------------------- ----------------------- ---------- --------- ----------------------
PT_I1                INDEX                       169810     29752

PT_PK                INDEX                       169804     2974C
                     INDEX PARTITION             169805     2974D P0200
                     INDEX PARTITION             169806     2974E P0400
                     INDEX PARTITION             169807     2974F P0600
                     INDEX PARTITION             169808     29750 P0800
                     INDEX PARTITION             169809     29751 P0100

PT_RANGE             TABLE                       169798     29746
                     TABLE PARTITION             169799     29747 P0200
                     TABLE PARTITION             169800     29748 P0400
                     TABLE PARTITION             169801     29749 P0600
                     TABLE PARTITION             169802     2974A P0800
                     TABLE PARTITION             169803     2974B P0100

I really ought to be reporting the data_object_id since that’s the column that Oracle generally uses for “physical” navigation: when you move an index, for example, its object_id stays the same but it gets a new data_object_id. Since the objects in my case are all brand new the data_object_id will match the object_id (though the INDEX and TABLE won’t have physical segments so won’t have a data_object_id at all). I’ve shown you the hexadecimal format of the (data_)object_id because that’s how the values will appear in the dump files that I’ll be producing later.

column file_no new_value m_file_no
colum  block_no new_value m_block_no

select
        dbms_rowid.rowid_object(lbid)           object_no,
        dbms_rowid.rowid_relative_fno(lbid)     file_no,
        dbms_rowid.rowid_block_number(lbid)     block_no,
        ct
from    (
        select
                lbid, count(*) ct
        from
                (
                select
                        /*+ index_ffs(pt_range(grp)) */
                        sys_op_lbid(&m_global_id, 'L', pt_range.rowid) lbid
                from
                        pt_range
                where
                        grp > = 0
                )
        group by
                lbid
        )
/


 OBJECT_NO    FILE_NO   BLOCK_NO         CT
---------- ---------- ---------- ----------
    169810         36       2310        398
    169810         36       2311        202
    169810         36       2309        400


I’ve run a messy little query using the sys_op_lbid() function to list all the index leaf blocks in the global index with their (tablespace-relative) file and block number (and number of index entries in the leaf block). I’m just going to pick the last reported leaf block and dump it:

alter system flush buffer_cache;
alter system dump datafile &m_file_no block &m_block_no;

If I examine the resulting trace file the bulk of the block will be made up of an ordered list of the index entries in the block.

row#0[8017] flag: -------, lock: 0, len=15
col 0; len 1; (1):  80
col 1; len 10; (10):  00 02 97 47 09 00 04 04 00 31

row#1[8002] flag: -------, lock: 0, len=15
col 0; len 1; (1):  80
col 1; len 10; (10):  00 02 97 47 09 00 04 05 00 24

row#2[7987] flag: -------, lock: 0, len=15
col 0; len 1; (1):  80
col 1; len 10; (10):  00 02 97 47 09 00 04 06 00 17

row#3[7972] flag: -------, lock: 0, len=15
col 0; len 1; (1):  80
col 1; len 10; (10):  00 02 97 47 09 00 04 07 00 0b


You’ll notice two things about this tiny extract:

  • First it appears to be a two-column index; that’s because Oracle handles a non-unique index by treating the table’s rowid as if it were an extra column in the index definition, so the “last column” of a non-unique index entry is the rowid of the row that that index value is pointing to.
  • Secondly that the col 1 values (the rowids) are 10 bytes rather than the “normal” 6.

If you review the code that generated the table and its global index you’ll see that the grp column should have 20 rows for each distinct value, so I’m going to pick out the 20 consecutive rows for one key value from this index leaf block, stripping out all but the rowid entries in the order they appear in the index:

col 0; len 2; (2):  c1 14

col 1; len 10; (10):   00 02 97 47    09 00 04 04 00 12
col 1; len 10; (10):   00 02 97 47    09 00 04 05 00 05
col 1; len 10; (10):   00 02 97 47    09 00 04 05 00 37
col 1; len 10; (10):   00 02 97 47    09 00 04 06 00 2a

col 1; len 10; (10):   00 02 97 48    09 00 04 84 00 12
col 1; len 10; (10):   00 02 97 48    09 00 04 85 00 06
col 1; len 10; (10):   00 02 97 48    09 00 04 85 00 38
col 1; len 10; (10):   00 02 97 48    09 00 04 86 00 2c

col 1; len 10; (10):   00 02 97 49    09 00 05 04 00 12
col 1; len 10; (10):   00 02 97 49    09 00 05 05 00 06
col 1; len 10; (10):   00 02 97 49    09 00 05 05 00 38
col 1; len 10; (10):   00 02 97 49    09 00 05 06 00 2c

col 1; len 10; (10):   00 02 97 4a    09 00 05 84 00 12
col 1; len 10; (10):   00 02 97 4a    09 00 05 85 00 06
col 1; len 10; (10):   00 02 97 4a    09 00 05 85 00 38
col 1; len 10; (10):   00 02 97 4a    09 00 05 86 00 2c

col 1; len 10; (10):   00 02 97 4b    09 00 06 04 00 12
col 1; len 10; (10):   00 02 97 4b    09 00 06 05 00 06
col 1; len 10; (10):   00 02 97 4b    09 00 06 05 00 38
col 1; len 10; (10):   00 02 97 4b    09 00 06 06 00 2c

End dump data blocks tsn: 6 file#: 36 minblk 2309 maxblk 2309

I’ve split the rowids into 5 ranges (I had 5 table partitions) of 4 rowids each. I’ve also split each rowid by putting a little extra space after the first 4 bytes. Take a look at those groups of 4 bytes and compare the values there with the hex_id values I’ve listed for the TABLE partitions above (and the first one is highlighted to make it easier.

As I claimed at the start of this note, the first 4 bytes of the rowid are the (data_)object_id of the table partition.

Consequences

For a global index the index entries for a given key value are likely to be grouped fairly well within each leaf block. The arrangement isn’t perfect, though; remember that index entries are piled on a “heap” as they arrive at a leaf block and they only appear to be in order in the dump because the leaf block “row directory” is maintained in real time to report them in order. Nevertheless, the rowids for a given key in a given table partition will all be in the same leaf block (or small set of adjacent blocks). It’s worth thinking about this when you start experimenting with the impact of partition maintenance.

For example – if you create a new table for exchange its data_object_id will be higher than any current data_object_id so when you exchange an existing partition with the new table you will (eventually) be deleting all the rowids at one part of each key’s range and inserting a batch of rowids at the “high end” of each key’s range. The overall workload due to that pattern of activity might vary dramatically with different distributions of data – and that might make you review the data you’re using in your tests.

3 Comments »

  1. […] Global Rowids […]

    Pingback by Partitioning Catalogue | Oracle Scratchpad — February 10, 2023 @ 12:16 pm GMT Feb 10,2023 | Reply

  2. […] Global Rowids […]

    Pingback by Indexing Catalogue | Oracle Scratchpad — February 10, 2023 @ 12:17 pm GMT Feb 10,2023 | Reply

  3. […] Global Rowids (Feb 2023): a note describing how rowids are stored in global indexes […]

    Pingback by Infrastructure Catalogue | Oracle Scratchpad — February 10, 2023 @ 12:25 pm GMT Feb 10,2023 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: