In a recent discussion in the comments of a blog item by Charles Hooper, I made a comment about how hard it is to be accurate (and unambiguous) when talking about the space needed for a rowid. So I thought I’d try to list all the possible options for rowids that you might find in the field. Frankly I’m not sure I’ll get it right first time. So, how much space do you need to store a rowid ?
If I am a row in a simple heap table (we’ll ignore index organized tables (IOTs) because their rows are located by primary key, not rowid), then I am sitting in a block, and the block records (just once) the object_id, relative file_id, and block_id – and this is most of the information that makes up my address – the only other part of the address I need is to tell me which entry in the block’s “row directory” I occupy and since the row directory can hold 4,096 entries that takes two bytes. So – from my perspective – the cost of my rowid is just two bytes.
But what about other people who need my address – how much space do they need to hold my rowid. Technically they will need the object_id (which will also allow them to find the tablespace number), relative file number, block number within file, and row number within row directory. That’s four bytes for the object_id, two bytes for the row number, then the block number and relative file number are squeezed into a further four bytes. There are two formats for this last piece of information: for “bigfile” tablespaces which guarantee just one file per tablespace the whole four bytes is assigned to the block number; for traditional tablespaces 10 bits (out of 32) are set aside for the relative file number to allow 1022 files per tablespace and the rest is for the block number within file. So the generic case for a rowid is ten bytes.
But different pieces of code may have to behave differently, some may be more space efficient, some less.
If my rowid is being stored in another table as a column of type rowid (which is almost always a bad idea), or is being stored temporarily in a program variable that has correctly been defined as type rowid then the table/variable has to store all ten bytes.
End-user representations of rowids have to be in human readable form so, for display purposes, the ten-byte rowid is re-coded as a string of 18 characters.
If my rowid has to be stored in an index on my table then things are a little variable. If the table is not partitioned then the internal code can infer my object_id from the fact that the index can only be pointing at my table’s single data segment, so the index doesn’t have to store the four bytes of the object_id. If the index is unique then the rowid is carried as “fixed size data” associated with the primary key and takes up six bytes. If the index is non-unique the rowid is appended as if it were an extra column at the end of the index, but this means it has to have a preceding “length byte” saying that it’s a column that happens to be six bytes long – so the storage requirement for the rowid is seven bytes.
If my table is a partitioned table then the same “infer the object_id” logic applies if the index is a local index – if the index is the Nth partition of the index then it must be pointing to the Nth partition of the partitioned table, that’s effectively the definition of local indexes. So again the storage is six (or seven) bytes. But if the index is global or globally partitioned then any entry in the index could be pointing to any partition in the partitioned table – so the index entry has to include the four byte object_id of the (sub)object the row is in. The same unique/non-unique argument applies so the storage requirement is either ten or eleven bytes.
I think that more or less wraps it up – but if you want to make life complicated you could start thinking about what goes on inside the index of an index cluster, and whether this introduces a special case of rowids; and this might get even more complicated when Oracle introduces the range-partitioned hash cluster.