Here’s a script I’ve used from time to time to help me get some ideas about how to hack through some x$ tables. It’s obviously something that can only be run by the SYS schema and it’s not something to apply arbitrarily to every x$ table as the side effects could be nasty but, on a sandbox instance with enough CPUs and a large enough SGA, it might give you some interesting clues about how Oracle is defining or accessing some of its internal structures.
The first query simply walks through the specified x$ table in index (indx) order comparing the address (addr) of the preceding row with that of the current row and reporting the difference. Some of the x$ tables are likely to give you a random collection of large numbers, which means there is no well-structured row involved. Others will give very regular patterns – though not necessarily a constant value as you move from row to row (take a look at x$ksuru, for example).
The second query takes the lowest and highest addresses of rows in the table, finds the difference and divides by one less than the number of rows in the table (i.e. allowing for the fact that the highest address is the start of a final row, not the end of a row). If the average size value matches the value reported by every other row in the output that’s a fairly strong clue that the x$ table may be a simple array of fixed sized rows.
rem
rem Script: x_rowsize.sql
rem Author: Jonathan Lewis
rem Dated: Dec 2002 (probably)
rem
rem Usage:
rem Connect as SYS (due to x$)
rem start x_rowsize {x tablename} {number of rows to list}
rem
set linesize 120
set pagesize 60
set trimspool on
set tab off
prompt ================================================
prompt Usage: @x_rowsize {table_name} {number_of_rows}
prompt ================================================
pause Press return to continue
define m_xtab = &1
define m_rowct = &2
column addr_dec format 999,999,999,999,999
column lag_addr_dec format 999,999,999,999,999
column row_size format 999,999,999,999,999
spool x_rowsize
select
addr,
to_number(addr,'XXXXXXXXXXXXXXXX') addr_dec,
to_number(lag(addr,1) over (order by indx),'xxxxxxxxxxxxxxxx') lag_addr_dec,
to_char(
to_number(addr,'XXXXXXXXXXXXXXXX') -
to_number(lag(addr,1) over (order by indx),'xxxxxxxxxxxxxxxx'),
'xxxxxxxxxxxxxxxx'
) row_size_hex,
to_number(addr,'XXXXXXXXXXXXXXXX') -
to_number(lag(addr,1) over (order by indx),'xxxxxxxxxxxxxxxx') row_size,
indx
from &m_xtab
where indx <= &m_rowct
order by
indx
/
prompt ========================================================
prompt Average row size will (almost certainly) not match the
prompt row gap listed in the above if the x$ table is segmented
prompt ========================================================
select
min(addr), max(addr), count(*),
(to_number(max(addr),'xxxxxxxxxxxxxxxx') -
to_number(min(addr),'xxxxxxxxxxxxxxxx')
) / (count(*) - 1) avg_row_size
from &m_xtab
/
spool off
It’s a good idea to start with a small sample size in case a complete scan of a structure crashes the session or even the instance- so here’s the result of calling the script with the parameters x$ksuse 10
===============================================
Usage: @x_rowsize {table_name} {numbe_of_rows}
===============================================
Press return to continue
ADDR ADDR_DEC LAG_ADDR_DEC ROW_SIZE_HEX ROW_SIZE INDX
---------------- -------------------- -------------------- ----------------- -------------------- ----------
000000008E713678 2,389,784,184 1
000000008E715E60 2,389,794,400 2,389,784,184 27e8 10,216 2
000000008E718648 2,389,804,616 2,389,794,400 27e8 10,216 3
000000008E71AE30 2,389,814,832 2,389,804,616 27e8 10,216 4
000000008E71D618 2,389,825,048 2,389,814,832 27e8 10,216 5
000000008E71FE00 2,389,835,264 2,389,825,048 27e8 10,216 6
000000008E7225E8 2,389,845,480 2,389,835,264 27e8 10,216 7
000000008E724DD0 2,389,855,696 2,389,845,480 27e8 10,216 8
000000008E7275B8 2,389,865,912 2,389,855,696 27e8 10,216 9
000000008E729DA0 2,389,876,128 2,389,865,912 27e8 10,216 10
10 rows selected.
========================================================
Average row size will (almost certainly) not match the
row gap listed in the above if the x$ table is segmented
========================================================
MIN(ADDR) MAX(ADDR) COUNT(*) AVG_ROW_SIZE
---------------- ---------------- ---------- ------------
000000008E713678 000000008EDEDD70 704 10222.1451
1 row selected.
At first sight it looks as if the rowsize for an x$ksuse (v$session) rows is 10,216 bytes – but then the average rowsize turns out to be different, so there’s some inconsistency somewhere in x$ksuse that we can’t see in the first few rows.
After running the report for the whole x$ksuse structure I realised that a quick and easy edit would report just the interesting bits. I folded the original query into an inline view and selected only the rows where “row_size != 10216”. Here’s the result:
ADDR ADDR_DEC LAG_ADDR_DEC ROW_SIZE_HEX ROW_SIZE INDX
---------------- -------------------- -------------------- ----------------- -------------------- ----------
000000008E8CAB98 2,391,583,640 2,391,571,984 2d88 11,656 177
000000008EA820B8 2,393,383,096 2,393,371,440 2d88 11,656 353
000000008EC395D8 2,395,182,552 2,395,170,896 2d88 11,656 529
That looks like an interesting pattern – especially if you’re good at mental arithmetic (or have a calculator handy) and I tell you that there are 704 rows in x$ksuse.
The way I’ve written my code the row size reported for indx = n is the length of row n-1, so out of the 704 rows in x$ksuse, the “interesting” rows are 176, 352, and 528 and when you check the arithmetic you realise:
- 176 = 704 / 4,
- 352 = 704 / 2,
- 528 = 704 * 3/4
Maybe it’s not a coincidence that this instance is running on 4 CPUs; moreover, with that thought in mind, if I look at v$latch_children to see if there are any latches with 4 children I find (amongst others) the following latch names:
client/application info
session allocation
session idle bit
session switching
There may be other latches that are also relevant but these 4 latches have names that suggest you might want to acquire one of them if you wanted to modify the contents of a session-related row – in other words if you wanted to modify the contents of a row in x$ksuse. So maybe Oracle has checked the cpu_count on startup and created the v$session/x$ksuse array in equal sized pieces, one for each CPU, so that it can minimise contention when try to modify data in x$ksuse. And following that thought I’ll just add that I had set sessions = 702 in my parameter file, and Oracle had rounded that up to a multiple of 4.
The hypothesis is easy to check – just bounce the database a few times, using a pfile rather than an spfile, change the cpu_count and sessions parameter each time and see if Oracle always rounds sessions up (if needed) to a multiple of cpu_count and then splits the array into cpu_count equal pieces.
And that certainly seemed to be the case on my little machine. (Obviously I’m not in a position to see what would happen on a machine with 256 CPU and sessions set to 16,000 – so there might be some sanity checks for extreme cases.)
Jumping to conclusions
Going back to the output from my initial setup, it looks as if each row in x$ksuse is going to demand 10,216 bytes, and that there may be 1,440 (viz: 11,656 – 10,216) bytes needed as some sort of “header” for each segment of the array, but what sort of header could need 1,440 bytes? I can’t help noticing that if you wanted to store 176 pointers of 8 bytes each that’s a total of 1,408 bytes – so maybe the “header” is mostly a list of pointers to the rows – just like the row directory in a data block!
It’s a thought that might be heading in the right direction, but perhaps it’s a little premature. Fortunately on one of my restarts one of the gaps showed up as roughly minus 30,000,000. So I need to find an explanation that covers this case as well – but that’s okay because I soon came up with a hypothesis (that I haven’t actually confirmed with a reproducible model yet).
I know that my memory granule size is 16MB (select * from V$sgainfo). What if Oracle had allocated the first few sections of x$ksuse and then found that it didn’t have enough room left in the current granule for the next piece of the array? It would have to start using another granule. What if the granules were attached to the shared pool “from the top downwards” – the next granule would have a starting address that was 16MB lower than the current granule – so the step from the end of the current granule to the beginning of the next granule would be around 32MB backwards. Allowing for the fact that I was experimenting with numbers that produced sizes of about 2MB (200 rows @ 10,216 bytes) for each piece of the array the hypothesis is reasonably consistent with the step size of minus 30 million.
Further experimentation is left as an exercise for the interested reader, but if I wanted to test the granule hypothesis I could probably experiment with carefully selected values for sessions, maybe starting the instance with a reduced granule size (parameter _ksmg_granule_size) to see if I could get two or more backwards jumps in the array. But I think I’ve pursued x$ksuse far enough for the moment. Except I have one more (generic) problem that it highlights.
It’s not that simple
If I query x$kqfta – the list of x$ tables, I see the following for x$ksuse:
SQL> execute print_table('select * from x$kqfta where kqftanam = ''X$KSUSE''')
ADDR : 00000000143D1DD0
INDX : 43
INST_ID : 1
CON_ID : 0
KQFTAOBJ : 4294951004
KQFTAVER : 42
KQFTANAM : X$KSUSE
KQFTATYP : 2
KQFTAFLG : 1
KQFTARSZ : 8
KQFTACOC : 113
-----------------
1 row(s) selected
The column kqftarsz is the “Row SiZe” and 8 is a long way off the 10,216 that I’ve derived from the addresses of consecutive rows in the table. So how does an 8 byte column turn into a 10,216 byte row? (I wonder if it’s describing the rows in the table of pointers to the rows, not the length of the rows themselves). But (apparently) it’s neither 8 nor 10,216 anyway because if I look at x$kqfco (the list of x$ columns) for x$ksuse I see the following):
select
co.indx,
co.kqfconam column_name,
co.kqfcosiz column_size,
co.kqfcooff column_offset
from
x$kqfta ta,
x$kqfco co
where
co.kqfcotab = ta.indx
and ta.kqftanam = 'X$KSUSE'
order by
co.kqfcooff,
co.indx
/
INDX COLUMN_NAME COLUMN_SIZE COLUMN_OFFSET
---------- -------------------------------- ----------- -------------
503 ADDR 8 0
504 INDX 4 0
505 INST_ID 4 0
506 CON_ID 2 0
507 KSSPAFLG 4 0
508 KSSPAOWN 8 0
...
603 KSUSELTXIDFLG 4 9636
604 KSUSELTXID 8 9648
615 KSUSEFLG3 4 10000
According to this the last column of the row starts at byte 10,000 and ends at 10,003 giving us a third possible length for the row and a gap of 212 bytes to explain; and even though there are several x$ tables starting at the same address (x$ksuse, x$ksusio, x$lsusecst, x$ksuru, x$ksusm) on the same internal structure none of them has a column definition explaining the gap.
It looks as if any further investigation (of x$ksuse at least) will have to depend on dumping memory memory to trace files and chasing pointers.
Summary
When you try to find out how big a “row” in an x$ structure is you may find several pieces of contradictory information:
- The rowsize (kqftarsz) reported in x$kqfta.
- The offset + length of the final column in the description in x$kqfco.
- The “gap” between consecutive rows when you query addr in the object itself
Using oradebug to peek a couple of rows may help you uncover the truth. But you may end up chasing pointers to pointers to pointers to get to the ultimate source.
[…] X$ row sizes (Mar 2022): Simple script to get some clue about rows in x$ structures […]
Pingback by Infrastructure Catalogue | Oracle Scratchpad — March 21, 2022 @ 11:15 am GMT Mar 21,2022 |
[…] Three ways of checking row sizes – which may give different results: now published […]
Pingback by Excavating x$ | Oracle Scratchpad — March 21, 2022 @ 11:22 am GMT Mar 21,2022 |