Oracle Scratchpad

Direct SGA Access – pt 3

Reviewing v$ and x$

I ended part 2 of this series with a little C program that that attached to two shared memory segments from an Oracle instance – the “Fixed Size” memory segment (equivalent to x$ksmfsv) and the “Variable Size” memory segment which is basically the SGA.

The program design assumed that these two memory segments would be mapped to the logical addresses they are “always” mapped to, but needed the shared memory segment ids (SHMIDs) supplied on the command line. After a short pause, to allow time for a quick check (using ipcs -m) that the number of processes attached to the two segments had increased, the program detached and terminated.

The next step, then, is to decide what we’re going to look for now that we can access the memory. There are already notes on my blog about what’s in these memory segments and how you can access them, so you might want to browse a couple of those notes before continuing.

With a little background from the above, we can start turning our v$ query “which SQL is being run by which session” into a query of x$ structures if we’re lucky, or a journary along a series of pointers that we find buried in the x$ structures.

We want to do the equivalent of the following:

select
        se.sid, se.username, sq.sql_fulltext
from
        v$session       se,
        v$sql           sq
where
        sq.sql_id = se.sql_id
and     sq.child_number = se.sql_child_number
/

v$session

We can’t see the definition of v$session in its entirety in v$fixed_view_definition but by following other routes we find that v$session is a join between x$ksuse, x$kslwt and x$ksled; by running the script to query x$kqfco and running a few examples we can map the critical columns of v$session to their corresponding columns in x$ksuse.

From three different sessions I’ve done the following:

set serveroutput off
alter session set sort_area_size = nnnnnnn;
select count(*) from user_objects;

Because I want to get several child cursors for a test query I’ve used a different value for the sort_area_size for each session. Once I’ve done this I’ve queried v$session. from a session logged into the CDB root as SYS:

SQL> select saddr, sid, sql_id, sql_address, sql_child_number, prev_sql_id, prev_sql_addr, prev_child_number
  2  from v$session
  3  where username = 'TEST_USER'
  4  /

SADDR                   SID SQL_ID        SQL_ADDRESS      SQL_CHILD_NUMBER PREV_SQL_ID   PREV_SQL_ADDR    PREV_CHILD_NUMBER
---------------- ---------- ------------- ---------------- ---------------- ------------- ---------------- -----------------
000000008E8E12C0        186 8x615vyks733p 0000000067DE5390                0 8x615vyks733p 0000000067DE5390                 0
000000008E904170        200 8x615vyks733p 0000000067DE5390                2 8x615vyks733p 0000000067DE5390                 2
000000008EC72BB0        552 8x615vyks733p 0000000067DE5390                1 8x615vyks733p 0000000067DE5390                 1

Having run this experiment several times to narrow down the columns I was interested in the sql_id and prev_sql_id happen to be the same for all three sessions. Once I’d prepared v$session I could query x$ksuse to cross-check the matching columns, with the following result:

SQL> select addr, indx, ksusesqi, ksusesql, ksusesch, ksusepsi, ksusepsq, ksusepch
  2  from x$ksuse
  3  where ksusesqi = '8x615vyks733p'
  4  /

ADDR                   INDX KSUSESQI      KSUSESQL           KSUSESCH KSUSEPSI      KSUSEPSQ           KSUSEPCH
---------------- ---------- ------------- ---------------- ---------- ------------- ---------------- ----------
000000008E8E12C0        186 8x615vyks733p 0000000067DE5390          0 8x615vyks733p 0000000067DE5390          0
000000008E904170        200 8x615vyks733p 0000000067DE5390          2 8x615vyks733p 0000000067DE5390          2
000000008EC72BB0        552 8x615vyks733p 0000000067DE5390          1 8x615vyks733p 0000000067DE5390          1


There are two important points to notice in these results. First that the addr column of x$ksuse is in the range that puts it into the SGA (not the PGA); secondly that despite having three child cursors the sql_address / ksusesqi is the same in all three cases – it’s the address of the common parent cursor, not the address of the session-specific child cursor.

At this point it’s worth looking at the contents of x$kqfco for x$ksuse, and here’s an extract from the output:

                                                               X$KSUSE

Column                           Ext Type   Int Type  Array Max   Len Size Len Offset   Col Size     Offset      Index    Idx Col
------------------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
KSSPAFLG                                2         20          0          0          0          4          0          0          0
KSUSESQI                                1         20          0          0          0         13          0          0          0
KSUSEPSI                                1         20          0          0          0         13          0          0          0
KSUSEFLG                                2         11          0          0          0          4       2184          0          0
KSUSESQL                               23          0          0          0          0          8       2424          0          0
KSUSESCH                                2         11          0          0          0          2       2440          0          0
KSUSEPSQ                               23          0          0          0          0          8       2496          0          0
KSUSEPCH                                2         11          0          0          0          2       2512          0          0

I’ve included ksspaflg and ksuseflg because those columns are used to filter rows from x$ksuse in the view definition of v$session.

where bitand(s.ksspaflg,1) != 0 
and   bitand(s.ksuseflg,1) != 0 

You’ll notice that the ksusesqi and ksusepsi column both have an offset of zero – this basically means one of two things: either the column is the one column that really does start at the beginning of the row (or struct if we’re thinking C), or the value can be derived in some way from the row itself. For example the indx column has an offset of zero – but the indx is simply the position of the row in the array and the Nth row doesn’t need to store the value N to tell itself that it’s the Nth row.

In the case of the two SQL ID columns these might be derived by using the ksusesql and ksusepsq pointers to visit the parent cursors and find the SQL_IDs there, and maybe Oracle uses the stored SQL hash values to do some latching before peeking at the parent.

It would be convenient if the ksspaflg is the thing that really is stored as a 4 byte integer at the zeroth offset, but I haven’t investigated that thoroughly to see if that’s the case. [Update: as part of writing the next article in the series I investigated this point and can report that ksspaflg really is a 4 bytes integer starting at offset zero]

If we take the address (addr) of session 200 and add the offset plus length of the ksusesql column we should find the address of the parent cursor; if we go a little further along the row by using the offset plus length of the ksusesch column we should find the value 2 for child number 2.

SQL> oradebug peek 0x000000008E904170 2442 1
[08E904170, 08E904AFC) = 99FC0501 00000004 8E57BCC0 00000000 8EEFCAB0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 8E712EB0 00000000 ...
SQL> ed or19_ora_25236.trc

Processing Oradebug command 'peek 0x000000008E904170 2442 1'
[08E904170, 08E904AFC) = 99FC0501 00000004 8E57BCC0 00000000 8EEFCAB0 ...
Dump of memory from 0x08E904184 to 0x08E904AFC
08E904180          00000000 00000000 00000000      [............]
08E904190 00000000 00000000 00000000 00000000  [................]
...
08E904AD0 00000000 00000000 7FFFFFFC 00000000  [................]
08E904AE0 00000000 00000000 67DE5390 00000000  [.........S.g....]
08E904AF0 8E9825DF A5838C75 00000002           [.%..u.......]

In the last two lines of the output we can see:

  • Starting at address 0x08E04AE the 8-byte address 0000000067DE5390 as required
  • Starting at address 0x08E904AF8 the value 0002 as required

Check:

  • 0x08E904170 + 2424 (dec) = 0x08E904AE8
  • 0x08E904170 + 2440 (dec) = 0x08E904AF8

Side note: the 0002 doesn’t appear to start at the right place if you simply count bytes from left to right, but this is due to the convoluted byte-swapping that takes place in the output. See how the “%” that appears as the 2nd byte in the right hand part of the display corresponds to the 25 that appears as the 3rd byte of the left hand part of the display, and the “u” that appear as the 5th byte on the right is the 75 that appears at the 8th byte on the left! (ascii(‘%’) = 37 = 0x25, ascii(‘u’) = 0x75 = 117)

In summary, the strategy we can adopt for digging into v$session is to find the address of each row (v$session.saddr or x$ksuse.addr) and use x$kqfco to tell us how to walk along the row and interpret the data we find there for many of the columns.

v$sql

Having made some headway in understanding what information we can get out of v$session we can move on to v$sql, which is where (we hope, sort of) Oracle has put the SQL we want to see.

From v$fixed_view_definition we can see that v$sql is based on x$kglcursor_child, but looking at x$kqfdt we can see that x$kglcursor_child is derived from x$kglob. Again we can try to match v$ columns with x$ columns, and a first step (I’ve cut out 30 minutes of messing around) gives us the following:

SQL> select address, child_address, child_number from v$sql where sql_id = '8x615vyks733p';

ADDRESS          CHILD_ADDRESS    CHILD_NUMBER
---------------- ---------------- ------------
0000000067DE5390 0000000076D7ABF0            0
0000000067DE5390 0000000070A84BB0            1
0000000067DE5390 0000000072608698            2

SQL> select addr, kglhdpar, kglhdadr, kglobt09 from x$kglcursor_child where kglobt03 ='8x615vyks733p';

ADDR             KGLHDPAR         KGLHDADR           KGLOBT09
---------------- ---------------- ---------------- ----------
00007FD8223C65F8 0000000067DE5390 0000000076D7ABF0          0
00007FD8222A9988 0000000067DE5390 0000000070A84BB0          1
00007FD8222AB3D8 0000000067DE5390 0000000072608698          2

SQL> select addr, kglhdpar, kglhdadr, kglobt09 from x$kglob where kglobt03 ='8x615vyks733p';

ADDR             KGLHDPAR         KGLHDADR           KGLOBT09
---------------- ---------------- ---------------- ----------
00007FD8223C4BA8 0000000067DE5390 0000000076D7ABF0          0
00007FD8223C65F8 0000000067DE5390 0000000070A84BB0          1
00007FD8222A9988 0000000067DE5390 0000000072608698          2
00007FD8222B5250 0000000067DE5390 0000000067DE5390      65535

Notice how x$kglob has given us an “extra” row where the parent address (kglhdpar) matches the child address (kglhdadr). It’s possible that it may also have given us a superset of the columns that would be in x$kglcursor_child – but that’s not a detail to worry about at present. The interesting point is that x$ksuse/v$session points to the parent address, not the individual child cursor address.

Side note: if you’ve read Oracle Core you may recall that the parent structure holds segmented arrays of pointers to child cursors, so when you query a child cursor Oracle has constructed each row by following a pointer in one of thos array segments and combining the child content with various bits of information from the parent structure. A “parent cursor” in the library cache is a denormalised structure, and the internal code does a lot of work to make it look like a set of nearly normalised tables.

The other important point to note for x$kglob / x$kglcursor_child is that the addr column is in the high (i.e. PGA) range. These structures are recreated on demand in private memory – so they won’t exist for a C program that does a direct attach to the SGA, and we won’t be able to take advantage of the information in x$kqfco to “walk columns” as we could with x$ksuse. To get from v$session to its current SQL we’re going to have to follow the pointer to the parent cursor structure in x$ksuse and find a different way of jumping through memory to find the SQL.

As an initial stage of doing this last bit of investigation it’s quite easy to write some code that selects the addresses of parent cursors from the SGA and then uses a peek command to dump a few hundred bytes to a trace file for each one. After doing this once it seemed fairly obvious that (on my system) the SQL text was a zero terminated string that started 448 bytes from the start of the cursor structure. Of course it’s possible that this was a coincidence based on the low rate of activity of the system – for example if a parent cursor had more than 16 children perhaps the structure would be modified to insert a second segment for the array of child pointers ahead of the SQL text (that does seem a little unlikely, though).

Here’s a simple SQL statement to generate most of a check, though – it’s “code to write code” and uses (and reconnects to) the sys schema, and all it does is dump the start of text for each parent cursor – assuming the text starts 448 bytes from the start of structure. The dump is simply to allow an eyeball check that what follows looks like the ASCII coding for SQL — particularly “select” or “SELECT”:

rem
rem     Script: dump_parents.sql
rem     Author: Jonathan Lewis
rem     Dated:  April 2022
rem
rem     Last tested
rem             19.11.0.0
rem
rem     Notes
rem     Peeks the 1st 32 characters of the text of all parent 
rem     cursors currently in the SGA based on the assumption 
rem     that the text start 448 bytes after the address of the
rem     parent structure
rem
rem     Has to be run by SYS on the CDB root.
rem     Reconnects as sys
rem

set pagesize 0
set trimspool on

spool temp.sql
prompt oradebug setmypid

select 
        distinct
                 'oradebug peek ' || 
                (to_number(kglhdpar,'xxxxxxxxxxxxxxxx') + 448) || ' 32' 
from x$kglcursor_child
/
spool off

connect sys/sys as sysdba

spool temp
@temp
spool off

Summary

Some dynamic performance views are based on fixed structures in the SGA and Oracle gives us enough information to understand and navigate the contents of these structures without needing to call internal Oracle funcitons. There are some views, however, that are dynamically recreated on demand in the private memory of Oracle processes and while these structures are described in x$kqfco an external process cannot take advantage of the descriptions as it will not have access to the internal calls that Oracle uses to recreate the structure on demand.

Next Installment: Some C code (with comments and caveats) to attach to the SGA then print the SQL currently reported in v$session.sql_address. Here’s a little cut-and-paste of a run:

[oracle@linux183 trace]$ /mnt/working/direct_attach/show_sessions 29294599 29327369
Fixed shmid:    29294599
Fixed address:    0x60000000 1610612736 
Variable shmid: 29327369
Variable address: 0x61000000 1627389952 

SID:   21, saddr:       0x8e745498, sql_address:       0x8e745e10, cursor address:       0x7b5c72d8, text address:       0x7b5c7498
BEGIN /* KSXM:FLUSH DML_MON */    dbms_stats_internal.gather_scan_rate_by_mmon;  END;

SID:  367, saddr:       0x8eaa4f68, sql_address:       0x8eaa58e0, cursor address:       0x7b9b4088, text address:       0x7b9b4248
insert into smon_scn_time (thread, time_mp, time_dp, scn, scn_wrp, scn_bas, num_mappings, tim_scn_map) values (0, :1, :2, :3, :4, :5, :6, :7)

There’s not a lot happening on this 19.11 sandbox.

FYI – the critical code is 6 lines long – or 18 if you like to format C to be readable. “Readable” is relative, of course: you need to be happy to interpret text like this:

(char *)(*(unsigned long int *) (sessions[i]+KSUSE_KSUSESQL_OFFSET) + KGLOB_SQLTEXT_OFFSET)

If I’ve got this right (and it does seem to do exactly what I wanted it to do) this adds an integer constant to an array of (long) integers, casts the result to a pointer to a long (i.e. 64 bit) integer, dereferences the pointer to produce a long integer, adds another constant and casts the result to be a pointer to a string.

Leave a Comment »

No comments yet.

RSS feed for comments on this post.

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: