Oracle Scratchpad

October 2, 2017

markhot

Filed under: latches,mutex,Oracle,Troubleshooting — Jonathan Lewis @ 8:42 am BST Oct 2,2017

How can a single piece of SQL text – checked very carefully – end up with multiple SQL_IDs ? There are probably quite a lot of people who know the answer to this question but won’t think of it until they’re reminded and, thanks to a question that came up on the forum formerly known as OTN a couple of days ago, I was reminded about it recently and rediscovered an article I had drafted on the topic a few years ago.

The specific problem on the forum was about having a huge number of child cursors for a single parent thanks to a frequently executed update statement that updated all 50 columns of a table using bind variables to do so. The reason why a single statement could produce so many child cursors seemed to be due to the variation in the lengths of the bind variables supplied – which could well be the consequence of an internal library mechanism rather than an explicit design mechanism written into the client code. One of the comments to my 2007 article suggested event  10503 as a damage limitation mechanism, but there was some problem with it not working as expected at the time. A quick check on MoS now reports bug 10274265 : “EVENT 10503 NOT WORKING ON THE SESSION LEVEL” as fixed in 12.1 with lots of backport patches to various versions of 11.2

Moving on from the back-story, the case that prompted my draft note in 2014 was the simpler one of having lots of sessions constantly executing exactly the same SQL statement and child cursor, so rather than having a latch/mutex type of problem because of a large number of child cursors I was seeing a problem purely because of the level of concurrent access to the same child cursor.  The solution in the version of Oracle the client was using at the time was to tell Oracle to mark the SQL statement as “hot” by setting a hidden parameter; but the mechanism is now officially exposed in a procedure called dbms_shared_pool.markhot() that I learned about a few months ago when I was at a client who had a similar problem with highly concurrent execution of a small set of statements – with the extra twist that the table referenced in the critical statements was a partitioned table which suffered a fairly regular partition exchange.

When a statement (through it’s “full_hash_value”) is marked as hot, an extra value visible as the property column in v$db_object_cache is set to a value that seems to be dependent on the process id of the session attempting to execute the statement, and this value is used as an extra component in calculating a new full_hash_value (which leads to a new hash_value and sql_id). With a different full_hash_value the same text generates a new parent cursor which is (probably) going to be associated with a new library cache hash bucket and latch. The property value for the original parent cursor is set to “HOT”, and the extra copies become “HOTCOPY1”, “HOTCOPY2” and so on. Interestingly once an object is marked as HOT and the first HOTCOPYn has appeared the original version may disappear from v$sql while still existing in v$db_object_cache.

The number of “HOTCOPYn” versions of the statement is limited by the hidden parameter “_kgl_hot_object_copies” which (according to my notes) defaults to either cpu_count or cpu_count/2. On my most recent test on 11.2.0.4 it seemed to be the latter.

Marking a cursor hot

There are three options:

  • set a hidden parameter in the startup file
  • execute an “alter system” command to set the hidden parameter
  • from 11.2.0.3 onwards (possibly earlier in 11.2) call dbms_shared_pool.markhot()

Examples:

Startup file:

_kgl_debug="hash='cc7d5ecdcc9e7c0767456468efe922ea' namespace=0 debug=33554432"

Alter system call with multiple targets:

alter system set "_kgl_debug" =
        "hash='cc7d5ecdcc9e7c0767456468efe922ea' namespace=0 debug=33554432",
        "hash='59a1f6575a5006600792ee802558305b' namespace=0 debug=33554432"
;

markhot() procedure:

begin
        dbms_shared_pool.markhot(
                hash            =>'71fc6ccf9a3265368492ec9fc05b785b',
                namespace       =>0,
                global          =>true
        );
end;
/

The namespace identifies the object as an SQL Cursor (you can mark other types of object as hot if you need to), and for those of a mathematical bent you’ll work out that the debug values is power(2,25).

The value supplied as the hash is the full_hash_value and you can find this in v$db_object_cache either by searching on some string that easily identifies your statement, or by searching v$sql on a string to get the (short) hash value of the statement and using that to search v$db_object_cache on the hash_value column.


select
        hash_value,
        full_hash_value,
        namespace,
        child_latch,
        property        hot_flag,
        executions,
        invalidations
from
        v$db_object_cache
where
        name like '{some part of your critical SQL statement}'
;

I ran into two problems using the markhot() approach. The first not terribly serious – the second fatal, except I’m not going to do it again and I wouldn’t have run into it if I hadn’t been impatient working around the first.

First: if you’ve already got lots of sessions executing the statement and holding cursors open in some way before you call markhot() then it may be some time before all those sessions release the hot parent and child and acquire a “cool” parent and child and unfortunately you can’t call markhot() until at least one session has opened the relevant cursor – and that’s a problem that isn’t relevant if you’ve got the hidden parameter set.

Secondly: although eventually your hot cursor(s) will drop out of use, if you try to get rid of them early by a cunning call to dbms_shared_pool.purge() you may find that you don’t manage to purge them; if you decide to try again, and again (as I did) you may find that your session goes into an infinite CPU spin and no-one can get at the hot cursor.  Be patient, once you’ve marked a cursor as hot your application will (probably) end up spreading itself over the copies.

One last detail – if, for any reason, you decide that a cursor no longer needs to be marked hot there is a procedure dbms_shared_pool.unmarkhot() that takes the same three parameters to clear the property and allow the copies to disappear.

Footnote

The OTN problem that prompted me to write this note wasn’t about high concurrency levels, it was about mutex contention while searching for the right child cursor. The markhot() procedure doesn’t really look as if it’s designed to address this issue but, as a side-effect of having multiple parent cursors for the same statement text, there should be fewer sessions searching each child-cursor chain at any one moment and this may be enough to reduce the contention. Statistically, of course, every child chain is likely to end up the same length so the amount of shared pool memory used by the SQL statement will eventually grow by a factor matching the number of hot copies produced – but if the problem is contention it may be better (e.g.) to have 16 times the memory used so that 100 concurrent sessions can be spread across 16 different chains rather than having 100 sessions all trying to search the same chain at the same time.

 

Powered by WordPress.com.