Oracle Scratchpad

February 4, 2020


Filed under: Bugs,Infrastructure,Oracle,undo — Jonathan Lewis @ 3:50 pm GMT Feb 4,2020

The view v$undostat is a view holding summary information about undo activity that can be used by the automatic undo mechanism to deal with optimising the undo retention time (hence undo space allocation). The view holds one row for every ten minute interval in the last 4 days (96 hours) and includes two columns called maxquerylen and maxqueryid – which tell you something about the query that was considered to be the longest running query active in the interval.

In this note I want to explain why the contents of these two columns are sometimes (possibly often) completely irrelevant despite there being a few notes on the internet about how you should investigate them to help you decide on a suitable setting for the undo_retention.

The descriptions in the 19c reference manual for these columns are as follows:

  • maxquerylen – Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.
  • maxqueryid – SQL identifier of the longest running SQL statement in the period

It would be hard to misunderstand the meaning of the second column: if the first column tells us that Oracle has spotted a “longest query” then the second column gives us the sql_id so we can check v$sql to find out what it was. But what sort of queries are going to show up as the longest query in the interval?

Here’s an example from a few hours of a nearly idle instance, querying the begin and end times (formatted to show just day of month + time), with the two critical columns,

        begin_time, end_time, maxquerylen, maxqueryid
order by

--------------- ----------- ----------- -------------
04 10:50:18	04 11:00:18	      0
04 11:00:18	04 11:10:18	    356 f3yfg50ga0r8n
04 11:10:18	04 11:20:18	    883 25u1mbkcr9rnu
04 11:20:18	04 11:30:18	   1486 25u1mbkcr9rnu
04 11:30:18	04 11:40:18	   2090 25u1mbkcr9rnu
04 11:40:18	04 11:50:18	      0
04 11:50:18	04 12:00:18	   3299 25u1mbkcr9rnu
04 12:00:18	04 12:10:18	   3903 25u1mbkcr9rnu
04 12:10:18	04 12:20:18	   4507 25u1mbkcr9rnu
04 12:20:18	04 12:30:18	      0
04 12:30:18	04 12:40:18	      0
04 12:40:18	04 12:50:18	      0
04 12:50:18	04 13:00:18	      0
04 13:00:18	04 13:10:18	      0
04 13:10:18	04 13:20:18	      0
04 13:20:18	04 13:30:18	      0
04 13:30:18	04 13:37:27	   9035 25u1mbkcr9rnu

173 rows selected.

Notice, particularly, that the SQL_ID 25u1mbkcr9rnu disappears from the 11:40 interval, then re-appears at 11:50, then disappears again from 12:20 through 13:20 (lunchtime), then reappears again at 13:30. And when it reappears after an absence the query length has increased in a way that’s consistent with the gap. So it looks as if the query wasn’t running during the gap, but turns out to have been running after the gap ended. (Is this a Quantum query – does it only run if you’re looking at it?)

The explanation is in the detail of the definition: “from the cursor open time to the last fetch/execute time”. From an SQL*Plus session I “set pause on” then executed the query “select * from all_objects” and hit return a few times to get a few pages of output. Then, spread over the next couple of hours, I hit return a few more times to get a few more pages of output. Each time I hit return my session does another fetch call, and the code behind v$undostat notices that my query is still active.

I don’t know exactly how Oracle is keeping track of “active” statements because there seem to be some inconsistencies in the reporting (and I’ll comment on those in a moment) but, as a first approximation, until you close a cursor (either explicitly or implicitly) some piece of Oracle’s code registers the fact that the query might do further fetches, which means it might need to apply more undo to make current data read-consistent with the cursor’s start SCN, which means that it should take note of the time the cursor has been open because the undo retention time might need to be that long.


I said there were some inconsistencies in the reporting. I’ve noticed three anomalies – perhaps due to the extreme idleness of the instance I was using for testing.

  1. At about 12:45 I hit return a few times to get the maxquerylen and maxqueryid refreshed – but the code didn’t seem to notice that I had done a few more fetches of a long running query. So it seems to be possible for Oracle to miss the query that should be reported.
  2. At about 11:52 I hit return a few times again, and you can see that the two critical columns were refreshed, but if you do the arithmetic Oracle has attributed 600 seconds to the query – the whole of the 10 minute refresh interval, not just the time up to the moment of the last fetch that I had done in that interval.
  3. I didn’t hit return again until about 12:45 (the moment in point 1 above when the two columns didn’t refresh the way I though they should), but the columns kept updating through 12:00 and 12:10 intervals anyway before disappearing from the 12:20 interval. So it looks like queries can be reported as long running even when they haven’t been executing or fetching.

The is one last anomaly – and this relates to the reason I started looking at this columns. If you check the entry for 11:00 you’ll see that the SQL Id ‘f3yfg50ga0r8n’ has been reported as “running” for 356 seconds. But here’s what I found in v$sql for that query:

select  executions, parse_calls, fetches, end_of_fetch_count, elapsed_time, cpu_time, rows_processed, sql_text
from    v$sql
where   sql_id = 'f3yfg50ga0r8n'

---------- ----------- ---------- ------------------ ------------ ---------- -------------- ------------------------------------------
        79          79         79                  0        20487      10667             79 select obj# from obj$ where dataobj# = :1

The SQL lookes like a sys-recursive statement which, in 79 executions, has accumulated 20 milliseconds of elapsed time (rather than 356 seconds – but that difference could just be related to one or other of the anomalies I reported earlier). The key thing to note is that the value of column end_of_fetch_count is zero: this looks like a statement where Oracle has simply not bothered to fetch all the data and has then not bothered to close the cursor. As a result it’s possible that each time the statement is executed (note that parse_calls = executions, so we’re not looking at a “held” cursor) the code behind v$undostat looks back at the time the cursor was initially opened to measure the query length, rather than looking at the time the statement was re-executed.

This may go a long way to answering the question that came up on Oracle-l earlier on today as follows:

The following query (sql_id is 89w8y2pgn25yd) was recorded in v$undostat.maxqueryid in a 12.2. database during a period of a high undo usage: select ts# from sys.ts$ where ts$.online$ != 3 and bitand(flags,2048) != 2048;

from    v$undostat u
where   maxqueryid='89w8y2pgn25yd'

--------  --------  -----------  --------------  ----------
   39199      4027         1378               5     2531960

What is this query, and how does it manage to report a maximum query length of 1,378 seconds?

Just like the one above it’s a sys-recursive query; and this one appears when you query dba_tablespaces – and even though it executes once for every row it takes just fractions of a second to execute. But if you trace a query like “select tablespace_name from dba_tablespaces” you’ll see that every time the query is called the trace file will show lines for: “Parse, Exec, Fetch, Close” until the last call – which doesn’t report a “Close”.

Just like my original “select * from all_objects” there’s a query dangling with an open cursor, leaving Oracle with the opportunity to go back to the moment it was opened and report it as a “long running query”.


The maxquerylen and maxqueryid in v$undostat don’t tell you about statements that have taken a long time to change a lot of data and generate a lot of undo; they tell you about statements that might need to apply a lot of undo to see read-consistent data and therefore might become victims in a “ORA-01555: snapshot too old” event.

The statements reported may be completely irrelevant because there are various reasons (some, possibly, bug-related) why a cursor may be treated as if it opened a long time in the past when it was actually a re-use of an existing “dangling” cursor. It’s also possible that some bad client code will treat cursor in a way that does no harm to the client program but hides a more interesting query that would otherwise have been reported by these two columns.

Update Feb 2020

I don’t know why I didn’t think of searching MOS before poking around at this issue, but I’ve just done a search on “maxquerylen maxqueryid” and one of the first Knowledgebase articles to show up – dated September 2019 – was Doc ID 2005931.1: Wrongly calculated MAXQUERYLEN Causing ORA-01555 or ORA-30036 Errors.

The document reports several bugs all relating to the same issue, and all closed as duplicates of unpublished bug: 12314489: OCIRELEASESTMT() TO PUT THE CURSORS IN CURBOUND STATE WHEN PLACING THEM IN CACHE This is also closed and replaced by bug 26833932 which is also unpublished and doesn’t even have a description.

Update Dec 2020

A question on the MOS Community Forums today raised a question about SQL_ID f3yfg50ga0r8n appearing in v$active_session_history with a an sql_exec_start (22:25) that was nearly 38 minutes before the sample_time (23:03) with no other samples appearing in ASH between those two times. This doesn’t necessarily mean anything special (it could be a query which is very efficient at the database but is only asked to return a few rows to the client every few seconds), but it’s the same query that I commented on in the section on Inconsistencies.

This prompted me to do another search for the bug numbers I reported in February, and I found that something has been published fairly recently about bug 26833932 – search for the document id 26833932.8, and you’ll find “Bug 26833932 OCIStmtRelease() To Put Cursors In CURBOUND State When Placing Them In Cache” . This is reported as fixed in 18.1 with various patches available for and

Update Jan 2021

Writing up some notes on undo handling I noticed that my 19.3 v$undostat was reporting a maxqueryid of  g0t052az3rx44 that had spontaneously appeared with a maxquerylen of 42,000 seconds, but no previous appearances. The text of the query was another sys-recursive statement:

select name,intcol#,segcol#,type#,length,nvl(precision#,0),decode(type#,2,nvl(scale,-127/*MAXSB1MINAL*/),178,scale,179,scale,180,sca
le,181,scale,182,scale,183,scale,231,scale,0),null$,fixedstorage,nvl(deflength,0),default$,rowid,col#,property, nvl(charsetid,0),nvl
(charsetform,0),spare1,spare2,nvl(spare3,0), nvl(evaledition#,1),nvl(unusablebefore#,0),nvl(unusablebeginning#,0), case when (type#
in (1,8,9,96,112)) then nvl(collid, 16382) else 0 end case, nvl(collintcol#,0), nvl(acdrrescol#, 0), nvl(spare7, 0), nvl(spare9, 0),
 nvl(spare10, 0) from col$ where obj#=:1 order by intcol#

So the problem still hasn’t gone away yet.


  1. Hi Jonathan,
    We are seeing a strange issue after upgrading our DW databases to 19c (single tenant). When on 11g, we never faced ORA-1555 in last 8 years. Right after upgrading to 19c, our DW databases are frequently seeing ORA-1555 errors. The problem I am seeing in v$undostat is that for last 4 days, it is reporting the same SQL ID in the maxqueryid column. So, all the 576 rows for instance one of the database have the same SQL ID as maxqueryid. The database is pretty busy and users are running lot of queries. So, why the v$undostat is not showing any other query? Your article is the only one which comes anywhere near to what I am seeing. You insight into Oracle internals is phenomenal so I turn to you for any explanation of the issue.

    Comment by Arun Gupta — February 18, 2020 @ 7:37 pm GMT Feb 18,2020 | Reply

    • Arun,

      Thanks for the question; it made me realise that I hadn’t searched MOS for any relevant cases before writing the blog noe so I’ve now done that and added an update which might be relevant to your problem. The description may be relevant in your case, and the reason the problem has appeared may be because in your move from 11g to 19c it’s possible that you’ve automatically enabled something like undo autotuning (or some other auto feature) that hadn’t been in place in 11g, so the bug is now making itself felt as a side effect.

      I would contact Oracle support and ask them how the hidden bug is progressing and if there’s any patch available. I’d also look at the Doc ID I started from to see if any of the workarounds it suggests might help.

      Jonathan Lewis

      Comment by Jonathan Lewis — February 19, 2020 @ 4:09 pm GMT Feb 19,2020 | Reply

  2. Thank you so much…!!! I had seen this MOS note, but just skipped over it because it did not have any Oracle version number and seemed very generic. I will open SR and follow up with support.

    Comment by Arun Gupta — February 19, 2020 @ 5:17 pm GMT Feb 19,2020 | Reply

  3. Nenad,
    In addition to the MOS note Jonathan has pointed out (2005931.1), please also check out MOS note Bug 27543971 – Incorrect V$undostat.tuned_undoretention Value In Local Undo Mode (Doc ID 27543971.8).

    Comment by Arun Gupta — February 24, 2020 @ 4:23 pm GMT Feb 24,2020 | Reply

  4. The sql issue may be involved in Bug 31577569  Datapump export at default direct path unload mode fails with ORA-1555 errors.

    Comment by YeonHong.Min — November 3, 2020 @ 2:41 am GMT Nov 3,2020 | Reply

    • YeonHong.Min,

      Thanks for the reference.

      Reading the description ((Doc ID 31577569.8) it doesn’t look as if this could cause the the ORA-01555; but it is possible that in the upgrade to 19.7 (where your bug number is first noted as present) there’s a change in the way Oracle tracks examples like the above which means it STOPs protecting the direct path unload. So an upgrade from 19.6 or lower may suddenly start hitting the direct path problem for “no apparent reason”.

      Jonathan Lewis

      Comment by Jonathan Lewis — November 3, 2020 @ 9:56 am GMT Nov 3,2020 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by