Browsing the Internet recently, I came across the following question in response to a posting by Dan Fink:
Assuming I collect snapshots every 15 min, for example:
10.00 AM – snapid 1
10.15 AM – snapid 2
and I run spreport for
start snap = 1
end snap = 2
If I have a SQL statement that was parsed at 9.55AM and is still executing at 10.00 AM. I have always assumed that it will show up in the report - is this correct ?
The question is just a little ambiguous. But I am going to assume the poster is thinking of a single occurrence of a long running statement that is issued at 9:55, is still running at 10:00, and may not complete before 10:15. This highlights two critical points you have to understand about how statspack handles this statement .
First, statspack has various limits set to minimise the volume of data captured and reported; secondly, your session updates the session and statement statistics at the end of each database call. (10gR2 has changed this, statement-level statistics can now be updated every few seconds inside calls – it seemed to be every 6 seconds when I tested it on 10.2.0.1 – but I have seen a comment that it’s supposed to be every 3 seconds).
I should start by taking advantage of the blogger’s disclaimer and pointing out that I’ve never had to track down a specific statements in this way on a production system and therefore haven’t investigated this very carefully. I believe, however, that the following is correct; and a couple of simple tests I’ve run seem to corroborate the descriptions I’ve given.
Imagine, then, that you have a query that is going to take 20,000 units of work to complete, and the critical level at which statspack captures a query is 10,000 units. (We won’t bother to worry about what a “unit” represents for the purposes of this argument). Depending on the nature of the statement, and the way in which you retrieve data from it, the following is feasiable:
Scenario 1: the nature of the work is such that the statistics don’t get updated until the query ends. The statement completes between 10:00 and 10:15 – it is captured at 10:15 and gets into the report for 10:00 to 10:15.
Scenario 2: again, the stats gets updated only when the query ends, but the query ends after 10:15 – so it won’t get captured at 10:15; it will get captured at 10:30, and all it’s work will be ascribed to the later period.
Scenario 3: the stats get updated as the query runs, and the query ends between 10:00 and 10:15. It gets captured and reported as in Scenario 1. However, in this scenario, the work done before 10:00 will (quite correctly) be factored out of the report, as the report shows the change since the start of period.
Scenario 4 – in slightly more detail:
A lot of work gets done in the first five minutes, work proceeeds all the way through the next 15 minutes, and the query is still running at 10:15, completing at 10:20. Statistics are updated as the query runs.
Say you have 6,000 units of work done by 10:00 – the statement is not trapped because if falls below the limit; assume you have 15,000 units completed by 10:15 – the statement is trapped; then the last 5,000 units occur between 10:15 and 10:20 when the statement ends – so the query is trapped again at 10:30.
Queries only appear in a statspack report (in general) if the number of executions recorded in the period has changed – no matter what volume of work was done. So this query will not appear in the report covering 10:00 to 10:15 – it’s execution count did not change in the interval. It will appear in the report from 10:15 to 10:30, showing (quite correctly) 5,000 units of work.
As I said, I’ve never had to deal with the consequences of an oddity like this. But it’s worth remembering that the work done by a single execution of a single query – operated from a direct SQL*Plus command line call even – could cross the boundaries of a statspack report and, in a few cases, this might mean a critical query is missing from the statspack report where you might expect it to be – or indicating a much lower than expected resource demand.
In the 9i version of statspack the resources checked, with their default collection boundaries, are:
- disk reads (1,000)
- buffer gets (10,000)
- parse calls (1,000)
- executions (100)
- version count (20)
- sharable memory (1MB)
You are allowed to change these using the statspack.modify_statspack_parameter() procedure and in a large, busy, system it would probably be a good idea to do so to reduce the volume of data captured during a snapshot.
Statements are reported according to slightly varying criteria depending on the resource. But in all cases at most 64 statements will be reported, and this could be reduced to as few as 16 (if you have a lot of long SQL statements). The checks are as follows:
- disk reads – reports a statement if any executions completed in the period and the ending disk read count for that statement was non-zero
- buffer gets – reports a statement if any executions of that statement completed in the period
- parse calls – report any statement in the endpoint capture set
- executions – reports a statement if there are executions of that statement recorded in in the period at all
- version count – reports a statement if there are any executions of that statement completed in the period and the ending version count exceeds the limit
- sharable memory – reports a statement if there are any executions of that statement completed in the period, and the ending memory exceeds the limit.
Remember, even if a statement does not get into the report, you may still be able to find it in the stats$sqlsummary and stats$sqltext tables for both the beginning and ending snapshot.