The following question appeared some time ago on the Oracle Forum:
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class CPU time 1,167 86.9 db file sequential read 111,047 223 2 16.6 User I/O log file sync 77,390 132 2 9.9 Commit log file parallel write 78,060 124 2 9.2 System I/O db file scattered read 79,233 10 0 0.7 User I/O
From AWR I am seeing a lot of CPU taking most call time %. Can anyone explain what this means. I have looked on metalink and there are a number of doc regarding top events in general.
So what could we guess from this minimal slice of an AWR report.
Something has to be top of the list. In this case it’s CPU – but if this is a typical AWR install (which means hourly snapshots), it’s reported 1,167 seconds of CPU in one hour: which is about 30% of the CPU available if it’s running on a single CPU system. (But we haven’t been told the length of the snapshot or the number of CPUs and whether this is supposed to represent a fully loaded production system, or a lightly loaded development system).
Second is the single block read time – which is averaging 2 milliseconds per read – so it’s not coming from disc (mostly) it’s coming out of memory (possibly a SAN cache). That might be pushing up CPU usage outside of Oracle, and could be using memory that would be better inside Oracle; but it’s not giving any obvious threat indications.
Log file sync – and its driver “log file parallel writes” – are also averageing 2m/s so we aren’t overloading the discs on log file writes, and (most) users probably don’t even notice the pause on commit. But we can’t tell whether the counts are appropriate for the work being done – so we can’t say if this is good or bad.
The multiblock-reads are blindingly fast – so must be coming from local file-system cache. But perhaps this is supposed to be an OLTP system, and that could be a lot of inefficient data accesses being used to collect a very small amount of data.
All in all, we might say that the hardware is not under stress – but it is possible that the application is so badly written that the work done per transaction is far higher than it should be. These stats say you might (for example) be doing a 128 block tablescan with one subsequent row lookup and update with commit as you work your way through a batch job. The hardware seems to be coping, but that doesn’t mean the application is healthy.
Without a context, a small collection of numbers is not helpful.
But if you want a couple of guesses –
- it’s an OLTP system (77,000 log file syncs) and there are some SQL statements that need tuning to do less work for their result (79,000 db file scattered reads).
- Fixing a few SQL statement will probably eliminate a lot of db file scattered reads and a fair bit of CPU.
- At the same time, it looks like the machine is surviving on the back of a large file system cache – put some more memory into the buffer cache and the CPU will drop further as Oracle does fewer ‘physical reads’ to the file system.
Two simple questions you can always ask about performance figures are these: “Can I see any resource usage that seems unreasonable for the result achieved ?” and “where are the gaps in the information that might make me jump to the wrong conclusion ?”
Footnote: Any advice about reading AWR reports is almost always relevant when reading Statspack reports (and vice versa).