In earlier posts we looked at v$reqdist and v$queue, which report time spent running tasks, and time spent waiting in the COMMON and DISPATCHER queues.
I mentioned in the previous article that if we see too much time spent in the COMMON queue(s) then perhaps we needed more shared servers. Moving to the other end of the dialogue, one of the reasons why we might spend too much time waiting in a DISPATCHER queue for the result to go back to the user is that we don’t have enough dispatchers – and we can get a clue about this from the view v$dispatcher:
column messages format 999,999,999,999
column bytes format 999,999,999,999
column idle format 999,999,999,999
column busy format 999,999,999,999
column total_time format 999,999,999,999
column busy_percent format 999.99
select
name, /* network, */ messages, bytes,
idle, busy,
idle + busy total_time,
100 * round(busy/nullif(idle+busy,0),4) busy_percent
from
v$dispatcher
/
NAME MESSAGES BYTES IDLE BUSY TOTAL_TIME BUSY_PERCENT
---- ---------------- ---------------- ---------------- ---------------- ---------------- ------------
D000 341,902,864 -1,875,035,869 498,676,896 897,199,945 1,395,876,841 64.28
D001 351,090,918 -860,132,585 1,672,899,708 216,085,537 1,888,985,245 11.44
D002 3,543,576 1,820,366,239 6,602,929 82,744 6,685,673 1.24
D003 5,994,180 -1,007,460,261 6,539,742 145,927 6,685,669 2.18
Unfortunately, it looks as if the critical columns in this view are recorded as 32-bit signed, which means they wrap from positive to negative at about 2,000,000,000 – and this means the figures for D000 and D001 are complete garbage. In my last note I pointed out that I had started up two extra dispatchers on a system that had been running for quite a long time – which is why dispatchers D002 and D003 have such small number compared to the others – they’ve only been running about 18 hours (66,857 seconds).
Clearly, to get some sensible figures, you really need to play around with snapshots and deltas and worry about all the usual problems of collecting information for the right interval. Even so, these figures do show you that D002 and D003 have been idle for most of the time they’ve been up – but you’ll have to take it from me that the 827 seconds and 1,459 seconds they’ve recorded as busy time was a small fraction of a soak test that we were running. It’s not obvious from the absolute figures, but with the background information I have I can say that there was a small benefit from having four dispatchers, but nothing significant.
Note: if we were able to trust the 64.28% figure for dispatcher D000 we could be reasonably confident that we needed at least the second dispatcher simply on the basis of the work being done by D000; but we might also worry about it for another reason – if the dispatcher is very busy, it’s possible that this is just a symptom of the whole machine being busy, in which case it’s possible that the dispatcher isn’t able to get CPU time to do its work.
