<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Analysing Statspack 13</title>
	<atom:link href="http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/feed/" rel="self" type="application/rss+xml" />
	<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/</link>
	<description>Just another Oracle weblog</description>
	<lastBuildDate>Mon, 20 May 2013 01:44:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52785</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 12 Jan 2013 18:42:01 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52785</guid>
		<description><![CDATA[Timur,

The original question came up in (at least) two places on OTN - this is &lt;strong&gt;&lt;em&gt;&lt;a href=&quot;https://forums.oracle.com/forums/thread.jspa?threadID=2482111&amp;tstart=200&quot; rel=&quot;nofollow&quot;&gt;the one I saw &lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;and responded to.

Your comments about me saying &quot;AWR&quot; when really I should have been saying &quot;9i&quot; are correct, of course. There is some OS information in 10g, though the presentation is different; in fact if you enable it you can get some OS stats appearing in the 8i and 9i Instance Activity.

My inference from the db file read figures is that since they&#039;re so fast they must be coming from the file system cache - and if that&#039;s the case then there&#039;s going to be a big chunk of CPU being recorded against the OS time for file system manipulation.  (I&#039;ve assumed, of course, that the huge excess of memory, and the low version of Oracle, are good indicators of a simple set up.). 

I think I made the point about confusing cause and effect in the OTN thread, but at this point I think it ought to be fairly clear that which is which. If anyone doesn&#039;t get your point though, it&#039;s worth mentioning that if you have a bit of a latch sleep problem then if everyone starts chewing up all the CPU (and without necessarily doing anything related to latches) then the latch sleeps get worse because it takes longer for a sleeper that&#039;s woken up to get to the top of the run queue because of all the processes in the queue ahead of it that are using up their time slices.

Good alternative explanation for how we could lose about 8 Billion LIOs that took place in this time interval.  The OP might benefit from running off the next two or three (or more) hours reports as well. 

Predicate capture (statspack doesn&#039;t do it):  at this point, with such extreme problems, I&#039;m only really interested in the big picture - are today&#039;s (or this interval&#039;s) execution plans the ones we&#039;ve been seeing for weeks, or is today a very special day.  I&#039;ll worry about WHY there&#039;s been a change after I&#039;ve determined that a (massive) change has occurred.

Regarding SGA size:  it may not be in the thread you saw, but the sga_max_size is set to 1.7GB, and the db_cache_size and shared_pool_size are set (and reported) as 300MB - it looks as if someone has allowed for manual adjustment up to the legal 32-bit max for the platform, but has not actually allocated it. So there seems to be about 1GB of memory doing nothing that ought to have been in the db cache to reduce the constant copying from file system to Oracle cache.  But it is possible that an increase might make the wrong thing go faster - with the side effect (see &quot;top of run queue&quot; comment) that the thing we&#039;re worried about might slow down even more.

In fact there&#039;s a follow-up post from the same person comparing two consecutive days. I&#039;m trying to find time to write some comments about it as &quot;Statspack 13a&quot;. It&#039;s very educational.]]></description>
		<content:encoded><![CDATA[<p>Timur,</p>
<p>The original question came up in (at least) two places on OTN &#8211; this is <strong><em><a href="https://forums.oracle.com/forums/thread.jspa?threadID=2482111&amp;tstart=200" rel="nofollow">the one I saw </a></em></strong>and responded to.</p>
<p>Your comments about me saying &#8220;AWR&#8221; when really I should have been saying &#8220;9i&#8221; are correct, of course. There is some OS information in 10g, though the presentation is different; in fact if you enable it you can get some OS stats appearing in the 8i and 9i Instance Activity.</p>
<p>My inference from the db file read figures is that since they&#8217;re so fast they must be coming from the file system cache &#8211; and if that&#8217;s the case then there&#8217;s going to be a big chunk of CPU being recorded against the OS time for file system manipulation.  (I&#8217;ve assumed, of course, that the huge excess of memory, and the low version of Oracle, are good indicators of a simple set up.). </p>
<p>I think I made the point about confusing cause and effect in the OTN thread, but at this point I think it ought to be fairly clear that which is which. If anyone doesn&#8217;t get your point though, it&#8217;s worth mentioning that if you have a bit of a latch sleep problem then if everyone starts chewing up all the CPU (and without necessarily doing anything related to latches) then the latch sleeps get worse because it takes longer for a sleeper that&#8217;s woken up to get to the top of the run queue because of all the processes in the queue ahead of it that are using up their time slices.</p>
<p>Good alternative explanation for how we could lose about 8 Billion LIOs that took place in this time interval.  The OP might benefit from running off the next two or three (or more) hours reports as well. </p>
<p>Predicate capture (statspack doesn&#8217;t do it):  at this point, with such extreme problems, I&#8217;m only really interested in the big picture &#8211; are today&#8217;s (or this interval&#8217;s) execution plans the ones we&#8217;ve been seeing for weeks, or is today a very special day.  I&#8217;ll worry about WHY there&#8217;s been a change after I&#8217;ve determined that a (massive) change has occurred.</p>
<p>Regarding SGA size:  it may not be in the thread you saw, but the sga_max_size is set to 1.7GB, and the db_cache_size and shared_pool_size are set (and reported) as 300MB &#8211; it looks as if someone has allowed for manual adjustment up to the legal 32-bit max for the platform, but has not actually allocated it. So there seems to be about 1GB of memory doing nothing that ought to have been in the db cache to reduce the constant copying from file system to Oracle cache.  But it is possible that an increase might make the wrong thing go faster &#8211; with the side effect (see &#8220;top of run queue&#8221; comment) that the thing we&#8217;re worried about might slow down even more.</p>
<p>In fact there&#8217;s a follow-up post from the same person comparing two consecutive days. I&#8217;m trying to find time to write some comments about it as &#8220;Statspack 13a&#8221;. It&#8217;s very educational.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Noons</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52775</link>
		<dc:creator><![CDATA[Noons]]></dc:creator>
		<pubDate>Sat, 12 Jan 2013 13:24:20 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52775</guid>
		<description><![CDATA[Very true.  In this case, it was quite clear: it&#039;s a daily aggregation batch job that starts executing at 11pm and is supposed to be finished by 8am.  
Used to work well within that window.  Until the Xmas load from our centres made the &quot;demo data&quot; in the tables involved grow from 100 rows to 3.8 million. 
And of course: archiving older data is unheard of - apparently, adding CPU to enable the thing to &quot;scale&quot; was the recommended way to address this. 
Conveniently ignoring the process does not use parallel query and of the 4 cores available only one was flat out. 
Given I got a say in this one, I preferred to tune the SQL: now down to 10 minutes, from never finishing after 72 hours. I reckon that was a better improvement than &quot;adding CPU&quot;. 
Although of course the Oracle sales rep hates that!  :)]]></description>
		<content:encoded><![CDATA[<p>Very true.  In this case, it was quite clear: it&#8217;s a daily aggregation batch job that starts executing at 11pm and is supposed to be finished by 8am.<br />
Used to work well within that window.  Until the Xmas load from our centres made the &#8220;demo data&#8221; in the tables involved grow from 100 rows to 3.8 million.<br />
And of course: archiving older data is unheard of &#8211; apparently, adding CPU to enable the thing to &#8220;scale&#8221; was the recommended way to address this.<br />
Conveniently ignoring the process does not use parallel query and of the 4 cores available only one was flat out.<br />
Given I got a say in this one, I preferred to tune the SQL: now down to 10 minutes, from never finishing after 72 hours. I reckon that was a better improvement than &#8220;adding CPU&#8221;.<br />
Although of course the Oracle sales rep hates that!  :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52773</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 12 Jan 2013 13:18:07 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52773</guid>
		<description><![CDATA[Ahmed,

The point of that comment was that the instance activity reports something like 14.4 Billion LIOs, so somewhere we&#039;ve &quot;lost&quot; 8 or 9 billion.  This means that solving the 5B problem might be nice (and it might even be easy) but there&#039;s still a huge amount of work hiding somewhere that we&#039;re still going to have to address.

Maybe the big one that we can see all happened in the seconde hour of the report, and there&#039;s a similarly sized query that took place in the first hour. It&#039;s always worth thinking about the things that we can&#039;t see, and wondering whether there&#039;s a way in which we can find them.]]></description>
		<content:encoded><![CDATA[<p>Ahmed,</p>
<p>The point of that comment was that the instance activity reports something like 14.4 Billion LIOs, so somewhere we&#8217;ve &#8220;lost&#8221; 8 or 9 billion.  This means that solving the 5B problem might be nice (and it might even be easy) but there&#8217;s still a huge amount of work hiding somewhere that we&#8217;re still going to have to address.</p>
<p>Maybe the big one that we can see all happened in the seconde hour of the report, and there&#8217;s a similarly sized query that took place in the first hour. It&#8217;s always worth thinking about the things that we can&#8217;t see, and wondering whether there&#8217;s a way in which we can find them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52772</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 12 Jan 2013 13:15:14 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52772</guid>
		<description><![CDATA[PterN

Thanks for the comment - it&#039;s too easy to lose track of which bits appear in which version (though this was a 9i system, of course). The only bit I don&#039;t forget is that Statspack 10 reported the event histograms that didn&#039;t get into the AWR until 11 ;)]]></description>
		<content:encoded><![CDATA[<p>PterN</p>
<p>Thanks for the comment &#8211; it&#8217;s too easy to lose track of which bits appear in which version (though this was a 9i system, of course). The only bit I don&#8217;t forget is that Statspack 10 reported the event histograms that didn&#8217;t get into the AWR until 11 ;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52767</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 12 Jan 2013 12:33:23 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52767</guid>
		<description><![CDATA[Noons,

In many ways this boils down to the question: &quot;Who knows what the system is supposed to do?&quot;

Intuitively we feel that a query that takes 10,000,000 buffer gets and runs hundreds of times per day in an OLTP system has to be wrong - either it should only be running rarely, or it should be taking thousands of LIOs rather than millions. There are special cases, of course, and there&#039;s a very large grey area where the reource usage looks suspicious but not necessarily outrageous. When in doubt, we need to be able to get a good answer to the question: &quot;does this seem reasonable?&quot;, and I often find that the answer to that question is that nobody knows, and you have to go through a lot of work trying to reverse engineer the requirement and the intent of the programmer.]]></description>
		<content:encoded><![CDATA[<p>Noons,</p>
<p>In many ways this boils down to the question: &#8220;Who knows what the system is supposed to do?&#8221;</p>
<p>Intuitively we feel that a query that takes 10,000,000 buffer gets and runs hundreds of times per day in an OLTP system has to be wrong &#8211; either it should only be running rarely, or it should be taking thousands of LIOs rather than millions. There are special cases, of course, and there&#8217;s a very large grey area where the reource usage looks suspicious but not necessarily outrageous. When in doubt, we need to be able to get a good answer to the question: &#8220;does this seem reasonable?&#8221;, and I often find that the answer to that question is that nobody knows, and you have to go through a lot of work trying to reverse engineer the requirement and the intent of the programmer.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Timur Akhmadeev</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52727</link>
		<dc:creator><![CDATA[Timur Akhmadeev]]></dc:creator>
		<pubDate>Fri, 11 Jan 2013 16:03:21 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52727</guid>
		<description><![CDATA[Hello Jonathan
&lt;blockquote&gt;and the machine had 12 CPUs&lt;/blockquote&gt;
Where did you get this number? Original &lt;a href=&quot;https://forums.oracle.com/forums/thread.jspa?messageID=10776350&amp;tstart=0&quot; rel=&quot;nofollow&quot;&gt;thread&lt;/a&gt; doesn&#039;t show available cores count.
&lt;blockquote&gt;Working from the top down, the first thing we notice is that 2M logical I/Os per second is quite busy&lt;/blockquote&gt;
Just to translate it into bytes makes sense: that&#039;s around 16GiB/s, which could tell you more if you happen to know chipset - maybe that&#039;s close to its limit.
&lt;blockquote&gt;Since this isn’t the AWR, of course, we don’t have the OS Stats available, so we can’t see if the rest of the machines CPU is being used up outside the database – but looking at the db file read figures I’ll bet it is&lt;/blockquote&gt;
Could you please clarify what is it in the db file read figures which makes you think something else is eating up CPU outside the DB? BTW Statspack in 11g provides OS stats (10g doesn&#039;t AFAIR), so it&#039;s not only the tool that is an issue.
&lt;blockquote&gt;If you have extreme CPU usage in an Oracle system, then latch contention problems are very likely to show up – and that’s our next highest figure – AWR would probably tells us which latch but, again based on the db file read figures, I’ll guess it’s the cache buffers chains latches.&lt;/blockquote&gt;
1. That&#039;s not AWR which will tells the latch name, but modern Oracle kernel which has special names for specific latch free events, so Statspack in the 10g or 11g will show CBC latch in the Top 5 as well.
2. It&#039;s hard to decide what is the cause and what is the consequences - latching is causing CPU saturation or extreme CPU usage causes latch waits to appear in Top 5. In this case I think it is the 1st scenario, i.e. too much block gets cause too much CPU usage due to too much latch acquisitions.
&lt;blockquote&gt;so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them&lt;/blockquote&gt;
That is a very likely situation, especially with relatively small shared pool. Another possibility is that something is still being executed, and in this case Oracle kernel in 9i could not help Statspack in identifying a query which is still running at the end snapshot.
&lt;blockquote&gt;Trouble is, though, that unless you set statspack to run at level 6 you don’t capture the execution plans – unlike the AWR which capture them by default&lt;/blockquote&gt;
... but it doesn&#039;t help in some situation as it doesn&#039;t capture predicates (Bug 7493519: ACCESS_PREDICATES AND FILTER_PREDICATES MISSING IN DBA_SQL_PLAN_HIST, fixed in 12c). I don&#039;t remember if Statspack gathers the predicates correctly under level 6 - if it is, then Statspack is preferable tool for this task.
&lt;blockquote&gt;As a quick and dirty which might help a bit – put more of the sga_max_size into the buffer cache – it doesn’t happen automatically in 9.2 – and that may reduce the interaction between the O/S and Oracle and save a bit of latching and CPU.&lt;/blockquote&gt;
If it is indeed 32-bit binaries then I&#039;m not sure it&#039;s possible to allocate more than 2.7G to SGA. If it is 64-bit then SGA increase should also be combined with the switching to direct IO and possibly to the huge pages if it&#039;s Linux (hostname suggests so). But without fixing SQL it&#039;s possible to make things worse: removing IO completely, the app will (try to) use more CPU, resulting in potentially worse latch contention and more problems.

What looks a bit suspicious to me in the report are Parallel operations: there were 3/3 DDL/DML statements parallelized, 54 DFO trees parallelized, 48 queries parallelized. I&#039;m not sure if these numbers are consistent - but I think that even 9 PX slaves on a 12 core system could utilize 75% of the CPU if they catch the wrong plan. It could be related to a DDL on the index, perhaps, changing its degree although this is just a guess. Anyway, I think it&#039;s worth looking if Parallel Execution is acceptable on this system or not.]]></description>
		<content:encoded><![CDATA[<p>Hello Jonathan</p>
<blockquote><p>and the machine had 12 CPUs</p></blockquote>
<p>Where did you get this number? Original <a href="https://forums.oracle.com/forums/thread.jspa?messageID=10776350&amp;tstart=0" rel="nofollow">thread</a> doesn&#8217;t show available cores count.</p>
<blockquote><p>Working from the top down, the first thing we notice is that 2M logical I/Os per second is quite busy</p></blockquote>
<p>Just to translate it into bytes makes sense: that&#8217;s around 16GiB/s, which could tell you more if you happen to know chipset &#8211; maybe that&#8217;s close to its limit.</p>
<blockquote><p>Since this isn’t the AWR, of course, we don’t have the OS Stats available, so we can’t see if the rest of the machines CPU is being used up outside the database – but looking at the db file read figures I’ll bet it is</p></blockquote>
<p>Could you please clarify what is it in the db file read figures which makes you think something else is eating up CPU outside the DB? BTW Statspack in 11g provides OS stats (10g doesn&#8217;t AFAIR), so it&#8217;s not only the tool that is an issue.</p>
<blockquote><p>If you have extreme CPU usage in an Oracle system, then latch contention problems are very likely to show up – and that’s our next highest figure – AWR would probably tells us which latch but, again based on the db file read figures, I’ll guess it’s the cache buffers chains latches.</p></blockquote>
<p>1. That&#8217;s not AWR which will tells the latch name, but modern Oracle kernel which has special names for specific latch free events, so Statspack in the 10g or 11g will show CBC latch in the Top 5 as well.<br />
2. It&#8217;s hard to decide what is the cause and what is the consequences &#8211; latching is causing CPU saturation or extreme CPU usage causes latch waits to appear in Top 5. In this case I think it is the 1st scenario, i.e. too much block gets cause too much CPU usage due to too much latch acquisitions.</p>
<blockquote><p>so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them</p></blockquote>
<p>That is a very likely situation, especially with relatively small shared pool. Another possibility is that something is still being executed, and in this case Oracle kernel in 9i could not help Statspack in identifying a query which is still running at the end snapshot.</p>
<blockquote><p>Trouble is, though, that unless you set statspack to run at level 6 you don’t capture the execution plans – unlike the AWR which capture them by default</p></blockquote>
<p>&#8230; but it doesn&#8217;t help in some situation as it doesn&#8217;t capture predicates (Bug 7493519: ACCESS_PREDICATES AND FILTER_PREDICATES MISSING IN DBA_SQL_PLAN_HIST, fixed in 12c). I don&#8217;t remember if Statspack gathers the predicates correctly under level 6 &#8211; if it is, then Statspack is preferable tool for this task.</p>
<blockquote><p>As a quick and dirty which might help a bit – put more of the sga_max_size into the buffer cache – it doesn’t happen automatically in 9.2 – and that may reduce the interaction between the O/S and Oracle and save a bit of latching and CPU.</p></blockquote>
<p>If it is indeed 32-bit binaries then I&#8217;m not sure it&#8217;s possible to allocate more than 2.7G to SGA. If it is 64-bit then SGA increase should also be combined with the switching to direct IO and possibly to the huge pages if it&#8217;s Linux (hostname suggests so). But without fixing SQL it&#8217;s possible to make things worse: removing IO completely, the app will (try to) use more CPU, resulting in potentially worse latch contention and more problems.</p>
<p>What looks a bit suspicious to me in the report are Parallel operations: there were 3/3 DDL/DML statements parallelized, 54 DFO trees parallelized, 48 queries parallelized. I&#8217;m not sure if these numbers are consistent &#8211; but I think that even 9 PX slaves on a 12 core system could utilize 75% of the CPU if they catch the wrong plan. It could be related to a DDL on the index, perhaps, changing its degree although this is just a guess. Anyway, I think it&#8217;s worth looking if Parallel Execution is acceptable on this system or not.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ahmed aangour</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52722</link>
		<dc:creator><![CDATA[ahmed aangour]]></dc:creator>
		<pubDate>Fri, 11 Jan 2013 14:10:34 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52722</guid>
		<description><![CDATA[Hi jonathan,

Happy new year.

Could you please explain further your thoughts when you&#039;ve wrote the following sentence :
&quot;and even then that’s 40% of the total – so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them&quot;

Do you think 5 billion of LR should represent more than 40% of the total?]]></description>
		<content:encoded><![CDATA[<p>Hi jonathan,</p>
<p>Happy new year.</p>
<p>Could you please explain further your thoughts when you&#8217;ve wrote the following sentence :<br />
&#8220;and even then that’s 40% of the total – so there may be a couple more big ones that have fallen out of the library cache in the interim, perhap an hourly report on the previous hour would catch them&#8221;</p>
<p>Do you think 5 billion of LR should represent more than 40% of the total?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: PetrN</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52718</link>
		<dc:creator><![CDATA[PetrN]]></dc:creator>
		<pubDate>Fri, 11 Jan 2013 12:25:30 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52718</guid>
		<description><![CDATA[Hallo Jonathan
&quot;Since this isn’t the AWR, of course, we don’t have the OS Stats available..&quot;

Not generally , since  Version 10 captures  Statspack also OS stats in table stats$osstat.

Best Regards,
Petr]]></description>
		<content:encoded><![CDATA[<p>Hallo Jonathan<br />
&#8220;Since this isn’t the AWR, of course, we don’t have the OS Stats available..&#8221;</p>
<p>Not generally , since  Version 10 captures  Statspack also OS stats in table stats$osstat.</p>
<p>Best Regards,<br />
Petr</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Noons</title>
		<link>http://jonathanlewis.wordpress.com/2013/01/07/analysing-statspack-13/#comment-52645</link>
		<dc:creator><![CDATA[Noons]]></dc:creator>
		<pubDate>Wed, 09 Jan 2013 06:34:02 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=10275#comment-52645</guid>
		<description><![CDATA[These things must attract each other!  Just finished a tuning exercise on a third party j2ee app here.

A single SQL was in its 3rd day of uninterrupted execution. Finally got permission from the business to shoot it and fix the blessed thing.

Typical j2ee &quot;framework&quot; disaster: auto-generated, 3.8Mrow-table joined to 4Mrow-GTT, with multiple subqueries on both predicates AND select column list coming from equally monster-joins.  None of them with the slightest chance of ever using an index.  
What I call a &quot;legacy query&quot;: our grandchildren will see its results.

The execution plan in EM was a very long, sorry list of nested over nested over nested loops of million row FTS...
But because the buffer cache was quite large, the whole lot fit in cache.  Hence, a LOT of CPU being used and negligible reads.  
And of course: lots of buffer cache waits and even much more lots of CPU waits!

Which prompted the supplier&#039;s &quot;Oracle guru&quot; to *categorically* claim &quot;you are starved of CPU - add more of it&quot;.
Of course, that got shot in the bud.  
After a little judicious re-coding and trimming of totally unnecessary nested subqueries, the whole lot finished in around 10 minutes.  

More CPU.  Yeah! Right...
(save us from &quot;j2ee Oracle gurus&quot;!...)]]></description>
		<content:encoded><![CDATA[<p>These things must attract each other!  Just finished a tuning exercise on a third party j2ee app here.</p>
<p>A single SQL was in its 3rd day of uninterrupted execution. Finally got permission from the business to shoot it and fix the blessed thing.</p>
<p>Typical j2ee &#8220;framework&#8221; disaster: auto-generated, 3.8Mrow-table joined to 4Mrow-GTT, with multiple subqueries on both predicates AND select column list coming from equally monster-joins.  None of them with the slightest chance of ever using an index.<br />
What I call a &#8220;legacy query&#8221;: our grandchildren will see its results.</p>
<p>The execution plan in EM was a very long, sorry list of nested over nested over nested loops of million row FTS&#8230;<br />
But because the buffer cache was quite large, the whole lot fit in cache.  Hence, a LOT of CPU being used and negligible reads.<br />
And of course: lots of buffer cache waits and even much more lots of CPU waits!</p>
<p>Which prompted the supplier&#8217;s &#8220;Oracle guru&#8221; to *categorically* claim &#8220;you are starved of CPU &#8211; add more of it&#8221;.<br />
Of course, that got shot in the bud.<br />
After a little judicious re-coding and trimming of totally unnecessary nested subqueries, the whole lot finished in around 10 minutes.  </p>
<p>More CPU.  Yeah! Right&#8230;<br />
(save us from &#8220;j2ee Oracle gurus&#8221;!&#8230;)</p>
]]></content:encoded>
	</item>
</channel>
</rss>
