<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Quiz Night</title>
	<atom:link href="http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/feed/" rel="self" type="application/rss+xml" />
	<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/</link>
	<description>Just another Oracle weblog</description>
	<lastBuildDate>Fri, 24 May 2013 13:27:07 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44695</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Thu, 26 Jan 2012 23:10:22 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44695</guid>
		<description><![CDATA[Valentin,
I think your first couple of statements summarise the informal argument about why we generally assume a two-column index will have a lower clustering_factor than a three-column index - the inclusion of the rowid in the index entry is very important, so if we have six rows with the same (two-column) key, three rows in each of two blocks (call them B1, B2) the block component of the index entries would be for (B1, B1, B1, B2, B2, B2). Adding an extra column to the index could then very easily change the order of visiting the rows to (B1, B2, B1, B2, B1, B2).

On the other hand, as you point out, there are patterns of data where an odd synchronsiation of values and block locations could contradict the general intuition.]]></description>
		<content:encoded><![CDATA[<p>Valentin,<br />
I think your first couple of statements summarise the informal argument about why we generally assume a two-column index will have a lower clustering_factor than a three-column index &#8211; the inclusion of the rowid in the index entry is very important, so if we have six rows with the same (two-column) key, three rows in each of two blocks (call them B1, B2) the block component of the index entries would be for (B1, B1, B1, B2, B2, B2). Adding an extra column to the index could then very easily change the order of visiting the rows to (B1, B2, B1, B2, B1, B2).</p>
<p>On the other hand, as you point out, there are patterns of data where an odd synchronsiation of values and block locations could contradict the general intuition.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44694</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Thu, 26 Jan 2012 23:00:20 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44694</guid>
		<description><![CDATA[Srivenu,

Your blog highlights an interesting side-effect of the flaw in the optimizer&#039;s model when it comes to using the clustering_factor. (It also fit quite nicely with a little posting I did recently about enhancing index statistics to include &lt;em&gt;&lt;strong&gt;&lt;a href=&quot;http://jonathanlewis.wordpress.com/2011/12/13/i-wish/&quot; rel=&quot;nofollow&quot;&gt;figures about every prefix&lt;/a&gt;&lt;/strong&gt;&lt;/em&gt; combination.)]]></description>
		<content:encoded><![CDATA[<p>Srivenu,</p>
<p>Your blog highlights an interesting side-effect of the flaw in the optimizer&#8217;s model when it comes to using the clustering_factor. (It also fit quite nicely with a little posting I did recently about enhancing index statistics to include <em><strong><a href="http://jonathanlewis.wordpress.com/2011/12/13/i-wish/" rel="nofollow">figures about every prefix</a></strong></em> combination.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44693</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Thu, 26 Jan 2012 22:56:53 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44693</guid>
		<description><![CDATA[Srivenu,
The mixture of local and global indexes is certainly a case where the side effects on clustering_factor can be very counter-intuitive. 

On a completely different tack, creating a hash-partitioned index on a non-partitioned table could also result in less contention for popular blocks on insertion, which can avoid a bug that causes indexes to become much bigger than they need to. There&#039;s lots of room for discusion when looking at partitioned indexes.]]></description>
		<content:encoded><![CDATA[<p>Srivenu,<br />
The mixture of local and global indexes is certainly a case where the side effects on clustering_factor can be very counter-intuitive. </p>
<p>On a completely different tack, creating a hash-partitioned index on a non-partitioned table could also result in less contention for popular blocks on insertion, which can avoid a bug that causes indexes to become much bigger than they need to. There&#8217;s lots of room for discusion when looking at partitioned indexes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44692</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Thu, 26 Jan 2012 22:40:05 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44692</guid>
		<description><![CDATA[Martin,
I wasn&#039;t thinking about bitmap indexes when I posed the question, but my first thought is that if (&#039;x&#039;,&#039;y&#039;) is a key value in the two-column index then it may have one index entry; if you add a third column you may have N index entries like (&#039;x&#039;,&#039;y&#039;,&#039;a&#039;), (&#039;x&#039;,&#039;y&#039;,&#039;b&#039;) - with a corresponding increase in the clustering_factor.  This won&#039;t change the cost of using the index, of course, since the clustering_factor of a bitmap index isn&#039;t used in the cost.]]></description>
		<content:encoded><![CDATA[<p>Martin,<br />
I wasn&#8217;t thinking about bitmap indexes when I posed the question, but my first thought is that if (&#8216;x&#8217;,'y&#8217;) is a key value in the two-column index then it may have one index entry; if you add a third column you may have N index entries like (&#8216;x&#8217;,'y&#8217;,'a&#8217;), (&#8216;x&#8217;,'y&#8217;,'b&#8217;) &#8211; with a corresponding increase in the clustering_factor.  This won&#8217;t change the cost of using the index, of course, since the clustering_factor of a bitmap index isn&#8217;t used in the cost.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44691</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Thu, 26 Jan 2012 22:34:21 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44691</guid>
		<description><![CDATA[Stefan,
As Valentin has pointed out, compression won&#039;t affect the &lt;em&gt;&lt;strong&gt;clustering_factor&lt;/strong&gt;, &lt;/em&gt;but it will affect the size (number of leaf blocks) in the index - and that does have some impact on the cost of using the index; so if you have different compression on the indexes you can get counter-intuitive results.]]></description>
		<content:encoded><![CDATA[<p>Stefan,<br />
As Valentin has pointed out, compression won&#8217;t affect the <em><strong>clustering_factor</strong>, </em>but it will affect the size (number of leaf blocks) in the index &#8211; and that does have some impact on the cost of using the index; so if you have different compression on the indexes you can get counter-intuitive results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44690</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Thu, 26 Jan 2012 22:32:03 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44690</guid>
		<description><![CDATA[Valentin,

Another good point - it is possible to find edge cases (particularly with a small number of index entries) where the order of data arrival can have a surprising impact. In your case the effect is also largely dependent on the great length of the table rows.]]></description>
		<content:encoded><![CDATA[<p>Valentin,</p>
<p>Another good point &#8211; it is possible to find edge cases (particularly with a small number of index entries) where the order of data arrival can have a surprising impact. In your case the effect is also largely dependent on the great length of the table rows.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: srivenu kadiyala</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44659</link>
		<dc:creator><![CDATA[srivenu kadiyala]]></dc:creator>
		<pubDate>Wed, 25 Jan 2012 12:59:19 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44659</guid>
		<description><![CDATA[I could think of a test case, where a 2 column global index on a partitioned table could have a higher clustering factor than a super set 3 column local index. (I dont have access to an oracle database right now to test it out).
srivenu]]></description>
		<content:encoded><![CDATA[<p>I could think of a test case, where a 2 column global index on a partitioned table could have a higher clustering factor than a super set 3 column local index. (I dont have access to an oracle database right now to test it out).<br />
srivenu</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: srivenu kadiyala</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44658</link>
		<dc:creator><![CDATA[srivenu kadiyala]]></dc:creator>
		<pubDate>Wed, 25 Jan 2012 11:16:21 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44658</guid>
		<description><![CDATA[Jonathan,
I&#039;m working on a test case scenario (looks tricky!)
I posted something recently
http://srivenukadiyala.wordpress.com/2011/12/25/optimizer-might-ignore-a-more-suitable-superset-composite-index/
regards
srivenu]]></description>
		<content:encoded><![CDATA[<p>Jonathan,<br />
I&#8217;m working on a test case scenario (looks tricky!)<br />
I posted something recently<br />
<a href="http://srivenukadiyala.wordpress.com/2011/12/25/optimizer-might-ignore-a-more-suitable-superset-composite-index/" rel="nofollow">http://srivenukadiyala.wordpress.com/2011/12/25/optimizer-might-ignore-a-more-suitable-superset-composite-index/</a><br />
regards<br />
srivenu</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexandru Ersenie</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44641</link>
		<dc:creator><![CDATA[Alexandru Ersenie]]></dc:creator>
		<pubDate>Tue, 24 Jan 2012 11:24:33 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44641</guid>
		<description><![CDATA[Guess it&#039;s all a question of math. 2 columns index, combination of two values. 3 Columns index, combination of 3 values, taken by three. There goes your clustering factor...the last column increases the number of combinations possible. Of course compression will save something,but compression is used (correct me if i&#039;m wrong) for the prefix, not the sufix, meaning:
in a two column index, you can compress the first column
in a three column index (depending on the density of the second column) you can compress the first two columns. 

Still, that will not be the deciding factor.
Fact is: third column value has to be stored in the index. Clustering factor increases due to the number of combinations.
Maybe i&#039;m wrong, but that&#039;s how i see it.
Alex]]></description>
		<content:encoded><![CDATA[<p>Guess it&#8217;s all a question of math. 2 columns index, combination of two values. 3 Columns index, combination of 3 values, taken by three. There goes your clustering factor&#8230;the last column increases the number of combinations possible. Of course compression will save something,but compression is used (correct me if i&#8217;m wrong) for the prefix, not the sufix, meaning:<br />
in a two column index, you can compress the first column<br />
in a three column index (depending on the density of the second column) you can compress the first two columns. </p>
<p>Still, that will not be the deciding factor.<br />
Fact is: third column value has to be stored in the index. Clustering factor increases due to the number of combinations.<br />
Maybe i&#8217;m wrong, but that&#8217;s how i see it.<br />
Alex</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael vR</title>
		<link>http://jonathanlewis.wordpress.com/2012/01/19/quiz-night-16/#comment-44563</link>
		<dc:creator><![CDATA[Michael vR]]></dc:creator>
		<pubDate>Fri, 20 Jan 2012 09:11:37 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?p=8145#comment-44563</guid>
		<description><![CDATA[For a bulk loaded table the CF for the 2-column table should be at least as good as the CF of the 3-column table.
If the data is not bulk loaded and the third column increases for each insert (for example insertion date) I would expect a better CF for the 3-column table.]]></description>
		<content:encoded><![CDATA[<p>For a bulk loaded table the CF for the 2-column table should be at least as good as the CF of the 3-column table.<br />
If the data is not bulk loaded and the third column increases for each insert (for example insertion date) I would expect a better CF for the 3-column table.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
