<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Index Sizing</title>
	<atom:link href="http://jonathanlewis.wordpress.com/index-sizing/feed/" rel="self" type="application/rss+xml" />
	<link>http://jonathanlewis.wordpress.com</link>
	<description>Just another Oracle weblog</description>
	<lastBuildDate>Sun, 26 May 2013 02:13:39 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-52999</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Fri, 18 Jan 2013 12:59:20 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-52999</guid>
		<description><![CDATA[Morp,

The extra byte appears only once per index entry (it&#039;s the byte that says the rowid is 6, or 10, bytes long), so we add it to the accumulated length just once.]]></description>
		<content:encoded><![CDATA[<p>Morp,</p>
<p>The extra byte appears only once per index entry (it&#8217;s the byte that says the rowid is 6, or 10, bytes long), so we add it to the accumulated length just once.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Morp</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-52995</link>
		<dc:creator><![CDATA[Morp]]></dc:creator>
		<pubDate>Fri, 18 Jan 2013 12:27:37 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-52995</guid>
		<description><![CDATA[&gt; ind.num_rows * (tab.rowid_length + ind.uniq_ind + 4)

Why ind.uniq_ind is not multiplied by tab.rowid_length?]]></description>
		<content:encoded><![CDATA[<p>&gt; ind.num_rows * (tab.rowid_length + ind.uniq_ind + 4)</p>
<p>Why ind.uniq_ind is not multiplied by tab.rowid_length?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: carlossierrausa</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-45859</link>
		<dc:creator><![CDATA[carlossierrausa]]></dc:creator>
		<pubDate>Sun, 01 Apr 2012 11:57:48 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-45859</guid>
		<description><![CDATA[Jonathan,

It makes perfect sense now. I had overlooked that avg_col_len already considered nulls. Thanks for the clarification!

Carlos]]></description>
		<content:encoded><![CDATA[<p>Jonathan,</p>
<p>It makes perfect sense now. I had overlooked that avg_col_len already considered nulls. Thanks for the clarification!</p>
<p>Carlos</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-45829</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 31 Mar 2012 08:40:17 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-45829</guid>
		<description><![CDATA[Carlos,

The choice of tab.num_rows was deliberate, and based on the fact the avg_col_len already makes allowance for nulls.

Every index entry has a rowid, lock byte, directory entry etc. and that&#039;s why I have a component that uses ind.num_rows at line 162; but the &quot;data&quot; content of the index is dictated by the data  content of the table, irrepective of how the columns are distributed across index entries. It&#039;s probably best to see this through an example (engineered to be an extreme case):
[sourcecode gutter=&quot;false&quot;]
create table t1(v1 char(6), v2 char(6));
insert into t1
select 
        decode(mod(rownum,2),0,&#039;xxxxxx&#039;),
        &#039;xxxxxx&#039;
from
        all_objects
where   rownum &lt;= 1000
;

execute dbms_stats.gather_table_stats(user,&#039;t1&#039;)

select  column_name, avg_col_len
from    user_tab_cols
where   table_name = &#039;T1&#039;
;

COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                             4
V2                             7

3 rows selected.
[/sourcecode]

Although every populated column has a data content length of 6 bytes, the average (content) is 3 for v1 because half the rows are null. The reported length includes the one extra byte required for the length byte that precedes the column content which is why we see 4 and 7 rather than 3 and 6 in the column stats.

If I had an index on just v1, then ind.num_rows would be 50 because only half the table rows have a value for v1; so if I multiplied the average column length by the number of index entries I would have applied the &quot;half the rows in the table&quot; factor twice. By using tab.num_rows I scale the average column length back up to &quot;total space taken by column&quot; in both the table segment and index segment.

The choice does introduce an error, of course, in as the average column length is rounded up to an integer, leading to an over-estimate. There&#039;s a further error due to the addition of the length byte which I could correct if I used (avg_col_len - 1) multipled tab.num_rows then re-introduced as ind.num_rows * &quot;number of columns in index&quot;.]]></description>
		<content:encoded><![CDATA[<p>Carlos,</p>
<p>The choice of tab.num_rows was deliberate, and based on the fact the avg_col_len already makes allowance for nulls.</p>
<p>Every index entry has a rowid, lock byte, directory entry etc. and that&#8217;s why I have a component that uses ind.num_rows at line 162; but the &#8220;data&#8221; content of the index is dictated by the data  content of the table, irrepective of how the columns are distributed across index entries. It&#8217;s probably best to see this through an example (engineered to be an extreme case):</p>
<pre class="brush: plain; gutter: false; title: ; notranslate">
create table t1(v1 char(6), v2 char(6));
insert into t1
select 
        decode(mod(rownum,2),0,'xxxxxx'),
        'xxxxxx'
from
        all_objects
where   rownum &lt;= 1000
;

execute dbms_stats.gather_table_stats(user,'t1')

select  column_name, avg_col_len
from    user_tab_cols
where   table_name = 'T1'
;

COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                             4
V2                             7

3 rows selected.
</pre>
<p>Although every populated column has a data content length of 6 bytes, the average (content) is 3 for v1 because half the rows are null. The reported length includes the one extra byte required for the length byte that precedes the column content which is why we see 4 and 7 rather than 3 and 6 in the column stats.</p>
<p>If I had an index on just v1, then ind.num_rows would be 50 because only half the table rows have a value for v1; so if I multiplied the average column length by the number of index entries I would have applied the &#8220;half the rows in the table&#8221; factor twice. By using tab.num_rows I scale the average column length back up to &#8220;total space taken by column&#8221; in both the table segment and index segment.</p>
<p>The choice does introduce an error, of course, in as the average column length is rounded up to an integer, leading to an over-estimate. There&#8217;s a further error due to the addition of the length byte which I could correct if I used (avg_col_len &#8211; 1) multipled tab.num_rows then re-introduced as ind.num_rows * &#8220;number of columns in index&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Carlos Sierra</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-45811</link>
		<dc:creator><![CDATA[Carlos Sierra]]></dc:creator>
		<pubDate>Fri, 30 Mar 2012 21:44:44 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-45811</guid>
		<description><![CDATA[Hello Jonathan,
On line 165 above, instead of tab.num_rows, I think we should have ind.num_rows. Rationale is to account for NULLs. Let&#039;s say index is on single column and half the table rows have a null on this column, thus index rows would be half of those on table. Similar with a composite index when all columns have nulls. What do you think?
Thanks -- Carlos]]></description>
		<content:encoded><![CDATA[<p>Hello Jonathan,<br />
On line 165 above, instead of tab.num_rows, I think we should have ind.num_rows. Rationale is to account for NULLs. Let&#8217;s say index is on single column and half the table rows have a null on this column, thus index rows would be half of those on table. Similar with a composite index when all columns have nulls. What do you think?<br />
Thanks &#8212; Carlos</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-44732</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Sat, 28 Jan 2012 20:18:47 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-44732</guid>
		<description><![CDATA[Ajeet,

1. I have an entire category on rebuilding indexes: http://jonathanlewis.wordpress.com/category/oracle/indexing/index-rebuilds/

2. The best answer to that question is in my book Cost Based Oracle - Fundamentals: I don&#039;t have space to repeat it all in a comment, but as a starting point here&#039;s a link to a short article I wrote several years ago:http://www.jlcomp.demon.co.uk/12_using_index_i.html]]></description>
		<content:encoded><![CDATA[<p>Ajeet,</p>
<p>1. I have an entire category on rebuilding indexes: <a href="http://jonathanlewis.wordpress.com/category/oracle/indexing/index-rebuilds/" rel="nofollow">http://jonathanlewis.wordpress.com/category/oracle/indexing/index-rebuilds/</a></p>
<p>2. The best answer to that question is in my book Cost Based Oracle &#8211; Fundamentals: I don&#8217;t have space to repeat it all in a comment, but as a starting point here&#8217;s a link to a short article I wrote several years ago:<a href="http://www.jlcomp.demon.co.uk/12_using_index_i.html" rel="nofollow">http://www.jlcomp.demon.co.uk/12_using_index_i.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ajeet</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-44730</link>
		<dc:creator><![CDATA[Ajeet]]></dc:creator>
		<pubDate>Sat, 28 Jan 2012 17:18:31 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-44730</guid>
		<description><![CDATA[Jonathan,

I want to ask two questions related with B*TREE indexes :

1. What are the things/statistics which tells /indicate that an index need to be rebuild. There are rare cases as I have learned where index rebuild can actually help, can you please give few cases where it really helps.
2. What are the statistics using which optimizer decide to use one index over another in a table.

Thanks
Ajeet]]></description>
		<content:encoded><![CDATA[<p>Jonathan,</p>
<p>I want to ask two questions related with B*TREE indexes :</p>
<p>1. What are the things/statistics which tells /indicate that an index need to be rebuild. There are rare cases as I have learned where index rebuild can actually help, can you please give few cases where it really helps.<br />
2. What are the statistics using which optimizer decide to use one index over another in a table.</p>
<p>Thanks<br />
Ajeet</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jcon.no: Oracle Blog - When to rebuild an Oracle index?</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-44066</link>
		<dc:creator><![CDATA[jcon.no: Oracle Blog - When to rebuild an Oracle index?]]></dc:creator>
		<pubDate>Thu, 05 Jan 2012 10:27:40 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-44066</guid>
		<description><![CDATA[[...] article also includes some related links. The first link is to a script that checks all the indexes for a spesific schema. By the way &#8211; DO READ the notes in the [...]]]></description>
		<content:encoded><![CDATA[<p>[...] article also includes some related links. The first link is to a script that checks all the indexes for a spesific schema. By the way &#8211; DO READ the notes in the [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Index Size &#171; Oracle Scratchpad</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-41456</link>
		<dc:creator><![CDATA[Index Size &#171; Oracle Scratchpad]]></dc:creator>
		<pubDate>Fri, 26 Aug 2011 06:31:50 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-41456</guid>
		<description><![CDATA[[...] explain plan for every index in your system &#8211; so in a future note I&#8217;ll be showing you a simple piece of code that you might feel happy to run against every (simple, B-tree) index in your database.    Eco World [...]]]></description>
		<content:encoded><![CDATA[<p>[...] explain plan for every index in your system &#8211; so in a future note I&#8217;ll be showing you a simple piece of code that you might feel happy to run against every (simple, B-tree) index in your database.    Eco World [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Lewis</title>
		<link>http://jonathanlewis.wordpress.com/index-sizing/#comment-35665</link>
		<dc:creator><![CDATA[Jonathan Lewis]]></dc:creator>
		<pubDate>Tue, 02 Mar 2010 20:48:34 +0000</pubDate>
		<guid isPermaLink="false">http://jonathanlewis.wordpress.com/?page_id=3063#comment-35665</guid>
		<description><![CDATA[Padders,

Thanks for the extra information; it&#039;s always useful to hear about the special cases. I&#039;ll leave it as an exercise for someone else to test, though.

One thought, if anyone does want to follow this - my general strategy for handling &quot;rare&quot; cases is to write a separate script; it&#039;s much safer and simpler than trying to write a &quot;one script does everything ... but no-one dare change it&quot; type of thing.]]></description>
		<content:encoded><![CDATA[<p>Padders,</p>
<p>Thanks for the extra information; it&#8217;s always useful to hear about the special cases. I&#8217;ll leave it as an exercise for someone else to test, though.</p>
<p>One thought, if anyone does want to follow this &#8211; my general strategy for handling &#8220;rare&#8221; cases is to write a separate script; it&#8217;s much safer and simpler than trying to write a &#8220;one script does everything &#8230; but no-one dare change it&#8221; type of thing.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
