Oracle Scratchpad

September 27, 2012

Row sizes

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 5:32 pm BST Sep 27,2012

A recent question on Oracle-L (a performance problem creating indexes that turned into an issue with migrated rows) prompted me to revisit a piece of code I first wrote about 20 years ago for an Oracle 6 system, and first published in November 2000. It addresses the question of analysing row-lengths in detail and, if you look at the example I’ve linked to, you’ll see that this could be very useful when you’re trying to work out suitable settings for pctfree and why you’re suffering from row migration.

The script is strictly limited to “simple” tables, by which I mean heap tables with columns that are basic data types and don’t include such things as Longs, LOBs, nested tables and all the other nasty things that usually break simple utilities. All it does is estimate the length of each row, then aggregating by row length. The estimate doesn’t allow for nulls at the ends of rows or columns longer than 254 bytes – technically the former don’t need length bytes and the latter use 3 bytes for the length when the column is more than 250 bytes. I don’t use the script often, and haven’t got around to including all the column types it won’t work for – that’s left as an exercise for the reader.

In the original example I just listed a script using the vsize() function being applied to a specific table; in this updated code I’ve switched to the sys_op_opnsize() function to measure the internal length of the column, and I’ve written a pl/sql block that generates the SQL from a table name (using the current schema) and then runs the code to produce the result:


define m_table='&1'

declare
	m_string	varchar2(32000) := 'select 5';	-- 3 if you want to ignore the row directory
	m_cursor	sys_refcursor;

	m_length	number;
	m_ct		number;
	m_space		number;
	m_total		number;
	m_ratio		number;

begin
	for r in (
		select
			column_name, data_type
		from	user_tab_columns
		where	table_name = upper('&m_table')
		and	data_type not in (
				'LONG',
				'LONG RAW',
				'CLOB',
				'NCLOB',
				'BLOB'		-- etc. etc.
			)
	) loop					-- add the lengths of each column
		m_string := m_string ||
				' + 1 + nvl(sys_op_opnsize(' ||
				r.column_name ||
				'),1)' ||
				chr(10)
		;

	end loop;

	--
	--	Use the preceding SQL in an inline view
	--

	m_string :=
			'select row_space, count(*) row_ct, row_space * count(*) sum_space, ' ||
			'sum (row_space * count(*)) over (order by row_space) running_tot, ' ||
			'round(100* ratio_to_report(row_space * count(*)) over(),2) ratio_to_rep ' ||
			' from (' ||
			m_string ||
			' row_space from &m_table ) group by row_space order by row_space'
	;

--	dbms_output.put_line(m_string);

	dbms_output.put_line(
		rpad('Length',8) ||
		rpad('Row Count',12)   ||
		rpad('Space',12) ||
		rpad('Running Total',14) ||
		rpad('Space pct',12)
	);

	dbms_output.put_line(
		rpad('-',7,'-') || ' ' ||
		rpad('-',11,'-') || ' ' ||
		rpad('-',11,'-') || ' ' ||
		rpad('-',13,'-') || ' ' ||
		rpad('-',11,'-')
	);

	open m_cursor for m_string;
	loop
		fetch m_cursor
		into
			m_length, m_ct, m_space, m_total, m_ratio
		;

		exit when m_cursor%notfound;

		dbms_output.put_line(
			lpad(to_char(m_length,'FM999,999'),7) || ' ' ||
			lpad(to_char(m_ct,'FM999,999,999'),11)   || ' ' ||
			lpad(to_char(m_space,'FM999,999,999'),11) || ' ' ||
			lpad(to_char(m_total,'FM999,999,999'),13) || ' ' ||
			lpad(to_char(m_ratio,'FM999,999.00'),11)
		);

	end loop;

	close m_cursor;

end;
/

Here’s a sample of the output, run against obj$ in a fairly standard install of 11.2.0.3 (with 58,773 rows in the table).

Length  Row Count   Space       Running Total Space pct
------- ----------- ----------- ------------- -----------
     73           1          73            73         .00
     74          13         962         1,035         .02
     75          11         825         1,860         .01
     76          18       1,368         3,228         .02
     77          32       2,464         5,692         .04
     78          62       4,836        10,528         .08
     79          85       6,715        17,243         .11
     80         242      19,360        36,603         .33
     81         348      28,188        64,791         .48
     82         368      30,176        94,967         .52
     83         385      31,955       126,922         .55
     84         418      35,112       162,034         .60
     85         508      43,180       205,214         .74
     86         584      50,224       255,438         .86
     87         600      52,200       307,638         .89
     88         808      71,104       378,742        1.22
     89         967      86,063       464,805        1.47
     90       1,155     103,950       568,755        1.78
     91       1,186     107,926       676,681        1.85
     92       1,441     132,572       809,253        2.27
     93       1,456     135,408       944,661        2.32
     94       1,659     155,946     1,100,607        2.67
     95       1,888     179,360     1,279,967        3.07
     96       2,206     211,776     1,491,743        3.63
     97       2,508     243,276     1,735,019        4.17
     98       2,612     255,976     1,990,995        4.38
     99       2,785     275,715     2,266,710        4.72
    100       2,934     293,400     2,560,110        5.02
    101       3,153     318,453     2,878,563        5.45
    102       3,705     377,910     3,256,473        6.47
    103       3,336     343,608     3,600,081        5.88
    104       3,827     398,008     3,998,089        6.82
    105      15,683   1,646,715     5,644,804       28.20
    106         579      61,374     5,706,178        1.05
    107         429      45,903     5,752,081         .79
    108         129      13,932     5,766,013         .24
    109         125      13,625     5,779,638         .23
    110          92      10,120     5,789,758         .17
    111         114      12,654     5,802,412         .22
    112          66       7,392     5,809,804         .13
    113          55       6,215     5,816,019         .11
    114          39       4,446     5,820,465         .08
    115          26       2,990     5,823,455         .05
    116          27       3,132     5,826,587         .05
    117          22       2,574     5,829,161         .04
    118          14       1,652     5,830,813         .03
    119           8         952     5,831,765         .02
    120          17       2,040     5,833,805         .03
    121           7         847     5,834,652         .01
    122           5         610     5,835,262         .01
    123          12       1,476     5,836,738         .03
    124           5         620     5,837,358         .01
    125           6         750     5,838,108         .01
    126           4         504     5,838,612         .01
    127           2         254     5,838,866         .00
    128           2         256     5,839,122         .00
    129           1         129     5,839,251         .00
    130           2         260     5,839,511         .00
    131           1         131     5,839,642         .00

PL/SQL procedure successfully completed.

You will appreciate, of course, that if the table definition has a very large number of columns, or if the variation in column lengths is extreme, then the report could run easily run to two or three pages, so you might want a slightly less accurate, more concise report. The required code patch is fairly straightforward, though you have to count bracket carefully. All you have to do is edit the text used in the outer select after you’ve generated the inline view with all the columns in it. Here, for example, is the assignment you need to aggregate the column sizes to the nearest 10 bytes, and the results showing the impact this had on my output:

	m_string :=
			'select 10 * round(row_space/10) row_space, count(*) row_ct, ' ||
			'10 * round(row_space/10) * count(*) sum_space, ' ||
			'sum (10 * round(row_space/10) * count(*)) over (order by 10 * round(row_space/10)) running_tot, ' ||
			'round(100* ratio_to_report(10 * round(row_space/10) * count(*)) over(),2) ratio_to_rep ' ||
			' from (' ||
			m_string ||
			' row_space from &m_table ) ' ||
			'group by 10 * round(row_space/10) order by 10 * round(row_space/10)'
	;

Length  Row Count   Space       Running Total Space pct
------- ----------- ----------- ------------- -----------
     70          14         980           980         .02
     80       1,969     157,520       158,500        2.67
     90      10,364     932,760     1,091,260       15.78
    100      28,954   2,895,400     3,986,660       48.99
    110      17,311   1,904,210     5,890,870       32.22
    120         143      17,160     5,908,030         .29
    130          18       2,340     5,910,370         .04

PL/SQL procedure successfully completed.

Don’t forget that this code is going to scan every column of every row in the table, applying a function to each column, so it’s likely to do a lot of work in a large table. If you want to reduce the workload then it would be perfectly reasonable to edit line 8 of the code above to insert a sample clause after the table name.

Once you’ve got some code like this, there’s another interesting game you can play with your table – but I’m saving that for another article.

3 Comments »

  1. Jonathan, why did you opt for an undocumented function sys_op_opnsize instead of the documented and well known vsize?

    Comment by Mladen Gogala — October 3, 2012 @ 8:24 pm BST Oct 3,2012 | Reply

    • Mladen,

      I made the change a long time ago, and didn’t leave a note in the script about it.

      I think I found a couple of cases where vsize() and sys_op_opnsize() gave different results, and vsize() was the wrong one, possibly with respect to nvarchar2() and other N-types – but I can’t be sure of that.

      Since the code was an occasional diagnostic for DBAs, rather than something that would be used for affecting data in normal production processing I didn’t mind using an undocumented function – especially one that Oracle uses every day for collecting stats on all column types.

      Comment by Jonathan Lewis — October 4, 2012 @ 2:35 pm BST Oct 4,2012 | Reply

  2. [...] an earlier post I showed you how you could generate SQL to analyze the distribution of row sizes in a table. In the [...]

    Pingback by Row sizes 2 | Oracle Scratchpad — April 13, 2013 @ 11:50 pm BST Apr 13,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Theme: Rubric. Get a free blog at WordPress.com

Follow

Get every new post delivered to your Inbox.

Join 4,115 other followers