A recent question on Oracle-L (a performance problem creating indexes that turned into an issue with migrated rows) prompted me to revisit a piece of code I first wrote about 20 years ago for an Oracle 6 system, and first published in November 2000. It addresses the question of analysing row-lengths in detail and, if you look at the example I’ve linked to, you’ll see that this could be very useful when you’re trying to work out suitable settings for pctfree and why you’re suffering from row migration.
The script is strictly limited to “simple” tables, by which I mean heap tables with columns that are basic data types and don’t include such things as Longs, LOBs, nested tables and all the other nasty things that usually break simple utilities. All it does is estimate the length of each row, then aggregating by row length. The estimate doesn’t allow for nulls at the ends of rows or columns longer than 254 bytes – technically the former don’t need length bytes and the latter use 3 bytes for the length when the column is more than 250 bytes. I don’t use the script often, and haven’t got around to including all the column types it won’t work for – that’s left as an exercise for the reader.
In the original example I just listed a script using the vsize() function being applied to a specific table; in this updated code I’ve switched to the sys_op_opnsize() function to measure the internal length of the column, and I’ve written a pl/sql block that generates the SQL from a table name (using the current schema) and then runs the code to produce the result:
rem rem Script: row_size_2.sql rem Author: Jonathan Lewis rem Dated: Mar 2004 rem define m_table='&1' declare m_string varchar2(32000) := 'select 5'; -- 3 if you want to ignore the row directory m_cursor sys_refcursor; m_length number; m_ct number; m_space number; m_total number; m_ratio number; begin for r in ( select column_name, data_type from user_tab_columns where table_name = upper('&m_table') and data_type not in ( 'LONG', 'LONG RAW', 'CLOB', 'NCLOB', 'BLOB' -- etc. etc. ) ) loop -- add the lengths of each column m_string := m_string || ' + 1 + nvl(sys_op_opnsize(' || r.column_name || '),1)' || chr(10) ; end loop; -- -- Use the preceding SQL in an inline view -- m_string := 'select row_space, count(*) row_ct, row_space * count(*) sum_space, ' || 'sum (row_space * count(*)) over (order by row_space) running_tot, ' || 'round(100* ratio_to_report(row_space * count(*)) over(),2) ratio_to_rep ' || ' from (' || m_string || ' row_space from &m_table ) group by row_space order by row_space' ; -- dbms_output.put_line(m_string); dbms_output.put_line( rpad('Length',8) || rpad('Row Count',12) || rpad('Space',12) || rpad('Running Total',14) || rpad('Space pct',12) ); dbms_output.put_line( rpad('-',7,'-') || ' ' || rpad('-',11,'-') || ' ' || rpad('-',11,'-') || ' ' || rpad('-',13,'-') || ' ' || rpad('-',11,'-') ); open m_cursor for m_string; loop fetch m_cursor into m_length, m_ct, m_space, m_total, m_ratio ; exit when m_cursor%notfound; dbms_output.put_line( lpad(to_char(m_length,'FM999,999'),7) || ' ' || lpad(to_char(m_ct,'FM999,999,999'),11) || ' ' || lpad(to_char(m_space,'FM999,999,999'),11) || ' ' || lpad(to_char(m_total,'FM999,999,999'),13) || ' ' || lpad(to_char(m_ratio,'FM999,999.00'),11) ); end loop; close m_cursor; end; /
Here’s a sample of the output, run against obj$ in a fairly standard install of 11.2.0.3 (with 58,773 rows in the table).
Length Row Count Space Running Total Space pct ------- ----------- ----------- ------------- ----------- 73 1 73 73 .00 74 13 962 1,035 .02 75 11 825 1,860 .01 76 18 1,368 3,228 .02 77 32 2,464 5,692 .04 78 62 4,836 10,528 .08 79 85 6,715 17,243 .11 80 242 19,360 36,603 .33 81 348 28,188 64,791 .48 82 368 30,176 94,967 .52 83 385 31,955 126,922 .55 84 418 35,112 162,034 .60 85 508 43,180 205,214 .74 86 584 50,224 255,438 .86 87 600 52,200 307,638 .89 88 808 71,104 378,742 1.22 89 967 86,063 464,805 1.47 90 1,155 103,950 568,755 1.78 91 1,186 107,926 676,681 1.85 92 1,441 132,572 809,253 2.27 93 1,456 135,408 944,661 2.32 94 1,659 155,946 1,100,607 2.67 95 1,888 179,360 1,279,967 3.07 96 2,206 211,776 1,491,743 3.63 97 2,508 243,276 1,735,019 4.17 98 2,612 255,976 1,990,995 4.38 99 2,785 275,715 2,266,710 4.72 100 2,934 293,400 2,560,110 5.02 101 3,153 318,453 2,878,563 5.45 102 3,705 377,910 3,256,473 6.47 103 3,336 343,608 3,600,081 5.88 104 3,827 398,008 3,998,089 6.82 105 15,683 1,646,715 5,644,804 28.20 106 579 61,374 5,706,178 1.05 107 429 45,903 5,752,081 .79 108 129 13,932 5,766,013 .24 109 125 13,625 5,779,638 .23 110 92 10,120 5,789,758 .17 111 114 12,654 5,802,412 .22 112 66 7,392 5,809,804 .13 113 55 6,215 5,816,019 .11 114 39 4,446 5,820,465 .08 115 26 2,990 5,823,455 .05 116 27 3,132 5,826,587 .05 117 22 2,574 5,829,161 .04 118 14 1,652 5,830,813 .03 119 8 952 5,831,765 .02 120 17 2,040 5,833,805 .03 121 7 847 5,834,652 .01 122 5 610 5,835,262 .01 123 12 1,476 5,836,738 .03 124 5 620 5,837,358 .01 125 6 750 5,838,108 .01 126 4 504 5,838,612 .01 127 2 254 5,838,866 .00 128 2 256 5,839,122 .00 129 1 129 5,839,251 .00 130 2 260 5,839,511 .00 131 1 131 5,839,642 .00 PL/SQL procedure successfully completed.
You will appreciate, of course, that if the table definition has a very large number of columns, or if the variation in column lengths is extreme, then the report could run easily run to two or three pages, so you might want a slightly less accurate, more concise report. The required code patch is fairly straightforward, though you have to count bracket carefully. All you have to do is edit the text used in the outer select after you’ve generated the inline view with all the columns in it. Here, for example, is the assignment you need to aggregate the column sizes to the nearest 10 bytes, and the results showing the impact this had on my output:
m_string := 'select 10 * round(row_space/10) row_space, count(*) row_ct, ' || '10 * round(row_space/10) * count(*) sum_space, ' || 'sum (10 * round(row_space/10) * count(*)) over (order by 10 * round(row_space/10)) running_tot, ' || 'round(100* ratio_to_report(10 * round(row_space/10) * count(*)) over(),2) ratio_to_rep ' || ' from (' || m_string || ' row_space from &m_table ) ' || 'group by 10 * round(row_space/10) order by 10 * round(row_space/10)' ; Length Row Count Space Running Total Space pct ------- ----------- ----------- ------------- ----------- 70 14 980 980 .02 80 1,969 157,520 158,500 2.67 90 10,364 932,760 1,091,260 15.78 100 28,954 2,895,400 3,986,660 48.99 110 17,311 1,904,210 5,890,870 32.22 120 143 17,160 5,908,030 .29 130 18 2,340 5,910,370 .04 PL/SQL procedure successfully completed.
Don’t forget that this code is going to scan every column of every row in the table, applying a function to each column, so it’s likely to do a lot of work in a large table. If you want to reduce the workload then it would be perfectly reasonable to edit line 8 of the code above to insert a sample clause after the table name.
Once you’ve got some code like this, there’s another interesting game you can play with your table – but I’m saving that for another article.
Jonathan, why did you opt for an undocumented function sys_op_opnsize instead of the documented and well known vsize?
Comment by Mladen Gogala — October 3, 2012 @ 8:24 pm BST Oct 3,2012 |
Mladen,
I made the change a long time ago, and didn’t leave a note in the script about it.
I think I found a couple of cases where vsize() and sys_op_opnsize() gave different results, and vsize() was the wrong one, possibly with respect to nvarchar2() and other N-types – but I can’t be sure of that.
Since the code was an occasional diagnostic for DBAs, rather than something that would be used for affecting data in normal production processing I didn’t mind using an undocumented function – especially one that Oracle uses every day for collecting stats on all column types.
Comment by Jonathan Lewis — October 4, 2012 @ 2:35 pm BST Oct 4,2012 |
[…] an earlier post I showed you how you could generate SQL to analyze the distribution of row sizes in a table. In the […]
Pingback by Row sizes 2 | Oracle Scratchpad — April 13, 2013 @ 11:50 pm BST Apr 13,2013 |
[…] changes over time) to analyse the content of tables in some detail. The first was a breakdown of the lengths of the rows in the table, the second was a map showing the distribution of nulls in the rows. There used to be […]
Pingback by Row sizes 3 | Oracle Scratchpad — November 26, 2020 @ 12:08 pm GMT Nov 26,2020 |
[…] Row sizes – pt1 (Sep 2012): a little hacker to generate a “histogram” showing the distribution of row sizes in a table. […]
Pingback by Infrastructure Catalogue | Oracle Scratchpad — July 23, 2022 @ 9:19 pm BST Jul 23,2022 |