One thing you (ought to) learn very early on in an Oracle career is that eventually you’ll come across a case you haven’t previously considered. It’s a phenomenon that often signals the downfall of the typical “I found it on the internet” query. Here’s one (heavily paraphrased) example that appeared on the OTN database forum a few days ago:
select table_name,round((blocks*8),2)||’kb’ “size” from user_tables where table_name = ‘MYTABLE’;
select table_name,round((num_rows*avg_row_len/1024),2)||’kb’ “size” from user_tables where table_name = ‘MYTABLE’;
The result from the first query is 704 kb, the result from the second is 25.4 kb … fragmentation, rebuild, CTAS etc. etc.
The two queries are perfectly reasonable approximations (assuming an 8KB block size with pctfree set to zero and up-to-date stats) for the formatted space and actual data size of a simple heap table – and since the two values here don’t come close to matching it’s perfectly reasonable to consider doing something like a rebuild or shrink space to reclaim space and (perhaps) to improve performance.
In this case it doesn’t look as if the space reclaimed is likely to be huge (less than 1MB), on the other hand it’s probably not going to take much time to rebuild such a tiny table; it doesn’t seem likely that the rebuild could make a significant difference to performance (though apparently it did), but the act of rebuilding might cause execution plans to change for the better because new statistics might appear as the rebuild took place. The figures came from a test system, though, so maybe the table on the production system was much larger and the impact would be greater.
Being cautious about wasting time and introducing risk, I made a few comments about the question – and learned that one of the columns was of type SDO_GEOMETRY. This makes a big difference about what to do next because dbms_stats.gather_table_stats() doesn’t process such columns correctly and produces a massive under-estimate for the avg_row_len (which is roughly the sum of avg_col_len for the table). Here’s an example (run on 12.1.0.2, based on some code to create an SDO Geometry object taken from the 10gR2 manuals):
rem rem Script: sdo_sizing.sql rem Author: Jonathan Lewis rem Dated: Jan 2015 rem rem Last tested rem 19.11.0.0 rem 12.1.0.2 rem drop table cola_markets purge; CREATE TABLE cola_markets ( mkt_id NUMBER, name VARCHAR2(32), shape SDO_GEOMETRY); INSERT INTO cola_markets VALUES( 1, 'cola_a', SDO_GEOMETRY( 2003, -- two-dimensional polygon NULL, NULL, SDO_ELEM_INFO_ARRAY(1,1003,3), -- one rectangle (1003 = exterior) SDO_ORDINATE_ARRAY(1,1, 5,7) -- only 2 points needed to -- define rectangle (lower left and upper right) with -- Cartesian-coordinate data ) ); insert into cola_markets select * from cola_markets; / / / / / / / / / execute dbms_stats.gather_table_stats(user,'cola_markets') select avg_row_len, num_rows, blocks, round(avg_row_len * num_rows / 7200,0) expected_blocks from user_tables where table_name = 'COLA_MARKETS' ; analyze table cola_markets compute statistics; select avg_row_len, num_rows, blocks, round(avg_row_len * num_rows / 7200,0) expected_blocks from user_tables where table_name = 'COLA_MARKETS' ;
If you care to count the number of times I execute the “insert as select” it’s 10, so the table ends up with 2^10 = 1024 rows. The figure 7,200 in the expected_blocks calculation converts bytes to approximate blocks on the assumption of 8KB blocks and pctfree = 10. Here are the results from the two different methods for generating object statistics:
PL/SQL procedure successfully completed. AVG_ROW_LEN NUM_ROWS BLOCKS EXPECTED_BLOCKS ----------- ---------- ---------- --------------- 14 1024 124 2 Table analyzed. AVG_ROW_LEN NUM_ROWS BLOCKS EXPECTED_BLOCKS ----------- ---------- ---------- --------------- 109 1024 124 16
Where does the difference in expected_blocks come from? (The blocks figures is 124 because I’ve used 1MB uniform extents – 128 blocks – under ASSM (which means 4 space management blocks at the start of the first extent.)
Here are the column lengths after the call to dbms_stats: as you can see the avg_row_len is the sum of avg_col_len.
select column_name, data_type, avg_col_len from user_tab_cols where table_name = 'COLA_MARKETS' order by column_id ; COLUMN_NAME DATA_TYPE AVG_COL_LEN -------------------- ------------------------ ----------- MKT_ID NUMBER 3 NAME VARCHAR2 7 SYS_NC00010$ SDO_ORDINATE_ARRAY SHAPE SDO_GEOMETRY SYS_NC00008$ NUMBER 0 SYS_NC00004$ NUMBER 4 SYS_NC00005$ NUMBER 0 SYS_NC00006$ NUMBER 0 SYS_NC00007$ NUMBER 0 SYS_NC00009$ SDO_ELEM_INFO_ARRAY
The figures from the analyze command are only slightly different, but fortunately the analyze command uses the row directory pointers to calculate the actual row allocation, so picks up information about the impact of inline varrays, LOBs, etc. that the dbms_stats call might not be able to handle.
COLUMN_NAME DATA_TYPE AVG_COL_LEN -------------------- ------------------------ ----------- MKT_ID NUMBER 2 NAME VARCHAR2 6 SYS_NC00010$ SDO_ORDINATE_ARRAY SHAPE SDO_GEOMETRY SYS_NC00008$ NUMBER 1 SYS_NC00004$ NUMBER 3 SYS_NC00005$ NUMBER 1 SYS_NC00006$ NUMBER 1 SYS_NC00007$ NUMBER 1 SYS_NC00009$ SDO_ELEM_INFO_ARRAY
As a basic reminder – whenever you do anything slightly non-trivial (e.g. something you couldn’t have done in version 6, say) you need to remember that all those simple little scripts you find on the Internet might not actually cover your particular case.
Update (July 2021)
I’ve just re-run the test script on version 19.11.0.0, and the results haven’t changed.
[…] on from a recent “check the space” posting, here’s another case of the code not reporting what you thought it would, prompted by […]
Pingback by LOB Space | Oracle Scratchpad — January 21, 2015 @ 1:27 pm GMT Jan 21,2015 |
[…] Another example of “special” features messing up traditional space calculations […]
Pingback by Fragmentation 1 | Oracle Scratchpad — March 2, 2021 @ 6:35 pm GMT Mar 2,2021 |