I have never been keen on the option to “shrink space” for a table because of the negative impact it can have on performance.
I don’t seem to have written about it in the blog but I think there’s something in one of my books pointing out that the command moves data from the “end” of the table (high extent ids) to the “start” of the table (low extent ids) by scanning the table backwards to find data that can be moved and scanning forwards to find space to put it. This strategy can have the effect of increasing the scattering of the data that you’re interested in querying if most of your queries are about “recent” data, and you have a pattern of slowing deleting aging data. (You may end up doing a range scan through a couple of hundred table blocks for data at the start of the table that was once packed into a few blocks near the end of the table.)
In a discussion with a member of the audience at the recent DOAG conference (we were talking about execution plans for queries that included filter subqueries) I suddenly thought of another reason why (for an unlucky person) the shrink space command could be a disaster – here’s a little fragment of code and output to demonstrate the point.
rem rem Script: shrink_scalar_subq.sql rem Author: Jonathan Lewis rem Dated: Nov 2018 rem Purpose: rem rem Versions tested rem 12.2.0.1 rem select /*+ gather_plan_statistics pre-shrink */ count(*) from ( select /*+ no_merge */ outer.* from emp outer where outer.sal > ( select /*+ no_unnest */ avg(inner.sal) from emp inner where inner.dept_no = outer.dept_no ) ) ; alter table emp enable row movement; alter table emp shrink space compact; select /*+ gather_plan_statistics post-shrink */ count(*) from ( select /*+ no_merge */ outer.* from emp outer where outer.sal > ( select /*+ no_unnest */ avg(inner.sal) from emp inner where inner.dept_no = outer.dept_no ) ) ;
The two queries are the same and the execution plans are the same (the shrink command doesn’t change the object statistics, after all), but the execution time jumped from 0.05 seconds to 9.43 seconds – and the difference in timing wasn’t about delayed block cleanout or other exotic side effects.
COUNT(*) ---------- 9498 Elapsed: 00:00:00.05 COUNT(*) ---------- 9498 Elapsed: 00:00:09.43
The query is engineered to have a problem, of course, and enabling rowsource execution statistics exaggerates the anomaly – but the threat is genuine. You may have seen my posting (now 12 years old) about the effects of scalar subquery caching – this is another example of the wrong item of data appearing in the wrong place making us lose the caching benefit. The emp table I’ve used here is (nearly) the same emp table I used in the 2006 posting, but the difference between this case and the previous case is that I updated a carefully selected row to an unlucky value in 2006, but here in 2018 the side effects of a call to shrink space moved a row from the end of the table (where it was doing no harm) to the start of the table (where it had a disastrous impact).
Here are the two execution plans – before and after the shrink space – showing the rowsource execution stats. Note particularly the number of times the filter subquery ran – jumping from 7 to 3172 – the impact this has on the buffer gets, and the change in time recorded:
---------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.03 | 1880 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.03 | 1880 | | 2 | VIEW | | 1 | 136 | 9498 |00:00:00.03 | 1880 | |* 3 | FILTER | | 1 | | 9498 |00:00:00.03 | 1880 | | 4 | TABLE ACCESS FULL | EMP | 1 | 19001 | 19001 |00:00:00.01 | 235 | | 5 | SORT AGGREGATE | | 7 | 1 | 7 |00:00:00.02 | 1645 | |* 6 | TABLE ACCESS FULL| EMP | 7 | 2714 | 19001 |00:00:00.02 | 1645 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("OUTER"."SAL">) 6 - filter("INNER"."DEPT_NO"=:B1) ---------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:09.42 | 745K| | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:09.42 | 745K| | 2 | VIEW | | 1 | 136 | 9498 |00:00:11.71 | 745K| |* 3 | FILTER | | 1 | | 9498 |00:00:11.70 | 745K| | 4 | TABLE ACCESS FULL | EMP | 1 | 19001 | 19001 |00:00:00.01 | 235 | | 5 | SORT AGGREGATE | | 3172 | 1 | 3172 |00:00:09.40 | 745K| |* 6 | TABLE ACCESS FULL| EMP | 3172 | 2714 | 10M|00:00:04.33 | 745K| ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("OUTER"."SAL">) 6 - filter("INNER"."DEPT_NO"=:B1)
Footnote:
For completeness, here’s the code to generate the emp table. It’s sitting in a tablespace using system managed extents and automatic segment space management.
create table emp( dept_no not null, sal, emp_no not null, padding, constraint e_pk primary key(emp_no) ) as with generator as ( select null from dual connect by level <= 1e4 -- > comment to avoid wordpress format issue ) select mod(rownum,6), rownum, rownum, rpad('x',60) from generator v1, generator v2 where rownum <= 2e4 -- > comment to avoid wordpress format issue ; insert into emp values(432, 20001, 20001, rpad('x',60)); delete /*+ full(emp) */ from emp where emp_no <= 1000; -- > comment to avoid wordpress format issue commit; begin dbms_stats.gather_table_stats( ownname => user, tabname => 'EMP', method_opt => 'for all columns size 1' ); end; /
Ok. But there is any solution to fix Shrink disaster effect?
“side effects of a call to shrink space moved a row from the end of the table (where it was doing no harm) to the start of the table (where it had a disastrous impact).”
example: Can we drop table (after export data to other tmp_object) and recreate table with proper sorting etc?
Comment by Anonymous — November 27, 2018 @ 10:25 am GMT Nov 27,2018 |
The method that’s best for you depends on the most important aspects of the business use, but:
A serial “alter table move” will recreate the table with the same pattern of distribution, though you might have to think about the best setting for pctfree. (see https://jonathanlewis.wordpress.com/2007/11/23/table-rebuilds/ )
The dbms_redefinition package can do the same, and you can include an order by clause in the main copy if you want to sort the data at the same time.
If you have licenced the partitioning option under 12c then you could convert your table to a partitioned table with a single partition so that you could do an online move of the single partition whenever you felt you really had to rebuild the table. (See: https://jonathanlewis.wordpress.com/2013/06/25/12c/ )
Comment by Jonathan Lewis — November 27, 2018 @ 6:26 pm GMT Nov 27,2018 |
I wanted to try the idea but did not find the time so far that’s why I will just post it here. I suppose this could work both ways and you could construct a similar example showing an improvement after shrink space. What if the devil row happens to reside towards the start of the table (just on the right place to make the pre-shrink select slow) – but wirh some free space space before. Then the shink space could fill that free space with enough rows to negate the disastrous effect and make the scalar subquery cashing benefitial. So one could make the case for “shrink space bringing an enorm performance improvement. :-)
Comment by Todor Botev — January 31, 2019 @ 4:05 pm GMT Jan 31,2019 |
Todor,
I’m sure it would be possible to demonstrate the effect.
From a production point of view, of course, the benefit could encourage a subsequent disaster – if the first time someone did a shrink space they ran into your beneficial case they might go “wow – shrink space is mega fantastic for performance, let’s do it to all the tables in the database!” ;)
Comment by Jonathan Lewis — January 31, 2019 @ 4:34 pm GMT Jan 31,2019 |