Oracle Scratchpad

October 31, 2014

first_rows(10)

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 5:31 pm BST Oct 31,2014

No, not the 10th posting about first_rows() this week – whatever it may seem like – just an example that happens to use the “calculate costs for fetching the first 10 rows” optimizer strategy and does it badly. I think it’s a bug, but it’s certainly a defect that is a poster case for the inherent risk of using anything other than all_rows optimisation.  Here’s some code to build a couple of sample tables:


begin
	dbms_stats.set_system_stats('MBRC',16);
	dbms_stats.set_system_stats('MREADTIM',12);
	dbms_stats.set_system_stats('SREADTIM',5);
	dbms_stats.set_system_stats('CPUSPEED',1000);
end;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	rownum					id,
	trunc(dbms_random.value(1,1000))	n1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

create index t1_n1 on t1(id, n1);

create table t2
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	rownum					id,
	trunc(dbms_random.value(10001,20001))	x1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

create index t2_i1 on t2(x1);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		estimate_percent => 100,
		method_opt	 => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T2',
		estimate_percent => 100,
		method_opt	 => 'for all columns size 1'
	);

end;
/


create or replace view  v1
as
select 
	id, n1, small_vc, padding
from	t1 
where	n1 between 101 and 300
union all
select 
	id, n1, small_vc, padding
from	t1 
where	n1 between 501 and 700
;

The key feature of this demonstration is the UNION ALL view and what the optimizer does with it when we have first_rows_N optimisation – this is a simplified model of a production problem I was shown a couple of years ago, so nothing special, nothing invented. Here’s a query that behaves badly:


select
	/*+ gather_plan_statistics */
	v1.small_vc,
	v1.n1
from
	v1,
	t2
where
	t2.id = v1.id
and	t2.x1 = 15000
;

I’m going to execute this query in three different ways – as is, using all_rows optimisation; as is, using first_rows_10 optimisation, then using all_rows optimisation but with the necessary hints to make it follow the first_rows_10 execution path. Here are the resulting plans from an instance of 12.1.0.2 (the same thing happens in 11.2.0.4):


first_rows_10 plan
----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |     1 |    35 |   107   (0)| 00:00:01 |
|*  1 |  HASH JOIN                           |       |     1 |    35 |   107   (0)| 00:00:01 |
|   2 |   VIEW                               | V1    |    24 |   600 |     4   (0)| 00:00:01 |
|   3 |    UNION-ALL                         |       |       |       |            |          |
|*  4 |     TABLE ACCESS FULL                | T1    |    12 |   240 |     2   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL                | T1    |    12 |   240 |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |   100 |  1000 |   103   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN                  | T2_I1 |   100 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

all_rows plan
------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |       |    40 |  1400 |   904   (1)| 00:00:01 |
|   1 |  NESTED LOOPS                          |       |    40 |  1400 |   904   (1)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED  | T2    |   100 |  1000 |   103   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                    | T2_I1 |   100 |       |     3   (0)| 00:00:01 |
|   4 |   VIEW                                 | V1    |     1 |    25 |     8   (0)| 00:00:01 |
|   5 |    UNION ALL PUSHED PREDICATE          |       |       |       |            |          |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    20 |     4   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                  | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
|   8 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    20 |     4   (0)| 00:00:01 |
|*  9 |      INDEX RANGE SCAN                  | T1_N1 |     1 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

first_rows_10 plan hinted under all_rows optimisation
---------------------------------------------------------------------------------------------- 
| Id  | Operation                    | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | 
---------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT             |       |   200 |  8600 |       |  6124   (3)| 00:00:01 | 
|*  1 |  HASH JOIN                   |       |   200 |  8600 |    17M|  6124   (3)| 00:00:01 |
|   2 |   VIEW                       | V1    |   402K|    12M|       |  5464   (3)| 00:00:01 | 
|   3 |    UNION-ALL                 |       |       |       |       |            |          | 
|*  4 |     TABLE ACCESS FULL        | T1    |   201K|  3933K|       |  2731   (3)| 00:00:01 | 
|*  5 |     TABLE ACCESS FULL        | T1    |   201K|  3933K|       |  2733   (3)| 00:00:01 | 
|   6 |   TABLE ACCESS BY INDEX ROWID| T2    |   100 |  1000 |       |   103   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN          | T2_I1 |   100 |       |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

I’m not sure why the first_rows_10 plan uses “table access by rowid batched”, but I’d guess it’s because the optimiser calculates that sorting the index rowids before visiting the table may have a small benefit on the speed of getting the first 10 rows – eventually I’ll get around to checking the 10053 trace file. The important thing, though, is the big mistake in the strategy, not the little difference in table access.

In the first_rows_10 plan the optimizer has decided building an in-memory hash table from the UNION ALL of the rows fetched from the two copies of the t1 table will be fast and efficient; but it’s made that decision based on the assumption that it will only get 10 rows from each copy of the table – and at run-time it HAS to get all the relevant t1 rows to build the hash table before it can get any t2 rows. We can get some idea of the scale of this error when we look at the hinted plan under all_rows optimisation – it’s a lot of redundant data and a very expensive hash table build.

In contrast the all_rows plan does an efficient indexed access into the t2 table then, for each row, does a join predicate pushdown into the union all view using an indexed access path. If we only wanted to fetch 10 rows we could stop after doing a minimum amount of work. To demonstrate the error more clearly I’ve re-run the experiment for the first two plans from SQL*PLus, setting the arraysize to 11, the pagesize to 5, and stopping after the first 10 rows. Here are the plans showing the rowsource execution stats:


first_rows_10 plan
------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |      1 |        |   107 (100)|     12 |00:00:00.43 |   35150 |       |       |          |
|*  1 |  HASH JOIN                           |       |      1 |      1 |   107   (0)|     12 |00:00:00.43 |   35150 |    24M|  3582K|   23M (0)|
|   2 |   VIEW                               | V1    |      1 |     24 |     4   (0)|    400K|00:00:06.48 |   35118 |       |       |          |
|   3 |    UNION-ALL                         |       |      1 |        |            |    400K|00:00:04.20 |   35118 |       |       |          |
|*  4 |     TABLE ACCESS FULL                | T1    |      1 |     12 |     2   (0)|    200K|00:00:00.71 |   17559 |       |       |          |
|*  5 |     TABLE ACCESS FULL                | T1    |      1 |     12 |     2   (0)|    200K|00:00:00.63 |   17559 |       |       |          |
|   6 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |      1 |    100 |   103   (0)|     28 |00:00:00.01 |      32 |       |       |          |
|*  7 |    INDEX RANGE SCAN                  | T2_I1 |      1 |    100 |     3   (0)|     28 |00:00:00.01 |       4 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

all_rows plan
-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |       |      1 |        |   904 (100)|     12 |00:00:00.01 |     213 |
|   1 |  NESTED LOOPS                          |       |      1 |     43 |   904   (1)|     12 |00:00:00.01 |     213 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED  | T2    |      1 |    100 |   103   (0)|     28 |00:00:00.01 |      32 |
|*  3 |    INDEX RANGE SCAN                    | T2_I1 |      1 |    100 |     3   (0)|     28 |00:00:00.01 |       4 |
|   4 |   VIEW                                 | V1    |     28 |      1 |     8   (0)|     12 |00:00:00.01 |     181 |
|   5 |    UNION ALL PUSHED PREDICATE          |       |     28 |        |            |     12 |00:00:00.01 |     181 |
|   6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     28 |    212K|     4   (0)|      8 |00:00:00.01 |      93 |
|*  7 |      INDEX RANGE SCAN                  | T1_N1 |     28 |      1 |     3   (0)|      8 |00:00:00.01 |      85 |
|   8 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     28 |    213K|     4   (0)|      4 |00:00:00.01 |      88 |
|*  9 |      INDEX RANGE SCAN                  | T1_N1 |     28 |      1 |     3   (0)|      4 |00:00:00.01 |      84 |
-----------------------------------------------------------------------------------------------------------------------

If I had set the optimizer_mode to first_rows_10 because I really only wanted to fetch (about) 10 rows then I’ve managed to pay a huge overhead in buffer visits, memory and CPU for the privilege – the all_rows plan was much more efficient.

Remember – we often see cases where the first_rows(n) plan will do more work to get the whole data set in order to be able to get the first few rows more quickly (the simplest example is when the optimizer uses a particular index to get the first few rows of a result set in order without sorting rather than doing a (faster) full tablescan with sort. This case, though, is different: the optimizer is choosing to build a hash table as if it only has to put 10 rows into that hash table when it actually HAS to build the whole has table before it can take any further steps – we don’t get 10 rows quicker and the rest more slowly; we just get 10 very slow rows.

Footnote:

It’s possible that this is an example of bug 9633142: (FIRST_ROWS OPTIMIZER DOES NOT PUSH PREDICATES INTO UNION INLINE VIEW) but that’s reported as fixed in 12c, with a couple of patches for 11.2.0.2/3. However, setting “_fix_control”=’4887636:off’, does bypass the problem. (The fix control, introduced in 11.1.0.6 has description: “remove restriction from first K row optimization”)

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.