Oracle Scratchpad

March 12, 2012

First_rows hash

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 1:49 am GMT Mar 12,2012

Just like my posting on an index hash, this posting is about a problem as well as being about a hash join. The article has its roots in a question posted on the OTN database forum, where a user has shown us the following execution plan:

---------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                         |    14 |   896 |    63   (7)| 00:00:01 |
|   1 |  SORT GROUP BY                |                         |    14 |   896 |    63   (7)| 00:00:01 |
|   2 |   NESTED LOOPS                |                         |       |       |            |          |
|   3 |    NESTED LOOPS               |                         |    14 |   896 |    62   (5)| 00:00:01 |
|*  4 |     HASH JOIN                 |                         |    14 |   280 |    48   (7)| 00:00:01 |
|   5 |      VIEW                     | V_SALES_ALL             |   200 |  1800 |     4   (0)| 00:00:01 |
|   6 |       UNION-ALL               |                         |       |       |            |          |
|   7 |        INDEX FAST FULL SCAN   | PRODUCTS_DATES_IDX      |   100 |   900 |     2   (0)| 00:00:01 |
|   8 |        INDEX FAST FULL SCAN   | PRODUCTS_DATES_IDX_HARD |   100 |   900 |     2   (0)| 00:00:01 |
|*  9 |      VIEW                     | index$_join$_003        |  2238 | 24618 |    44   (7)| 00:00:01 |
|* 10 |       HASH JOIN               |                         |       |       |            |          |
|* 11 |        INDEX RANGE SCAN       | PRODUCTS_GF_INDEX2      |  2238 | 24618 |     6   (0)| 00:00:01 |
|  12 |        INDEX FAST FULL SCAN   | PRODUCTS_GF_PK          |  2238 | 24618 |    45   (3)| 00:00:01 |
|* 13 |     INDEX UNIQUE SCAN         | DATES_PK                |     1 |       |     0   (0)| 00:00:01 |
|  14 |    TABLE ACCESS BY INDEX ROWID| DATES                   |     1 |    44 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------


Note particularly that line 6 is a union all of two index fast full scans that are predicted to produce 100 rows each, and that the resulting 200 rows are passed through a view to become the build table for a hash join. The problem, as the OP points out, is that the estimates are very bad and the indexes return 1.7 million and 78 million rows respectively, leading to a very large hash table and a very slow query.

The problem is that the system is running with the optimizer_mode set to first_rows_100 – and there’s a bug in the code that has made the optimizer “forget” that it can’t stop after 100 rows when it’s supposed to be building a hash table. Here’s a test case to reproduce the problem. I’ve run this on Oracle 11.1.0.7 using my standard setup (8KB blocks, locally managed tablespace with 1MB uniform extents, freelist management, and system statistics disabled):

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 10000)
select
	rownum					id,
	trunc(dbms_random.value(1,1000))	n1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1000000;

create index t1_n1 on t1(id, n1);

create table t2
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 10000)
select
	rownum					id,
	trunc(dbms_random.value(10001,20001))	x1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1000000;

create index t2_i1 on t2(x1);

-- collect stats on both tables, no histograms

create or replace view  v1
as
select
	id, n1, small_vc, padding
from	t1
where	n1 between 101 and 300
union all
select
	id, n1, small_vc, padding
from	t1
where	n1 between 501 and 700
;

alter session set optimizer_mode = first_rows_10;

explain plan for
select
	v1.small_vc,
	v1.n1
from
	v1,
	t2
where
	t2.id = v1.id
and	t2.x1 between 1000 and 5000
;

select * from table(dbms_xplan.display(null,null,'outline'));

I’ve used the same table twice in my view – and the final query is actually going to use tablescans on the two accesses to the t1 table, ignoring the index which I happen to have left in place from my first attempts to model the original scenario. With this data in place, and running with first_rows_10, the execution plan is as follows:

-----------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    50 |   108 |
|*  1 |  HASH JOIN                    |       |     1 |    50 |   108 |
|   2 |   VIEW                        | V1    |    24 |   960 |     4 |
|   3 |    UNION-ALL                  |       |       |       |       |
|*  4 |     TABLE ACCESS FULL         | T1    |    12 |   240 |     2 |
|*  5 |     TABLE ACCESS FULL         | T1    |    12 |   240 |     2 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T2    |   100 |  1000 |   103 |
|*  7 |    INDEX RANGE SCAN           | T2_I1 |   100 |       |     3 |
-----------------------------------------------------------------------

As you can see, the optimizer thinks it will be very quick and cheap to do two full tablescans of t1 to collect a total of 24 rows to feed into a hash table before doing an index range scan against t2 to collect 100 rows to probe the hash table. But even though it may be quick and cheap for Oracle to collect the first 12 rows from each tablescan, at runtime we actually have to complete both tablescans and build the entire hash table before we can start the index range scan of t2. To see how nasty this plan will be I generated the outline for the query, which gave me all the hints I needed to reproduce the plan while running all_rows – so here’s the plan again, giving you a better indication of how expensive it will be to build the hash table:

------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes |TempSpc| Cost  |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |   200 | 11600 |       |  5854 |
|*  1 |  HASH JOIN                   |       |   200 | 11600 |    23M|  5854 |
|   2 |   VIEW                       | V1    |   402K|    18M|       |  5382 |
|   3 |    UNION-ALL                 |       |       |       |       |       |
|*  4 |     TABLE ACCESS FULL        | T1    |   201K|  3933K|       |  2691 |
|*  5 |     TABLE ACCESS FULL        | T1    |   201K|  3933K|       |  2691 |
|   6 |   TABLE ACCESS BY INDEX ROWID| T2    |   100 |  1000 |       |   103 |
|*  7 |    INDEX RANGE SCAN          | T2_I1 |   100 |       |       |     3 |
------------------------------------------------------------------------------

See how expensive the two full tablescans are, relative to the cost of getting the 100 rows held by t2. The fastest way of running this queryto get the first 10 rows is probably to use t2 as the build table of a hash join (so that we can stop the tablescans of t1 early).

Remember – if you’re running first_rows_N as the optimizer_mode you’re in the minority, so you’re running code that has been subject to less testing than the rest of the optimizer code: you’re likely to run into some very odd bugs. Despite comments you might hear on the internet from time to time along the lines of “first_rows_N is for OLTP systems” (that’s not really correct, by the way), it’s probably best to stick with all_rows unless you can come up with a very sound argument for doing otherwise.

Footnote:
This looks as if it mighe be bug 9633142: FIRST_ROWS OPTIMIZER DOES NOT PUSH PREDICATES INTO UNION INLINE VIEW; and if it is not the same bug it is closely related to it. The bug is reported as fixed in 12.1

Update (August 2015)

The problem still reproduces in 12.1.0.2

 

21 Comments »

  1. Reblogged this on lava kafle kathmandu nepal.

    Comment by lkafle — March 12, 2012 @ 3:34 am GMT Mar 12,2012 | Reply

  2. Jonathan

    you’ve missed “less than” in data generation scripts.
    I was able to reproduce similar plans using 10K & 1M rows in t1 & t2 respectively. However, I don’t see interesting word BATCHED on the 6th plan line. So my plan looks like:

    Plan hash value: 3636653528
    ----------------------------------------------------------------------
    | Id  | Operation                    | Name  | Rows  | Bytes | Cost  |
    ----------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |       |     1 |    49 |   108 |
    |*  1 |  HASH JOIN                   |       |     1 |    49 |   108 |
    |   2 |   VIEW                       | V1    |    26 |  1014 |     4 |
    |   3 |    UNION-ALL                 |       |       |       |       |
    |*  4 |     TABLE ACCESS FULL        | T1    |    13 |   247 |     2 |
    |*  5 |     TABLE ACCESS FULL        | T1    |    13 |   247 |     2 |
    |   6 |   TABLE ACCESS BY INDEX ROWID| T2    |   100 |  1000 |   103 |
    |*  7 |    INDEX RANGE SCAN          | T2_I1 |   100 |       |     3 |
    ----------------------------------------------------------------------
    Outline Data
    -------------
      /*+
          BEGIN_OUTLINE_DATA
          FULL(@"SEL$2" "T1"@"SEL$2")
          FULL(@"SEL$3" "T1"@"SEL$3")
          USE_HASH(@"SEL$1" "T2"@"SEL$1")
          LEADING(@"SEL$1" "V1"@"SEL$1" "T2"@"SEL$1")
          INDEX_RS_ASC(@"SEL$1" "T2"@"SEL$1" ("T2"."X1"))
          NO_ACCESS(@"SEL$1" "V1"@"SEL$1")
          OUTLINE_LEAF(@"SEL$1")
          OUTLINE_LEAF(@"SET$1")
          OUTLINE_LEAF(@"SEL$3")
          OUTLINE_LEAF(@"SEL$2")
          FIRST_ROWS(10)
          OPT_PARAM('_fix_control' '6776808:0')
          OPT_PARAM('_optimizer_use_feedback' 'false')
          OPT_PARAM('_connect_by_use_union_all' 'old_plan_mode')
          OPT_PARAM('_optimizer_connect_by_cost_based' 'false')
          OPT_PARAM('_optimizer_cost_model' 'io')
          OPT_PARAM('_b_tree_bitmap_plans' 'false')
          DB_VERSION('11.2.0.3')
          OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
          IGNORE_OPTIM_EMBEDDED_HINTS
          END_OUTLINE_DATA
      */
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       1 - access("T2"."ID"="V1"."ID")
       4 - filter("N1">=101 AND "N1"<=300)
       5 - filter("N1">=501 AND "N1"<=700)
       7 - access("T2"."X1">=1000 AND "T2"."X1"<=5000)
    Note
    -----
       - cpu costing is off (consider enabling it)
    

    For those who want to copy-paste test case: http://pastebin.com/raw.php?i=XKdYKB6d

    Comment by Timur Akhmadeev — March 12, 2012 @ 7:27 am GMT Mar 12,2012 | Reply

    • Timur,

      Thanks, now fixed.

      I don’t know why my “<=" are disappearing when I cut and paste these things, I haven't changed the way I prepare the texts for months. I've had a couple real messes when I've tried to add a comment or correction to some older notes – possibly WordPress has done something behind the scenes that means I need to watch more carefully what finally arrives. The "batched" is from another experiment I was doing at the same time – I've corrected the plan, and I copied in the wrong output.

      Comment by Jonathan Lewis — March 12, 2012 @ 7:47 am GMT Mar 12,2012 | Reply

      • I’ve seen something similar if I switch between HTML and Visual editing mode couple of times – I think WP performs unnecessary escaping sometimes.

        Comment by Timur Akhmadeev — March 12, 2012 @ 9:32 am GMT Mar 12,2012 | Reply

  3. 10.2.0.5 has a different plan though, with the presence of ‘UNION ALL PUSHED PREDICATE’ appearing for the view V1.

    Comment by Andy — March 12, 2012 @ 8:41 am GMT Mar 12,2012 | Reply

    • Andy,
      I’m going to guess that you forgot to set the optimizer_mode to first_rows_10, and got a plan like this:

      ------------------------------------------------------------------------
      | Id  | Operation                      | Name  | Rows  | Bytes | Cost  |
      ------------------------------------------------------------------------
      |   0 | SELECT STATEMENT               |       |    40 |  1720 |   903 |
      |   1 |  NESTED LOOPS                  |       |    40 |  1720 |   903 |
      |   2 |   TABLE ACCESS BY INDEX ROWID  | T2    |   100 |  1500 |   103 |
      |*  3 |    INDEX RANGE SCAN            | T2_I1 |   100 |       |     3 |
      |   4 |   VIEW                         | V1    |     1 |    28 |     8 |
      |   5 |    UNION ALL PUSHED PREDICATE  |       |       |       |       |
      |   6 |     TABLE ACCESS BY INDEX ROWID| T1    |     1 |    20 |     4 |
      |*  7 |      INDEX RANGE SCAN          | T1_N1 |     1 |       |     3 |
      |   8 |     TABLE ACCESS BY INDEX ROWID| T1    |     1 |    20 |     4 |
      |*  9 |      INDEX RANGE SCAN          | T1_N1 |     1 |       |     3 |
      ------------------------------------------------------------------------
      

      Comment by Jonathan Lewis — March 12, 2012 @ 8:49 am GMT Mar 12,2012 | Reply

      • Hi Jonathan,

        Yes, that’s the plan, but only when I set it to ‘all_rows’. If it is first_rows_10, the plan is still the same, except the rows estimate at line 2 becomes 28.

        I can only get the ‘faulty’ plan (ie, full table scan on T1) if T1 is built with10,000 rows. At 1 million rows for T1, I can’t get this bad plan.

        Am I missing something?

        Comment by Andy — March 12, 2012 @ 9:02 am GMT Mar 12,2012 | Reply

        • I tried Timur’s script on 10.2.0.5 , Solaris 10, and the plan is ‘correct'( now showing T2 as the build table):

                                                                               
          --------------------------------------------------------------------------------------
          | Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT             |       |     1 |    43 |   173   (2)| 00:00:03 |
          |*  1 |  HASH JOIN                   |       |     1 |    43 |   173   (2)| 00:00:03 |
          |   2 |   TABLE ACCESS BY INDEX ROWID| T2    |   100 |  1000 |   103   (0)| 00:00:02 |
          |*  3 |    INDEX RANGE SCAN          | T2_I1 |   100 |       |     3   (0)| 00:00:01 |
          |   4 |   VIEW                       | V1    |  4028 |   129K|    69   (2)| 00:00:01 |
          |   5 |    UNION-ALL                 |       |       |       |            |          |
          |*  6 |     TABLE ACCESS FULL        | T1    |  2014 | 38266 |    35   (3)| 00:00:01 |
          |*  7 |     TABLE ACCESS FULL        | T1    |  2014 | 38266 |    35   (3)| 00:00:01 |
          --------------------------------------------------------------------------------------
          
          

          Comment by Andy — March 12, 2012 @ 9:20 am GMT Mar 12,2012

        • Andy,

          No, you’re not missing anything. I’ve just run the test against 10.2.0.3 and 10.2.0.5 and got the correct (i.e. view with pushed predicate) plan. However 11.2.0.3 is consistent with 11.1.0.7, producing the wrong plan.

          So it looks like it may be a bug introduced in 11g.

          Comment by Jonathan Lewis — March 12, 2012 @ 9:34 am GMT Mar 12,2012

        • Andy, most likely you are running with altered optimizer_index_cost_adj/optimizer_index_caching settings (like me). I’ve added ALTER SESSION to change these settings in the script to default values of 100/0.

          Comment by Timur Akhmadeev — March 12, 2012 @ 9:36 am GMT Mar 12,2012

  4. This seems to be a ubiquitous problem with FIRST_ROWS; it appears to act like an /*+ INDEX (T) */ (no index specified), i.e. it uses an index if at all possible, no matter how bad.

    Here’s a more pathological example, (Kyte’s pagination); note the completely inappropriate use of SH.CUSTOMERS_GENDER_BIX :

    SELECT *
      FROM (SELECT /*+ FIRST_ROWS */
                  ROWNUM AS rnum, cust_last_name
              FROM (  SELECT cust_last_name
                        FROM customers
                       WHERE CUST_LAST_NAME BETWEEN 'A%' AND 'D%'
                    ORDER BY cust_last_name) a
             WHERE ROWNUM = :1; -- set to 10000
    
    Plan
    SELECT STATEMENT  HINT: FIRST_ROWSCost: 2 K  Bytes: 211 K  Cardinality: 6 K  CPU Cost: 53 M  IO Cost: 2 K  							
    	7 VIEW SH. Filter Predicates: "RNUM">=10000  Cost: 2 K  Bytes: 211 K  Cardinality: 6 K  CPU Cost: 53 M  IO Cost: 2 K  						
    		6 COUNT STOPKEY  Filter Predicates: ROWNUM<=10050  					
    			5 VIEW SH. Cost: 2 K  Bytes: 133 K  Cardinality: 6 K  CPU Cost: 53 M  IO Cost: 2 K  				
    				4 SORT ORDER BY STOPKEY  Filter Predicates: ROWNUM<=10050  Cost: 2 K  Bytes: 48 K  Cardinality: 6 K  CPU Cost: 53 M  IO Cost: 2 K  			
    					3 TABLE ACCESS BY INDEX ROWID TABLE SH.CUSTOMERS Filter Predicates: "CUST_LAST_NAME"='A%'  Cost: 2 K  Bytes: 48 K  Cardinality: 6 K  CPU Cost: 35 M  IO Cost: 2 K  		
    						2 BITMAP CONVERSION TO ROWIDS  	
    							1 BITMAP INDEX FULL SCAN INDEX (BITMAP) SH.CUSTOMERS_GENDER_BIX
    

    Comment by Jeff Jacobs, Oracle Ace — March 12, 2012 @ 3:15 pm GMT Mar 12,2012 | Reply

    • There are huge differences between FIRST_ROWS(N) algorithm and old fashioned FIRST_ROWS. FIRST_ROWS tries to bypass physical sort always when possible. It could result into unexpected index use.

      Regards

      Pavol Babel

      Comment by Pavol Babel — March 29, 2012 @ 4:23 pm BST Mar 29,2012 | Reply

  5. My example was from R12.01.

    Comment by Jeff Jacobs, Oracle Ace — March 12, 2012 @ 3:16 pm GMT Mar 12,2012 | Reply

  6. Why do you always disable (you mean delete?) system statistics in your tests? We all want to have repeatable behaviour in our tests, why does this help? In your book about the cost optimizer you disabled “system statistics (CPU costing)”. CPU costing? How is CPU costing related to this?

    Comment by Xenofon Grigoriadis — March 13, 2012 @ 11:59 am GMT Mar 13,2012 | Reply

    • Xenofon,

      “CPU Costing” and “System statistics” are synonyms for the same mechanism. Because of the version-dependency in the rules for using system statistics, I call dbms_stats.delete_system_stats and set _optimizer_cost_model to “io” to ensure that I’ve disabled system statistics.

      As far as repeatable results go – the use of system statistics will change the cost of a tablescan relative to an index access path in ways that are dependent on the actual numbers in the sys.aux_stats$ table, getting rid of them will ensure that people don’t see different plans because they have different stats.

      There are cases though (distributed queries come to mind) where certain features cannot appear unless system statistics are enabled, so sometimes I will explicitly set some values for the stats using dbms_stats.set_system_stats().

      Comment by Jonathan Lewis — March 31, 2012 @ 6:31 am BST Mar 31,2012 | Reply

  7. A little more experimentation shows that the problem I describe is apparenly only for FIRST_ROWS hint, without a number qualifier, i.e. the “_N” (or the older “(N)”) when hinting a query.

    Comment by Jeff Jacobs, Oracle Ace — March 13, 2012 @ 6:16 pm GMT Mar 13,2012 | Reply

  8. Hi Jeff

    SELECT *
    FROM (SELECT /*+ FIRST_ROWS */
    ROWNUM AS rnum, cust_last_name
    FROM ( SELECT cust_last_name
    FROM customers
    WHERE CUST_LAST_NAME BETWEEN ‘A%’ AND ‘D%’
    ORDER BY cust_last_name) a
    WHERE ROWNUM = :1; — set to 10000
    

    Could you please provide the full test case for the explain plan that has got you the problem with first_rows hint.?

    Comment by George Joseph — March 15, 2012 @ 1:47 pm GMT Mar 15,2012 | Reply

    • The test case is the query in my original comment against the standard SH demo schema. Note that this “bug” appears to be only for FIRST_ROWS, without the number of rows being specified, either via FIRST_ROWS_n, or FIRST_ROWS (n). The latter generate appropriate execution plans. Also, “FIRST_ROWS” by itself is apparently deprecated in 11GR2.

      Comment by Jeff Jacobs, Oracle Ace — March 15, 2012 @ 4:34 pm GMT Mar 15,2012 | Reply

      • Jeff,

        Despite appearing in recent code released by Oracle Corp. the old first_rows hint was originally labelled in 9i as “available only for backwards compatibility”. As Pavol indicated, some of its paths (or avoidance of paths) were dictated by simple rules – so it’s fairly easy to find cases where first_rows and first_rows(n) – even with n = 1 – differ dramatically in their outcome.

        Comment by Jonathan Lewis — March 31, 2012 @ 6:24 am BST Mar 31,2012 | Reply

    • Jeff’s example was messed up by the way that WordPress reads the > and < symbols as markers for html directives. I think a number of lines of the various inline views were deleted – leaving the oddity of a statement ending with “rownum = :1”.

      Comment by Jonathan Lewis — March 31, 2012 @ 6:18 am BST Mar 31,2012 | Reply

  9. […] posting by Jonathan Lewis is about a problem as well as being about a hash […]

    Pingback by Log Buffer #263, A Carnival of the Vanities for DBAs | The Pythian Blog — March 16, 2012 @ 6:02 am GMT Mar 16,2012 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: