Oracle Scratchpad

November 8, 2006

Subquery Selectivity

Filed under: CBO,Execution plans,Hints,Tuning — Jonathan Lewis @ 9:58 am BST Nov 8,2006

The optimizer in 10g (possibly release 2 only) has an important fix that may be enough to make some people decide to accelerate their migration programs. It’s to do with filter subqueries.

To get things started, here’s a little query with its execution path:

select
	par.small_vc1,
	chi.small_vc1
from
	parent	par,
	child	chi
where
	par.id1 between 1001 and 2200
and	chi.id1 = par.id1
;             

--------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |  3914 | 70452 |    73 |
|*  1 |  HASH JOIN           |             |  3914 | 70452 |    73 |
|*  2 |   TABLE ACCESS FULL  | PARENT      |  1201 | 10809 |     9 |
|*  3 |   TABLE ACCESS FULL  | CHILD       |  9611 | 86499 |    63 |
--------------------------------------------------------------------             

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("CHI"."ID1"="PAR"."ID1")
   2 - filter("PAR"."ID1">=1001 AND "PAR"."ID1"=1001 AND "CHI"."ID1"

There are 8 rows in the child table for every row in the parent table, and the obvious referential integrity constraint is in place. The optimizer has predicted correctly (with only a tiny error) that the supplied range for the parent ID should return 1,200 parent rows.

Let’s now add a couple of filter subqueries and see what happens in Oracle 9i. In this case, and for demonstration purposes only, I am using the same subquery to filter against the parent table twice.

select
	par.small_vc1,
	chi.small_vc1
from
	parent	par,
	child	chi
where
	par.id1 between 1001 and 2200
and	chi.id1 = par.id1
and	exists (
		select
			/*+ no_unnest qb_name(subq1) */
			null
		from	subtest	sub1
		where
			sub1.id1       = par.id1
		and	sub1.small_vc1 = par.small_vc1
		and	sub1.small_vc2 >= '2'
	)
and	exists (
		select
			/*+ no_unnest qb_name(subq2) */
			null
		from	subtest	sub2
		where
			sub2.id1       = par.id1
		and	sub2.small_vc1 = par.small_vc1
		and	sub2.small_vc2 >= '2'
	)
;           

You’ll notice that I’ve used the /*+ no_unnest */ hint in the subqueries – again this is for demonstration purposes; 9i tends to unnest a lot of subqueries, but there are cases where subqueries do not unnest, and the feature I want to demonstrate appears when unnesting does not happen.

Here’s the execution plan (generated by a call to dbms_xplan.display):

----------------------------------------------------------------------------
| Id  | Operation                    |  Name       | Rows  | Bytes | Cost  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |    10 |   180 |    64 |
|*  1 |  FILTER                      |             |       |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| CHILD       |     3 |    27 |     5 |
|   3 |    NESTED LOOPS              |             |    10 |   180 |    24 |
|*  4 |     TABLE ACCESS FULL        | PARENT      |     3 |    27 |     9 |
|*  5 |     INDEX RANGE SCAN         | CHI_PK      |     4 |       |     1 |
|*  6 |   TABLE ACCESS BY INDEX ROWID| SUBTEST     |     1 |    14 |     2 |
|*  7 |    INDEX UNIQUE SCAN         | SUB_PK      |     1 |       |     1 |
|*  8 |   TABLE ACCESS BY INDEX ROWID| SUBTEST     |     1 |    14 |     2 |
|*  9 |    INDEX UNIQUE SCAN         | SUB_PK      |     1 |       |     1 |
----------------------------------------------------------------------------          

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "SUBTEST" "SUB1"
              WHERE "SUB1"."ID1"=:B1 AND "SUB1"."SMALL_VC1"=:B2 AND
              "SUB1"."SMALL_VC2">='2') AND  EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM
              "SUBTEST" "SUB2" WHERE "SUB2"."ID1"=:B3 AND "SUB2"."SMALL_VC1"=:B4 AND
              "SUB2"."SMALL_VC2">='2'))
   4 - filter("SYS_ALIAS_4"."ID1">=1001 AND "SYS_ALIAS_4"."ID1"=1001 AND "CHI"."ID1"='2')
   7 - access("SUB1"."ID1"=:B1)
   8 - filter("SUB2"."SMALL_VC1"=:B1 AND "SUB2"."SMALL_VC2">='2')
   9 - access("SUB2"."ID1"=:B1)          

Notice particularly (lines 3, 4 and 5) that the optimizer has decided to do a nested loop to join the parent and child tables. Moreover, the predicted cardinality (“Rows” column in the execution plan) for the full scan of the parent table is just three rows !
It’s not surprising that a prediction of three rows for the parent has resulted in a nested loop indexed access into the child table. But why did the optimizer decide that the cardinality would be three ?

The effect of an existence subquery – exists (select …) – is to impose a 5% selectivity on the driving table. We have two such subqueries, so we take 5% of 5%, and since we started with 1,200 rows selected from the parent table, the resulting cardinality is 3. (1,200 * 0.05 * 0.05).

But whatever the optimizer does with the calculations, things get worse when you realise that subqueries are, by default,  postponed to run at the end of the execution path. So the optimizer behaves as if the subqueries have run before joining to the child table – but the produces a path that runs the subqueries after joining to the child table.

Now take a look at the 10g execution plan:

-------------------------------------------------------------------------
| Id  | Operation                     | Name    | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |     1 |    18 |  3990 |
|*  1 |  FILTER                       |         |       |       |       |
|*  2 |   HASH JOIN                   |         |  3914 | 70452 |    73 |
|*  3 |    TABLE ACCESS FULL          | PARENT  |  1201 | 10809 |     9 |
|*  4 |    TABLE ACCESS FULL          | CHILD   |  9611 | 86499 |    63 |
|*  5 |   TABLE ACCESS BY INDEX ROWID | SUBTEST |     1 |    14 |     2 |
|*  6 |    INDEX UNIQUE SCAN          | SUB_PK  |     1 |       |     1 |
|*  7 |    TABLE ACCESS BY INDEX ROWID| SUBTEST |     1 |    14 |     2 |
|*  8 |     INDEX UNIQUE SCAN         | SUB_PK  |     1 |       |     1 |
-------------------------------------------------------------------------         

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "SUBTEST" "SUB"
              WHERE "SUB"."ID1"=:B1 AND "SUB"."SMALL_VC1"=:B2 AND
              "SUB"."SMALL_VC2">='2') AND  EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM
              "SUBTEST" "SUB" WHERE "SUB"."ID1"=:B3 AND "SUB"."SMALL_VC1"=:B4 AND
              "SUB"."SMALL_VC2">='2'))
   2 - access("CHI"."ID1"="PAR"."ID1")
   3 - filter("PAR"."ID1">=1001 AND "PAR"."ID1"=1001 AND "CHI"."ID1"='2')
   6 - access("SUB"."ID1"=:B1)
   7 - filter("SUB"."SMALL_VC1"=:B1 AND "SUB"."SMALL_VC2">='2')
   8 - access("SUB"."ID1"=:B1)         

Line 3 shows a predicted cardinality of 1,201 for the full scan of the parent table – which is correct and puts us back to the hash join to the child table. The arithmetical effect of the subqueries is now applied at the point in the execution path where the subqueries will actually run; which means that after an upgrade to 10g you are less likely to find the optimizer using insanely inefficient nested loops and indexed access paths on queries which you know to be high volume but the optimizer expects to be low-volume because of this quirk of subquery calculations.

push_subq
As a side note – this type of problem can be fixed by using the push_subq hint, which tells Oracle to execute the subquery as early as possible. If the subqueries operate before the join then the arithmetic and the run-time activity again become self-consistent, and (in this case) the choice to use a nested loop with index is appropriate.

However – a little word of warning: if you currently use the push_subq hint in 9i you will have to update your code to make it work again in 10g.

The push_subq goes into the main (containing) query in 9i, and all subqueries are then pushed. In 10g you put the hint into just those subqueries that you want pushed. So in 10g, you could push neither, one, or both of the subqueries.

There’s actually an awful lot more I could say about this single example of subqueries but I’ll leave you with one last thought – which I will revisit in another blog entry. If you push both subqueries, you will find that the default display of the execution plan in 10g (at least until 10.2.0.1) is wrong. (In fact, if you compare the 9i and 10g plans carefully, you’ll notice an error in the last two lines of this 10g plan as well [now described in a note written in Jan 2008]).

5 Comments »

  1. Excellent! We use many “exists” subqueries where I work and the cost calculations have never made sense. Now I know why, and we can respond appropriately.

    Comment by Charlie B. — November 8, 2006 @ 7:12 pm BST Nov 8,2006 | Reply

  2. I just beat my head against the wall for a couple days last week trying to tune and EXISTS subquery just like this in 9i. Great explanation, thanks!

    Comment by Brandon Allen — July 3, 2007 @ 6:11 pm BST Jul 3,2007 | Reply

  3. This is a good tip….

    Comment by Vijaya Kadiyala — December 16, 2009 @ 9:20 pm BST Dec 16,2009 | Reply

  4. [...] of posts already; and shown how useful it can be in two examples: understanding a problem with filter subquery selectivity and understanding why some Cartesian merge joins were appearing [...]

    Pingback by dbms_xplan – again « Oracle Scratchpad — February 11, 2010 @ 8:13 am BST Feb 11,2010 | Reply

  5. [...] an item I thought I’d published a few years ago as a follow-up to an article on a 10g bug-fix for subquery selectivity. I was reminded of my oversight when a question came up on OTN that looked [...]

    Pingback by Subquery Selectivity – 2 « Oracle Scratchpad — January 17, 2011 @ 6:39 pm BST Jan 17,2011 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,013 other followers