Oracle Scratchpad

January 13, 2017

use_nl hint

Filed under: Hints,Oracle — Jonathan Lewis @ 8:52 am GMT Jan 13,2017

In response to a recent lamentation from Richard Foote about the degree of ignorance regarding the clustering_factor of indexes I commented on the similar level of understanding of a specific hint syntax, namely use_nl(a b) pointing out that this does not mean “do a nested loop from a to b”. My comment was underscored by a fairly prompt response asking what the hint did mean.

Surprisingly, although I’ve explained it many times over the last couple of decades (here’s one from 10 years ago), I couldn’t find an explanation on my blog though I did find a blog note where I’d made a passing comment about the equivalent misunderstanding of the use_hash(a b) syntax.

The misunderstanding is not entirely surprising since for many years the Oracle manuals seemed to suggest (in their examples) that the hint did have a multi-table meaning and it wasn’t until 10g that the manual gave an explicit statement of the single-table nature of the hint. The hint /*+ use_nl(a b) */ is a short-hand for the pair of hints /*+ use_nl(a)  use_nl(b) */ it doesn’t say anything about whether a and b should be joined, or in what order. If you want to guarantee that a and b will be joined in that order by a nested loop you will have to work a lot harder with your hints – and almost certainly need to make use of the /+ leading() */ hint.

Consider the following query (I’ll put the table creation code at the end of the article if you want to experiment):

select
	/*+ use_nl(a b) */
	a.v1, b.v1, c.v1, d.v1
from
	a, b, c, d
where
	d.n100 = 0
and	a.n100 = d.id
and	b.n100= a.n2
and	c.id = a.id
;

Only one of the tables a and b can be the first table in the final execution plan so one of them will be “the next table in the join order” at some point, so this hint will guarantee that one of the tables will be the inner table of a nested loop join. Here’s the plan I happened to get with my data, indexing, version (11.2.0.4), etc.:

---------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      | 20000 |  1347K| 30125   (1)| 00:00:02 |
|   1 |  HASH JOIN                     |      | 20000 |  1347K| 30125   (1)| 00:00:02 |
|   2 |   TABLE ACCESS FULL            | C    | 10000 |   146K|    26   (4)| 00:00:01 |
|   3 |   HASH JOIN                    |      | 20000 |  1054K| 30098   (1)| 00:00:02 |
|   4 |    TABLE ACCESS FULL           | D    |   100 |  1800 |    26   (4)| 00:00:01 |
|   5 |    NESTED LOOPS                |      | 20000 |   703K| 30072   (1)| 00:00:02 |
|   6 |     NESTED LOOPS               |      | 20000 |   703K| 30072   (1)| 00:00:02 |
|   7 |      TABLE ACCESS FULL         | B    | 10000 |   136K|    26   (4)| 00:00:01 |
|   8 |      INDEX RANGE SCAN          | A_I2 |     2 |       |     1   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID| A    |     2 |    44 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

In this case it’s table a that ends up in a position to be the inner table of a nested loop join.

You may be wondering why there seems to be a hash join into b when we’ve hinted a nested loop join – but the join order that Oracle is using is B -> A -> D -> C with a swap_join_inputs(d) swap_join_inputs(d), so b is never “the next table in the join order”.

If you want an even more confusing (at first sight) plan here’s the plan I got if I changed the one hint to /*+ use_nl(a) */


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      | 20000 |  1347K|   105   (5)| 00:00:01 |
|   1 |  HASH JOIN           |      | 20000 |  1347K|   105   (5)| 00:00:01 |
|   2 |   TABLE ACCESS FULL  | B    | 10000 |   136K|    26   (4)| 00:00:01 |
|   3 |   HASH JOIN          |      | 10000 |   537K|    78   (4)| 00:00:01 |
|   4 |    TABLE ACCESS FULL | C    | 10000 |   146K|    26   (4)| 00:00:01 |
|   5 |    HASH JOIN         |      | 10000 |   390K|    52   (4)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| D    |   100 |  1800 |    26   (4)| 00:00:01 |
|   7 |     TABLE ACCESS FULL| A    | 10000 |   214K|    26   (4)| 00:00:01 |
-----------------------------------------------------------------------------

This plan really looks as if Oracle should have done a nested loop into a but didn’t. Again appearanced are deceptive thanks to the effects of swap_join_inputs(): the join order here is A -> D -> C -> B (note that we don’t have a use_nl(b) hint in this example).

If you want a plan where the optimizer produces a nested loop join between a and b you’ll need to put in a leading() hint which places b immediately after a somewhere in the list of tables with just use_nl(b) being sufficient to enforce the join method. Here, for example, is the plan with hints /*+ leading(d a b c) use_nl(b) */ for my data set:


----------------------------------------------------------------------------------------
| Id  | Operation                     | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        | 20000 |  1347K| 30164   (1)| 00:00:02 |
|   1 |  HASH JOIN                    |        | 20000 |  1347K| 30164   (1)| 00:00:02 |
|   2 |   TABLE ACCESS FULL           | C      | 10000 |   146K|    26   (4)| 00:00:01 |
|   3 |   NESTED LOOPS                |        | 20000 |  1054K| 30137   (1)| 00:00:02 |
|   4 |    NESTED LOOPS               |        |  1000K|  1054K| 30137   (1)| 00:00:02 |
|   5 |     HASH JOIN                 |        | 10000 |   390K|    52   (4)| 00:00:01 |
|   6 |      TABLE ACCESS FULL        | D      |   100 |  1800 |    26   (4)| 00:00:01 |
|   7 |      TABLE ACCESS FULL        | A      | 10000 |   214K|    26   (4)| 00:00:01 |
|   8 |     INDEX RANGE SCAN          | B_I100 |   100 |       |     1   (0)| 00:00:01 |
|   9 |    TABLE ACCESS BY INDEX ROWID| B      |     2 |    28 |   101   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Notice, yet again, Oracle has done hash join to c with a swap_join_inputs().

Creation Script:


create table a
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4
)
select
	rownum				id,
	mod(rownum,5000)		n2,
	mod(rownum,100)			n100,
	lpad(rownum,10,'0')		v1,
	lpad('x',100,'x')		padding
from
        generator       v1
;

create table b nologging as select * from a;
create table c nologging as select * from a;
create table d nologging as select * from a;

alter table a add constraint a_pk primary key(id);
alter table b add constraint b_pk primary key(id);
alter table c add constraint c_pk primary key(id);
alter table d add constraint d_pk primary key(id);

create index a_i2 on a(n2) nologging;
create index b_i2 on b(n2) nologging;
create index c_i2 on c(n2) nologging;
create index d_i2 on d(n2) nologging;

create index a_i100 on a(n100) nologging;
create index b_i100 on b(n100) nologging;
create index c_i100 on c(n100) nologging;
create index d_i100 on d(n100) nologging;
begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'A',
		method_opt	 => 'for all columns size 1'
	);
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'B',
		method_opt	 => 'for all columns size 1'
	);
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'C',
		method_opt	 => 'for all columns size 1'
	);
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'D',
		method_opt	 => 'for all columns size 1'
	);
end;
/

4 Comments »

  1. Hi, Jonathan.

    Thanks for this article. It’s pretty illuminating.
    However, due to my experiences at my previous customer, one of the largest Dutch energy companies, I must admit I am not a big fan of the nested loop join.
    It involved a datawarehouse, totalling over 15 TB, on exadata (X2 initially, but the behaviour didn’t change on X5).
    Our ETL tool was Oracle Warehouse Builder and every night the DWH load involved some 1000 OWB mappings, usually doing something like:

    merge into table a
    using select col1...colx 
    from table 1
    join table 2
    left join table 3
    etc.
    

    We noticed that every now and then relatively simple mappings (select data from 2 or 3 tables and insert this into a truncated target table), would run for hours and then crash, because of the temp space being full (380 GB). We also noticed that, while still running, the session browser indicated it was doing nested loops.
    This crashing was strange behaviour, because the mapping only had to handle some 2 or 3 million records, never enough to generate 300+ GB of data…

    Once or twice it sufficed to refresh the statistics of the tables involved to get oracle to choose a hash join, finishing in 3 or 4 minutes tops.
    Other times we had to rewrite the code to force the optimizer to use a hash join.
    At moments like that I really wished there was a db parameter to disable nested loops altogether, because, in my experience, not talking about queries handling a couple of dozen records in sub-second range, but queries/DML involving 1 million + (up to 100 million) records in response times like between 1 minute and 1 hour, I really never saw any advantage of a nested loop join over a hash join.

    So, why did oracle sometimes seemed to lose its way? Because of the truncate/insert? Because the source ‘table’ was a view? Because of poorly kept statistics (It’s possible that the stats job just probed a target table at the exact instant it had been truncated and so produced a statistic of 0 rows present). Because of a bug? Because of exadata? Other?

    Kind regards,

    Rob Hendrikx

    Comment by Rob Hendrikx — February 22, 2017 @ 1:37 pm GMT Feb 22,2017 | Reply

  2. Rob,

    It’s always “horses for courses” – a general pattern of Hash Joins for DW and Nested Loops for OLTP isn’t surprising, of course, though required volume, data clustering, indexing, complexity of reporting etc. will inevitably result in some cases that obviously ought to diverge from the common pattern; and sometimes you can find that the overheads of parallel hash joins are such that a parallel nested loop can be engineered to be faster.

    From your thumbnail sketch I would suspect problems with statistics – coincidentally I’ve just handed back to the UKOUG editorial board the proof pages for an article of the commonest problems seen with statistics: queries moving out of range of current stats, instability of histograms, and effects of column groups being ignored are possibly three main reasons for massive changes in performance of a query for “no apparent reason”.

    The problem of only 2 or 3 million rows producing 300+GB of temp space when using nested loops is deceptive – inevitably. Where aggregates are involved the optimizer may have decided to do group by placement and aggregate late when (for example) you have a view that aggregates early. A small input with a small output could result in a massive intermediate aggregation task if the optimizer picks the wrong join order and group by placement.

    The first diagnostic check, in all cases, is to compare the row counts predicted by the optimizer with the estimates you’ve made for each step of the query.

    Comment by Jonathan Lewis — February 24, 2017 @ 11:11 am GMT Feb 24,2017 | Reply

  3. I have heard issues from some co-workers that USE_NL is not working correctly with Exadata. Any common experience on your end?

    Comment by Barry Chase — August 18, 2017 @ 8:10 pm GMT Aug 18,2017 | Reply

    • Barry,

      I think we need to hear an explanation of what “not working correctly” means – either it works or it doesn’t.

      Given that the comment is made with reference to Exadata, though, I’ll take a guess that their complaint is one of two things that I’ve mentioned in the past:

      a) the data doesn’t seem to appear in the right order (which is a recent performance enhancement)
      b) the nested loop join is much slower and more CPU intensive than expected

      Problem (b) relates to the code of select a single row by index from a compression unit, and is described in this blog note.

      Problem (a) is the side effect of the fix for problem (b) where Oracle does the index range scan on the second table then sorts by rowid before accessing the table to reduce the number of times it decompresses compression units. The topic gets a mention in my little history of the nested loop join.

      If your co-workers have found a new complaint I’d be interested in hearing a little more about exactly what it is.

      Comment by Jonathan Lewis — August 19, 2017 @ 3:20 pm GMT Aug 19,2017 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Powered by WordPress.com.