In response to a recent lamentation from Richard Foote about the degree of ignorance regarding the clustering_factor of indexes I commented on the similar degree of ignorance regarding the syntax of a specific hint, namely use_nl(a b), pointing out that this does not mean “do a nested loop from a to b”. My comment was underscored by a fairly prompt response asking what the hint did mean.
Surprisingly, although I’ve explained it many times over the last couple of decades (here’s one from 2006), I couldn’t find an explanation on my blog though I did find a blog note where I’d made a passing comment about the equivalent misunderstanding of the use_hash(a b) syntax.
The misunderstanding is not entirely surprising since for many years the Oracle manuals seemed to suggest (in their examples) that the hint did have a multi-table meaning and it wasn’t until 10g that the manual gave an explicit statement of the single-table nature of the hint. The hint /*+ use_nl(a b) */ is a short-hand for the pair of hints /*+ use_nl(a) use_nl(b) */ it doesn’t say anything about whether a and b should be joined, or in what order. If you want to guarantee that a and b will be joined in that order by a nested loop you will have to work a lot harder with your hints – and almost certainly need to make use of the /+ leading() */ hint.
Consider the following query (I’ll put the table creation code at the end of the article if you want to experiment):
select /*+ use_nl(a b) */ a.v1, b.v1, c.v1, d.v1 from a, b, c, d where d.n100 = 0 and a.n100 = d.id and b.n100 = a.n2 and c.id = a.id ;
Only one of the tables a and b can be the first table in the final execution plan so one of them will be “the next table in the join order” at some point, so this hint will guarantee that one of the tables will be the inner table of a nested loop join. Here’s the plan I happened to get with my data, indexing, version (11.2.0.4), etc.:
--------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20000 | 1347K| 30125 (1)| 00:00:02 | | 1 | HASH JOIN | | 20000 | 1347K| 30125 (1)| 00:00:02 | | 2 | TABLE ACCESS FULL | C | 10000 | 146K| 26 (4)| 00:00:01 | | 3 | HASH JOIN | | 20000 | 1054K| 30098 (1)| 00:00:02 | | 4 | TABLE ACCESS FULL | D | 100 | 1800 | 26 (4)| 00:00:01 | | 5 | NESTED LOOPS | | 20000 | 703K| 30072 (1)| 00:00:02 | | 6 | NESTED LOOPS | | 20000 | 703K| 30072 (1)| 00:00:02 | | 7 | TABLE ACCESS FULL | B | 10000 | 136K| 26 (4)| 00:00:01 | | 8 | INDEX RANGE SCAN | A_I2 | 2 | | 1 (0)| 00:00:01 | | 9 | TABLE ACCESS BY INDEX ROWID| A | 2 | 44 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------
In this case it’s table a that ends up in a position to be the inner table of a nested loop join.
You may be wondering why there seems to be a hash join into b when we’ve hinted a nested loop join – but the join order that Oracle is using is B -> A -> D -> C with a swap_join_inputs(c) swap_join_inputs(d), so b is never “the next table in the join order”.
If you want a plan that looks (at first sight) even more contrary, here’s the plan I got if I changed the single hint to be /*+ use_nl(a) */
----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20000 | 1347K| 105 (5)| 00:00:01 | | 1 | HASH JOIN | | 20000 | 1347K| 105 (5)| 00:00:01 | | 2 | TABLE ACCESS FULL | B | 10000 | 136K| 26 (4)| 00:00:01 | | 3 | HASH JOIN | | 10000 | 537K| 78 (4)| 00:00:01 | | 4 | TABLE ACCESS FULL | C | 10000 | 146K| 26 (4)| 00:00:01 | | 5 | HASH JOIN | | 10000 | 390K| 52 (4)| 00:00:01 | | 6 | TABLE ACCESS FULL| D | 100 | 1800 | 26 (4)| 00:00:01 | | 7 | TABLE ACCESS FULL| A | 10000 | 214K| 26 (4)| 00:00:01 | -----------------------------------------------------------------------------
This plan really looks as if Oracle should have done a nested loop into a but didn’t. Again appearances are deceptive thanks to the effects of the swap_join_inputs() option for hash joins: the join order here is A -> D -> C -> B (note that we don’t have a use_nl(b) hint in this example).
If you want a plan where the optimizer produces a nested loop join between a and b you’ll need to put in a leading() hint which places b immediately after a somewhere in the list of tables, and then the addition of just use_nl(b) would be sufficient to enforce the join method. Here, for example, is the plan with hints /*+ leading(d a b c) use_nl(b) */ for my data set:
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20000 | 1347K| 30164 (1)| 00:00:02 | | 1 | HASH JOIN | | 20000 | 1347K| 30164 (1)| 00:00:02 | | 2 | TABLE ACCESS FULL | C | 10000 | 146K| 26 (4)| 00:00:01 | | 3 | NESTED LOOPS | | 20000 | 1054K| 30137 (1)| 00:00:02 | | 4 | NESTED LOOPS | | 1000K| 1054K| 30137 (1)| 00:00:02 | | 5 | HASH JOIN | | 10000 | 390K| 52 (4)| 00:00:01 | | 6 | TABLE ACCESS FULL | D | 100 | 1800 | 26 (4)| 00:00:01 | | 7 | TABLE ACCESS FULL | A | 10000 | 214K| 26 (4)| 00:00:01 | | 8 | INDEX RANGE SCAN | B_I100 | 100 | | 1 (0)| 00:00:01 | | 9 | TABLE ACCESS BY INDEX ROWID| B | 2 | 28 | 101 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
Notice, yet again, the optimizer has applied the swap_join_inputs() option to the hash join to table c.
Creation Script:
rem rem Script: use_nl_hint.sql rem Author: Jonathan Lewis rem Dated: Jan 2017 rem create table a nologging as with generator as ( select rownum id from dual connect by level <= 1e4 -- > comment to avoid wordpress format issue ) select rownum id, mod(rownum,5000) n2, mod(rownum,100) n100, lpad(rownum,10,'0') v1, lpad('x',100,'x') padding from generator v1 ; create table b nologging as select * from a; create table c nologging as select * from a; create table d nologging as select * from a; alter table a add constraint a_pk primary key(id); alter table b add constraint b_pk primary key(id); alter table c add constraint c_pk primary key(id); alter table d add constraint d_pk primary key(id); create index a_i2 on a(n2) nologging; create index b_i2 on b(n2) nologging; create index c_i2 on c(n2) nologging; create index d_i2 on d(n2) nologging; create index a_i100 on a(n100) nologging; create index b_i100 on b(n100) nologging; create index c_i100 on c(n100) nologging; create index d_i100 on d(n100) nologging; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'A', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'B', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'C', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'D', method_opt => 'for all columns size 1' ); end; /
Hi, Jonathan.
Thanks for this article. It’s pretty illuminating.
However, due to my experiences at my previous customer, one of the largest Dutch energy companies, I must admit I am not a big fan of the nested loop join.
It involved a datawarehouse, totalling over 15 TB, on exadata (X2 initially, but the behaviour didn’t change on X5).
Our ETL tool was Oracle Warehouse Builder and every night the DWH load involved some 1000 OWB mappings, usually doing something like:
We noticed that every now and then relatively simple mappings (select data from 2 or 3 tables and insert this into a truncated target table), would run for hours and then crash, because of the temp space being full (380 GB). We also noticed that, while still running, the session browser indicated it was doing nested loops.
This crashing was strange behaviour, because the mapping only had to handle some 2 or 3 million records, never enough to generate 300+ GB of data…
Once or twice it sufficed to refresh the statistics of the tables involved to get oracle to choose a hash join, finishing in 3 or 4 minutes tops.
Other times we had to rewrite the code to force the optimizer to use a hash join.
At moments like that I really wished there was a db parameter to disable nested loops altogether, because, in my experience, not talking about queries handling a couple of dozen records in sub-second range, but queries/DML involving 1 million + (up to 100 million) records in response times like between 1 minute and 1 hour, I really never saw any advantage of a nested loop join over a hash join.
So, why did oracle sometimes seemed to lose its way? Because of the truncate/insert? Because the source ‘table’ was a view? Because of poorly kept statistics (It’s possible that the stats job just probed a target table at the exact instant it had been truncated and so produced a statistic of 0 rows present). Because of a bug? Because of exadata? Other?
Kind regards,
Rob Hendrikx
Comment by Rob Hendrikx — February 22, 2017 @ 1:37 pm GMT Feb 22,2017 |
Rob,
It’s always “horses for courses” – a general pattern of Hash Joins for DW and Nested Loops for OLTP isn’t surprising, of course, though required volume, data clustering, indexing, complexity of reporting etc. will inevitably result in some cases that obviously ought to diverge from the common pattern; and sometimes you can find that the overheads of parallel hash joins are such that a parallel nested loop can be engineered to be faster.
From your thumbnail sketch I would suspect problems with statistics – coincidentally I’ve just handed back to the UKOUG editorial board the proof pages for an article of the commonest problems seen with statistics: queries moving out of range of current stats, instability of histograms, and effects of column groups being ignored are possibly three main reasons for massive changes in performance of a query for “no apparent reason”.
The problem of only 2 or 3 million rows producing 300+GB of temp space when using nested loops is deceptive – inevitably. Where aggregates are involved the optimizer may have decided to do group by placement and aggregate late when (for example) you have a view that aggregates early. A small input with a small output could result in a massive intermediate aggregation task if the optimizer picks the wrong join order and group by placement.
The first diagnostic check, in all cases, is to compare the row counts predicted by the optimizer with the estimates you’ve made for each step of the query.
Comment by Jonathan Lewis — February 24, 2017 @ 11:11 am GMT Feb 24,2017 |
I have heard issues from some co-workers that USE_NL is not working correctly with Exadata. Any common experience on your end?
Comment by Barry Chase — August 18, 2017 @ 8:10 pm BST Aug 18,2017 |
Barry,
I think we need to hear an explanation of what “not working correctly” means – either it works or it doesn’t.
Given that the comment is made with reference to Exadata, though, I’ll take a guess that their complaint is one of two things that I’ve mentioned in the past:
a) the data doesn’t seem to appear in the right order (which is a recent performance enhancement)
b) the nested loop join is much slower and more CPU intensive than expected
Problem (b) relates to the code of select a single row by index from a compression unit, and is described in this blog note.
Problem (a) is the side effect of the fix for problem (b) where Oracle does the index range scan on the second table then sorts by rowid before accessing the table to reduce the number of times it decompresses compression units. The topic gets a mention in my little history of the nested loop join.
If your co-workers have found a new complaint I’d be interested in hearing a little more about exactly what it is.
Comment by Jonathan Lewis — August 19, 2017 @ 3:20 pm BST Aug 19,2017 |
[…] First, the developer who wrote the view definition doesn’t understand the use_nl() hint. […]
Pingback by Geek Stuff | Oracle Scratchpad — March 3, 2021 @ 11:05 pm GMT Mar 3,2021 |
[…] the incorrect use of the use_nl() hint in some sys-recursive SQL, linking forward to an explanation I wrote in 2017 of the use_nl() hint – particularly the interpretation of the form use_nl(a,b), which does […]
Pingback by use_nl redux | Oracle Scratchpad — October 15, 2021 @ 3:01 pm BST Oct 15,2021 |
[…] What does use_nl(a,b) mean (Jan 2017) – a common misconception explained; this hint does not mean “do a nested loop from a to b”. […]
Pingback by “Ignoring Hints” Catalogue | Oracle Scratchpad — February 21, 2022 @ 9:44 am GMT Feb 21,2022 |