Oracle Scratchpad

October 15, 2021

use_nl redux

Filed under: CBO,Execution plans,Hints,Ignoring Hints,Oracle — Jonathan Lewis @ 2:58 pm BST Oct 15,2021

A question has just appeared on a note I wrote in 2012 about the incorrect use of the use_nl() hint in some sys-recursive SQL, linking forward to an explanation I wrote in 2017 of the use_nl() hint – particularly the interpretation of the form use_nl(a,b), which does not mean “use a nested loop from table A to table B)”.

The question is essentially turns into – “does Oracle pick the join order before it looks at the hints”?

I’m going to look at one of the queries (based on the 2017 table creation code) that was supplied in the question and explain how Oracle gets to the plan it uses in my (21.3) system; here’s the query, followed by the plan:

select
        /*+ use_nl(b) */
        a.v1, b.v1, c.v1, d.v1
from
        a, b, c, d
where
        d.n100 = 0
and     a.n100 = d.id
and     b.n100 = a.n2
and     c.id   = a.id
/


| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      | 20000 |  1347K|   105   (5)| 00:00:01 |
|*  1 |  HASH JOIN           |      | 20000 |  1347K|   105   (5)| 00:00:01 |
|   2 |   TABLE ACCESS FULL  | C    | 10000 |   146K|    26   (4)| 00:00:01 |
|*  3 |   HASH JOIN          |      | 20000 |  1054K|    78   (4)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL | D    |   100 |  1800 |    26   (4)| 00:00:01 |
|*  5 |    HASH JOIN         |      | 20000 |   703K|    52   (4)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| B    | 10000 |   136K|    26   (4)| 00:00:01 |
|   7 |     TABLE ACCESS FULL| A    | 10000 |   214K|    26   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      SWAP_JOIN_INPUTS(@"SEL$1" "C"@"SEL$1")
      SWAP_JOIN_INPUTS(@"SEL$1" "D"@"SEL$1")
      USE_HASH(@"SEL$1" "C"@"SEL$1")
      USE_HASH(@"SEL$1" "D"@"SEL$1")
      USE_HASH(@"SEL$1" "A"@"SEL$1")
      LEADING(@"SEL$1" "B"@"SEL$1" "A"@"SEL$1" "D"@"SEL$1" "C"@"SEL$1")
      FULL(@"SEL$1" "C"@"SEL$1")
      FULL(@"SEL$1" "D"@"SEL$1")
      FULL(@"SEL$1" "A"@"SEL$1")
      FULL(@"SEL$1" "B"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('21.1.0')
      OPTIMIZER_FEATURES_ENABLE('21.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("C"."ID"="A"."ID")
   3 - access("A"."N100"="D"."ID")
   4 - filter("D"."N100"=0)
   5 - access("B"."N100"="A"."N2")

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   6 -  SEL$1 / "B"@"SEL$1"
         U -  use_nl(b)

Note
-----
   - this is an adaptive plan

Points to note:

  • The Hint Report says the plan final did not use the use_nl(b) hint.
  • Whatever you may think the join order is by looking at the bodyy of the plan, the leading() hint in the Outline Information tells us that the join order was (B A D C) – and that explains why the use_nl(b) hint could not be used, because B was never “the next table in the join order”.
  • The “visible” order of activity displayed in the plan is C D B A, but that’s because we swap_join_inputs(D) to put it about the (B,A) join, then swap_join_inputs(C) to put that above D.

So did Oracle completely pre-empt any plans that allowed B to be “the next table”, thus avoiding the hint, or did it consider some plans where B wasn’t the first table in the join order, and if would it have used a nested loop into B if that plan had had a low enough cost?

The only way to answer these questions is to look at the CBO (10053) trace file; and for very simply queries it’s often enough to pick out a few lines as a starting point – in my case using egrep:

egrep -e "^Join order" -e"Best so far" or21_ora_15956.trc

Join order[1]:  D[D]#0  A[A]#1  B[B]#2  C[C]#3
Best so far:  Table#: 0  cost: 25.752439  card: 100.000000  bytes: 1800.000000
Join order[2]:  D[D]#0  A[A]#1  C[C]#3  B[B]#2
Best so far:  Table#: 0  cost: 25.752439  card: 100.000000  bytes: 1800.000000
Join order[3]:  D[D]#0  B[B]#2  A[A]#1  C[C]#3
Best so far:  Table#: 0  cost: 25.752439  card: 100.000000  bytes: 1800.000000
Join order[4]:  D[D]#0  B[B]#2  C[C]#3  A[A]#1
Join order aborted2: cost > best plan cost
Join order[5]:  D[D]#0  C[C]#3  A[A]#1  B[B]#2
Join order aborted2: cost > best plan cost
Join order[6]:  D[D]#0  C[C]#3  B[B]#2  A[A]#1
Join order aborted2: cost > best plan cost

Join order[7]:  A[A]#1  D[D]#0  B[B]#2  C[C]#3
Join order aborted2: cost > best plan cost
Join order[8]:  A[A]#1  D[D]#0  C[C]#3  B[B]#2
Join order aborted2: cost > best plan cost
Join order[9]:  A[A]#1  B[B]#2  D[D]#0  C[C]#3
Join order aborted2: cost > best plan cost
Join order[10]:  A[A]#1  C[C]#3  D[D]#0  B[B]#2
Join order aborted2: cost > best plan cost
Join order[11]:  A[A]#1  C[C]#3  B[B]#2  D[D]#0
Join order aborted2: cost > best plan cost

Join order[12]:  B[B]#2  D[D]#0  A[A]#1  C[C]#3
Join order aborted2: cost > best plan cost
Join order[13]:  B[B]#2  A[A]#1  D[D]#0  C[C]#3
Best so far:  Table#: 2  cost: 25.692039  card: 10000.000000  bytes: 140000.000000
Join order[14]:  B[B]#2  A[A]#1  C[C]#3  D[D]#0
Join order aborted2: cost > best plan cost
Join order[15]:  B[B]#2  C[C]#3  D[D]#0  A[A]#1
Join order aborted2: cost > best plan cost

Join order[16]:  C[C]#3  D[D]#0  A[A]#1  B[B]#2
Join order aborted2: cost > best plan cost
Join order[17]:  C[C]#3  A[A]#1  D[D]#0  B[B]#2
Join order aborted2: cost > best plan cost
Join order[18]:  C[C]#3  A[A]#1  B[B]#2  D[D]#0
Join order aborted2: cost > best plan cost
Join order[19]:  C[C]#3  B[B]#2  D[D]#0  A[A]#1
Join order aborted2: cost > best plan cost

Oracle has considerd 19 possible join orders (out of a maximum of 24 (= 4!). In theory we should see 6 plans starting with wach of the 4 tables. In fact we we that the optimizer’s first choice started with table D, producing 6 join orders, then switched to starting with table A, producing only 5 join orders.

The “missing” order is (A, B, C, D) which should have appeared between join orders 9 and 10. If we check the trace file in more detail we’ll see that the optimizer aborted after calculation the join from A to B because the cost had already exceeded the “Best so far” by then so it didn’t carry on to calculate the cost going on to D. Clearly , then, there was no point in considering any other order that starting with (A, B) hence the absence of (A, B, C, D).

I’ve highlighted all the join orders where the optimizer didn’t abort. The “Best so far” line that I have reported (for ease of searching and reporting) is misleading – it’s only the cost of the first table in join order, this is what the 4 non-aborted summaries look like:

egrep -A+3 -e"Best so far" or21_ora_15956.trc

Best so far:  Table#: 0  cost: 25.752439  card: 100.000000  bytes: 1800.000000
              Table#: 1  cost: 51.767478  card: 10000.000000  bytes: 400000.000000
              Table#: 2  cost: 30137.036118  card: 20000.000000  bytes: 1080000.000000
              Table#: 3  cost: 30163.548157  card: 20000.000000  bytes: 1380000.000000
--
Best so far:  Table#: 0  cost: 25.752439  card: 100.000000  bytes: 1800.000000
              Table#: 1  cost: 51.767478  card: 10000.000000  bytes: 400000.000000
              Table#: 3  cost: 78.079517  card: 10000.000000  bytes: 550000.000000
              Table#: 2  cost: 30163.348157  card: 20000.000000  bytes: 1380000.000000
--
Best so far:  Table#: 0  cost: 25.752439  card: 100.000000  bytes: 1800.000000
              Table#: 2  cost: 2483.956340  card: 1000000.000000  bytes: 32000000.000000
              Table#: 1  cost: 2530.068379  card: 20000.000000  bytes: 1080000.000000
              Table#: 3  cost: 2556.580418  card: 20000.000000  bytes: 1380000.000000
--
Best so far:  Table#: 2  cost: 25.692039  card: 10000.000000  bytes: 140000.000000
              Table#: 1  cost: 52.204078  card: 20000.000000  bytes: 720000.000000
              Table#: 0  cost: 78.479517  card: 20000.000000  bytes: 1080000.000000
              Table#: 3  cost: 104.991556  card: 20000.000000  bytes: 1380000.000000

As you can see, when we start with (B A) the estimated cost drops dramatically.

Now that we’ve see that Oracle looks at many (though not a completely exhaustive set of) plans on the way to the one it picks the thing we need to do (in theory) is check that for every single calculation where B is “the next table”, Oracle obeys our hint. Each time the optimizer join “the next” table it usually considers the cost of a Nested Loop, a Sort Merge, and a Hash Join in that order; if the optimizer is obeying the hint it will only consider the nested loop join. Here’s a suitable call to egrep with the first four join orders::

egrep -e "^Join order" -e "^Now joining" -e"^NL Join" -e"^SM Join" -e"^HA Join" or21_ora_15956.trc

Join order[1]:  D[D]#0  A[A]#1  B[B]#2  C[C]#3
Now joining: A[A]#1
NL Join
SM Join
SM Join (with index on outer)
HA Join
Now joining: B[B]#2
NL Join
Now joining: C[C]#3
NL Join
SM Join
HA Join

Join order[2]:  D[D]#0  A[A]#1  C[C]#3  B[B]#2
Now joining: C[C]#3
NL Join
SM Join
HA Join
Now joining: B[B]#2
NL Join

Join order[3]:  D[D]#0  B[B]#2  A[A]#1  C[C]#3
Now joining: B[B]#2
NL Join
Now joining: A[A]#1
NL Join
SM Join
HA Join
Now joining: C[C]#3
NL Join
SM Join
HA Join

Join order[4]:  D[D]#0  B[B]#2  C[C]#3  A[A]#1
Now joining: C[C]#3
NL Join
Join order aborted2: cost > best plan cost


As you can see, the only join considered when “Now joining” B is a nested loop join; for all other tables the three possible joins (and sometimes two variants of the Sort Merge join) are evaluated.

You may also notice another of the clever strategies the optimizer uses to minimise its workload. On the second join order the optimizer goes straight to “Now joining C” because it has remembered the result of joining A from the previous join order.

This is only a very simple example and analysis, but I hope it’s given you some idea of how the optimizer works, and how clever it tries to be about minimising the work; and how it can obey a hint while still producing an execution plan that appears to have ignored the hint.

October 7, 2021

Hints and Costs

Filed under: 12c,CBO,Conditional SQL,Execution plans,Oracle — Jonathan Lewis @ 12:06 pm BST Oct 7,2021

This note is one I drafted three years ago, based on a question from the Oracle-L. It doesn’t directly address that question because at the time I was unable to create a data set that reproduced the problem’ but it did highlight a detail that’s worth mentioning, so I’ve finally got around to completing it (and testing on a couple of newer versions of Oracle).

I’ll start with a model that was supposed to demonstrate the problem behind the question:


rem
rem     Script:         122_or_expand.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2018
rem     Purpose:        
rem
rem     Last tested
rem             21.3.0.0
rem             19.11.0.0
rem             12.2.0.1
rem

create table t1
segment creation immediate
pctfree 80 pctused 20
nologging
as
select
        *
from
        all_objects
where
        rownum <= 50000
;

alter table t1 add constraint t1_pk
        primary key(object_id)
        using index pctfree 80
;

variable b1 number
variable b2 number
variable b3 number
variable b4 number
variable b5 number
variable b6 number

exec :b1 := 100
exec :b2 := 120
exec :b3 := 1100
exec :b4 := 1220
exec :b5 := 3100
exec :b6 := 3320

set serveroutput off

select
        object_name
from 
        t1
where
        object_id between :b1 and :b2
or      object_id between :b3 and :b4
or      object_id between :b5 and :b6
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));


The critical feature of the query is the list of disjuncts (ORs) which all specify a range for object_id. The problem was that the query used a plan with an index full scan when there were no statistics on the table (or its indexes), but switched to a plan  that used index range scans when statistics were gathered – and the performance of the plan with the full scan was unacceptable.  (Clearly the “proper” solution is to have some suitable statistics in place – but sometimes such things are out of the control of the people who have to solve the problems.)

The /*+ index() */ and (undocumented) /*+ index_rs_asc() */ hints had no effect on the plan. The reason why the /*+ index() */ hint made no difference is because an index full scan is one of the ways in which the /*+ index() */ hint can be obeyed – the hint doesn’t instruct the optimizer to pick an index range scan. The hint /*+ index_rs_asc() */ specifically tells the optimizer to pick an index Range Scan ASCending if the hint has been specified correctly and the choice is available and legal. So why was the optimizer not doing as it was told. Without seeing the execution plan or CBO trace file from a live example I can’t guarantee that the following hypothesis is correct, but I think it’s in the right ball park.

I think the optimizer was probably using the (new to 12c) cost-based“OR expansion” transformation, which basically transformed the query into a UNION ALL of several index range scans – and that’s why its outline would show /*+ index_rs_asc() */ hints, and the hint would only become valid after the transformation had taken place so if Oracle didn’t consider (or considered and discarded) the transformation when there were no stats in place then the hint would have to be “Unused” (as the new 19c hint-report would say).

When I tried to model the problem the optimizer kept doing nice things with my data, so I wasn’t able to demonstrate the OP’s problem. However in one of my attempts to get a silly plan I did something silly – that can happen by accident if your client code isn’t careful! I’ll tell you what that was in a moment – first, a couple of plans.

As it stands, with the data and bind variables as shown, the optimizer used “b-tree / bitmap conversion” to produce an execution plan that did three separate index range scans, converts rowids to bit, OR-ed the bit-strings, then converted back to rowids before accessing the table:

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |       |       |    84 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   291 | 12804 |    84   (5)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |       |       |       |            |          |
|   3 |    BITMAP OR                        |       |       |       |            |          |
|   4 |     BITMAP CONVERSION FROM ROWIDS   |       |       |       |            |          |
|   5 |      SORT ORDER BY                  |       |       |       |            |          |
|*  6 |       INDEX RANGE SCAN              | T1_PK |       |       |     2   (0)| 00:00:01 |
|   7 |     BITMAP CONVERSION FROM ROWIDS   |       |       |       |            |          |
|   8 |      SORT ORDER BY                  |       |       |       |            |          |
|*  9 |       INDEX RANGE SCAN              | T1_PK |       |       |     2   (0)| 00:00:01 |
|  10 |     BITMAP CONVERSION FROM ROWIDS   |       |       |       |            |          |
|  11 |      SORT ORDER BY                  |       |       |       |            |          |
|* 12 |       INDEX RANGE SCAN              | T1_PK |       |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

So the first thing I had to do was disable this feature, which I did by adding the hint /*+ opt_param(‘_b_tree_bitmap_plans’,’false’) */ to the query. This adjustment left Oracle doing the OR-expansion that I didn’t want to see:


----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |       |       |   297 (100)|          |
|   1 |  VIEW                                  | VW_ORE_BA8ECEFB |   288 | 19008 |   297   (1)| 00:00:01 |
|   2 |   UNION-ALL                            |                 |       |       |            |          |
|*  3 |    FILTER                              |                 |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1              |    18 |   792 |    20   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | T1_PK           |    18 |       |     2   (0)| 00:00:01 |
|*  6 |    FILTER                              |                 |       |       |            |          |
|   7 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1              |    97 |  4268 |   100   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN                  | T1_PK           |    97 |       |     2   (0)| 00:00:01 |
|*  9 |    FILTER                              |                 |       |       |            |          |
|  10 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1              |   173 |  7612 |   177   (1)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN                  | T1_PK           |   173 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

You’ll notice that the three range scans have different row estimates and costs – that’s the effect of bind variable peeking and my careful choice of bind variables to define different sized ranges. Take note, by the way, for the three filter predicates flagged at operations 3, 6, and 9.  These are the “conditional plan” filters that say things like: “don’t run the sub-plan if the runtime value of :b5 is greater than :b6”.

Since I didn’t want to see OR-expansion just yet I then added the hint /*+ no_or_expand(@sel$1) */ to the query and that gave me a plan with tablescan:

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   617 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |   291 | 12804 |   617   (4)| 00:00:01 |
--------------------------------------------------------------------------

This was a shame because I really wanted to see the optimizer produce an index full scan at this point – so I decided to add an “unnamed index” hint to the growing list of hints – specifically: /*+ index_(@sel$1 t1@sel$1) */

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |       |       |   405 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   291 | 12804 |   405   (2)| 00:00:01 |
|*  2 |   INDEX FULL SCAN                   | T1_PK |   291 |       |   112   (7)| 00:00:01 |
---------------------------------------------------------------------------------------------

This, of course, is where things started to get a little interesting – the index full scan costs less than the tablescan but didn’t appear until hinted. But after a moment’s thought you can dismiss this one (possibly correctly) as an example of the optimizer being cautious about the cost of access paths that are dictated by bind variables or unpeekable inputs. (But these bind variables were peekable – so maybe there’s more to it than that – I was still trying to get to a point where my model would behave more like the OP’s, so I didn’t follow up on this detail: maybe in a couple of years time … ).

Once last tweak – and that will bring me to the main point of this note. In my original code I was using three ranges dictated by 3 pairs of bind variables, for example [:b5, :b6]. What would happen if I made :b5 greater than :b6, say I swapped their values?

The original btree/bitmap plan didn’t change, but where I had simply blocked bree/bitmap plans and seen OR-expansion as a result the plan changed to a full tablescan (with the cost you saw above of 617). So tried again, adding the hint /*+ or_expand(@sel$1) */ to see why; and this is the plan I got:

----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |       |       |   735 (100)|          |
|   1 |  VIEW                                  | VW_ORE_BA8ECEFB |   116 |  7656 |   735   (3)| 00:00:01 |
|   2 |   UNION-ALL                            |                 |       |       |            |          |
|*  3 |    FILTER                              |                 |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1              |    18 |   792 |    20   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | T1_PK           |    18 |       |     2   (0)| 00:00:01 |
|*  6 |    FILTER                              |                 |       |       |            |          |
|   7 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1              |    97 |  4268 |   100   (0)| 00:00:01 |
|*  8 |      INDEX RANGE SCAN                  | T1_PK           |    97 |       |     2   (0)| 00:00:01 |
|*  9 |    FILTER                              |                 |       |       |            |          |
|* 10 |     TABLE ACCESS FULL                  | T1              |     1 |    44 |   615   (4)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

I still get the same three branches in the expansion, but look what’s happened to the sub-plan for the third pair of bind variables. The optimizer still has the FILTER at operation 9 – and that will evaluate to FALSE for the currently peeked values; but the optimizer has decided that it should use a tablescan for this part of the query if it ever gets a pair of bind variables in the right order; and the cost of the tablescan has echoed up the plan to make the total cost of the plan 735, which is (for obvious reasons) higher than the cost of running the whole query as a single tablescan.

The same anomaly appears in 19.11.0.0 and 21.3.0.0. On the plus side, it’s possible that if you have code like this the optimizer will be using the btree/bitmap conversion anyway;

tl;dr

As a generic point it’s worth ensuring that if you’re using bind variables in client code to define ranges then you’ve got to get the values in the right order otherwise one day the answer to the question “nothing changed why is the query running so slowly?” will be “someone got in first with the bound values the wrong way round”.

DB Optimizer

Filed under: CBO,Oracle — Jonathan Lewis @ 8:56 am BST Oct 7,2021

I’ve rediscovered this list of articles by Stefan Koehler on my laptop; it has a bias towards SAP users, and it’s several years old judging by the dates so some details will have changed, but there still seem to be plenty of people using 12c (and SAP) so I thought I’d tidy it up and publish it

September 20, 2021

Optimizer Tip

Filed under: CBO,extended stats,Indexing,Oracle,Statistics — Jonathan Lewis @ 9:04 am BST Sep 20,2021

This is a note I drafted in March 2016, starting with a comment that it had been about that time the previous year that I had written:

I’ve just responded to the call for items for the “IOUG Quick Tips” booklet for 2015 – so it’s probably about time to post the quick tip that I put into the 2014 issue. It’s probably nothing new to most readers of the blog, but sometimes an old thing presented in a new way offers fresh insights or better comprehension.

I keep finding ancient drafts like this (there are still more than 730 drafts on my blog at present – which means one per day for the next 2 years!) and if they still seem relevant – even if they are a little behind the times – I’ve taken to dusting them down and publishing.

With the passing of time, though, new information becomes available, algorithms change, and (occasionally) I discover I’ve made a significant error in my inferences. In this case  there are a couple of important additions that I’ve added to the end of the note.

Optimizer Tips (IOUG Quick Tips 2015)

There are two very common reasons why the optimizer picks a bad execution plan. The first is that its estimate of the required data volume is bad, the second is that it has a misleading impression of how scattered that data is.

The first issue is often due to problems with the selectivity of complex predicates, the second to unsuitable values for the clustering_factor of potentially useful indexes. Recent [ed: i.e. pre-2015] versions of the Oracle software have given us features that try to address both these issues, and I’m going to comment on them in the following note.

As always, any change can have side effects – introducing a new feature might have no effect on 99% of what we do, a beneficial effect on 99% of the remainder, and a hideous effect on the 1% of 1% that’s left, so I will be commenting on both the pros and cons of both features.

Column Group Stats

The optimizer assumes that the data in two different columns of a single table are independent – for example the registration number on your car (probably) has nothing to do with the account number of your bank account. So when we execute queries like:

     colX = 'abcd'
and  colY = 'wxyz'

the optimizer’s calculations will be something like:

“one row in 5,000 is likely to have colX = ‘abcd’ and one row in 2,000 is likely to have colY = ‘wxyz’, so the combination will probably appear in roughly one row in ten million”.

On the other hand we often find tables that do things like storing post codes (zipcodes) in one column and city names in another, and there’s a strong correlation between post codes and city – for example the district code (first part of the post code) “OX1” will be in the city of Oxford (Oxfordshire, UK). So if we query a table of addresses for rows where:

     district_code = 'OX1'
and  city          = 'Oxford

there’s a degree of redundancy, but the optimizer will multiple the total number of distinct district codes in the UK by the total number of distinct city names in the UK as it tries to work out the number of addresses that match the combined predicate and will come up with a result that is far too small.

In cases like this we can define “column group” statistics about combinations of columns that we query together, using the function dbms_stats.create_extended_stats(). This function will create a type of virtual column for a table and report the system-generated name back to us, and we will be able to see that name in the view user_tab_cols, and the definition in the view user_stat_extensions. If we define a column group in this way we then need to gather stats on it, which we can do in one of two ways, either by using the generated name or by using the expression that created it.


SQL> create table addresses (district_code varchar2(8), city varchar2(40));

Table created.

SQL> execute dbms_output.put_line( -
>        dbms_stats.create_extended_stats( -
>            user,'addresses','(district_code, city)'))

SYS_STU12RZM_07240SN3V2667EQLW

PL/SQL procedure successfully completed.

begin
        dbms_stats.gather_table_stats(
                user, 'addresses',
                method_opt => 'for columns SYS_STU12RZM_07240SN3V2667EQLW size 1'
        );
        dbms_stats.gather_table_stats(
                user, 'addresses',
                method_opt => 'for columns (district_code, city) size 1'
        );
end;
/

I’ve included both options in the anonymous pl/sql block, but you only need one of them. In fact if you use the second one without calling create_extended_stats() first Oracle will create the column group implicitly, but you won’t know what it’s called until you query user_stat_extensions.

I’ve limited the stats collection to basic stats with the “size 1” option. You can collect a histogram on a column group but since the optimizer can only use a column group with equality predicates you should only create a histogram in the special cases where you know that you’re going to get a frequency histogram or “Top-N” histogram.

You can also define extended stats on expressions (e.g. trunc(delivery-date) – trunc(collection_date)) rather than column groups, but since you’re only allowed 20 column groups per table [see update 1] it would be better to use virtual columns for expressions since you can have as many virtual columns you like on a table provided the total column count stays below the limit of 1,000 columns per table.

Warnings

  • Column group statistics are only used for equality expressions. [see also update 2]
  • Column group statistics will not be used if you’ve created a histogram on any of the underlying columns unless there’s also a histogam on the column group itself.
  • Column group statistics will not be used if you query any of the underlying columns with an “out of range” value. This, perhaps, is the biggest instability threat with column groups. As time passes and new data appears you may find people querying the new data. If you haven’t kept the column stats up to date then plans can change dramatically as the optimizer switches from using column group stats to multiplying the selectivities of underlying columns.
  • The final warning arrives with 12c. If you have all the adaptive optimizer options enabled the optimizer will keep a look out for tables that it thinks could do with column group stats, and automatically creates them the next time you gather stats on the table. In principle this shouldn’t be a problem – the optimizer should only do this when it has seen that column stats should improve performance – but you might want to monitor your system for the arrival of new automatic columns.

Preference: table_cache_history

Even when the cardinality estimates are correct we may find that we get an inefficient execution plan because the optimizer doesn’t want to use an index that we think would be a really good choice. A common reason for this failure is that the clustering_factor on the index is unrealistically large.

The clustering_factor of an index is a measure of how randomly you will jump around the table as you do an index range scan through the index – and the algorithm Oracle uses to calculate this number has a serious flaw in it: it can’t tell the difference between a little bit of localised jumping and constant random leaps across the entire width of the table.

To calculate the clustering_factor Oracle basically walks the index in order using the rowid at the end of each index entry to check which table block it has to visit, and every time it has to visit a “new” table block it increments a counter. The trouble with this approach is that, by default, it doesn’t remember its recent history so, for example, it can’t tell the difference in quality between the following two sequences of table block visits:

Block 612, block 87, block 154, block 3,  block 1386, block 834, block 237
Block 98,  block 99, block 98,  block 99, block 98,   block 99,  block 98

In both cases Oracle would say that it had visited seven different blocks and the data was badly scattered. This has always been a problem, but it became much more of a problem when Oracle introduced ASSM (automatic segment space management). The point of ASSM is to ensure that concurrent inserts from different sessions tend to use different table blocks, the aim being to reduce contention due to buffer busy waits. As we’ve just seen, though, the clustering_factor doesn’t differentiate between “a little bit of scatter” and “a totally random disaster area”.

Oracle finally addressed this problem by introducing a “table preference” which allows you to tell it to “remember history” when calculating the clustering_factor. So, for example, a call like this:

execute dbms_stats.set_table_prefs(user,'t1','table_cached_blocks',16)

would tell Oracle that the next time you collect statistics on any indexes on table t1 the code to calculate the clustering_factor should remember the last 16 table blocks it had “visited” and not increment the counter if the “next” block was already in that list.

If you look at the two samples above, this means the counter for the first list of blocks would reach 7 while the counter for the second list would only reach 2. Suddenly the optimizer will be able to tell the difference between data that is “locally” scattered and data that really is randomly scattered. You and the optimizer may finally agree on what constitutes a good index.

It’s hard to say whether there’s a proper “default” value for this preference. If you’re using ASSM (and there can’t be many people left who aren’t) then the obvious choice for the parameter would be 16 since ASSM tends to format 16 consecutive blocks at a time when a segment needs to make more space available for users [but see Update 3]. However, if you know that the real level of insert concurrency on a table is higher than 16 then you might be better off setting the value to match the known level of concurrency.

Are there any special risks to setting this preference to a value like 16? I don’t think so; it’s going to result in plans changing, of course, but indexes which should have a large clustering_factor should still end up with a large clustering_factor after setting the preference and gathering of statistics; the indexes that ought to have a low clustering_factor are the ones most likely to change, and change in the right direction.

Footnote: “Danger, Will Robinson”.

I’ve highlighted two features that are incredibly useful as tools to give the optimizer better information about your data and allow it to get better execution plans with less manual intervention. The usual warning applies, though: “if you want to get there, you don’t want to start from here”. When you manipulate the information the optimizer is using it will give you some new plans; better information will normally result in better plans but it is almost inevitable that some of your current queries are running efficiently “by accident” (possibly because of bugs) and the new code paths will result in some plans changing for the worse.

Clearly it is necessary to do some thorough testing but fortunately both features are incremental and any changes can be backed out very quickly and easily. We can change the “table_cached_blocks” one table at a time (or even, with a little manual intervention, one index at a time) and watch the effects; we can add column groups one at a time and watch for side effects. All it takes to back out of a change is a call to gather index stats, or a call to drop extended stats. It’s never nice to live through change – especially change that can have a dramatic impact – but if we find after going to production that we missed a problem with our testing we can reverse the changes very quickly.

Updates

Update 1 – 20 sets of extended stats. In fact the limit is the larger of 20 and ceiling(column count/10), and the way the arithmetic is applied is a little odd so there are ways to hack around the limit.

Update 2 – Column groups and equality. It’s worth a special menton that the predicate colX is null is not an equality predicate, and column group stats will not apply but there can be unexpected side effects even for cases where you don’t use this “is null” predicate.

Update 3 – table_cache_history = 16. This suggestions doesn’t allow for systems running RAC.

August 25, 2021

qbregistry 2

Filed under: CBO,dbms_xplan,Oracle,Transformations — Jonathan Lewis @ 1:45 pm BST Aug 25,2021

Following a question (very similar to one I had been asking myself) that appeared on twitter in response to my original posting on the new qbregistry format option in the dbms_xplan package, I’ve drafted a note of how I interpreted the execution plan so that I could more clearly see how my visualisation of the transformation maps (or fails to map) to the Query Block Registry.

I can’t guarantee the correctness of the description I’ve given here, but it’s probably fairly accurate.

Original Query (hiding the unnest and no_semijoin hints)

select  
        /* sel$1 */
        * 
from    t1 
where   t1.owner = 'OUTLN' 
and     object_name in (
                select  /* sel$2 */
                        distinct t2.object_name 
                from   t2 
                where  t2.object_type = 'TABLE'
        )
;

Transformation 1: unnest the subquery

This is visible in the Outline Information of the execution plan as the hint UNNEST(@”SEL$2″ UNNEST_INNERJ_DISTINCT_VIEW) and produces two new query blocks, the query block that “is” the unnested subquery and the query block that joins t1 to the unnested subquery vw_nso_1.

select  
        /* SEL$5DA710D3 */
        t1.* 
from
        t1,
        (
        select  /* SEL$683B0107 */
                distinct
                t2.object_name 
        from    t2 
        where   t2.object_type = 'TABLE'
        )       vw_nso_1
where
        t1.owner = 'OUTLN' 
and     vw_nso_1.object_name = t1.object_name



Transformation 2: view merge (join then aggregate)

This is visible in the Outline Information of the execution plan in the hint MERGE(@”SEL$683B0107″ >”SEL$5DA710D3″). I think this produces three new query blocks; the block that “is” the merged view, a block that selects (projects) from the merged view, and the query block that the main query now becomes.

We will pretend that t1 has only 4 columns, owner, object_name, object_type, object_id.

select
        /* SEL$B186933D */
        vm_nwvw_2.owner,
        vm_nwvw_2.object_name,
        vm_nwvw_2.object_type
        vm_nwvw_2.object_id
from    (
        select  /* SEL$2F1334C4 */
                -- no distinct, and t2.object_name and t1.rowid eliminated
                t1.owner,
                t1.object_name,
                t1.object_type
                t1.object_id
        from    (
                select  /* SEL$88A77D12 */
                        distinct
                        t1.rowid,        -- ensures we don't duplicate t1 rows
                        t1.owner,
                        t1.object_name,
                        t2.object_name,  -- seems redundant, but is in the trace file
                        t1.object_type
                        t1.object_id
                from
                        t1,
                        t2
                where
                        t2.object_type = 'TABLE'
                and     t1.owner = 'OUTLN'
                and     t1.object_name = t2.object_name
                ) 
        ) vm_nwvw_2
;

Transformation 3: aggregate into partial join

I realised only as I was writing this note that I had completely forgotten that the plan reported a semi join even though the subquery had been hinted with a no_semijoin hint, and that the reported semi join was actually a partial join.

However, the query block registry is identical with or without a partial join (controlled by the hint [no]partial_join) so there doesn’t seem to be a transformation stage corresponding to the choice of strategy. Maybe the apparently redundant query block layer allows the variation to appear if required.

It’s Difficult

A problem I have with the query block registry is deciding what it’s telling us – and maybe the trace file and the execution plan are not trying to tell us exactly the same thing. I think it’s quite difficult, anyway, to find a good way of presenting the information that is completely informative but clear and uncluttered.

Something that may help, when you can check the trace file and the final execution plan, is the order in which query blocks are registered. Some of them may be discarded, of course, as the optimizer works through options, some of them may be marked as COPY, but if you ignore those you may be able to see from what’s left the evolution of the plan. Here, for example, is the extract of the lines where each query block is registered, taken from the CBO trace for this query, with numbering:

32:Registered qb: SEL$1 0xc26e28e8 (PARSER)
36:  signature (): qb_name=SEL$1 nbfros=1 flg=0
37:    fro(0): flg=4 objn=76167 hint_alias="T1"@"SEL$1"

39:Registered qb: SEL$2 0xc26e0d58 (PARSER)
43:  signature (): qb_name=SEL$2 nbfros=1 flg=0
44:    fro(0): flg=4 objn=76168 hint_alias="T2"@"SEL$2"

966:Registered qb: SEL$683B0107 0xbcc187c8 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
970:  signature (): qb_name=SEL$683B0107 nbfros=1 flg=0
971:    fro(0): flg=0 objn=76168 hint_alias="T2"@"SEL$2"

973:Registered qb: SEL$5DA710D3 0xbb1fcb10 (SUBQUERY UNNEST SEL$1; SEL$2;)
977:  signature (): qb_name=SEL$5DA710D3 nbfros=2 flg=0
978:    fro(0): flg=0 objn=76167 hint_alias="T1"@"SEL$1"
979:    fro(1): flg=5 objn=0 hint_alias="VW_NSO_1"@"SEL$5DA710D3"

1471:Registered qb: SEL$2F1334C4 0xbcf4b210 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3)
1475:  signature (): qb_name=SEL$2F1334C4 nbfros=1 flg=0
1476:    fro(0): flg=5 objn=0 hint_alias="VM_NWVW_2"@"SEL$2F1334C4"

1478:Registered qb: SEL$88A77D12 0xbcda5540 (PROJECTION VIEW FOR CVM SEL$683B0107)
1482:  signature (): qb_name=SEL$88A77D12 nbfros=2 flg=0
1483:    fro(0): flg=0 objn=76167 hint_alias="T1"@"SEL$1"
1484:    fro(1): flg=1 objn=0 hint_alias="VW_NSO_1"@"SEL$5DA710D3"

1489:Registered qb: SEL$B186933D 0xbcda5540 (VIEW MERGE SEL$88A77D12; SEL$683B0107; SEL$5DA710D3)
1493:  signature (): qb_name=SEL$B186933D nbfros=2 flg=0
1494:    fro(0): flg=0 objn=76167 hint_alias="T1"@"SEL$1"
1495:    fro(1): flg=0 objn=76168 hint_alias="T2"@"SEL$2"

Because it’s a very simple query you can almost see the “thinking” in the clumping of the line numbers.

  • The first two registrations are the original query blocks.
  • After a break the next pair is the t2 subquery being unested and the query which is the join between t1 and the unnested t2.
  • After another break we see, in rapid succession, the view using the merged join view, the projection view using that merged join view, then the query block selecting from that projection.

The thing I find difficult to keep clear in my mind (when trying to describe what the trace/registry is saying, that is, not when just reading the plan) is the “doubling” effect where transformation steps often seem to produce two query blocks, one for the inline view containing the transformed object and one for the query that is now using the transformed object; and a further source of confusion appears when a query block seems to be able to peek into an inner query block to reference the objects in it. I just keep losing track of the layers!

It’s probablyh as safe as it’s going to be to read this note (unless someone points out an error). I don’t think there’s any more that I could find to say about the example.

August 24, 2021

qbregistry

Filed under: CBO,dbms_xplan,Oracle,Transformations — Jonathan Lewis @ 11:54 am BST Aug 24,2021

If you look at the “Outline Information” from an execution plan it shows you a list of hints that will (in theory, at least) recreate the execution plan and it’s this information that gets stored as the “injection” part of an SQL Plan Baseline. Unfortunately the hints won’t necessarily allow you to infer what transformations the optimizer has used to get to the final execution plan.

If you’re prepared to generate the CBO trace file you could examine the Query Block Registry that appears near the end of the trace file to get some clues – here’s an example from 19.11.0.0 for a simple query involving a single table plus an IN subquery:

Query Block Registry:
SEL$2 0x6d47cde8 (PARSER)
  SEL$5DA710D3 0x6d480e60 (SUBQUERY UNNEST SEL$1; SEL$2;)
    SEL$2F1334C4 0x6d480e60 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3) [FINAL]
  SEL$683B0107 0x6d47cde8 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
    SEL$B186933D 0x6d48e3a8 (VIEW MERGE SEL$88A77D12; SEL$683B0107; SEL$5DA710D3) [FINAL]
    SEL$88A77D12 0x6d48e3a8 (PROJECTION VIEW FOR CVM SEL$683B0107)
      SEL$B186933D 0x6d48e3a8 (VIEW MERGE SEL$88A77D12; SEL$683B0107; SEL$5DA710D3) [FINAL]
SEL$1 0x6d480e60 (PARSER)
  SEL$5DA710D3 0x6d480e60 (SUBQUERY UNNEST SEL$1; SEL$2;)

I’m not going to say anything about interpreting this extract because I want to highlight a recent feature of the dbms_xplan package (brought to my attention by Franck Pachot some time ago). One of the format options for displaying execution plans will report the query block registry. Here’s the output from display_cursor(format=>’qbregistry’)) in 21.3.0.0 for the query that produced the CBO trace extract above:

Query Block Registry:
---------------------
  SEL$1 (PARSER)
    SEL$5DA710D3 (SUBQUERY UNNEST SEL$1 ; SEL$2)
      SEL$2F1334C4 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3) [FINAL]
  SEL$2 (PARSER)
    SEL$683B0107 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
      SEL$88A77D12 (PROJECTION VIEW FOR CVM SEL$683B0107)
        SEL$B186933D (VIEW MERGE SEL$88A77D12 ; SEL$683B0107) [FINAL]

Two things to notice here – first that the output has reduced the 9 lines to 7 lines (which can only be helpful). secondly that the redundant memory addresses which appear in the trace file don’t get copied into the report.

I’m still not going to say anything about interpreting the output because I want to show you the display_cursor() output for the same query when executed in 19.11.0..0. It looks like this:

Query Block Registry:
---------------------

  <q o="13"><n><![CDATA[SEL$88A77D12]]></n><p><![CDATA[SEL$683B0107]]></p><
        f><h><t><![CDATA[T1]]></t><s><![CDATA[SEL$1]]></s></h><h><t><![CDATA[VW_N
        SO_1]]></t><s><![CDATA[SEL$5DA710D3]]></s></h></f></q>
  <q o="12"><n><![CDATA[SEL$683B0107]]></n><p><![CDATA[SEL$2]]></p><f><h><t
        ><![CDATA[T2]]></t><s><![CDATA[SEL$2]]></s></h></f></q>
  <q o="2"><n><![CDATA[SEL$1]]></n><f><h><t><![CDATA[T1]]></t><s><![CDATA[S
        EL$1]]></s></h></f></q>
  <q o="2"><n><![CDATA[SEL$2]]></n><f><h><t><![CDATA[T2]]></t><s><![CDATA[S
        EL$2]]></s></h></f></q>
  <q o="18" f="y" h="y"><n><![CDATA[SEL$B186933D]]></n><p><![CDATA[SEL$88A7
        7D12]]></p><i><o><t>VW</t><v><![CDATA[SEL$683B0107]]></v></o></i><f><h><t
        ><![CDATA[T1]]></t><s><![CDATA[SEL$1]]></s></h><h><t><![CDATA[T2]]></t><s
        ><![CDATA[SEL$2]]></s></h></f></q>
  <q o="19" h="y"><n><![CDATA[SEL$5DA710D3]]></n><p><![CDATA[SEL$1]]></p><i
        ><o><t>SQ</t><v><![CDATA[SEL$2]]></v></o></i><f><h><t><![CDATA[T1]]></t><
        s><![CDATA[SEL$1]]></s></h><h><t><![CDATA[VW_NSO_1]]></t><s><![CDATA[SEL$
        5DA710D3]]></s></h></f></q>
  <q o="15" f="y"><n><![CDATA[SEL$2F1334C4]]></n><p><![CDATA[SEL$5DA710D3]]
        ></p><f><h><t><![CDATA[VM_NWVW_2]]></t><s><![CDATA[SEL$2F1334C4]]></s></h
        ></f></q>

Yes, it’s naked XML (extracted from the v$sql_plan.other_xml column for operation 1).

I had been living in hope that someone else would write a messy bit of SQL to translate this into something readable – but the last time I searched the Internet for “other_xml qbregistry” I got the magical result of a Googlewhack (i.e. only one hit), which was in Russian, and largely a short description of all the options for the format command.

Since I’ve just installed 21.3 on a VM I decided to bite the bullet but I’ve taken the short-cut to writing the code. I’ve run a trace on a call to dbms_xplan.display_cursor() and extracted the critical query from the resulting trace file. Then I spent 30 minutes making it readable, hacking it to make it almost workable on 19c, then finding out why it can’t work without a little extra effort. Here’s the resulting hack:

rem
rem     Script:         qbregistry_query.sql
rem     Author:         Oracle Corp / Jonathan Lewis
rem     Dated:          Aug 2021
rem
rem     Last tested 
rem             19.11.0.0
rem

define m_sql_id='232sya6twg7sq'
define m_origin = 2

with 
xml as (
        select  other_xml
        from    V$sql_plan 
        where   sql_id = '&m_sql_id' 
        and     id = 1
        and     other_xml is not null
),
allqbs as ( 
        select 
                extractvalue(d.column_value, '/q/n') qbname, 
                extractvalue(d.column_value, '/q/@f') final, 
                extractvalue(d.column_value, '/q/p') prev, 
                extractvalue(d.column_value, '/q/@o') origin 
        from 
                table(xmlsequence(extract(xmltype ((select other_xml from xml)), '/other_xml/qb_registry/q'))) d 
), 
inpqbs as ( 
        select 
                xml.qbname qbname, 
                listagg(xml.depqbs, ',') within group (
                        order by xml.depqbs) depqbs 
        from 
                xmltable('/other_xml/qb_registry/q/i/o' passing xmltype((select other_xml from xml)) 
                        columns depqbs varchar2(256) path 'v', 
                        qbname varchar2(256) path './../../n'
                ) xml 
        where     xml.depqbs in ( select qbname from allqbs) 
        group by xml.qbname
), 
recqb   (src, origin, dest, final, lvl, inpobjs) as ( 
        select 
                qbname src, origin origin, null dest, final final, 1 lvl, null inpobjs 
        from 
                allqbs
        where 
--              origin = &m_origin
                origin in (2,3)
        union all 
        select 
                a.qbname src, a.origin origin, a.prev dest, a.final final, lvl+1, 
                (select depqbs from inpqbs i where i.qbname = a.qbname) inpobjs 
        from
                allqbs a, 
                recqb r 
        where a.prev = r.src
)
search depth first by src asc set ordseq, 
finalans as ( 
        select 
                src,
/* 
                (
                select 
                        name 
                from    v$query_block_origin 
                where 
                        origin_id=origin
                )       origin, 
*/
                origin,
                dest, final, lvl, inpobjs 
        from recqb order by ordseq
) 
select
        /*+ opt_param('parallel_execution_enabled', 'false') */ 
        g.qbreg 
from (
        select 
                rpad(' ', 2*(lvl-1)) || 
                src || ' (' || origin || 
                        case when length(dest)>0 
                                then ' ' || dest 
                                else '' 
                        end || 
                        case when length(inpobjs)>0 
                                then ' ; ' || inpobjs 
                                else ' ' 
                        end ||
                        ')' || 
                        case when final='y' 
                                then ' [final]' 
                                else '' 
                        end 
                qbreg 
        from 
                finalans
        ) g
/

In lines 10 and 11 I’ve defined a couple of substitution variables that appear further on in the script. One is the SQL_ID for the query you’re interested in, the other is a fixed (probably) symbolic constant used by Oracle.

Lines 14-20 are a “with” subquery that I’ve prepended to Oracle’s internal code to create a single row, single column table holding the other_xml value of the query of interest. You’ll notice that I’ve been fairly casual about this bit since I haven’t catered for the fact that a single sql_id may have several child cursors and might even be obsolete.

Lines 28 and 36 are where I’ve used my “with” subquery to supply the other_xml value that would have appeared as a bind variable (:B1) in the trace file.

Line 49 (commented out for the reason described in footnote 1) uses the m_origin variable to identify a row in the dynamic performance view v$query_block_origin (highlighted at line 68) and that’s where we have a problem with Oracle 19c: the view doesn’t exist, nor does the x$qbname structure that the view is based on (although it’s easy to find a table of the values in the oracle executable – albeit that several items in the 21c list don’t appear in the 19.11 executable).

In the code above I’ve actually commented out the whole of the inline scalar subquery that translates an origin number into an origin name and reported the actual value of origin. Originally I did this to check whether it was worth spending any more working on the code – and this is the result I got the initial test:

SEL$1 (2 )
  SEL$5DA710D3 (19 SEL$1 ; SEL$2)
    SEL$2F1334C4 (15 SEL$5DA710D3 ) [final]
SEL$2 (2 )
  SEL$683B0107 (12 SEL$2 )
    SEL$88A77D12 (13 SEL$683B0107 )
      SEL$B186933D (18 SEL$88A77D12 ; SEL$683B0107) [final]

A quick check by eye shows that it’s got the same pattern and set of query block names that the 21c output produced so it’s clearly a step in the right direction. Now all I need is a way to translate the origin numbers into names.

I could have tried searching x$ksmfsv to see if I could spot a pointer to the relevant structure and fake my way through the whole process of creating a “nearly dynamic” performance view, but I decided the quick and dirty workaround was to dump a CSV file listing the view contents in 21c, then read the file back as an external table to copy the data into a local IOT (index organized table) called my_query_block_origin. With the inline view back in play – and the name suitably changed – the 19c and 21c queries produced the same result (which is slightly surprising as the “SUBQUERY UNNEST” and “VIEW MERGE” options don’t seem to exist in the 19.11 list I found in the oracle executable.)

Footnote 1

Here’s a query to show the content of that 21c view (which is fairly interesting in its own right):

set linesize 144
set pagesize 100
set trimspool on
set tabout off

column  name format a60
column  hint_token format a32

spool query_block_origin.lst

select
        origin_id,
        name,
        hint_token
from
        v$query_block_origin
/

 ORIGIN_ID NAME                                                         HINT_TOKEN
---------- ------------------------------------------------------------ --------------------------------
         0 NOT NAMED
         1 ALLOCATE
         2 PARSER
         3 HINT
         4 COPY
         5 SAVE
         6 MV REWRITE                                                   REWRITE
         7 PUSHED PREDICATE                                             PUSH_PRED
         8 STAR TRANSFORM SUBQUERY
         9 COMPLEX VIEW MERGE
        10 COMPLEX SUBQUERY UNNEST
        11 OR EXPANSION                                                 USE_CONCAT
        12 SUBQ INTO VIEW FOR COMPLEX UNNEST
        13 PROJECTION VIEW FOR CVM
        14 GROUPING SET TO UNION
        15 SPLIT/MERGE QUERY BLOCKS
        16 COPY PARTITION VIEW
        17 RESTORE
        18 VIEW MERGE                                                   MERGE
        19 SUBQUERY UNNEST                                              UNNEST
        20 STAR TRANSFORM                                               STAR_TRANSFORMATION
        21 INDEX JOIN
        22 STAR TRANSFORM TEMP TABLE
        23 MAP QUERY BLOCK
        24 VIEW ADDED
        25 SET QUERY BLOCK
        26 QUERY BLOCK TABLES CHANGED
        27 QUERY BLOCK SIGNATURE CHANGED
        28 MV UNION QUERY BLOCK
        29 SPLIT QUERY BLOCK FOR GSET-TO-UNION                          EXPAND_GSET_TO_UNION
        30 PREDICATES REMOVED FROM QUERY BLOCK                          PULL_PRED
        31 PREDICATES ADDED TO QUERY BLOCK
        32 OLD PUSHED PREDICATE                                         OLD_PUSH_PRED
        33 ORDER BY REMOVED FROM QUERY BLOCK                            ELIMINATE_OBY
        34 JOIN REMOVED FROM QUERY BLOCK                                ELIMINATE_JOIN
        35 OUTER-JOIN REMOVED FROM QUERY BLOCK                          OUTER_JOIN_TO_INNER
        36 STAR TRANSFORMATION JOINBACK ELIMINATION                     ELIMINATE_JOIN
        37 BITMAP JOIN INDEX JOINBACK ELIMINATION                       ELIMINATE_JOIN
        38 CONNECT BY COST BASED TRANSFORMATION                         CONNECT_BY_COST_BASED
        39 CONNECT BY WITH FILTERING                                    CONNECT_BY_FILTERING
        40 CONNECT BY WITH NO FILTERING                                 NO_CONNECT_BY_FILTERING
        41 CONNECT BY START WITH QUERY BLOCK
        42 CONNECT BY FULL SCAN QUERY BLOCK
        43 PLACE GROUP BY                                               PLACE_GROUP_BY
        44 CONNECT BY NO FILTERING COMBINE                              NO_CONNECT_BY_FILTERING
        45 VIEW ON SELECT DISTINCT
        46 COALESCED SUBQUERY                                           COALESCE_SQ
        47 QUERY HAS COALESCED SUBQUERIES                               COALESCE_SQ
        48 SPLIT QUERY BLOCK FOR DISTINCT AGG OPTIM                     TRANSFORM_DISTINCT_AGG
        49 CONNECT BY ELIMINATE DUPLICATES FROM INPUT                   CONNECT_BY_ELIM_DUPS
        50 CONNECT BY COST BASED TRANSFORMATION FOR WHR ONLY            CONNECT_BY_CB_WHR_ONLY
        51 TABLE EXPANSION                                              EXPAND_TABLE
        52 TABLE EXPANSION BRANCH
        53 JOIN FACTORIZATION SET QUERY BLOCK                           FACTORIZE_JOIN
        54 DISTINCT PLACEMENT                                           PLACE_DISTINCT
        55 JOIN FACTORIZATION BRANCH QUERY BLOCK
        56 TABLE LOOKUP BY NESTED LOOP QUERY BLOCK                      TABLE_LOOKUP_BY_NL
        57 FULL OUTER JOIN TRANSFORMED TO OUTER                         FULL_OUTER_JOIN_TO_OUTER
        58 LEFT OUTER JOIN TRANSFORMED TO ANTI                          OUTER_JOIN_TO_ANTI
        59 VIEW DECORRELATED                                            DECORRELATE
        60 QUERY VIEW DECORRELATED                                      DECORRELATE
        61 NOT EXISTS SQ ADDED
        62 BRANCH WITH OUTER JOIN
        63 BRANCH WITH ANTI JOIN
        64 UNION ALL FOR FULL OUTER JOIN
        65 VECTOR TRANSFORMATION                                        VECTOR_TRANSFORM
        66 VECTOR TRANSFORMATION TEMP TABLE
        67 QUERY ANSI REARCHiTECTURE                                    ANSI_REARCH
        68 VIEW ANSI REARCHiTECTURE                                     ANSI_REARCH
        69 ELIMINATION OF GROUP BY                                      ELIM_GROUPBY
        70 UAL BRANCH OF UNNESTED SUBQUERY
        71 QUERY BLOCK HAS BUSHY JOIN                                   BUSHY_JOIN
        72 SUBQUERY ELIMINATE                                           ELIMINATE_SQ
        73 OR EXPANSION UNION ALL BRANCH
        74 OR EXPANSION UNION ALL VIEW                                  OR_EXPAND
        75 DIST AGG GROUPING SETS UNION ALL TRANSFORMATION              USE_DAGG_UNION_ALL_GSETS
        76 MATERIALIZED WITH CLAUSE
        77 STATISTCS BASED TRANSFORMED QB
        78 PQ TABLE EXPANSION
        79 LEFT OUTER JOIN TRANSFORMED TO BOTH INNER AND ANTI
        80 SHARD TEMP TABLE
        81 BRANCH OF COMPLEX UNNESTED SET QUERY BLOCK
        82 DIST AGG GROUPING SETS OPTIMIZATION                          DAGG_OPTIM_GSETS


You’ll notice the highlight for origin_id 2 which has the name PARSER – that’s the (first) significant value when reporting the query block registry but take note, also, of origin_id 3 which has the name hint. This is where the code built into 21c goes wrong. If you use the qb_name hint to name all your query blocks then their origin_id will be 3, and Oracle’s code won’t find them.

When I added the hint /*+ qb_name(main) */ to the query this is what I got from my registry query:

Query Block Registry:
---------------------

  SEL$1 (PARSER)
    SEL$7D4DB4AA (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$1)
      SEL$EFD91A2C (PROJECTION VIEW FOR CVM SEL$7D4DB4AA)
        SEL$7086F02E (VIEW MERGE SEL$EFD91A2C ; SEL$7D4DB4AA) [FINAL]

And when I also added the hint /*+ qb_name(subq) */ to the subquery the result was this:

Query Block Registry:
---------------------

An uncaught error happened in display_cursor : ORA-06502: PL/SQL: numeric or value error

I’ve said for a long time: “always name all your query blocks”. I think 21c (temporarily) demonstrates why you have two options: name ALL of them or name NONE of them. If you name just some of them you might not notice that parts of your plan don’t appear in the registry report, and I’d say it’s better to see an error than to be fooled into thinking you’ve got complete information.

Footnote 2

There’s another new option for the format parameter in 21c which is qbregistry_graph. I haven’t considered playing about with the trace file to see if I can extract and hack the SQL that generates the appropriate output (but that might change if I pick up a tool to turn the textual description into a graphic). For the registry listing above this is what the “graph” output looks like:

Query Block Registry Graph (dot format):
---------------------
digraph g{
  rankdir = TB
  "SEL$88A77D12"
  "SEL$683B0107"
  "SEL$1"
  "SEL$2"
  "SEL$B186933D" [peripheries=2]
  "SEL$5DA710D3"
  "SEL$2F1334C4" [peripheries=2]
  "SEL$683B0107" -> "SEL$88A77D12" [label="PROJECTION VIEW FOR CVM"]
  "SEL$2" -> "SEL$683B0107" [label="SUBQ INTO VIEW FOR COMPLEX UNNEST"]
  "SEL$88A77D12" -> "SEL$B186933D" [label="VIEW MERGE"]
  "SEL$1" -> "SEL$5DA710D3" [label="SUBQUERY UNNEST"]
  "SEL$5DA710D3" -> "SEL$2F1334C4" [label="SPLIT/MERGE QUERY BLOCKS"]
  "SEL$683B0107" -> "SEL$B186933D" [style=dotted]
  "SEL$2" -> "SEL$5DA710D3" [style=dotted]
  { rank = same }
  {
    rank="sink";
    rankdir = LR;
    item1 [style=invis];
    item2 [shape="plaintext" label="Participating query blocks"];
    item3 [label="&nbsp;" peripheries=2];
    item4 [shape="plaintext" label="Final query blocks"];
    item1 -> item2 [style=dotted];
    { rank=same item3 item4; }
  }
}

Footnote 2.1 (a few days later)

A couple of days after publishing this note I received an email pointing out that the qbregistry_graph output is in the Graphviz DOT language (see http://www.graphviz.org/) and there are even websites where it can easily be rendered into graphic form. This is the result I got by pasting the output into the website:

I haven’t tried to think through a generalised pattern for drawing these pictures, but I think I’d prefer to see a diagram which showed that the “final” query block sel$2f1334c4 was used by query block sel$b186933d. After all, the word “final” in this context means only that the query block was one for which the optimizer produced an independent (sub-)plan.

Footnote 3

For completeness – here’s the original SQL and plan for the statement that produced this qbregistry example:

select
        *
from    t1
where   owner = 'OUTLN'
and     object_name in (
                select  /*+
                                unnest
                                no_semijoin
                        */
                        distinct object_name
                from   t2
                where  object_type = 'TABLE'
        )
;

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |       |       |     5 (100)|          |
|   1 |  VIEW                                  | VM_NWVW_2 |     1 |   483 |     5  (20)| 00:00:01 |
|   2 |   HASH UNIQUE                          |           |     1 |   155 |     5  (20)| 00:00:01 |
|   3 |    NESTED LOOPS SEMI                   |           |     1 |   155 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1        |     1 |   128 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | T1_I1     |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS BY INDEX ROWID BATCHED| T2        |     1 |    27 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                  | T2_I2     |    48 |       |     0   (0)|          |
----------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$B186933D / "VM_NWVW_2"@"SEL$2F1334C4"
   2 - SEL$B186933D
   4 - SEL$B186933D / "T1"@"SEL$1"
   5 - SEL$B186933D / "T1"@"SEL$1"
   6 - SEL$B186933D / "T2"@"SEL$2"
   7 - SEL$B186933D / "T2"@"SEL$2"

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('21.1.0')
      DB_VERSION('21.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$B186933D")
      MERGE(@"SEL$683B0107" >"SEL$5DA710D3")
      OUTLINE_LEAF(@"SEL$2F1334C4")
      OUTLINE(@"SEL$88A77D12")
      OUTLINE(@"SEL$683B0107")
      OUTLINE(@"SEL$5DA710D3")
      UNNEST(@"SEL$2" UNNEST_INNERJ_DISTINCT_VIEW)
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$1")
      NO_ACCESS(@"SEL$2F1334C4" "VM_NWVW_2"@"SEL$2F1334C4")
      INDEX_RS_ASC(@"SEL$B186933D" "T1"@"SEL$1" ("T1"."OWNER"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B186933D" "T1"@"SEL$1")
      INDEX_RS_ASC(@"SEL$B186933D" "T2"@"SEL$2" ("T2"."OBJECT_TYPE"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$B186933D" "T2"@"SEL$2")
      LEADING(@"SEL$B186933D" "T1"@"SEL$1" "T2"@"SEL$2")
      USE_NL(@"SEL$B186933D" "T2"@"SEL$2")
      USE_HASH_AGGREGATION(@"SEL$B186933D" UNIQUE)
      PARTIAL_JOIN(@"SEL$B186933D" "T2"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("OWNER"='OUTLN')
   6 - filter("OBJECT_NAME"="OBJECT_NAME")
   7 - access("OBJECT_TYPE"='TABLE')

Query Block Registry:
---------------------
  SEL$1 (PARSER)
    SEL$5DA710D3 (SUBQUERY UNNEST SEL$1 ; SEL$2)
      SEL$2F1334C4 (SPLIT/MERGE QUERY BLOCKS SEL$5DA710D3) [FINAL]
  SEL$2 (PARSER)
    SEL$683B0107 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2)
      SEL$88A77D12 (PROJECTION VIEW FOR CVM SEL$683B0107)
        SEL$B186933D (VIEW MERGE SEL$88A77D12 ; SEL$683B0107) [FINAL]

Tables t1 and t2 are copies of the same data set, which is a subset of 100 rows from all_objects. You won’t necessarily see this plan on your systems because (even with the hints) the plan can vary depending on the number of rows with owner = ‘OUTLN’ (which is likely to be zero) or with object_type = ‘TABLE’ (which might be all of them). The script I started with was one I had used in a note I wrote about “distinct” appearing in the select list of subqueries, but the data it produced in the newer versions of Oracle was sufficiently different that I had to be a little more careful in constructing a data set that produced stable plans.

If you cross-check the Query Block Registry with the Outline Information you’ll see that the lines labelled FINAL start with the query block names that are shown as “outline_leaf” entries, and the other 5 query block names appear as “outline” entries.

Reading down the tree I then find myself strugling to interpret the QBR. I think I know what has happened, but I can’t quite manage to see exactly how the QBR is telling me that it happened.

QBR – tentative interpretation

Part of the difficulty is that the QBR seems to have a section for every initial query block in the query, so there’s (a) likely to be some overlap between sections and (b) some sequencing that means you can’t get the full picture just by reading straight from top to bottom. In this case we have two initial query blocks (the main query, implicitly named sel$1, and the subquery implicitly named sel$2), and I think the interpretation is as follows:

Starting with sel$2 section we can see that its second line tells us that the subquery was unnested and the resulting aggregate inline view is the sole content of a query block called sel$683B0107.

Jumping backwards to the sel$1 section, its second line tells us that sel$5DA710D3 is a query block consisting of a join between t1 and the inline aggregate view.

Sticking with the sel$1 tree, we then see a query block that tells us that the optimizer has transformed an “aggregate then join” into a “join then aggregate”. sel$2F1334C4 is the query block holding nothing but a select from the view VM_VWNW_2.

Returning to the sel$2 tree, sel$88A77D12 is the resulting query block when the inline aggregate view is merged using complex view merging. This is where I get a bit stuck, because this seems to be repeating a step that we’ve handled in the sel$1 section by a different route.

The final step of the sel$2 tree is sel$B186933D the query block where we select from the non-mergeable inline view VM_VWNW_2 that seems to have come from one of two different places.

Bottom line on this one: even though it’s an extremely simple query and I believe I understand what the execution plan is telling us about the transformations that took place, the query block registry is still something of a mystery to me.

August 23, 2021

Distributed Query

Filed under: distributed,Execution plans,Hints,Oracle,subqueries,Transformations,Troubleshooting — Jonathan Lewis @ 5:24 pm BST Aug 23,2021

Here’s an example that appeared on the Oracle Developer Community forum about a year ago that prompted me to do a little investigative work. The question involved a distributed query that was “misbehaving” – the interesting points were the appearance of the /*+ rule */ and /*+ driving_site() */ hints in the original query when combined with a suggestion to address the problem using the /*+ materialize */ hint with factored subqueries (common table expressions – CTEs), or when combined with my suggestion to use the /*+ no_merge */ hint.

If you don’t want to read the whole article there’s a tl;dr summary just before the end.

The original question was posed with a handful of poorly constructed code fragments that were supposed to describe the problem, viz:


select /*+ DRIVING_SITE (s1) */ * from  Table1 s1 WHERE condition in (select att1 from local_table) ; -- query n°1

select /*+ RULE DRIVING_SITE (s2) */  * from  Table2 s2 where  condition in (select att1 from local_table); -- query n°2

select * from
select /*+ DRIVING_SITE (s1) */ * from  Table1 s1 WHERE condition in (select att1 from local_table) ,
select /*+ RULE DRIVING_SITE (s2) */  * from  Table2 s2 where  condition in (select att1 from local_table)
where att_table_1 = att_table_2  -- sic

The crux of the problem was that the two separate statements individually produced an acceptable execution plan but the attempt to use the queries in inline views with a join resulted in a plan that (from the description) sounded like the result of Oracle merging the two inline views and running the two IN subqueries as FILTER (existence) subqueries.

We weren’t shown any execution plans and only had the title of the question (“Distributed sql query through multiple databases”) to give us the clue that there might be three different databases involved.

Obviously there are several questions worth asking when presented with this problem. The first being “can we have a more realistic piece of code”, also “which vesion of Oracle”, and “where are the execution plans”. I can’t help feeling that there’s more to the problem than just the three tables that seem to be suggested by the fragments supplied.

More significant, though, was the surprise that rule and driving_site should work together. There’s a long-standing (but incorrect) assertion that “any other hint invalidates the RULE hint”. I think I’ve published an example somewhere showing that /*+ unnest */ would affect an execution plan where the optimizer still obeyed the /*+ rule */ hint, and there’s an old post on this blog which points out that transformation and optimisation are (or were, at the time) independent of each other, implying that you could combine the rule hint with “transformational” hints and still end up with a rule-based execution plan.

Despite old memories suggesting the contrary my first thought was that the rule and driving_site hints couldn’t be working together – and that made it worth running a little test. Then one of the other specialists on the forums suggested using subquery factoring with the materialize hint – and I thought that probably wouldn’t help because when you insert into a global temporary table the driving site has to become the site that holds the global temporary tables (in fact this isn’t just a feature of GTTs). So there was another thing prompting me to run a test. (And then I suggested using the /*+ no_merge */ hint – but thought I’d check if that idea was going to work before I suggested it.)

So here’s a code sample to create some data, and the first two simple queries with calls for their predicted execution plans:

rem
rem     Script:         distributed_multi.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2020
rem     Purpose:
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

rem     create public database link test@loopback using 'test';
rem     create public database link test2@loopback using 'test2';

rem     create public database link orcl@loopback using 'orcl';
rem     create public database link orcl2@loopback using 'orcl2';

rem     create public database link orclpdb@loopback using 'orclpdb';
rem     create public database link orclpdb2@loopback using 'orclpdb2';

define m_target=test@loopback
define m_target2=test2@loopback

define m_target=orcl@loopback
define m_target2=orcl2@loopback

define m_target=orclpdb@loopback
define m_target2=orclpdb2@loopback

create table t0
as
select  *
from    all_objects
where   mod(object_id,4) = 1
;

create table t1
as
select  *
from    all_objects
where   mod(object_id,11) = 0
;

create table t2
as
select  *
from    all_Objects
where   mod(object_id,13) = 0
;

explain plan for
select  /*+ driving_site(t1) */
        t1.object_name, t1.object_id
from    t1@&m_target    t1
where
        t1.object_id in (
                select  t0.object_id
                from    t0
        )
;

select * from table(dbms_xplan.display);

explain plan for
select
        /*+ rule driving_site(t2) */
        t2.object_name, t2.object_id
from    t2@&m_target2   t2
where
        t2.object_id in (
                select  t0.object_id
                from    t0
        )
;

select * from table(dbms_xplan.display);

Reading from the top down – t0 is in the local database, t1 is in remote database 1, t2 is in remote database 2. I’ve indicated the creation and selection of a pair of public database links at the top of the script – in this case both of them are loopback links to the local database, but I’ve used substitition variables in the SQL to allow me to adjust which databases are the remote ones. Since there are no indexes on any of the tables the optimizer is very limited in its choice of execution plans, which are as follows in 19.3 (the oraclepdb/orclpdb2 links).

First, the query against t1@orclpdb1 – which will run cost-based:


-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|      |  5168 |   287K|    57   (8)| 00:00:01 |        |      |
|*  1 |  HASH JOIN SEMI        |      |  5168 |   287K|    57   (8)| 00:00:01 |        |      |
|   2 |   TABLE ACCESS FULL    | T1   |  5168 |   222K|    16   (7)| 00:00:01 | ORCLP~ |      |
|   3 |   REMOTE               | T0   | 14058 |   178K|    40   (5)| 00:00:01 |      ! | R->S |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A1"."OBJECT_ID"="A2"."OBJECT_ID")

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT "OBJECT_ID" FROM "T0" "A2" (accessing '!' )

Note
-----
   - fully remote statement

You’ll note that operation 3 is simply REMOTE, and t0 is the object accessed – which means this query is behaving as if the (local) t0 table is the remote one as far as the execution plan is concerned. The IN-OUT column tells us that this operation is “Remote to Serial” (R->S)” and the instance called to is named “!” which is how the local database is identified in the plan from a remote database.

We can also see that the execution plan gives us the “Remote SQL Information” for operation 2 – and that’s the text of the query that gets sent by the driving site to the instance that holds the object of interest. In this case the query is simply selecting the object_id values from all the rows in t0.

Now the plan for the query against t2@orclpdb2 which includes a /*+ rule */ hint:

-----------------------------------------------------------
| Id  | Operation              | Name     | Inst   |IN-OUT|
-----------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|          |        |      |
|   1 |  MERGE JOIN            |          |        |      |
|   2 |   SORT JOIN            |          |        |      |
|   3 |    TABLE ACCESS FULL   | T2       | ORCLP~ |      |
|*  4 |   SORT JOIN            |          |        |      |
|   5 |    VIEW                | VW_NSO_1 | ORCLP~ |      |
|   6 |     SORT UNIQUE        |          |        |      |
|   7 |      REMOTE            | T0       |      ! | R->S |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("A1"."OBJECT_ID"="OBJECT_ID")
       filter("A1"."OBJECT_ID"="OBJECT_ID")

Remote SQL Information (identified by operation id):
----------------------------------------------------
   7 - SELECT /*+ RULE */ "OBJECT_ID" FROM "T0" "A2" (accessing '!' )

Note
-----
   - fully remote statement
   - rule based optimizer used (consider using cbo)

The most striking feature of this plan is that it is an RBO (rule based optimizer) plan not a cost-based plan – and the Note section confirms that observation. We can also see that the Remote SQL Information is echoing the /*+ RULE */ hint back in it’s query against t0. Since the query is operating rule-based the hash join mechanism is not available (it’s a costed path – it needs to know the size of the data that will be used in the build table), and that’s why the plan is using a sort/merge join.

Following the “incremental build” strategy for writing SQL all we have to do as the next step of producing the final code is put the two queries into separate views and join them:


explain plan for
select  v1.*, v2.*
from    (
        select  /*+ driving_site(t1) */
                t1.object_name, t1.object_id
        from    t1@&m_target    t1
        where
                t1.object_id in (
                        select  t0.object_id
                        from    t0
                )
        )       v1,
        (
        select
                /*+ rule driving_site(t2) */
                t2.object_name, t2.object_id
        from    t2@&m_target2 t2
        where
                t2.object_id in (
                        select  t0.object_id
                        from    t0
                )
        )       v2
where
        v1.object_id = v2.object_id
;

select * from table(dbms_xplan.display);

And here’s the execution plan – which, I have to admit, gave me a bit of a surprise on two counts when I first saw it:


-----------------------------------------------------------
| Id  | Operation              | Name     | Inst   |IN-OUT|
-----------------------------------------------------------
|   0 | SELECT STATEMENT       |          |        |      |
|   1 |  MERGE JOIN            |          |        |      |
|   2 |   MERGE JOIN           |          |        |      |
|   3 |    MERGE JOIN          |          |        |      |
|   4 |     SORT JOIN          |          |        |      |
|   5 |      REMOTE            | T2       | ORCLP~ | R->S |
|*  6 |     SORT JOIN          |          |        |      |
|   7 |      REMOTE            | T1       | ORCLP~ | R->S |
|*  8 |    SORT JOIN           |          |        |      |
|   9 |     VIEW               | VW_NSO_1 |        |      |
|  10 |      SORT UNIQUE       |          |        |      |
|  11 |       TABLE ACCESS FULL| T0       |        |      |
|* 12 |   SORT JOIN            |          |        |      |
|  13 |    VIEW                | VW_NSO_2 |        |      |
|  14 |     SORT UNIQUE        |          |        |      |
|  15 |      TABLE ACCESS FULL | T0       |        |      |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
       filter("T1"."OBJECT_ID"="T2"."OBJECT_ID")
   8 - access("T2"."OBJECT_ID"="OBJECT_ID")
       filter("T2"."OBJECT_ID"="OBJECT_ID")
  12 - access("T1"."OBJECT_ID"="OBJECT_ID")
       filter("T1"."OBJECT_ID"="OBJECT_ID")

Remote SQL Information (identified by operation id):
----------------------------------------------------
   5 - SELECT /*+ RULE */ "OBJECT_NAME","OBJECT_ID" FROM "T2" "T2"
       (accessing 'ORCLPDB2.LOCALDOMAIN@LOOPBACK' )

   7 - SELECT /*+ RULE */ "OBJECT_NAME","OBJECT_ID" FROM "T1" "T1"
       (accessing 'ORCLPDB.LOCALDOMAIN@LOOPBACK' )

Note
-----
   - rule based optimizer used (consider using cbo)

The two surprises were that (a) the entire plan was rule-based, and (b) the driving_site() selection has disappeared from the plan.

Of course as soon as I actually started thinking about what I’d written (instead of trusting the knee-jerk “just stick the two bits together”) the flaw in the strategy became obvious.

  • Either the whole query runs RBO or it runs CBO – you can’t split the planning.
  • In the words of The Highlander “There can be only one” (driving site that is) – only one of the database involved will decide how to decompose and distribute the query.

It’s an interesting detail that the /*+ rule */ hint seems to have pushed the whole query into the arms of the RBO despite being buried somewhere in the depths of the query rather than being in the top level query block – but we’ve seen that before in some old data dictionary views.

The complete disregard for the driving_site() hints is less interesting – there is, after all, a comment in the manuals somewhere to the effect that when two hints contradict each other they are both ignored. (But I did wonder why the Hint Report that should appear with 19.3 plans didn’t tell me that the hints had been observed but not used.)

The other problem (from the perspective of the OP) is that the two inline views have been merged so the join order no longer reflects the two isolated components we used to have. So let’s fiddle around a little bit to see how close we can get to what the OP wants. The first step would be to add the /*+ no_merge */ hint to both inline view, and eliminate one of the /*+ driving_site() */ hints to see what happens, and since we’re modern we’ll also get rid of the /*+ rule */ hint:


explain plan for
select  v1.*, v2.*
from    (
        select  /*+ qb_name(subq1) no_merge driving_site(t1) */
                t1.object_name, t1.object_id
        from    t1@&m_target    t1
        where
                t1.object_id in (
                        select  t0.object_id
                        from    t0
                )
        )       v1,
        (
        select
                /*+ qb_name(subq2) no_merge */
                t2.object_name, t2.object_id
        from    t2@&m_target2 t2
        where
                t2.object_id in (
                        select  t0.object_id
                        from    t0
                )
        )       v2
where
        v1.object_id = v2.object_id
;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|      |  4342 |   669K|    72   (9)| 00:00:01 |        |      |
|*  1 |  HASH JOIN             |      |  4342 |   669K|    72   (9)| 00:00:01 |        |      |
|   2 |   VIEW                 |      |  4342 |   334K|    14   (8)| 00:00:01 |        |      |
|   3 |    REMOTE              |      |       |       |            |          |      ! | R->S |
|   4 |   VIEW                 |      |  5168 |   398K|    57   (8)| 00:00:01 |        |      |
|*  5 |    HASH JOIN SEMI      |      |  5168 |   287K|    57   (8)| 00:00:01 |        |      |
|   6 |     TABLE ACCESS FULL  | T1   |  5168 |   222K|    16   (7)| 00:00:01 | ORCLP~ |      |
|   7 |     REMOTE             | T0   | 14058 |   178K|    40   (5)| 00:00:01 |      ! | R->S |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A2"."OBJECT_ID"="A1"."OBJECT_ID")
   5 - access("A3"."OBJECT_ID"="A6"."OBJECT_ID")

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - EXPLAIN PLAN INTO "PLAN_TABLE" FOR SELECT /*+ QB_NAME ("SUBQ2") NO_MERGE */
       "A1"."OBJECT_NAME","A1"."OBJECT_ID" FROM  (SELECT DISTINCT "A3"."OBJECT_ID"
       "OBJECT_ID" FROM "T0" "A3") "A2","T2"@ORCLPDB2.LOCALDOMAIN@LOOPBACK "A1" WHERE
       "A1"."OBJECT_ID"="A2"."OBJECT_ID" (accessing '!' )

   7 - SELECT "OBJECT_ID" FROM "T0" "A6" (accessing '!' )

Note
-----
   - fully remote statement

In this plan we can see that the /*+ driving_site() */ hint has been applied – the plan is presented from the point of view of orclpdb (the database holding t1). The order of the two inline views has apparently been reversed as we move from the statement to its plan – but that’s just a minor side effect of the hash join (picking the smaller result set as the build table).

Operations 5 – 7 tell us that t1 is treated as the local table and used for the build table in a hash semi-join, and then t0 is accessed by a call back to our database and its result set is used as the probe table.

From operation 3 (in the body of the plan, and in the Remote SQL Information) we see that orclpdb has handed off the entire t2 query block to a remote operation – which is ‘accessing “!”. But there’s a problem (in my opinion) in the SQL that it’s handing off – the text is NOT the text of our inline view; it’s already been through a heuristic transformation that has unnested the IN subquery of our original text into a “join distinct view” – if we had used a hint to force this transformation it would have been the /*+ unnest(UNNEST_INNERJ_DISTINCT_VIEW) */ variant.

SELECT /*+ NO_MERGE */
        "A1"."OBJECT_NAME","A1"."OBJECT_ID"
FROM
       (SELECT DISTINCT "A3"."OBJECT_ID" "OBJECT_ID" FROM "T0" "A3") "A2",
       "T2"@ORCLPDB2.LOCALDOMAIN@LOOPBACK "A1"
WHERE
        "A1"."OBJECT_ID"="A2"."OBJECT_ID"

I tried to change this by adding alternative versions of the /* unnest() */ hint to the original query, following the query block names indicated by the outline information (not shown), but it looks as if the code path constructs the Remote SQL operates without considering the main query hints – perhaps the decomposition code is simply following the code path of the old heuristic “I’ll do it if it’s legal” unnest. The drawback to this is that if the original form of the text had been sent to the other site the optimizer that had to handle it could have used cost-based query transformation and may have come up with a better plan.

You may be wondering why I left the /*+ driving_site() */ hint in one of the inline views rather than inserting it in the main query block. The answer is simple – it didn’t seem to work (even in 19.3) when I put /*+ driving_site(t1@subq1) */ in the main query block.

tl;dr

The optimizer has to operate rule-based or cost-based, it can’t do a bit of both in the same query – so if you’ve got a /*+ RULE */ hint that takes effect anywhere in the query the entire query will be optimised under the rule-based optimizer.

There can be only one driving site for a query, and if you manage to get multiple driving_site() hints in a query that contradict each other the optimizer will ignore all of them.

When the optimizer decomposes a distributed query and produces non-trivial components to send to remote sites you may find that some of the queries constructed for the remote sites have been subject to transformations that you cannot influence by hinting.

Footnote

I mentioned factored subqueries and the /*+ materialize */ option in the opening notes. In plans where the attempt to specify the driving site failed (i.e. when the query ran locally) the factored subqueries did materialize. In any plans where the driving site was a remote site the factored subqueries were always inline. This may well be related to the documented (though not always implemented) restriction that temporary tables cannot take part in distributed transactions.

August 9, 2021

Preferences

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 12:58 pm BST Aug 9,2021

I made a few comments in the past about setting “table preferences” for stats collection – most significantlyj the table_cache_blocks preference that affects the calculation of the clustering_factor of all the indexes on the table, the incremental preference for dictating the strategy used for dealing with partitioned tables, and the method_opt preference for dictating precise requirments for histograms.

If you want to check the current preferences set for a table you can query the XXXX_tab_stat_prefs views. For some reason in the dim and distant past (perhaps in a beta release before the views had been created but perhaps because the views show only the preferences that have been set) I wrote a little script to report all the possible table preferences showing both the table value and the current global value.

rem
rem     Script: get_table_prefs.sql
rem     Dated:  ???
rem     Author: Jonathan Lewis
rem
rem     Last tested
rem             19.11.0.0
rem
rem     Notes
rem     Report the table preferences for a given
rem     owner and table.
rem
rem     Needs to find a list of all legal preferences.
rem             Global prefs are in:    optstat_hist_control$ (sname, spare4)
rem             Table prefs are in:     optstat_user_prefs$ (valchar / valnum)
rem
rem     The public view is dba_tab_stat_prefs / user_tab_stat_prefs.
rem     But if a table has no prefs set there are no rows in the view
rem
rem     This script currently has to be run by sys or a user with 
rem     the select privileges on sys.optstat_hist_control$ (and
rem     execute on dbms_stats).
rem

define m_owner = '&enter_schema'
define m_table = '&enter_tablename'


<<anon_block>>
declare
        pref_count      number(2,0) := 0;
begin
        dbms_output.new_line;
        dbms_output.put_line(
                        rpad('Preference',32) || ' ' ||
                        rpad('Table value',32) || ' ' ||
                        '[Global value]'
        );
        dbms_output.put_line(
                        rpad('=',32,'=') || ' ' ||
                        rpad('=',32,'=') || ' ' ||
                        '================================'
        );
        for c1 in (
                select  sname, spare4 
                from    sys.optstat_hist_control$
                where   spare4 is not null
        ) loop
                anon_block.pref_count := anon_block.pref_count + 1;
                
                dbms_output.put_line(
                        rpad(c1.sname,32) || ' ' ||
                        rpad(dbms_stats.get_prefs(c1.sname,'&m_owner','&m_table'),32) || ' ' 
                        || '[' || c1.spare4 || ']'
                );      

        end loop;
        dbms_output.new_line;
        dbms_output.put_line('Preferences reported: ' || anon_block.pref_count);
end;
/

While I’ve hardly ever used the script – and so haven’t considered reviewing the strategy it uses – the benefit of having it around means that when I have run it I’ve occasionally discovered new preferences that I hadn’t previously noticed (and ought to investigate).

Here’s a sample of the output – from a table with no special settings for preferences:

Preference                       Table value                      [Global value]
================================ ================================ ================================
TRACE                            0                                [0]
DEBUG                            0                                [0]
SYS_FLAGS                        1                                [1]
SPD_RETENTION_WEEKS              53                               [53]
CASCADE                          DBMS_STATS.AUTO_CASCADE          [DBMS_STATS.AUTO_CASCADE]
ESTIMATE_PERCENT                 DBMS_STATS.AUTO_SAMPLE_SIZE      [DBMS_STATS.AUTO_SAMPLE_SIZE]
DEGREE                           NULL                             [NULL]
METHOD_OPT                       FOR ALL COLUMNS SIZE AUTO        [FOR ALL COLUMNS SIZE AUTO]
NO_INVALIDATE                    DBMS_STATS.AUTO_INVALIDATE       [DBMS_STATS.AUTO_INVALIDATE]
GRANULARITY                      AUTO                             [AUTO]
PUBLISH                          TRUE                             [TRUE]
STALE_PERCENT                    10                               [10]
APPROXIMATE_NDV                  TRUE                             [TRUE]
APPROXIMATE_NDV_ALGORITHM        REPEAT OR HYPERLOGLOG            [REPEAT OR HYPERLOGLOG]
ANDV_ALGO_INTERNAL_OBSERVE       FALSE                            [FALSE]
INCREMENTAL                      FALSE                            [FALSE]
INCREMENTAL_INTERNAL_CONTROL     TRUE                             [TRUE]
AUTOSTATS_TARGET                 AUTO                             [AUTO]
CONCURRENT                       OFF                              [OFF]
JOB_OVERHEAD_PERC                1                                [1]
JOB_OVERHEAD                     -1                               [-1]
GLOBAL_TEMP_TABLE_STATS          SESSION                          [SESSION]
ENABLE_TOP_FREQ_HISTOGRAMS       3                                [3]
ENABLE_HYBRID_HISTOGRAMS         3                                [3]
TABLE_CACHED_BLOCKS              1                                [1]
INCREMENTAL_LEVEL                PARTITION                        [PARTITION]
INCREMENTAL_STALENESS            ALLOW_MIXED_FORMAT               [ALLOW_MIXED_FORMAT]
OPTIONS                          GATHER                           [GATHER]
GATHER_AUTO                      AFTER_LOAD                       [AFTER_LOAD]
STAT_CATEGORY                    OBJECT_STATS, REALTIME_STATS     [OBJECT_STATS, REALTIME_STATS]
SCAN_RATE                        0                                [0]
GATHER_SCAN_RATE                 HADOOP_ONLY                      [HADOOP_ONLY]
PREFERENCE_OVERRIDES_PARAMETER   FALSE                            [FALSE]
AUTO_STAT_EXTENSIONS             OFF                              [OFF]
WAIT_TIME_TO_UPDATE_STATS        15                               [15]
ROOT_TRIGGER_PDB                 FALSE                            [FALSE]
COORDINATOR_TRIGGER_SHARD        FALSE                            [FALSE]
MAINTAIN_STATISTICS_STATUS       FALSE                            [FALSE]
AUTO_TASK_STATUS                 OFF                              [OFF]
AUTO_TASK_MAX_RUN_TIME           3600                             [3600]
AUTO_TASK_INTERVAL               900                              [900]

Preferences reported: 41

As the notes that I’ve left in-line say: this version of the script has to be run by SYS or a DBA because of the privileges required.

You might notice, by the way , that this is one of those rare cases where I’ve remembered to use a label to name the PL/SQL block, and then used the label to qualify a variable I’ve used inside the block.

July 9, 2021

19c tweak 2

Filed under: CBO,Oracle,Performance — Jonathan Lewis @ 4:40 pm BST Jul 9,2021

Trying to find out why a plan had changed in the upgrade from 11g to 19c I came across this cunning little tweak that must have appeared in the 19c timeline. I’ll start with a simple query, then the execution plans (autotrace traceonly) from 19.11.0.0 – first with the parameter optimizer_features_enable set to 18.1.0, then with the it set to 19.1.0. The table t1 is a copy of the first 10,000 rows of view all_objects:

SQL> alter session set optimizer_features_enable = '18.1.0';
SQL> select count(data_object_id) from t1 where f1(object_id) = 'Y';

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     7 |    38  (37)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     7 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   100 |   700 |    38  (37)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("F1"("OBJECT_ID")='Y')



SQL> alter session set optimizer_features_enable = '19.1.0';
SQL> select count(data_object_id) from t1 where f1(object_id) = 'Y';

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     7 |    26   (8)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     7 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     5 |    35 |    26   (8)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("DATA_OBJECT_ID" IS NOT NULL AND "F1"("OBJECT_ID")='Y')

Optimising with the optimizer features set back to 18.1 the cardinality estimate is 100 (that’s 1% of the rows in the table, the standard guess for “function() = constant”) with a cost of 38, or which 37% is CPU cost.

Running with the optimizer features of 19c enabled the cardinality estimate drops to 5 and the cost drops to 26 with CPU making up 5% of the cost. Where does the difference come from?

As ever you have to look at the Predicate Information. Running as 18c Oracle has decided to call my function for every row in the table; running as 19c Oracle has decided that since I’m counting non-null entries of column data_object_id it need only call the function when data_object_id is not null, so it’s introduced an extra predicate to make that happen, and that extra predicate has reduced the cardinality and cost estimates. (In my sample data set there are 9,456 nulls and 544 distinct values for data_object_id – so the difference in workload is significant. And 1% of 544 is 5, which explains the cardinality estimate.)

This looks like fix control 24761824 “add is not null for high null column in set function” introduced in 19.1.0. The description suggests that the feature will only be used in cases where the column is “often” null, but we have no clue, yet, about what “often” means. [Update 12th July: thanks to comment #1 below from Andi Schloegl we now have a pretty good idea that the break point is at 5%.]

This means that there may be cases where an execution plan changes on an upgrade to 19c because a tablescan has become cheaper or a cardinality estimate has been reduced.

Just as a confirmation of how the change in plan is echoing reality, here are the execution plans pulled from memory after executing them with the statistics_level set to all to enable collection of the rowsource execution statistics. First the 18c plan, then the 19c plan:

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:11.14 |    1780K|
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:11.14 |    1780K|
|*  2 |   TABLE ACCESS FULL| T1   |      1 |    100 |  10000 |00:00:11.14 |    1780K|
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("F1"("OBJECT_ID")='Y')



-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.46 |   97010 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.46 |   97010 |
|*  2 |   TABLE ACCESS FULL| T1   |      1 |      5 |    544 |00:00:00.46 |   97010 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("DATA_OBJECT_ID" IS NOT NULL AND "F1"("OBJECT_ID")='Y'))

As you can see, the buffer gets has dropped from 1,780K in 18c to 97K in 19c (mainly because the function results in a tablescan of a table of 178 blocks and the number of calls has dropped from 10,000 to 544), and the run time has dropped from 11.14 seconds to 0.46 seconds.

Code

If you want to run and refine this test, here’s the code I used to generate the data.

rem
rem     Script:         19c_not_null_tweak.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2021
rem     Purpose:        
rem
rem     Last tested 
rem             19.11.0.0
rem

create table t1 as select * from all_objects where rownum <= 10000;
create table t2 as select * from t1;

create or replace function f1(i_obj in number) return varchar2
is
        n1 number;
begin
        select count(*) into n1 from t2 where object_id = i_obj;

        if n1 = 0 then
                return 'N';
        else
                return 'Y';
        end if;
end;
/

set autotrace traceonly explain

alter session set optimizer_features_enable = '18.1.0';
select count(data_object_id) from t1 where f1(object_id) = 'Y';

alter session set optimizer_features_enable = '19.1.0';
select count(data_object_id) from t1 where f1(object_id) = 'Y';

set autotrace off

set serveroutput off
alter session set statistics_level = all;

alter session set optimizer_features_enable = '18.1.0';
select count(data_object_id) from t1 where f1(object_id) = 'Y';
select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

alter session set optimizer_features_enable = '19.1.0';
select count(data_object_id) from t1 where f1(object_id) = 'Y';
select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

alter session set statistics_level = typical;
set serveroutput on

July 5, 2021

Fussy FBIs

Filed under: CBO,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 11:19 am BST Jul 5,2021

In a recent thread on the Oracle Developer Forum a user was seeing a significant increase in time spent waiting for row locks after the number of executions of a particular “select for update” had increased from a couple of hundred per hour to a thousand per hour.

It turned out that the locking was a deliberate queueing mechanism following the basic pattern:

lock a row in the "locks" table

do some work in "another table" to flag some rows (perhaps to "own" them).

commit;

The intent was to ensure that processes did not collide (and possibly deadlock) while working on “another table”. It turned out that the increased wait time was due to an increase in the time spent between the lock and the commit; and the reason for that increase was simply a change in the execution path of a key statement executed between the two steps. The core of the work was simply the execution of one or both of two statements:

UPDATE TRAN_TAB 
SET 
        PID     = :B3,
        LOCK_ID = :B2,
        STATUS  = 'I' 
WHERE
        PID    IS NULL 
AND     STATUS = 'W' 
AND     ROWNUM <= :B1
;

UPDATE TRAN_TAB 
SET 
        PID     = :B3,
        LOCK_ID = :B2,
        STATUS  = 'I' 
WHERE
        PID    IS NULL 
AND     STATUS = 'T' 
AND     ROWNUM <= :B1
;

Originally the query had been using an index range scan on an index defined as (status, id, lock_id) but it had switched to using a tablescan because the estimated cardinality had changed from 18 rows to 3.5 million rows.

When you notice that the leading column of the index is called status you might guess (correctly) that there are just a few distinct values for the status, and just a few rows each for values ‘T’ and ‘W’ and that something unexpected had happened during statistics collection that had made Oracle “lose” sight of the special cases and treat ‘T’ (or ‘W’) as an “average” case either using “total rows / num_distinct” or “half the least popular” to estimate the cardinality. [At the time of writing it looks as if the problem appears as a side effect of “real-time statistics”.]

One fix, of course, would be to ensure that the statistics for this column never ever went wrong – and there are various ways of doing that, some more complicated and fragile than others (it’s a partitioned table and needs a suitable frequency histogram in place to get good estimates – the combination isn’t nice). Another strategy would simply be to hint the code (or add an sql_plan_baseline or sql_patch) to use the relevant index.

The nicest strategy (especially given the update to two columns out of the three in the index) might be to take advantage of function-based indexes – creating an index that would be impossible for the optimizer to avoid for these queries, that is as small and efficient as possible, and is highly unlikely to be used in the wrong circumstances. For example, a “two-index” solution:

create index tt_ft on tran_tab(
        case when status = 'T' and pid is null then 0 end
);

create index tt_fw on tran_tab(
        case when status = 'W' and pid is null then 0 end
);

or “single-index” solution:

create index tt_ftw on tran_tab(
        case when status in ('W','T') and pid is null then status end
);

The indexes hold entries only for the very small number of interesting rows, and when the status is updated the entries disappear from the index (rather than being deleted from, and re-inserted to, a very large index). Given the number of partitions in the table (ca. 100) and the very small number of rows involved, and the time-critical nature of the requirement, there’s a good case for making this a global index to avoid the need for doing lots of index probes that will find no data.

The next critical issue is that the code has to be modified to use the index – and the code has to be very precisly written. Here, from a simple model (see footnote), are a couple of examples followed by their (actual) execution plans:

select  lock_id 
from    tran_tab 
where   case when status = 'T' and pid is null then 0 end = 0 
and     rownum <= 5;

select * from table(dbms_xplan.display_cursor);


-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |       |       |     6 (100)|          |
|*  1 |  COUNT STOPKEY                       |          |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| TRAN_TAB |     5 |    30 |     6   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | TT_FT    |    10 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=5)
   3 - access("TRAN_TAB"."SYS_NC00006$"=0)


select  lock_id 
from    tran_tab 
where   case when status in ('W','T') and pid is null then status end = 'W'
;

select * from table(dbms_xplan.display_cursor);


------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |       |       |    11 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TRAN_TAB |    10 |    60 |    11   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | TT_FTW   |    10 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("TRAN_TAB"."SYS_NC00008$"='W')

Be Careful

The title of this piece is “Fussy FBI” – and the reason for writing it is a reminder that it’s nicer to create and index virtual columns rather than creating function-based indexes. And, if you’re on any recent version of Oracle (12c onwards) it’s a good idea to make the virtual columns invisible so that lazy code (select *, or insert without a specified list of columns, or pl/sql “insert row”) doesn’t result in an error due to the virtual column.

Take the two where clauses I’ve used above and change them slightly – in one case swapping the order of predicates, in the other swapping the order of the IN lists – and the execution paths change from index ranges scans to tablescans.

where   case when status = 'T' and pid is null then 0 end = 0 -- index range scan
where   case when pid is null and status = 'T' then 0 end = 0 -- tablescan


where   case when status in ('W','T') and pid is null then status end = 'W' -- index range scan
where   case when status in ('T','W') and pid is null then status end = 'W' -- tablescan

When you create the function-based index Oracle may rewrite the definition into a “normalised” form – for example when I query user_ind_expressions for my tt_ftw index it turns out that the stored definition is:

CASE  WHEN (("STATUS"='W' OR "STATUS"='T') AND "PID" IS NULL) THEN "STATUS" END

But when you write a query that looks as if it should match the predicate that’s visible in user_ind_expressions the optimizer won’t necessarily notice the match.

Summary

When you create a function-based index the expression you use in your queries must be a very good match for the expression that you used when creating the index. This is just one reason why it may be better to create a virtual column using the expression – then no-one has to remember exactly what the expression was in their queries.

Defining the virtual column as invisible is then a sensible strategy to avoid problems due to code that doesn’t specify explicit column names in all the cases where they should appear.

Footnote

The following script will create the table and indexes used in this note:

rem
rem     Script:         fussy_fbi.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2021
rem
rem     Last tested 
rem             19.3.0.0
rem
rem     Notes:
rem     You have to be careful with FBI definitions and usage.
rem     the match has to be very good.
rem

create table tran_tab (
        pid             number,
        id              number,
        lock_id         number,
        status          varchar2(1),
        padding         varchar2(100)
);

insert into tran_tab
select
        case when mod(rownum,10) = 0 then to_number(null) else rownum end,
        rownum,
        rownum,
        chr(65 + 8 * mod(rownum,4)),
        rpad('x',100)
from
        all_objects
where
        rownum <= 1e4
;

update tran_tab set status = 'T' where mod(lock_id,1000) = 0;
update tran_tab set status = 'W' where mod(lock_id, 990) = 0;

create index tt_ft on tran_tab(
        case when status = 'T' and pid is null then 0 end
);

create index tt_fw on tran_tab(
        case when status = 'W' and pid is null then 0 end
);

create index tt_ftw on tran_tab(
        case when status in ('W','T') and pid is null then status end
);

commit;

execute dbms_stats.gather_table_stats(user,'tran_tab')

set serveroutput off

prompt  ===========
prompt  Correct use
prompt  ===========

select  lock_id 
from    tran_tab 
where   case when status = 'T' and pid is null then 0 end = 0 
and     rownum <= 5;

select * from table(dbms_xplan.display_cursor);

select  lock_id 
from    tran_tab 
where   case when status in ('W','T') and pid is null then status end = 'W'
;

select * from table(dbms_xplan.display_cursor);

prompt  ==========
prompt  Failed use
prompt  ==========

select  lock_id 
from    tran_tab 
where   case when pid is null and status = 'T' then 0 end = 0 
and     rownum <= 5;

select * from table(dbms_xplan.display_cursor);

select  lock_id 
from    tran_tab 
where   case when status in ('T','W') and pid is null then status end = 'W'
;

select * from table(dbms_xplan.display_cursor);

March 8, 2021

Join Elimination redux

Filed under: Bugs,CBO,Join Elimination,Oracle,Transformations — Jonathan Lewis @ 12:58 pm GMT Mar 8,2021

This note is a followup to a post from a few years back (originally dating further back to 2012) where I described an inconsistency that appeared when join elimination and deferrable constraints collided. The bug resurfacted recently in a new guise in a question on the Oracle Developer forum with a wonderful variation on the symptons that ultimately gave a good clue to underlying issue. The post included a complete working example of the anomaly, but I’ll demonstrate it using a variation of my 2012/2017 code. We start with a pair of tables with referential integrity defined between them:

rem
rem     Script:         join_eliminate_bug_3.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Feb 2021
rem
rem     Last tested 
rem             19.11.0.0
rem             19.8.0.0 (LiveSQL)
rem

create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
        deferrable initially immediate
)
;

create table child(
        id_p    number(4)       
                constraint chi_fk_par
                references parent,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id) 
)
;

insert into parent values (1,'Smith');

insert into child values(1,1,'Simon');
insert into child values(1,2,'Sally');

commit;

begin
        dbms_stats.gather_table_stats(user,'child');
        dbms_stats.gather_table_stats(user,'parent');
end;
/

You’ll notice that I’ve created the primary key constraint on parent as “deferrable initially immediate”. So let’s write some code that defers the constraint, inserts some duplicate data executes a join between the two tables:

set serveroutput off
set constraint par_pk deferred;

insert into parent (id,name) values (1,'Smith');

select
        /*+ initially immediate  PK */
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);

Since I’ve now got two rows with id = 1 in parent the query ought to return duplicates for every row in child where id_p = 1, but it doesn’t. Here’s the output from the query and the execution plan:

     ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          2 Sally

2 rows selected.


PLAN_TABLE_OUTPUT
--------------------------------------------------
SQL_ID  gy6h8td4tmdpg, child number 0
-------------------------------------
select  /*+ initially immediate  PK */  chi.* from  child chi,  parent
par where  par.id = chi.id_p

Plan hash value: 2406669797

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS FULL| CHILD |     2 |    24 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

The optimizer has applied “join elimination” to take parent out of the transformed query, so no duplicates. Arguably this is the wrong result.

Let’s roll back the insert and change the experiment – let’s change the constraint on the parent primary key so that it’s still deferrable, but initially deferred then repeat the insert and query:

rollback;
alter table child drop constraint chi_fk_par;
alter table parent drop constraint par_pk;

alter table parent add constraint par_pk primary key (id) deferrable initially deferred;
alter table child add constraint chi_fk_par foreign key(id_p) references parent;

insert into parent (id,name) values (1,'Smith');

select
        /*+ initially deferred  PK */
        chi.*
from
        child   chi,
        parent  par
where
        par.id = chi.id_p
;

select * from table(dbms_xplan.display_cursor);

In this case we don’t need to “set constraint par_pk deferred”, it’s implicitly deferred by definition and will only be checked when we commit any transaction. Would you expect this to make any difference to the result? This is what we get:

      ID_P         ID NAME
---------- ---------- ------------
         1          1 Simon
         1          1 Simon
         1          2 Sally
         1          2 Sally

4 rows selected.


PLAN_TABLE_OUTPUT
--------------------------------------------------
SQL_ID  8gvn3mzr8uv0h, child number 0
-------------------------------------
select  /*+ initially deferred  PK */  chi.* from  child chi,  parent
par where  par.id = chi.id_p

Plan hash value: 1687613841

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |       |       |     2 (100)|          |
|   1 |  NESTED LOOPS      |        |     2 |    30 |     2   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| CHILD  |     2 |    24 |     2   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | PAR_PK |     1 |     3 |     0   (0)|          |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PAR"."ID"="CHI"."ID_P")

When the parent primary key is initially deferred then join elimination doesn’t take place – so we get two copies of each child row in the output. (This is still true even if we add the “rely” option to the parent primary key constraint).

Debug Analysis

As I said at the top of the article, this does give us a clue about the source of the bug. A check of the dictionary table cdef$ (constraint definitions) shows the following notes for column cdef$.defer:

  defer         number,                     /* 0x01 constraint is deferrable */
                                              /* 0x02 constraint is deferred */
                                /* 0x04 constraint has been system validated */
                                 /* 0x08 constraint name is system generated */
etc...

With my examples the “initially immediate” constraint reported defer = 5, for the “initially deferred” constraint it reported the value 7. It looks as if the optimizer code to handle join elimination look only at the static definition of the constraint (bit 0x02) and doesn’t consider the possibility that if bit 0x01 is set it should also check the session state to see if the constraint has been temporarily deferred.

Conclusion

If you are going to implement deferrable constraints be very careful about tracking exactly how you use them, and be aware that if you execute arbitrary queries in mid-transaction then you may find that the results are not exactly what you expect. In fact, though it’s not demonstrated here, different forms of SQL to that should express the same requirement may not give the same results.

Update (May 2021)

This buggy behaviour is still present in 19.11.0.0

February 16, 2021

Adaptive error

Filed under: CBO,dbms_xplan,Oracle,Statistics — Jonathan Lewis @ 5:41 pm GMT Feb 16,2021

There’s a thread on the Oracle Database Forum at present where someone has supplied a script to create some data that’s guaranteed to reproduce wrong results (provided your system stats and optimizer parameters are at their default values). They’ve even supplied a link to the script on LiveSQL (opens in new window) – which is running 19.8 – to demonstrate the problem.

I’ve tested on 12.2.0.1 and 19.3.0.0 and the problem occurs in both versions – though with my setup the initial plan that returned the wrong results didn’t re-optimize to a plan with the correct results in 12.2.0.1.

I’ve included a script at the end of the note to create the data set but I’ll describe some of the objects as we go along – starting with a query that gives the correct result, apparently because it’s been hinted to do so:

execute dbms_stats.delete_system_stats

set linesize 255
set pagesize  60
set trimspool on

alter session set statistics_level='all';
set serveroutput off

select 
        /*+ use_hash(dwf) */ 
        count(*) count_hash 
from 
        test_dwf_sapfi  dwf
where
         exists (
                select  1 
                from    test_sapfi_coicar_at5dat11      coi
                where   coi.datumzprac = 20200414
                and     to_char(coi.datuct,'yyyymmdd') = dwf.datumucetnipom_code
        );

select * from table(dbms_xplan.display_cursor(format=>'cost outline allstats last partition hint_report adaptive'));


test_dwf_sapfi is a table with a single numeric column datumucetnipom_code, the table is list partitioned by that column with 61 partitions. Each partition is defined to hold a single value. The number is designed to look like a date in the format YYYYMMDD.

test_sapfi_coicar_at5dat11 is a table with two columns (datuct, datumzprac). The first column is a date column with data covering a range of 60 dates, the second column is a numeric column and the table is list partioned on that column. All the data in the table is in one partition of that table and the column holds the same value for every row (again it’s a number that looks like a date).

There are 15,197 rows in each table, and the test_dwf_sapfi data has been created as a copy (with a suitable to_number(to_char()) formatting change from the test_sapfi_coicar_at5dat11 table.

Here’s the execution plan from 19c:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                            |      1 |        |   328 (100)|       |       |      1 |00:00:00.02 |     155 |     69 |       |       |          |
|   1 |  SORT AGGREGATE              |                            |      1 |      1 |            |       |       |      1 |00:00:00.02 |     155 |     69 |       |       |          |
|*  2 |   HASH JOIN RIGHT SEMI       |                            |      1 |    253 |   328   (1)|       |       |  15197 |00:00:00.02 |     155 |     69 |  2352K|  2352K| 2110K (0)|
|   3 |    PART JOIN FILTER CREATE   | :BF0000                    |      1 |    152 |    13   (0)|       |       |  15197 |00:00:00.01 |      25 |      0 |       |       |          |
|   4 |     PARTITION LIST SINGLE    |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |      0 |       |       |          |
|   5 |      TABLE ACCESS FULL       | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |      0 |       |       |          |
|   6 |    PARTITION LIST JOIN-FILTER|                            |      1 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     130 |     69 |       |       |          |
|   7 |     TABLE ACCESS FULL        | TEST_DWF_SAPFI             |     60 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     130 |     69 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      FULL(@"SEL$5DA710D3" "COI"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "DWF"@"SEL$1" "COI"@"SEL$2")
      USE_HASH(@"SEL$5DA710D3" "COI"@"SEL$2")
      SWAP_JOIN_INPUTS(@"SEL$5DA710D3" "COI"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))


Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (N - Unresolved (1))
---------------------------------------------------------------------------
   7 -  SEL$5DA710D3 / DWF@SEL$1
         U -  use_hash(dwf)



You’ll notice there’s no “adaptive” information in the report, and there’s no “Note” section saying it’s an adaptive plan. You might also note that the plan looks as if it’s doing a hash join into “dwf” but the “Hint Report” tells us that the hint has not been used and the “Outline Information” tells us that the plan has actually arrived as the result of the combination /*+ use_hash(coi) swap_join_inputs(coi)” */. In fact this is the default plan (on my system) that would have appeared in the complete absence of hints.

The result of the count(*) should be 15,197 – and you can see that this plan has produced the right answer when you check the A-Rows value for operation 2 (the hash join right semi that generates the rowsource for the sort aggregate).

The adaptive anomaly

So now we try again but with a hint to generate a nested loop join and it gives us the wrong result (8) and an oddity in the plan. I’ve reported the body of the plan twice, the first version includes the adaptive information the second is the tidier plan we get by omitting the ‘adaptive’ format option:

select 
        count(*) count_nl 
from 
        test_dwf_sapfi  dwf
where 
        exists (
                select
                        /*+
                                use_nl (coi)
                        */
                        1
                from    test_sapfi_coicar_at5dat11      coi
                where   coi.datumzprac = 20200414
                and     to_char(coi.datuct,'yyyymmdd') = dwf.datumucetnipom_code
        )
;

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                    | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT             |                            |      1 |        |   329 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|     1 |  SORT AGGREGATE              |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|- *  2 |   HASH JOIN                  |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |  3667K|  1779K|          |
|     3 |    PART JOIN FILTER CREATE   | :BF0000                    |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|     4 |     NESTED LOOPS             |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|-    5 |      STATISTICS COLLECTOR    |                            |      1 |        |            |       |       |     60 |00:00:00.01 |      25 |       |       |          |
|     6 |       SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|     7 |        PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|     8 |         TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|     9 |      PARTITION LIST ITERATOR |                            |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
|  * 10 |       TABLE ACCESS FULL      | TEST_DWF_SAPFI             |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
|-   11 |    PARTITION LIST JOIN-FILTER|                            |      0 |  15197 |   314   (1)|:BF0000|:BF0000|      0 |00:00:00.01 |       0 |       |       |          |
|-   12 |     TABLE ACCESS FULL        | TEST_DWF_SAPFI             |      0 |  15197 |   314   (1)|:BF0000|:BF0000|      0 |00:00:00.01 |       0 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |      1 |        |   329 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   1 |  SORT AGGREGATE            |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   2 |   PART JOIN FILTER CREATE  | :BF0000                    |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|   3 |    NESTED LOOPS            |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.01 |     154 |       |       |          |
|   4 |     SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   5 |      PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |       TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   7 |     PARTITION LIST ITERATOR|                            |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
|*  8 |      TABLE ACCESS FULL     | TEST_DWF_SAPFI             |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.01 |     129 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      USE_NL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "COI"@"SEL$2")
      LEADING(@"SEL$5DA710D3" "COI"@"SEL$2" "DWF"@"SEL$1")
      SEMI_TO_INNER(@"SEL$5DA710D3" "COI"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))
  10 - filter("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   8 -  SEL$5DA710D3 / COI@SEL$2
         U -  use_nl (coi)

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

Points to note here:

  • The most important item to note is that at operation 3 (of the tidy plan) we can see that the nested loop reports A-Rows as 8, it’s the wrong result.
  • Then there’s the oddityy that operation 2 is a “part join filter create” that shouldn’t be there for a nested loop, that’s a hash join feature that allows the Pstart/Pstop columns to report partition pruning by Bloom filter (“:BFnnnn”), but we’re running a nested loop join which can pass in the partition key, so we see KEY/KEY as the Pstart/Pstop.
  • The third thing we can pick up is that the 8 rows in our nested loop rowsource are echoed in the A-Rows for the 60 executions of the partition table scans of test_dwf_sapfi at operations 7 abd 8 in the reduced plan – it’s probably not a complete coincidence that the nested loop join is passing the partition keys in partition key order (sort unique at operation 4) and there are 8 rows in the last populated partition of test_dwf_sapfi,
  • Finally we note from the Hint Report that the hint, as supplied, was not used, and the outlne shows us that the path was actually “leading(coi dwf) use_nl(dwf)”.

The really fascinating thing about this execution plan is that it contains a hint that was not used – but the plan changed from the default plan to a slightly more expensive plan.

If at first you don’t succeed

There’s just one more surprise to reveal – we had an adaptive plan, which tends to mean the optimizer plays towards a nested loop join but hedges its bets to be able to swing to a hash join in mid-plan. This suggests that the real-time stats collector thought there wasn’t much data and a nested loop was good – but what happens when I run exactly the same query again? In my 12c system the answer was nothing changed, but in my 19c system a new plan appeared:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                            |      1 |        |   331 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   1 |  SORT AGGREGATE              |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|*  2 |   HASH JOIN                  |                            |      1 |    120K|   331   (2)|       |       |  15197 |00:00:00.01 |     154 |  2171K|  2171K| 1636K (0)|
|   3 |    PART JOIN FILTER CREATE   | :BF0000                    |      1 |  15197 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |       |       |          |
|   4 |     SORT UNIQUE              |                            |      1 |  15197 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   5 |      PARTITION LIST SINGLE   |                            |      1 |  15197 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |       TABLE ACCESS FULL      | TEST_SAPFI_COICAR_AT5DAT11 |      1 |  15197 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   7 |    PARTITION LIST JOIN-FILTER|                            |      1 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     129 |       |       |          |
|   8 |     TABLE ACCESS FULL        | TEST_DWF_SAPFI             |     60 |  15197 |   314   (1)|:BF0000|:BF0000|  15197 |00:00:00.01 |     129 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$5DA710D3")
      UNNEST(@"SEL$2")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      FULL(@"SEL$5DA710D3" "COI"@"SEL$2")
      FULL(@"SEL$5DA710D3" "DWF"@"SEL$1")
      LEADING(@"SEL$5DA710D3" "COI"@"SEL$2" "DWF"@"SEL$1")
      USE_HASH(@"SEL$5DA710D3" "DWF"@"SEL$1")
      SEMI_TO_INNER(@"SEL$5DA710D3" "COI"@"SEL$2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   6 -  SEL$5DA710D3 / COI@SEL$2
         U -  use_nl (coi)

Note
-----
   - statistics feedback used for this statement

This is the output with the ‘adaptive’ format in place – but the plan isn’t adaptive – the optimizer has used statistics feedback (formerly cardinality feedback)to work out a better plan. The hint is still unused of course but when we check the plan we can see that

  • it has got the right answer – the hash join at operation 2 reports 15,197 rows
  • the “partition join” Bloom filter created at operation 3 has been used for the Pstart/Pstop at operations 7 and 8
  • even though the hint has not been used the plan is (again) not the same as the default plan, we’ve got a hash join with Bloom filter while the default plan had a hash join right semi after a sort unique of the test_sapfi_coicar_at5dat11 data with an overall lower cost.

What Happened ?

Clearly there is a bug. It’s a slightly sensitive bug, and all I had to do to eliminate it was to gather stats on the underlying tables. (You’ll find in the table creation script at the end of this note that there are basically no object stats on the “big” partitioned table, which is presumably why the adaptive stuff came into play and allowed the bug to surface, and why 19c statistics feedback produced a new plan on the second execution)

It may be rather difficult for an outsider to pin down what’s going wrong and bypass the bug. One of the first ideas that appeared on the forum was that the Bloom filter pruning was breaking something – but when I added the hint /*+ opt_param(‘_bloom_pruning_enabled’,’false’) */ to the query all I got was basically the same nested loop plan without the Bloom filter creation and still ended up with the wrong result.

Finally, here’s a plan I got when I hinted query correctly to force the nested loop join with test_dwf_sapfi as the inner (second) table in the join (in other words I hinted the plan that had been giving me the wrong results):

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                            |      1 |        |   405 (100)|       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   1 |  SORT AGGREGATE           |                            |      1 |      1 |            |       |       |      1 |00:00:00.01 |     154 |       |       |          |
|   2 |   NESTED LOOPS            |                            |      1 |  38491 |   405   (1)|       |       |  15197 |00:00:00.01 |     154 |       |       |          |
|   3 |    SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   4 |     PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   5 |      TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |    PARTITION LIST ITERATOR|                            |     60 |    253 |     5   (0)|   KEY |   KEY |  15197 |00:00:00.01 |     129 |       |       |          |
|*  7 |     TABLE ACCESS FULL     | TEST_DWF_SAPFI             |     60 |    253 |     5   (0)|   KEY |   KEY |  15197 |00:00:00.01 |     129 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - filter("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   1 -  SEL$5DA710D3
           -  leading(@sel$5da710d3 coi@sel$2 dwf@sel$1)

   7 -  SEL$5DA710D3 / DWF@SEL$1
           -  use_nl(@sel$5da710d3 dwf@sel$1)

Compare this with the plan I got by using the wrong hint, resulting in the adaptive plan, but with Bloom filter pruning disable:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                       | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                            |      1 |        |   329 (100)|       |       |      1 |00:00:00.05 |     154 |       |       |          |
|   1 |  SORT AGGREGATE           |                            |      1 |      1 |            |       |       |      1 |00:00:00.05 |     154 |       |       |          |
|   2 |   NESTED LOOPS            |                            |      1 |  38491 |   329   (1)|       |       |      8 |00:00:00.05 |     154 |       |       |          |
|   3 |    SORT UNIQUE            |                            |      1 |    152 |    13   (0)|       |       |     60 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|   4 |     PARTITION LIST SINGLE |                            |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   5 |      TABLE ACCESS FULL    | TEST_SAPFI_COICAR_AT5DAT11 |      1 |    152 |    13   (0)|     2 |     2 |  15197 |00:00:00.01 |      25 |       |       |          |
|   6 |    PARTITION LIST ITERATOR|                            |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.05 |     129 |       |       |          |
|*  7 |     TABLE ACCESS FULL     | TEST_DWF_SAPFI             |     60 |    253 |   314   (1)|   KEY |   KEY |      8 |00:00:00.05 |     129 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   7 - filter("DWF"."DATUMUCETNIPOM_CODE"=TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("COI"."DATUCT"),'yyyymmdd')))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (1))
---------------------------------------------------------------------------
   0 -  STATEMENT
           -  opt_param('_bloom_pruning_enabled','false')

   5 -  SEL$5DA710D3 / COI@SEL$2
         U -  use_nl (coi)

It’s the same plan (with the same plan hash value though I haven’t shown that) – it has the same predicates, and does the same amount of work, But when the optimizer gets to this plan through the adaptive pathway the run-time engine produces the wrong results (note A-Rows = 8 at operation 2), while if the plan is forced by a correct set of hints the run-time engine produces the right path.

As you might guess, another way to bypass the problem was to disable adaptive plans – but when I did that the only way to get the nested loop path was through correct hinting anyway.

Test it yourself

Here’s a script to create the test data:

rem
rem     Script:         bloom_bug_02.sql
rem     Author:         Michal Telensky / Jonathan Lewis
rem     Dated:          Feb 2021
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem
rem     See also:
rem     https://community.oracle.com/tech/developers/discussion/4480469/reproducible-testcase-for-wrong-results
rem     https://livesql.oracle.com/apex/livesql/s/jzc2uyw6ecf2z2ul35nyrxelv
rem

drop table test_dwf_sapfi;
drop table test_sapfi_coicar_at5dat11;
purge recyclebin;

--
-- Don't do this unless it's a private system
-- Many sites seem to have the defaults anyway
--

execute dbms_stats.delete_system_stats

create table test_sapfi_coicar_at5dat11(
        datuct date,
        datumzprac number(8,0)
 ) 
row store compress advanced 
partition by list (datumzprac) (
        partition p20000101 values (20000101)
)
;

alter table test_sapfi_coicar_at5dat11 add partition p20200414 values (20200414);

insert /*+ append */ into test_sapfi_coicar_at5dat11
select date'2019-11-20' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2019-12-20' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2019-12-29' datuct, 20200414 datumzprac from dual connect by level <   4 union all
select date'2020-01-01' datuct, 20200414 datumzprac from dual connect by level <  55 union all
select date'2020-01-08' datuct, 20200414 datumzprac from dual connect by level <   3 union all
select date'2020-01-13' datuct, 20200414 datumzprac from dual connect by level <   8 union all
select date'2020-01-14' datuct, 20200414 datumzprac from dual connect by level <  117 union all
select date'2020-01-15' datuct, 20200414 datumzprac from dual connect by level <  65 union all
select date'2020-01-30' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2020-01-31' datuct, 20200414 datumzprac from dual connect by level <  12 union all
select date'2020-02-01' datuct, 20200414 datumzprac from dual connect by level <  20 union all
select date'2020-02-05' datuct, 20200414 datumzprac from dual connect by level <   4 union all
select date'2020-02-10' datuct, 20200414 datumzprac from dual connect by level <   5 union all
select date'2020-02-12' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2020-02-17' datuct, 20200414 datumzprac from dual connect by level <   2 union all
select date'2020-02-21' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-02-29' datuct, 20200414 datumzprac from dual connect by level <   37 union all
select date'2020-03-01' datuct, 20200414 datumzprac from dual connect by level < 1851 union all
select date'2020-03-02' datuct, 20200414 datumzprac from dual connect by level <  227 union all
select date'2020-03-03' datuct, 20200414 datumzprac from dual connect by level <   75 union all
select date'2020-03-04' datuct, 20200414 datumzprac from dual connect by level <   19 union all
select date'2020-03-05' datuct, 20200414 datumzprac from dual connect by level <  107 union all
select date'2020-03-06' datuct, 20200414 datumzprac from dual connect by level <  163 union all
select date'2020-03-07' datuct, 20200414 datumzprac from dual connect by level <   72 union all
select date'2020-03-08' datuct, 20200414 datumzprac from dual connect by level <   78 union all
select date'2020-03-09' datuct, 20200414 datumzprac from dual connect by level <  187 union all
select date'2020-03-10' datuct, 20200414 datumzprac from dual connect by level <  124 union all
select date'2020-03-11' datuct, 20200414 datumzprac from dual connect by level <   92 union all
select date'2020-03-12' datuct, 20200414 datumzprac from dual connect by level <  137 union all
select date'2020-03-13' datuct, 20200414 datumzprac from dual connect by level <  397 union all
select date'2020-03-14' datuct, 20200414 datumzprac from dual connect by level <   52 union all
select date'2020-03-15' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-03-16' datuct, 20200414 datumzprac from dual connect by level <  622 union all
select date'2020-03-17' datuct, 20200414 datumzprac from dual connect by level <  215 union all
select date'2020-03-18' datuct, 20200414 datumzprac from dual connect by level <  299 union all
select date'2020-03-19' datuct, 20200414 datumzprac from dual connect by level <  265 union all
select date'2020-03-20' datuct, 20200414 datumzprac from dual connect by level <  627 union all
select date'2020-03-21' datuct, 20200414 datumzprac from dual connect by level <   52 union all
select date'2020-03-22' datuct, 20200414 datumzprac from dual connect by level <   60 union all
select date'2020-03-23' datuct, 20200414 datumzprac from dual connect by level <  168 union all
select date'2020-03-24' datuct, 20200414 datumzprac from dual connect by level <  255 union all
select date'2020-03-25' datuct, 20200414 datumzprac from dual connect by level <  185 union all
select date'2020-03-26' datuct, 20200414 datumzprac from dual connect by level <  240 union all
select date'2020-03-27' datuct, 20200414 datumzprac from dual connect by level <  663 union all
select date'2020-03-28' datuct, 20200414 datumzprac from dual connect by level <   88 union all
select date'2020-03-29' datuct, 20200414 datumzprac from dual connect by level <  771 union all
select date'2020-03-30' datuct, 20200414 datumzprac from dual connect by level <  328 union all
select date'2020-03-31' datuct, 20200414 datumzprac from dual connect by level < 1675 union all
select date'2020-04-01' datuct, 20200414 datumzprac from dual connect by level <  641 union all
select date'2020-04-02' datuct, 20200414 datumzprac from dual connect by level <  251 union all
select date'2020-04-03' datuct, 20200414 datumzprac from dual connect by level <   84 union all
select date'2020-04-06' datuct, 20200414 datumzprac from dual connect by level <  325 union all
select date'2020-04-07' datuct, 20200414 datumzprac from dual connect by level <  366 union all
select date'2020-04-08' datuct, 20200414 datumzprac from dual connect by level <  459 union all
select date'2020-04-09' datuct, 20200414 datumzprac from dual connect by level < 2470 union all
select date'2020-04-10' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-04-11' datuct, 20200414 datumzprac from dual connect by level <   16 union all
select date'2020-04-12' datuct, 20200414 datumzprac from dual connect by level <   24 union all
select date'2020-04-13' datuct, 20200414 datumzprac from dual connect by level <  130 union all
select date'2020-04-14' datuct, 20200414 datumzprac from dual connect by level <    9  -- > change this value and the final (wrong) result changes in synch
/

commit
/

--
-- There are no indexes, so this method_opt collects fewer stats than expected
-- No column stats on the partition(s), only partition row and block stats
-- It does get basic column stats at the table level.
--

declare
        schema_name varchar2(128);
begin
        select sys_context('userenv', 'current_schema') into schema_name from dual;

        dbms_stats.gather_table_stats(
                ownname          => schema_name,
                tabname          => 'test_sapfi_coicar_at5dat11',
                partname         => 'p20200414',
                estimate_percent => dbms_stats.auto_sample_size,
                method_opt       => 'for all indexed columns size auto'
        );
end;
/

create table test_dwf_sapfi (
        datumucetnipom_code number(8,0) not null enable
) 
row store compress advanced 
partition by list (datumucetnipom_code) (
        partition p20000101 values (20000101) 
)
/

begin
        for i in (
                select  distinct to_char(datuct, 'yyyymmdd') datumucetnipom_code 
                from    test_sapfi_coicar_at5dat11 
                order by 
                        1
        ) loop
                execute immediate 
                        'alter table test_dwf_sapfi add partition p' || 
                                i.datumucetnipom_code || 
                                ' values (' || i.datumucetnipom_code || ')'
                ;
        end loop;
end;
/


insert  /*+ append */ into test_dwf_sapfi 
select  to_number(to_char(datuct, 'yyyymmdd')) 
from    test_sapfi_coicar_at5dat11 
where   datumzprac = 20200414
;

commit;

--
--  The problems (seem to) go away if you collect stats
--

-- execute dbms_stats.gather_table_stats(user,'test_dwf_sapfi',granularity=>'global')


set serveroutput off
set linesize 255
set pagesize 60
set trimspool on

alter session set statistics_level='all';

prompt  ===================================
prompt  plan with incorrect use_hash() hint
prompt  ===================================

select 
        /*  use_hash(dwf) */ 
        count(*) count_hash 
from 
        test_dwf_sapfi  dwf
where
         exists (
                select  1 
                from    test_sapfi_coicar_at5dat11      coi
                where   coi.datumzprac = 20200414
                and     to_char(coi.datuct,'yyyymmdd') = dwf.datumucetnipom_code
        );

select * from table(dbms_xplan.display_cursor(format=>'cost outline allstats last partition hint_report adaptive'));

set serveroutput on
spool off

Update (Sept 2021)

The wrong results still appear in 21.3.0.0

January 26, 2021

Index Hints

Filed under: CBO,dbms_xplan,Execution plans,Hints,Ignoring Hints,Indexing,Oracle — Jonathan Lewis @ 4:28 pm GMT Jan 26,2021

At the end of the previous post on index hints I mentioned that I had been prompted to complete a draft from a few years back because I’d been sent an email by Kaley Crum showing the optimizer ignoring an index_rs_asc() hint in a very simple query. Here, with some cosmetic changes, is the example he sent me.

rem
rem     Script: index_rs_kaley.sql
rem     Dated:  Dec 2020
rem     Author: Kaley Crum
rem
rem     Last tested
rem             19.3.0.0
rem

create table range_scan_me(
        one,
        letter 
)
compress
nologging
as
with rowgen_cte as (
        select  null
        from    dual
        connect by level <=  11315
)
select
        1 one,
        case 
                when rownum <=  64e5     then 'A'
                when rownum  =  64e5 + 1 then 'B'
                when rownum <= 128e5     then 'C' 
        end     letter
from 
        rowgen_cte a
cross join 
        rowgen_cte b 
where 
        rownum <= 128e5
;

create index one_letter_idx on range_scan_me(one, letter) nologging;

The table has 12.8 million rows. Of the two columns the first always holds the value 1, the second has one row holding the value ‘B’, and 6.4M rows each holding ‘A’ and ‘C’. On my laptop it took about 20 seconds to create the table and 26 seconds to create the index; using a total of roughly 376 MB (29,000 blocks for the index, 18,500 blocks for the (compressed) table).

Since this is running on 19,3 Oracle will have created basic statistics on the table and index as it created them. Significantly, though, the statistics created during data loading do note include histograms so the optimizer will not know that ‘B’ is a special case, all it knows is that there are three possible values for letter.

Time now to query the data:

et serveroutput off
alter session set statistics_level=all;

select 
        /*+ index_rs_asc(t1 (one, letter)) */ 
        letter, one
from 
        range_scan_me t1
where   one >= 1
and     letter = 'B'
/

select * from table(dbms_xplan.display_cursor(format=>'hint_report allstats last'));

I’ve told the optimizer to use an index range scan, using the “description” method to specify the index I want it to use. The hint is definitely valid, and the index can definitely be used in this way to get the correct result. But here’s the execution plan:

------------------------------------------------------------------------------------------------------
| Id  | Operation        | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                |      1 |        |      1 |00:00:00.01 |       8 |      4 |
|*  1 |  INDEX SKIP SCAN | ONE_LETTER_IDX |      1 |   4266K|      1 |00:00:00.01 |       8 |      4 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ONE">=1 AND "LETTER"='B' AND "ONE" IS NOT NULL
       filter("LETTER"='B')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
         U -  index_rs_asc(t1 (one, letter))

The plan gives us two surprises: first it ignores (and reports that it is ignoring) a perfectly valid hint. Secondly it claims to be using an index skip scan even though the common understanding of a skip scan is that it will be used when “the first column of the index doesn’t appear in the where clause”.

We can infer that the plan is truthful because it has taken only 8 buffer visits to get the result – that’s probably a probe down to the (1,’B’) index entry, then another probe to see if the last index leaf block has any entries in it where column one is greater than 1.

But there are a couple of little oddities about this “ignoring the index” line. First, if we hadn’t hinted the query at all it would have done a tablescan, so the “index” bit of the hint is being obeyed even if the “rs” bit isn’t. Then there’s this:

select 
        /*+ index_rs_desc(t1 (one, letter)) */ 
        letter, one
from 
        range_scan_me t1
where   one >= 1
and     letter = 'B'
/

-------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                |      1 |        |      1 |00:00:00.01 |       8 |
|*  1 |  INDEX SKIP SCAN DESCENDING| ONE_LETTER_IDX |      1 |   4266K|      1 |00:00:00.01 |       8 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ONE">=1 AND "LETTER"='B' AND "ONE" IS NOT NULL)
       filter("LETTER"='B')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
         U -  index_rs_desc(t1 (one, letter))

If we change the index_rs_asc() to index_rs_desc(), the optimizer still ignores the “range scan” bit of the hint, but honours the “descending” bit – we get an index skip scan descending.

Of course this example is a very extreme case – nevertheless it is a valid example of the optimizer behaving in a way that doesn’t seem very user-friendly. If we add ‘outline’ to the format options for the call to dbms_xplan.display_cursor() we’ll find that the index_ss_asc() and index_ss_desc() hints have been substituted for our attempted index_rs_asc() and index_rs_desc().

So, if we really are confident that an index range scan would work a lot better than an index skip scan what could we do. We could try telling it to use an index (posibly even an index range scan ascending), but not to do an index skip scan. Let’s test that and include the Outline Information in the execution plan:

select 
        /*+ index(t1) no_index_ss(t1) */
        letter, one
from 
        range_scan_me t1
where   one >= 1
and     letter = 'B'
;


select * from table(dbms_xplan.display_cursor(format=>'hint_report allstats last outline'));


---------------------------------------------------------------------------------------------
| Id  | Operation        | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                |      1 |        |      1 |00:00:00.78 |   14290 |
|*  1 |  INDEX RANGE SCAN| ONE_LETTER_IDX |      1 |   4266K|      1 |00:00:00.78 |   14290 |
---------------------------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX(@"SEL$1" "T1"@"SEL$1" ("RANGE_SCAN_ME"."ONE" "RANGE_SCAN_ME"."LETTER"))
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ONE">=1 AND "LETTER"='B' AND "ONE" IS NOT NULL)
       filter("LETTER"='B')

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index(t1)
           -  no_index_ss(t1)

It worked – we can see the index range scan, and we can see in the Buffers column of the plan why an index range scan was a bad idea – it’s taken 14,290 buffer visits to get the right result. If you check the index size I mentioned further up the page (, and think about how I defined the data, you’ll realise that Oracle has started an index range scan at the leaf block holding (1,B’) – which is half way along the index – and then walked every leaf block from there to the end of the index in an attempt to find any index entries with column one greater than 1.

The other thing to notice here is that the hint in the Outline Information is given as:

INDEX(@"SEL$1" "T1"@"SEL$1" ("RANGE_SCAN_ME"."ONE" "RANGE_SCAN_ME"."LETTER"))

This was the hint that appeared in the outline whether I used the index() hint or the index_rs_asc() hint in the query. Similarly, when I tried index_desc() or index_rs_desc() as the hint the outline reported index_desc() in both cases.

If I try adding just this hint to the query the plan goes back to a skip scan. It’s another case where the hints in the Outline Information (hence, possibly, an SQL Plan Baseline) don’t reproduce the plan that the outline claims to be describing.

Summary

Does Oracle ignore hints?

It looks as if the answer is still no, except it seems to think that a skip scan is just a special case of a range scan (and, from the previous article, a range scan is just a special case of a skip scan). So if you want to ensure that Oracle uses your preferred index strategy you may have to think about including various “no_index” hints to block the indexes you don’t want Oracle to use, and then no_index_ss() and no_index_ffs() to make sure it doesn’t use the wrong method for the index you do want to use. Even then you may find you don’t have quite enough options to block every index option that you’d like to block.

January 25, 2021

Index Hints

Filed under: CBO,dbms_xplan,Hints,Ignoring Hints,Index skip scan,Indexing,Oracle — Jonathan Lewis @ 4:59 pm GMT Jan 25,2021

I’ve lost count of the number of times I’ve reminded people that hinting (correctly) is hard. Even the humble /*+ index() */ hint and its close relatives are open to misunderstanding and accidental misuse, leading to complaints that “Oracle is ignoring my hint”.

Strange though it may seem, I’m still not 100% certain of what some of the basic index hints are supposed to do, and even the “hint report” in the most recent versions of dbms_xplan.display_xxx() hasn’t told me everything I’d like to know. So if you think you know all about hints and indexing this blog note is for you.

I’ll start with a brief, and approximate, timeline for the basic index hints – starting from 8.0

Version Hint
8.0index
8.1index_asc, index_desc, index_ffs, no_index
9.?index_ss, index_ss_asc, index_ss_desc
10.1no_index_ffs, no_index_ss
11.1index_rs_asc, index_rs_desc
Saving these for later
index_combine(8.0), index_join(9.0), use_nl_with_index, use_invisible_indexes,
parallel_index, local_indexes, index_stats, num_index_keys,
change_dupkey_error_index, ignore_row_on_dupkey_index,
domain_index_filter, domain_index_no_sort, domain_index_sort,
xmlindex_rewrite, xmlindex_rewrite_in_select, xmlindex_sel_idx_tbl

For completeness I’ve included the more exotic index-related hints in the list (without a version), and I’ve even highlighted the rarely seen use_nl_with_index() hint to remind myself to raise a rhetorical question about it at the end of this piece.

In this list you’ll notice that the only hint originally available directed the optimizer to access a table by index, but in 8.1 that changed so that we could

  1. tell the optimizer about indexes it should not use
  2. specify whether the index access should use the index in ascending or descending order
  3. use an index fast full scan.

In 9i Oracle then introduced the index skip scan, with the option to specify whether the skip scan should be in ascending or descending order. The index_ss hint seems to be no more than a synonym for the index_ss_asc hint (or should that be the other way round); ss far as I can tell the index_ss() hint will not produce a descending skip scan.

You’ll note that there’s no hint to block an index skip scan, until the hint no_index_ss() appears in 10g along with the no_index_ffs() hint to block the index fast full scan. Since 10g Oracle has got better at introducing both the “positive” and “negative” versions of a hint whenever it introduces any hints for new optimizer mechanisms.

Finally we get to 11g and if you search MOS you may still be able to find the bug note (4323868.8) that introduced the index_rs_asc() and index_rs_desc() hints for index range scan ascending and descending.

From MOS Doc 4323868.8: “This fix adds new hints to enforce that an index is selected only if a start/stop keys (predicates) are used: INDEX_RS_ASC INDEX_RS_DESC”

This was necessary because by this time the index() hint allowed the optimizer to decide for itself how to use an index and it was quite difficult to force it to use the strategy you really wanted.

It’s still a source of puzzlement to me that an explicit index() hint will sometimes be turned into an index_rs_asc() when you check the Outline Information from a call to dbms_xplan.display_xxx() the Optimizer wants to use to reproduce the plan, while there are other times that an explicit index_rs_asc() hint will be turned into a basic index() hint (which might not reproduce the original plan)!

The Warm-up

Here’s a little surprise that could only reveal itself in the 19c hint report – unless you were willing to read your way carefully through a 10053 (CBO) trace file in earlier versions of Oracle. It comes from a little investigation of the index_ffs() hint that I’ve kept repeating over the last 20 years.

rem
rem     Script:         c_indffs.sql
rem     Dated:          March 2001
rem     Author:         Jonathan Lewis
rem

create table t1
nologging
as
select 
        rownum                  id,
        rpad(mod(rownum,50),10) small_vc,
        rpad('x',50)            padding
from
        all_objects
where
        rownum <= 3000
;

alter table t1 modify id not null;

create index t_i1 on t1(id);
create index t_i2 on t1(small_vc,id);

set autotrace traceonly explain

select 
        count(small_vc)
from    t1
where
        id > 2750
;

select 
        /*+ index(t1) */
        count(small_vc)
from    t1
where
        id > 2750
;

select 
        /*+ index_ffs(t1) */
        count(small_vc)
from    t1
where
        id > 2750
;

select 
        /*+ index_ffs(t1) no_index(t1) */
        count(small_vc)
from    t1
where
        id > 2750
;

set autotrace off

I’ve created a table with two indexes, and then enabled autotrace to get the execution plans for 4 queries that vary only in their hinting. Here’s the plan (on 19.3, with my settings for system stats) for the first query:

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |    15 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T_I2 |   250 |  3750 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">2750)

It’s an index fast full scan on the t_i2 (two-column) index. If I add an index() hint to this query, will that allow Oracle to continue using the index fast full scan, or will it force Oracle into some other path. Here’s the plan for the query hinted with index(t1):

---------------------------------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |     1 |    15 |     5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |      |     1 |    15 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1   |   250 |  3750 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T_I1 |   250 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("ID">2750)

The optimizer has chosen an index range scan on the (single-column) t1 index. Since this path costs more than the index fast full scan it would appear that the index() hint does not allow the optimizer to consider an index fast full scan. So we might decide that an index_ffs() hint is appropriate to secure the plan we want – and here’s the plan we get with that hint:

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |    15 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T_I2 |   250 |  3750 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">2750)

As expected we get the index fast full scan we wanted. But we might want to add belts and braces – let’s include a no_index() hint to make sure that the optimizer doesn’t consider any other strategy for using an index. Since we’ve seen that the index() hint isn’t associated with the index fast full scan path it seems reasonable to assume that the no_index() is also not associated with the index fast full scan path. Here’s the plan we get from the final variant of my query with index_ffs(t1) no_index(t1):

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |      |     1 |    15 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T_I2 |   250 |  3750 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">2750)

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (2))
---------------------------------------------------------------------------
   2 -  SEL$1 / T1@SEL$1
         U -  index_ffs(t1) / hint conflicts with another in sibling query block
         U -  no_index(t1) / hint conflicts with another in sibling query block

The query has produced the execution plan we wanted – but only by accident. The hint report (which, by default, is the version that reports only the erroneous or unused hints) tells us that both hints have been ignored because they each conflict with some other hint in a “sibling” query block. In this case they’re conflicting with each other.

So the plan we get was our original unhinted plan – which made it look as if we’d done exactly the right thing to ensure that we’d made the plan completely reproducible. Such (previously invisible) errors can easily lead to complaints about the optimizer ignoring hints.

The Main Event

The previous section was about an annoying little inconsistency in the way in which the “negative” version of a hint may not correspond exactly to the “postive” version. There’s a more worrying issue to address when you try to be more precise in your use of basic index hints.

We’ve seen that an index() hint could mean almost anything other than an index fast full scan, while a no_index() hint (probably) blocks all possible uses of an index, but would you expect an index_rs_asc() hint to produce a skip scan, or an index_ss_asc() hint to produce a range scan? Here’s another old script of mine to create some data and test some hints:

rem
rem     Script:         skip_scan_anomaly.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2009
rem

create table t1
as
with generator as (
        select  --+ materialize
                rownum  id
        from    all_objects 
        where   rownum <= 3000  -- > hint to avoid wordpress format issue
)
select
        mod(rownum,300)                                 addr_id300,
        mod(rownum,200)                                 addr_id200,
        mod(rownum,100)                                 addr_id100,
        mod(rownum,50)                                  addr_id050,
        trunc(sysdate) + trunc(mod(rownum,2501)/3)      effective_date,
        lpad(rownum,10,'0')                             small_vc,
        rpad('x',050)                                   padding
--      rpad('x',100)                                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 250000   -- > hint to avoid wordpress format issue
;

create index t1_i1 on t1(effective_date);
create index t1_i300 on t1(addr_id300, effective_date);
create index t1_i200 on t1(addr_id200, effective_date);
create index t1_i100 on t1(addr_id100, effective_date);
create index t1_i050 on t1(addr_id050, effective_date);

I’ve created a table with rather more indexes than I’ll be using. The significant indexes are t1_i1(effective_date), and t1_i050(addr_id050, effective_date). The former will be available for range scans the latter for skip scans when I test queries with predicates only on effective_date.

Choice of execution path can be affected by the system stats, so I need to point out that I’ve set mine with the following code:

begin
        dbms_stats.set_system_stats('MBRC',16);
        dbms_stats.set_system_stats('MREADTIM',10);
        dbms_stats.set_system_stats('SREADTIM',5);
        dbms_stats.set_system_stats('CPUSPEED',500);
exception
        when others then null;
end;
/

And I’ll start with a couple of “baseline” queries and execution plans:

explain plan for
select 
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

select * from table(dbms_xplan.display(format=>'hint_report'));

alter index t1_i1 invisible;

explain plan for
select 
        /*+ index(t1) */
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

You’ll notice at line 11 I’ve made the t1_i1 index invisible, and it will stay that way for a couple more tests. Here are the first two execution plans:

Unhinted
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1500 | 28500 |   428   (9)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |  1500 | 28500 |   428   (9)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00',
              'syyyy-mm-dd hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Hinted with index(t1)
-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |  1500 | 28500 |  1558   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1      |  1500 | 28500 |  1558   (1)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN                   | T1_I050 |  1500 |       |    52   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
       filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index(t1)

Unhinted I’ve managed to rig the data and system stats so that the first path is a full tablescan; then, when I add the generic index(t1) hint Oracle recognises and uses the hint in the best possible way, picking the lowest cost index skip scan.

A variation I won’t show here – if I change the hint to index_rs_asc(t1) the optimizer recognizes there is no (currently visible) index that could be used for an index range scan and does a full tablescan, reporting the hint as unused. It won’t try to substitute a skip scan for a range scan.

What happens if I now try the index_ss(t1) hint without specifying an index. Firstly with the t1_i1 index still invisible, then after making t1_i1 visible again:

explain plan for
select 
        /*+ index_ss(t1) */
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

select * from table(dbms_xplan.display(format=>'hint_report'));

Here are the two execution plans, first when t1_i1(effective_date) is still invisible:

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |  1500 | 28500 |  1558   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1      |  1500 | 28500 |  1558   (1)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN                   | T1_I050 |  1500 |       |    52   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
       filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index_ss(t1)

As you might expect the optimizer has picked the t1_i050 index for a skip scan. (There are 3 other candidates for the skip scan, but since the have more distinct values for their leading column they are all turn out to have a higher cost than t1_i050).

So let’s make the t1_i1 index visible and see what the plan looks like:

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |  1500 | 28500 |   521   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |  1500 | 28500 |   521   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |  1500 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
         U -  index_ss(t1)

The optimizer picks an index range scan using the t1_i1 index, and reports the hint as unused! For years I told myself that an index skip scan was derived as a small collection of range scans, so an index range was technically a “degenerate” skip scan i.e. one where the “small collection” consisted of exactly one element. Oracle 19c finally told me I was wrong – the optimizer is ignoring the hint.

The fact that it’s a sloppy hint and you could have been more precise is irrelevant – if the optimizer won’t do a skip scan when you specify a range scan (but watch out for the next “index hints” instalment – see footnote) it shouldn’t do a range scan when you specify a skip scan (but that’s just a personal opinion).

We should check, of course, that a precisely targeted skip scan hint works before complaining too loudly – would index_ss(t1 t1_i050), or index_ss_t1 t1_i300) work when there’s a competing index that could produce a lower cost range scan? The answer is yes.

explain plan for
select 
        /*+ index_ss(t1 t1_i050) */
        small_vc
from    t1
where   effective_date >  to_date('&m_start_date','dd-mon-yyyy')
and     effective_date <= to_date('&m_end_date'  ,'dd-mon-yyyy')
;

select * from table(dbms_xplan.display(format=>'hint_report'));

-----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |  1500 | 28500 |  1558   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1      |  1500 | 28500 |  1558   (1)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN                   | T1_I050 |  1500 |       |    52   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
       filter("EFFECTIVE_DATE"<=TO_DATE(' 2021-02-26 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "EFFECTIVE_DATE">TO_DATE(' 2021-02-22 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1
---------------------------------------------------------------------------
   1 -  SEL$1 / T1@SEL$1
           -  index_ss(t1 t1_i050)

If you specify a suitable index in the index_ss() hint then the optimizer will use it and won’t switch to the index range scan. You can, of course, specify the index by description rather than name, so the hint /*+ index_ss(t1 (addr_id050, effective_date)) */ or even a partial description like /*+ index_ss(t1 (addr_id050)) */ would have been equally valid and obeyed.

How much do you know?

I’ll finish off with a rhetorical question, which I’ll introduce with this description take from the 19c SQL Tuning Guide section 9.2.1.6:

The related hint USE_NL_WITH_INDEX(table index) hint instructs the optimizer to join the specified table to another row source with a nested loops join using the specified table as the inner table. The index is optional. If no index is specified, then the nested loops join uses an index with at least one join predicate as the index key.

An intuitive response to this hint would be to assume that most people expect nested loops to use index unique scans or range scans into the second table. So what would your initial expectation be about the validity of use_nl_with_index() if the only way the index could be used was with an index skip scan, or a full scan, or a fast full scan. What if there were two join predicates and there’s a path which could do a nested loop if it used two indexes to do an index join (index_join()) or an index bitmap conversion (index_combine()). Come to that, how confident are you that the hint will work if the index specified is a bitmap index?

Summary

It’s important to be as accurate and thorough as possible when using hints. Even when a hint is documented you may find that you can asked “what if” questions about the hint and find that the only way to get answers to your questions is to do several experiments.

If you’re going to put hints into production code, take at least a little time to say to yourself:

“I know what I want and expect this hint to do; are there any similar actions that it might also be allowed to trigger, and how could I check if I need to allow for them or block them?”

Footnote: This journey of rediscovery was prompted by an email from Kaley Crum who supplied me with an example of Oracle using an index skip scan when it had been hinted to do an index range scan.

January 20, 2021

CBO Example

Filed under: CBO,Execution plans,Oracle,Statistics — Jonathan Lewis @ 10:01 am GMT Jan 20,2021

A little case study based on an example just in on the Oracle-L list server. This was supplied with a complete, working, test case that was small enough to understand and explain very quickly.

The user created a table, and used calls to dbms_stats to fake some statistics into place. Here, with a little cosmetic editing, is the code they supplied.

set serveroutput off
set linesize 180
set pagesize 60
set trimspool on

drop table t1 purge;

create table t1 (id number(20), v varchar2(20 char));
create unique index pk_id on t1(id);
alter table t1 add (constraint pk_id primary key (id) using index pk_id enable validate);
exec dbms_stats.gather_table_stats(user, 't1');
 
declare
        srec               dbms_stats.statrec;
        numvals            dbms_stats.numarray;
        charvals           dbms_stats.chararray;
begin
  
        dbms_stats.set_table_stats(
                ownname => user, tabname => 't1', numrows => 45262481, numblks => 1938304, avgrlen => 206
        );

        numvals := dbms_stats.numarray (1, 45262481);
        srec.epc:=2;
        dbms_stats.prepare_column_values (srec, numvals);
        dbms_stats.set_column_stats (
                ownname => user, tabname => 't1', colname => 'id', 
                distcnt => 45262481, density => 1/45262481,
                nullcnt => 0, srec => srec, avgclen => 6
        );

        charvals := dbms_stats.chararray ('', '');
        srec.epc:=2;
        dbms_stats.prepare_column_values (srec, charvals);
        dbms_stats.set_column_stats(
                ownname => user, tabname => 't1', colname => 'v', 
                distcnt => 0,  density => 0, 
                nullcnt => 45262481, srec => srec, avgclen => 0
        );
        dbms_stats.set_index_stats( 
                ownname => user, indname =>'pk_id', numrows => 45607914, numlblks => 101513,
                numdist => 45607914, avglblk => 1, avgdblk => 1, clstfct => 33678879, indlevel => 2
        );
end;
/
 
variable n1 nvarchar2(32)
variable n2 number

begin
        :n1 := 'D';
        :n2 := 50;
end;
/
 

select 
        /*+ gather_plan_statistics */ 
        * 
from    ( 
        select  a.id col0,a.id col1
        from    t1 a
        where   a.v = :n1 
        and     a.id > 1
        order by 
                a.id 
        ) 
where 
        rownum <= :n2 
;
 
select * from table(dbms_xplan.display_cursor(null,null,'allstats last cost peeked_binds '));

From Oracle’s perspective the table has 45M rows, with a unique sequential key starting at 1 in the id column. The query looks like a pagination query, asking for 50 rows, ordered by id. But the in-line view asks for rows where id > 1 (which, initiall, means all of them) and applies a filter on the v column.

Of course we know that v is always null, so in theory the predicate a.v = :n1 is always going to return false (or null, but not true) – so the query will never return any data. However, if you read the code carefully you’ll notice that the bind variable v has been declared as an nvarchar2() not a varchar2().

Here’s the exection plan I got on an instance running 19.3 – and it’s very similar to the plan supplied by the OP:

----------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |      1 |        |  3747 (100)|      0 |00:00:00.01 |
|*  1 |  COUNT STOPKEY                |       |      1 |        |            |      0 |00:00:00.01 |
|   2 |   VIEW                        |       |      1 |     50 |  3747   (1)|      0 |00:00:00.01 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T1    |      1 |    452K|  3747   (1)|      0 |00:00:00.01 |
|*  4 |     INDEX RANGE SCAN          | PK_ID |      0 |   5000 |    14   (0)|      0 |00:00:00.01 |
----------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------
   2 - :2 (NUMBER): 50

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(ROWNUM<=:N2)
   3 - filter(SYS_OP_C2C("A"."V")=:N1)
   4 - access("A"."ID">1)

The question we were asked was this: “Why does the optimizer estimate that it will return 5,000 entries from the index range scan at operation4?”

The answer is the result of combining two observations.

First: In the Predicate Information you can see that Oracle has applied a character-set conversion to the original predicate “a.v = :n1” to produce filter(SYS_OP_C2C(“A”.”V”)=:N1). The selectivity of “function of something = bind value” is one of those cases where Oracle uses one of its guesses, in this case 1%. Note that the E-rows estimate for operation 3 (table access) is 452K, which is 1% of the 45M rows in the table.

In real life if you had optimizer_dynamic_sampling set at level 3, or had added the hint /*+ dynamic_sampling(3) */ to the query, Oracle would sample some rows to avoid the need for guessing at this point.

Secondly: the optimizer has peeked the bind variable for the rownum predicate, so it is optimizing for 50 rows (basically doing the arithmetic of first_rows(50) optimisation). The optimizer “knows” that the filter predicate at the table will eliminate all but 1% of the rows acquired, and it “knows” that it has to do enough work to find 50 rows in total – so it can calculate that (statistically speaking) it has to walk through 5,000 (= 50 * 100) index entries to visit enough rows in the table to end up with 50 rows.

Next Steps (left as exercise)

Once you’ve got the answer to the question “Why is this number 5,000?”, you might go back and point out that the estimate for the table access was 95 times larger than the estimate for the number of rowids selected from the index and wonder how that could be possible. (Answer: that’s just one of the little defects in the code for first_rows(n).)

You might also wonder what would have happened in this model if the bind variable n1 had been declared as a varchar2() rather than an nvarchar2() – and that might have taken you on to ask yet another question about what the optimizer was playing at.

Once you’ve modelled something that is a little puzzle there’s always scope for pushing the model a little further and learning a little bit more before you file the model away for testing on the next version of Oracle.

Next Page »

Website Powered by WordPress.com.