Oracle Scratchpad

August 20, 2019

Join View

Filed under: constraints,Infrastructure,Joins,Oracle — Jonathan Lewis @ 12:39 pm BST Aug 20,2019

It’s strange how one thing leads to another when you’re trying to check some silly little detail. This morning I wanted to find a note I’d written about the merge command and “stable sets”, and got to a draft about updatable join views that I’d started in 2016 in response to a question on OTN (as it was at the time) and finally led to a model that I’d written in 2008 showing that the manuals were wrong.

Since the manual – even the 19c manual – is still wrong regarding the “Delete Rule” for updatable (modifiable) join views I thought I’d quickly finish off the draft and post the 2008 script. Here’s what the manual says about deleting from join views (my emphasis on “exactly”):

Rows from a join view can be deleted as long as there is exactly one key-preserved table in the join. The key preserved table can be repeated in the FROM clause. If the view is defined with the WITH CHECK OPTION clause and the key preserved table is repeated, then the rows cannot be deleted from the view.

But here’s a simple piece of code to model a delete from a join view that breaks the rule:


rem
rem     Script:         delete_join.sql 
rem     Dated:          Dec 2008
rem     Author:         J P Lewis
rem

create table source
as
select level n1
from dual
connect by level <= 10
/ 
 
create table search
as
select level n1
from dual
connect by level <= 10
/ 

alter table source modify n1 not null;
alter table search modify n1 not null;

create unique index search_idx on search(n1);
-- create unique index source_idx on source(n1)

I’ve set up a “source” and a “search” table with 10 rows each and the option for creating unique indexes on each table for a column that’s declared non-null. Initially, though, I’ve only created the index on search to see what happens when I run a couple of “join view” deletes using “ANSI” syntax.

prompt  ===============================
prompt  Source referenced first in ANSI
prompt  ===============================

delete from (select * from source s join search s1 on s.n1 = s1.n1);
select count(1) source_count from source;
select count(1) search_count from search;
rollback;
 
prompt  ===============================
prompt  Search referenced first in ANSI
prompt  ===============================

delete from (select * from search s join source s1 on s.n1 = s1.n1);
select count(1) source_count from source;
select count(1) search_count from search;
rollback;

With just one of the two unique indexes in place the order of the tables in the inline view makes no difference to the outcome. Thanks to the unique index on search any row in the inline view corresponds to exactly one row in the source table, while a single row in the search table could end up appearing in many rows in the view – so the delete implictly has to operate as “delete from source”. So both deletes will result in the source_count being zero, and the search_count remaining at 10.

If we now repeat the experiment but create BOTH unique indexes, both source and search will be key-preserved in the join. According to the manual the delete should produce some sort of error. In fact the delete works in both cases – but the order that the tables appear makes a difference. When source is the first table in the in-line view the source_count drops to zero and the search_count stays at 10; when search is the first table in the in-line view the search_count drops to zero and the source_count stays at 10.

I wouldn’t call this totally unreasonable – but it’s something you need to know if you’re going to use the method, and something you need to document very carefully in case someone editing your code at a later date (or deciding that they could add a unique index) doesn’t realise the significance of the table order.

This does lead on to another important test – is it the order that the tables appear in the from clause that matters, or the order they appear in the join order that Oracle uses to optimise the query. (We hope – and expect – that it’s the join order as written, not the join order as optimised, otherwise the effect of the delete could change from day to day as the optimizer chose different execution plans!). To confirm my expectation I switched to traditional Oracle syntax with hints (still with unique indexes on both tables), writing a query with search as the first table in the from clause, but hinting the inline view to vary the optimised join order.


prompt  ============================================
prompt  Source hinted as leading table in join order 
prompt  ============================================

delete from (
        select 
                /*+ leading(s1, s) */
                * 
        from 
                search s,
                source s1 
        where
                s.n1 = s1.n1
        )
;

select count(1) source_count from source; 
select count(1) search_count from search;
rollback;

prompt  ============================================
prompt  Search hinted as leading table in join order 
prompt  ============================================

delete from (
        select 
                /*+ leading(s, s1) */
                * 
        from 
                search s,
                source s1 
        where
                s.n1 = s1.n1
        )
;

select count(1) source_count from source; 
select count(1) search_count from search;
rollback;

In both cases the rows were deleted from search (the first table in from clause). And, to answer the question you should be asking, I did check the execution plans to make sure that the hints had been effective:


============================================
Source hinted as leading table in join order
============================================

------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
------------------------------------------------------------------
|   0 | DELETE STATEMENT    |            |    10 |    60 |     1 |
|   1 |  DELETE             | SEARCH     |       |       |       |
|   2 |   NESTED LOOPS      |            |    10 |    60 |     1 |
|   3 |    INDEX FULL SCAN  | SOURCE_IDX |    10 |    30 |     1 |
|*  4 |    INDEX UNIQUE SCAN| SEARCH_IDX |     1 |     3 |       |
------------------------------------------------------------------

============================================
Search hinted as leading table in join order
============================================

------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
------------------------------------------------------------------
|   0 | DELETE STATEMENT    |            |    10 |    60 |     1 |
|   1 |  DELETE             | SEARCH     |       |       |       |
|   2 |   NESTED LOOPS      |            |    10 |    60 |     1 |
|   3 |    INDEX FULL SCAN  | SEARCH_IDX |    10 |    30 |     1 |
|*  4 |    INDEX UNIQUE SCAN| SOURCE_IDX |     1 |     3 |       |
------------------------------------------------------------------

Summary

Using updatable join views to handle deletes can be very efficient but the manual’s statement of the “Delete Rule” is incorrect. It is possible to have several key-preserved tables in the view that you’re using, and if that’s the case you need to play safe and ensure that the table you want to delete from is the first table in the from clause. This means taking steps to eliminate the risk of someone editing some code at a later date without realising the importance of the table order.

Update (very shortly after publication)

Iduith Mentzel has pointed out in comment #1 below that the SQL Language Reference Guide and the DBA Administration Guide are not consistent in their descriptions of deleting from a join view, and that the SQL Language Reference Guide correctly states that the delete will be applied to the first mentioned key-preserved table.

 

 

July 10, 2018

Validate FK

Filed under: constraints,Infrastructure,Oracle — Jonathan Lewis @ 10:42 am BST Jul 10,2018

A comment arrived yesterday on an earlier posting about an enhancement to the truncate command in 12c, raising the question of what Oracle might do to validate a foreign key constraint. Despite being certain that I had the answer written down somewhere (maybe on a client site or in a report to a client) I couldn’t find anything I’d published about it, so I ran up a quick demo script to show that all Oracle does is construct a simple SQL statement that will check the (child) data and then execute it with whatever the optimizer thinks  is the fastest possible plan.

Here’s the script – with a few variations to show what happens if you start tweaking features to affect the plan.

rem
rem     Script:         validate_fk.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jun 2018
rem
rem     Last tested
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

create table parent 
as 
select  * 
from    all_Objects 
where   rownum  comment to avoid wordpress format issue
;

alter table parent add constraint par_pk primary key(object_id);

execute dbms_stats.gather_table_stats(null, 'parent', cascade=>true)

create table child 
as 
select par.* 
from    (select rownum from dual connect by level  comment to avoid wordpress format issue
        parent par
; 

alter table child add constraint chi_fk_par foreign key(object_id) references parent enable novalidate; 
create index chi_fk_par on child(object_id); 
execute dbms_stats.gather_table_stats(null, 'child', cascade=>true)

-- alter table child modify object_id null;
-- alter table child parallel(degree 8);
-- alter session set "_fast_full_scan_enabled" = FALSE;
-- alter session set "_optimizer_outer_to_anti_enabled" = false;

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 12';
alter table child modify constraint chi_fk_par validate;
alter session set events '10046 trace name context off;

All I’ve done is create a parent table with a primary key and a child table with 10 rows per parent. I’ve created a foreign key constraint on the child table, enabled it (so that future data will be checked) but not validated it (so there’s no enforced guarantee that the existing data is correct). Then I’ve issued a command to validate the foreign key.

The flush of the buffer cache is to allow me to see the I/O that takes place and will also (usually) let me see if there are any strange issues due to any recursive SQL. As you can see I’ve also got a couple of commented commands that might cause a couple of variations in behaviour.

Here’s the critical content from the output of the trace file summary from tkprof (in versions from 11.2.0.4 to 12.2.0.1):


select /*+ all_rows ordered dynamic_sampling(2) */ A.rowid, :1, :2, :3
from
 "TEST_USER"."CHILD" A , "TEST_USER"."PARENT" B where( "A"."OBJECT_ID" is not
  null) and( "B"."OBJECT_ID" (+)= "A"."OBJECT_ID") and( "B"."OBJECT_ID" is
  null)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.01       0.02        241        373          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.02       0.02        241        373          0           0

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  NESTED LOOPS ANTI (cr=373 pr=241 pw=0 time=21779 us starts=1 cost=70 size=22000 card=1000)
    100000     100000     100000   INDEX FAST FULL SCAN CHI_FK_PAR (cr=224 pr=219 pw=0 time=17753 us starts=1 cost=32 size=1700000 card=100000)(object id 104840)
     10000      10000      10000   INDEX UNIQUE SCAN PAR_PK (cr=149 pr=22 pw=0 time=4494 us starts=10000 cost=0 size=49995 card=9999)(object id 104838)

As you can see, Oracle writes SQL for an outer join with an “is null” predicate on the second table – which the optimizer converts to an anti-join, running a nested loop in this case. It’s an interesting little oddity that the code includes the predicate “A”.”OBJECT_ID” is not null given that the column is declared as not null – but this is presumably a developer deciding to re-use code even if it then includes a redundant predicate (which is effectively zero cost – since the optimizer can use transitive closure to eliminate it).

Given that Oracle has converted an outer join to an anti join I obviously had to check what would happen if I disabled this conversion by altering the “_optimizer_outer_to_anti_enabled” parameter to false. The optimizer obeyed the session setting with the following plan in the trace:

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  FILTER  (cr=373 pr=241 pw=0 time=226926 us starts=1)
    100000     100000     100000   NESTED LOOPS OUTER (cr=373 pr=241 pw=0 time=177182 us starts=1 cost=70 size=22000 card=1000)
    100000     100000     100000    INDEX FAST FULL SCAN CHI_FK_PAR (cr=224 pr=219 pw=0 time=40811 us starts=1 cost=32 size=1700000 card=100000)(object id 104848)
    100000     100000     100000    INDEX UNIQUE SCAN PAR_PK (cr=149 pr=22 pw=0 time=119363 us starts=100000 cost=0 size=49995 card=9999)(object id 104846)

The significant difference is in the CPU usage, of course, and, to a large degree, the magnitude of the change is dictated by the pattern and distribution of the data. The number of CR gets hasn’t changed as the number of index probes jumps from 10,000 to 100,000 because Oracle will have pinned index blocks (There’s a very old article on my old website if you want to read more about buffer pins).

The original question was about the effect of a local session setting that disabled index fast full scans, and followed up with a question on parallelism. After seeing the effect of changing one optimizer parameter at the session level you probably won’t be surprised by the following two results.  First, when the only change I make is the setting of the “_index_fast_full_scan_enabled” parameter, and then when the only change is the declared parallelism of the child table.

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  MERGE JOIN ANTI (cr=240 pr=240 pw=0 time=120163 us starts=1 cost=247 size=22000 card=1000)
    100000     100000     100000   INDEX FULL SCAN CHI_FK_PAR (cr=218 pr=218 pw=0 time=20314 us starts=1 cost=222 size=1700000 card=100000)(object id 104852)
    100000     100000     100000   SORT UNIQUE (cr=22 pr=22 pw=0 time=81402 us starts=100000 cost=25 size=50000 card=10000)
     10000      10000      10000    INDEX FULL SCAN PAR_PK (cr=22 pr=22 pw=0 time=1185 us starts=1 cost=22 size=50000 card=10000)(object id 104850)


Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  PX COORDINATOR  (cr=15 pr=2 pw=0 time=722483 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10001 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=37 size=22000 card=1000)
         0          0          0    HASH JOIN ANTI BUFFERED (cr=0 pr=0 pw=0 time=0 us starts=0 cost=37 size=22000 card=1000)
         0          0          0     PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us starts=0 cost=32 size=1700000 card=100000)
         0          0          0      TABLE ACCESS FULL CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=32 size=1700000 card=100000)
         0          0          0     PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us starts=0 cost=4 size=50000 card=10000)
         0          0          0      PX SEND BROADCAST :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=4 size=50000 card=10000)
         0          0          0       PX SELECTOR  (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0        INDEX FAST FULL SCAN PAR_PK (cr=0 pr=0 pw=0 time=0 us starts=0 cost=4 size=50000 card=10000)(object id 104854)

In the first case my version of Oracle has switched to a merge anti-join with an index full scan (not FAST full scan). It’s interesting to note that the merge join anti hasn’t been as clever as the nested loop anti in avoiding probes of the second data source as it walks the foreign key index (note how starts=100000 in the SORT UNIQUE line).

In the second case all the work was done by the parallel query slaves – and the PX SELECTOR line tells you that this plan must have come from 12c. As you can see we’re still doing an anti-join but this time we do a parallel tablescan of the child table (as we haven’t enabled the index for parallel execution – if we had altered the index to parallel(degree 8) as well we would have seen a parallel index fast full scan instead of the parallel tablescan.)

Bottom line: the SQL executed to validate a foreign key constraint is essentially an (outer) join between the child and parent tables, Oracle will simply optimize that statement to the best of its ability based on the current session settings and specific column declarations. If you want to test on a clone (or accurate model) of the tables you may find that you can create an sql_patch that works (even though the necessary SQL will be optimised as SYS); so far I’ve only tried this with a couple of variants of the parallel() hint on 12.2.0.1)

Footnote

If you were wondering what the three bind variables in the query were, this is the relevant extract from the 10046 trace file with bind variable tracing enabled:

 Bind#0
  oacdty=01 mxl=32(09) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=7f3fbd359c38  bln=32  avl=09  flg=05
  value="TEST_USER"
 Bind#1
  oacdty=01 mxl=32(05) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=7f3fbd359c00  bln=32  avl=05  flg=05
  value="CHILD"
 Bind#2
  oacdty=01 mxl=32(10) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=7f3fbd359bc8  bln=32  avl=10  flg=05
  value="CHI_FK_PAR"

The values are the owner, table, and constraint names. (Though you have to modify the code a little to show that the last bind variable is the constraint name and not the name of the index supporting the constraint).

 

 

October 17, 2013

Quiz Night

Filed under: constraints,Infrastructure,Oracle,Partitioning — Jonathan Lewis @ 10:58 pm BST Oct 17,2013

The following output is the result of two immediately consecutive SQL statements (with “set echo on”), and nothing else happening to the database.

(more…)

Powered by WordPress.com.