Oracle Scratchpad

December 29, 2011

I wish

Filed under: CBO,Execution plans,Oracle,Wishlist — Jonathan Lewis @ 10:02 pm BST Dec 29,2011

Here’s a simple update statement that identifies a few rows in a table then updates a column where a matching value can be derived from another table – it’s an example of an update by correlated subquery:

update	t1
set	small_vc = (
		select
			max(small_vc)
		from
			t2
		where	t2.id = t1.id
	)
where
	mod(id,100) = 0
and	exists (
		select null
		from t2
		where 	t2.id = t1.id
	)
;

There are a number of different strategies of varying efficiency we could use to modify the data in the same way. We could use the merge command to compare the results of the aggregate subquery on t2 with rows in t1, taking advantage of the 10g enhancement of the merge command to apply results only on matched rows. We could do something with pl/sql bulk selects from the aggregate then apply a forall bulk update taking advantage of array exceptions. In fact, depending on the volume and pattern of data in the two tables, either of the three mechanisms might turn out to be the most efficient in some circumstances.

There is a fourth mechanism which we can’t (or shouldn’t) employ – even though there may be cases where it is more efficient than the three I’ve mentioned so far. It’s the mechanism of the updateable join view, where we join t1 to an aggregate query on t2 on the column that was the correlation column, then update across the join:

update
	(
	select
		t1.small_vc	t1_vc,
		v1.max_vc	v1_vc
	from
		t1,
		(
		select
			t2.id,
			max(small_vc)	max_vc
		from
			t2
		group by
			id
	)	v1
	where
		mod(t1.id,100) = 0
	and	v1.id = t1.id
	)
set	t1_vc = v1_vc
;

There is a problem with this statement, though – it fails with Oracle error “ORA-01779: cannot modify a column which maps to a non key-preserved table” unless you add the hint /*+ bypass_ujvc */ (bypass update join view check ?) and even that hint fails to work in my copy of 11.2.0.2.

A critical feature of Oracle’s implementation of updatable join views is that only one table will be updated by the statement, and that table must be “key-preserved” – meaning that any row from the table to be updated can (logically) appear at most once in the view. A simple restriction to ensure that this requirement is met is to insist that the join(s) to the other tables(s) should always be with equality on the primary (or unique) key of the table(s).

In this case, though, because we aggregate on t2.id, and then join on t2.id, it is clearly the case that any row that appears in t1 can only appear once in the join between t1 and the aggregate of t2. In this update, t1 is clearly key-preserved – but, absent the hint, the optimizer will not allow the mechanism to come into play. (SQL Server will handle this particular example correctly, by the way.)

So, yet another item on my wishlist for the optimizer – let’s see a few more cases of updateable join views being recognised.

In passing, the path that the optimizer produces for join view update, when hinted, isn’t ideal because it fails to spot an opportunity for transitive closure that would make it more efficient. Here’s the execution plan (with Predicate Section):

----------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes |TempSpc| Cost  |
----------------------------------------------------------------------
|   0 | UPDATE STATEMENT      |      |       |       |       |   100 |
|   1 |  UPDATE               | T1   |       |       |       |       |
|*  2 |   HASH JOIN           |      |   100 |  3500 |       |   100 |
|*  3 |    TABLE ACCESS FULL  | T1   |   100 |  1500 |       |    27 |
|   4 |    VIEW               |      | 10000 |   195K|       |    72 |
|   5 |     SORT GROUP BY     |      | 10000 |   146K|   488K|    72 |
|   6 |      TABLE ACCESS FULL| T2   | 10000 |   146K|       |    27 |
----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   3 - filter(MOD("T1"."ID",100)=0)

As you can see, Oracle aggregates the whole of t2, then uses the result to probe a hash table built from a subset of t1. But the basic join is on t2.id = t1.id, which means that the optimizer could, in principle, generate the predicate mod(v1.id,100) = 0 and then push it into the inline view before aggregating a subset of t2. Here’s the plan if you add the predicate manually:

-------------------------------------------------------------------------
| Id  | Operation                       | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                |       |   100 |  3500 |    51 |
|   1 |  UPDATE                         | T1    |       |       |       |
|*  2 |   HASH JOIN                     |       |   100 |  3500 |    51 |
|   3 |    VIEW                         |       |   100 |  2000 |    23 |
|   4 |     SORT GROUP BY               |       |   100 |  1500 |    23 |
|   5 |      TABLE ACCESS BY INDEX ROWID| T2    |   100 |  1500 |    23 |
|*  6 |       INDEX FULL SCAN           | T2_I1 |   100 |       |    21 |
|*  7 |    TABLE ACCESS FULL            | T1    |   100 |  1500 |    27 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   6 - filter(MOD("T2"."ID",100)=0)
   7 - filter(MOD("T1"."ID",100)=0)

It’s an interesting little side effect of the change (for my particular data set) that the probe table is now the t1 table (whereas the probe was a sorted aggregate of t2.id in the earlier version). This means that Oracle will be updating t1 as it scans it; in the previous plan Oracle would be jumping around t1 at random to update it in an order dictated by the arrival order of the aggregated rows coming from t2. In principle, this type of change in the order of updates could result in a reduction in the number of random physical I/Os that take place on t1.

Footnote:

If you want to experiment with variations on this problem then you can start with code like the following to create the two tables – bear in mind that there’s no great point in examining the variation in performance that different mechanisms give when playing with such a small (and regular) data set:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 10000)
select
	rownum			id,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 10000;

create table t2 as select * from t1;

-- collect stats

create unique index t1_i1 on t1(id);
create unique index t2_i1 on t2(id);
alter table t1 modify id not null;
alter table t2 modify id not null;

12 Comments »

  1. The BYPASS_UJVC hint was removed in 11.2.

    Comment by Greg Rahn — December 29, 2011 @ 10:20 pm BST Dec 29,2011 | Reply

    • Greg,

      The hint still seems to exist in 11.2.0.2:

      SELECT
        NAME,
        SQL_FEATURE
      FROM
        V$SQL_HINT
      WHERE
        NAME LIKE 'BYPASS%';
       
      NAME                      SQL_FEATURE
      ------------------------- -----------
      BYPASS_UJVC               QKSFM_CBO
      BYPASS_RECURSIVE_CHECK    QKSFM_ALL
       
      2 rows selected.
       
      SELECT
        BANNER
      FROM
        V$VERSION;
       
      BANNER
      ----------------------------------------------------------------------------
      Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
      PL/SQL Release 11.2.0.2.0 - Production
      CORE    11.2.0.2.0      Production
      TNS for 64-bit Windows: Version 11.2.0.2.0 - Production
      NLSRTL Version 11.2.0.2.0 - Production
       
      5 rows selected.
      

      Is the V$SQL_HINT view incorrect, or does the hint still exist but have no effect?

      A search though Metalink (MOS) found the following articles:
      Doc ID 876633.1: Hint BYPASS_UJVC is for internal use only by Oracle Database
      Bug 9764271: Hint BYPASS_UJVC is ignored after upgrade to 11.2.0.1
      Bug 9595084: Hint BYPASS_UJVC is ignored after upgrade to 11.2.0.1

      I did not see a “fixed in version” note attached to the bug numbers, so you might be right about the hint being removed. The search through Metalink suggests that some of Oracle’s application related code might be using the hint (reference: Bug 9336394: SR 3-1208781511).

      Comment by Charles Hooper — December 29, 2011 @ 11:33 pm BST Dec 29,2011 | Reply

      • Technically the hint exists, but it can only be used by internal code and is enforced as such. There is no “fixed in version” because those bugs are closed as “not a bug”. Given BYPASS_UJVC was never a documented hint, it could change or be removed at any time, and was. That is the risk one faces when using non-public hints.

        Comment by Greg Rahn — December 30, 2011 @ 2:27 am BST Dec 30,2011 | Reply

        • Greg,

          … but it can only be used by internal code and is enforced as such.

          Hence my comment: “which we can’t (or shouldn’t) employ”; in view of which I shall also resist the temptation to explain how to bypass the enforcement mechanism.

          Comment by Jonathan Lewis — December 30, 2011 @ 12:10 pm BST Dec 30,2011

  2. [...] Jonathan Lewis adds yet another item on his wishlist for the optimizer at the end of 2011. Someone should compile that list and send it to Oracle. It’s truly valuable and we might have a full blown Oracle database version, say, Oracle 14j, out of that… [...]

    Pingback by Log Buffer #252, A Carnival of the Vanities for DBAs | The Pythian Blog — December 30, 2011 @ 6:01 am BST Dec 30,2011 | Reply

  3. Jonathan, can you please give details on which of the three mechanisms should be used depending on the volume and pattern of data in the two tables. Whether Merge is appropriate if we are going to use only one clause(matched or not matched)? Thanks.

    Comment by dibang — January 1, 2012 @ 10:07 pm BST Jan 1,2012 | Reply

    • dibang,

      The quick answer to that question is No. It’s just a variation on the standard problem of knowing how to tune an SQL statement.

      Comment by Jonathan Lewis — January 3, 2012 @ 7:49 pm BST Jan 3,2012 | Reply

  4. [...] few days ago I published an example of the optimizer failing to handle an updateable join view because it didn’t recognise that a particular type of [...]

    Pingback by I wish « Oracle Scratchpad — January 6, 2012 @ 5:38 pm BST Jan 6,2012 | Reply

  5. I have no DB at hand right now, so I can not test this.
    But I assume the update would work if we use the aggregate subquery in the select list like this:

    update
    (
        select
            t1.small_vc t1_vc,
            (select max(small_vc)   max_vc from t2 where t2.id = t1.id) v1_vc
        from t1
        where mod(t1.id,100) = 0
        )
    set t1_vc = v1_vc
    where v1_vc is not null -- this should perform the exists part, see note below
    ;
    

    Note that I am relying on the assumptin that small_vc is a not null column, that way the max(small_vc) can not yield null, hence v1_vc is only null if the subquery does not return any rows. So I can use that logic to mimic the exists filter efficiently.
    If small_vc could be null, some NVL trick could help, but that would make my short reply really illegible…

    Comment by András Gábor — January 7, 2012 @ 12:55 pm BST Jan 7,2012 | Reply

  6. Congratulations Jonathan, your wish came true :)

    update
        (
        select
            t1.small_vc t1_vc,
            v1.max_vc   v1_vc
        from
            t1,
            (
            select
                t2.id,
                max(small_vc)   max_vc
            from
                t2
            group by
                id
        )   v1
        where
            mod(t1.id,100) = 0
        and v1.id = t1.id
        )
    set t1_vc = v1_vc
     22  ;
    
    100 rows updated.
    
    SQL> select * from table(dbms_xplan.display_cursor);
    
    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------
    SQL_ID	551u0cnpbp9vh, child number 0
    -------------------------------------
    update	   (	 select 	t1.small_vc t1_vc,	   v1.max_vc
    v1_vc	  from	       t1,	   (	     select		t2.id,
    	   max(small_vc)   max_vc	  from		   t2
    group by	     id     )	v1     where	     mod(t1.id,100) = 0
        and v1.id = t1.id	  ) set t1_vc = v1_vc
    
    Plan hash value: 3293262119
    
    -------------------------------------------------------------------------------
    | Id  | Operation	       | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT       |      |       |       |   102 (100)|	      |
    |   1 |  UPDATE 	       | T1   |       |       | 	   |	      |
    |   2 |   MERGE JOIN	       |      |   100 |  5000 |   102	(0)| 00:00:01 |
    |   3 |    SORT JOIN	       |      | 10000 |   341K|    51	(0)| 00:00:01 |
    |   4 |     VIEW	       |      | 10000 |   341K|    51	(0)| 00:00:01 |
    |   5 |      SORT GROUP BY     |      | 10000 |   146K|    51	(0)| 00:00:01 |
    |   6 |       TABLE ACCESS FULL| T2   | 10000 |   146K|    51	(0)| 00:00:01 |
    |*  7 |    SORT JOIN	       |      |   100 |  1500 |    51	(0)| 00:00:01 |
    |*  8 |     TABLE ACCESS FULL  | T1   |   100 |  1500 |    51	(0)| 00:00:01 |
    -------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       7 - access("V1"."ID"="T1"."ID")
           filter("V1"."ID"="T1"."ID")
       8 - filter(MOD("T1"."ID",100)=0)
    
    
    31 rows selected.
    
    SQL> alter session set "_fix_control"='13555551:0';
    
    Session altered.
    
    update
        (
        select
            t1.small_vc t1_vc,
            v1.max_vc   v1_vc
        from
            t1,
            (
            select
                t2.id,
                max(small_vc)   max_vc
            from
                t2
            group by
                id
        )   v1
        where
            mod(t1.id,100) = 0
        and v1.id = t1.id
        )
     21  set t1_vc = v1_vc;
    set t1_vc = v1_vc
        *
    ERROR at line 21:
    ORA-01779: cannot modify a column which maps to a non key-preserved table
    

    Comment by Timur Akhmadeev — July 10, 2013 @ 7:53 pm BST Jul 10,2013 | Reply

    • Timur,

      I’m obviously going to have to rush out all my “CBO 12c fix” blogs as quickly as I can if you’re going to discover them this quickly ;)

      Are you reading the contents of v$system_fix_control and working backwards ? That’s a pretty good way of finding out new features and fixes.

      Comment by Jonathan Lewis — July 10, 2013 @ 8:10 pm BST Jul 10,2013 | Reply

  7. […] Key-preservation ensured by aggregation – fixed […]

    Pingback by 12c Join Views | Oracle Scratchpad — August 9, 2013 @ 6:37 pm BST Aug 9,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,003 other followers