Oracle Scratchpad

December 19, 2008

Updatable Join Views

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 8:25 am BST Dec 19,2008

I was in Slovenia earlier this week presenting one of the “Celebrity Seminars” for Oracle University. The audience was very good, and had some interesting questions and observations. One of them introduced me to the following oddity that he had discovered with updatable join views.

Given suitable constraints, I can write a query that joins several tables, then turn that query into an inline view and update from one table to another, for example (tested on 10.2.0.3 and 11.1.0.6):

update	(
	select
		chi.small_vc_c,
		gpa.small_vc_gp
	from
		test_user.child		chi,
		test_user.parent	par,
		test_user.grandparent	gpa
	where
		chi.small_num_c between 200 and 215
	and	par.id_gp = chi.id_gp
	and	par.id    = chi.id_p
	and	gpa.id    = par.id_gp
	)	gen
set
	gen.small_vc_c = gen.small_vc_gp
;

For this update to the child table to be legal, I have to be able to guarantee that any row from the table that appears in the inline view cannot logically appear there more than once.

This restriction is, in some ways, obvious: if a single row from the child table could appear twice in the join then the two appearances could correspond to two different rows from the grandparent table, which would leave you trying to decide which grandparent value to use in the update.
Consider for example this data:

create table tabX (x1 number, x2 number);
create table tabY (y1 number, y2 number);

insert into tabx values(1,1);
insert into tabx values(1,99);
insert into taby values(1,-1);
insert into taby values(1,-99);

commit;

select
	y2, x2
from
	tabX, tabY
where
	tabx.x1 = taby.y1
;

        Y2         X2
---------- ----------
        -1         99
        -1          1
       -99         99
       -99          1

Both rows in the tabY table appear twice in the join result (as do both rows from tabX). If I try the following update, which value of x2 should end up overwriting the value for y2 in the underlying table – the 99 that happens to appear first in the join, or the 99 that happens to appear last (bearing in mind, of course, that the order of the output is not guaranteed) ?

update 	(
	select
		y2, x2
	from
		tabX, tabY
	where
		tabx.x1 = taby.y1
	)
set
	y2 = x2
;

ERROR at line 10:
ORA-01779: cannot modify a column which maps to a non key-preserved table

The technical term used by Oracle is “key-preservation”. Going back to the original example: if you identify a row in the child table using a unique key then the same key could act as the key to the join view.

For this to work the join from child to parent has to be on a unique key in the parent table, and the join from the parent to grandparent has to be on a unique key of the grandparent.

The updatable join view concept is not commonly known, but there are circumstances where it is the most efficient option for updating from one table to another.

Finally we come to the oddity. You’ll notice that I had included the name of the table owner in my original query – this is because I want to be able to run the update from another schema called u1. So what privileges should test_user grant to u1 ? The obvious requirement would be as follows:

grant select, update (small_vc_c) on child to u1;
grant select on parent to u1;
grant select on grandparent to u1;

This means I can update the relevant column from the child table, and see the columns from the parent and grandparent tables that I need to do the update.
But when I try the update, I get the following error:

                test_user.grandparent   gpa
                          *
ERROR at line 8:
ORA-01031: insufficient privileges

Until I grant further (unexpected) privileges to u1, I can’t execute the update. This is what it takes:


grant update (id, small_vc_gp) on grandparent to u1;
grant update (id_gp, id) on parent to u1;
grant update (id_gp, id_p, id) on child to u1;

I have to grant update column privileges of the relevant unique keys on all three tables and on the column from the grandparent table that I want to copy from. This means that u1 has to have a level of update privilege that I don’t want it to have.

Of course, I can use non-deferrable referential integrity constraints to ensure that the key columns on the parent and grandparent tables can’t be changed – but I can’t stop u1 from changing the small_vc_gp column on the grandparent table, or the id column of the child table. I can’t think of a good reason why these extra privileges are needed.

There’s an important moral to this story. You should not build a system that allows end-users to connect to the database as the data owner.

But it’s still quite common for development work to take place in an environment that is a little lax about this rule, with developers connecting as the data owner to write code. This could leave you with code that seemed to work in the development environment but suddenly needs a change which grants excess privileges to the end-user before it runs in the production environment.

Remember, it’s important to use a development environment that is a proper match for the production environment.

19 Comments »

  1. Ever since 10g allowed one to use the MERGE statement to do a simple update (in other words, not requiring an INSERT portion of the statement), I have shied away from UPDATE JOINS, as they are really no longer required. Instead of relying on constraints to tell Oracle what you intend to update, MERGE allows you to specify it in the statement. That doesn’t get us around the issue in your test statement… but I think it makes coding a little easier.

    SQL> MERGE INTO taby t
     2  USING (SELECT x2,
     3                x1
     4            FROM tabx) s
     5  ON (s.x1=t.y1)
     6  WHEN MATCHED
     7  THEN UPDATE
     8     SET t.y2 = s.x2;
    MERGE INTO taby t
              *
    ERROR at line 1:
    ORA-30926: unable to get a stable set of rows in the source tables

    Elapsed: 00:00:00.01
    SQL>

    Comment by Stewart Bryson — December 19, 2008 @ 3:11 pm BST Dec 19,2008 | Reply

    • Stewart,

      That’s a valid point – and the merge command can be very powerful – especially in 10g. Nevertheless there are three or four mechanisms for doing this type of thing, and each has its sttrengths and weaknesses. (Sometimes, for example, the best insert/update option is to write your own two-pass code using array exception handling).

      A minor drawback to the merge command is that you may not realise that a single incoming row can update multiple existing rows – a possibility that may be intuitively more obvious in the update join view. It is sometimes surprising how, on a subjective level, one technique can seem to be wonderful to one user and awful to another when, on an objective level, there is no significant difference.

      Comment by Jonathan Lewis — December 21, 2008 @ 8:07 pm BST Dec 21,2008 | Reply

  2. @Jonathan

    The extra grants maybe “required” by the old bug (now solved) that permitted a user with only select privileges to update a view.

    I mean they can be an extra check or sort of.

    Comment by lascoltodelvenerdi — December 19, 2008 @ 4:13 pm BST Dec 19,2008 | Reply

    • Lascoltodelvenerdi,

      Good point – it is sometimes quite surprising how a fix for one problem pops up some time later as an anomaly in some other situation.

      Comment by Jonathan Lewis — December 21, 2008 @ 8:09 pm BST Dec 21,2008 | Reply

  3. I don’t think current view update vendor implementation should be taken seriously. After all even database theorists can’t agree how it should be done (except somewhat less ambitious approach with update triggers).

    How about the following scenario:


    drop table tabX;
    drop table tabY;

    create table tabX (p number,
    CONSTRAINT unique_xp UNIQUE (p)
    );
    create table tabY (p number,
    CONSTRAINT unique_yp UNIQUE (p)
    );

    insert into tabX values(1);
    insert into tabY values(1);

    commit;

    insert into (
           select
                   x.p
           from
                   tabX x, tabY y
           where
                   x.p = y.p
    ) values(2);

    select * from tabX;
    select * from tabY;
           select
                   x.p
           from
                   tabX x, tabY y
           where
                   x.p = y.p;

    rollback;

    Now on my 11.1 db:
    1. After the update a row is inserted into tabX and not into tabY?
    2. The join view doesn’t change at all and show only one row??

    Comment by Vadim Tropashko — December 19, 2008 @ 10:53 pm BST Dec 19,2008 | Reply

  4. Experts,

    is it possible to parallelize the actual update statement while using Updatable Join View?

    alter session enable parallel dml;

    update /*+ PARALLEL(gen) */
    (
           select
                   chi.small_vc_c,
                   gpa.small_vc_gp
           from
                   test_user.child                chi,
                   test_user.parent        par,
                   test_user.grandparent        gpa
           where
                   chi.small_num_c between 200 and 215
           and        par.id_gp = chi.id_gp
           and        par.id    = chi.id_p
           and        gpa.id    = par.id_gp
    ) gen
    set
           gen.small_vc_c = gen.small_vc_gp
    ;

    the above does not work :( the actual update is done in single thread.
    2 days ago at work we wanted to update 30 mil. of rows in a 80 mil. table from another table.
    While we could parallelize query(HJ) in a updateable view (which we verified could complete in a couple of minutes without the update statement) longops with update statement showed 2 days to complete.
    The solution was to use parallel CTAS to rebuild the whole table, which completed in under 45 mins, indexes included.

    Thanks.

    Comment by Dan — December 19, 2008 @ 11:17 pm BST Dec 19,2008 | Reply

    • Dan,

      I’ve edited your code to be more readable – which included changing the hint syntax because the “double hyphen” version didn’t show up properly.

      I am puzzled by the results of testing on 10.2.0.3. My plan looks as if the update is able to run parallel, v$pq_tqstat reports data that says the update actually ran parallel, v$transaction shows parallel generation of redo – yet I don’t get the usual Oracle error ORA-12838: cannot read/modify an object after modifying it in parallel when I tried to do a second update without issuing a commit against the first.

      This was the execution plan for one test:

      -----------------------------------------------------------------------------------------------------------
      | Id  | Operation                           | Name        | Rows  | Time     |    TQ  |IN-OUT| PQ Distrib |
      -----------------------------------------------------------------------------------------------------------
      |   0 | UPDATE STATEMENT                    |             |   335 | 00:00:14 |        |      |            |
      |   1 |  PX COORDINATOR                     |             |       |          |        |      |            |
      |   2 |   PX SEND QC (RANDOM)               | :TQ10003    |     1 | 00:00:01 |  Q1,03 | P->S | QC (RAND)  |
      |   3 |    INDEX MAINTENANCE                | CHILD       |       |          |  Q1,03 | PCWP |            |
      |   4 |     PX RECEIVE                      |             |     1 | 00:00:01 |  Q1,03 | PCWP |            |
      |   5 |      PX SEND RANGE                  | :TQ10002    |     1 | 00:00:01 |  Q1,02 | P->P | RANGE      |
      |   6 |       UPDATE                        | CHILD       |       |          |  Q1,02 | PCWP |            |
      |   7 |        PX RECEIVE                   |             |     1 | 00:00:01 |  Q1,02 | PCWP |            |
      |   8 |         PX SEND HASH (BLOCK ADDRESS)| :TQ10001    |     1 | 00:00:01 |  Q1,01 | P->P | HASH (BLOCK|
      |*  9 |          TABLE ACCESS BY INDEX ROWID| CHILD       |     1 | 00:00:01 |  Q1,01 | PCWC |            |
      |  10 |           NESTED LOOPS              |             |   335 | 00:00:14 |  Q1,01 | PCWP |            |
      |* 11 |            HASH JOIN                |             |  2000 | 00:00:01 |  Q1,01 | PCWP |            |
      |  12 |             PX BLOCK ITERATOR       |             |  1000 | 00:00:01 |  Q1,01 | PCWC |            |
      |  13 |              TABLE ACCESS FULL      | GRANDPARENT |  1000 | 00:00:01 |  Q1,01 | PCWP |            |
      |  14 |             BUFFER SORT             |             |       |          |  Q1,01 | PCWC |            |
      |  15 |              PX RECEIVE             |             |  2000 | 00:00:01 |  Q1,01 | PCWP |            |
      |  16 |               PX SEND BROADCAST     | :TQ10000    |  2000 | 00:00:01 |        | S->P | BROADCAST  |
      |  17 |                INDEX FAST FULL SCAN | P_PK        |  2000 | 00:00:01 |        |      |            |
      |* 18 |            INDEX RANGE SCAN         | C_PK        |     1 | 00:00:01 |  Q1,01 | PCWP |            |
      -----------------------------------------------------------------------------------------------------------

      Comment by Jonathan Lewis — December 21, 2008 @ 9:31 pm BST Dec 21,2008 | Reply

  5. @Vadim

    Where is the problem?

    I think it work correctly.

    1. After the update a row is inserted into tabX and not into tabY?

    That’s right! The insert as the update of a view, can work only on ONE table.
    If you want to insert rows into more than one table, you must use the multi-table insert.

    2. The join view doesn’t change at all and show only one row??
    Right. As you inserted only in one table, the join condition “x.p = y.p” is not satisfied.

    @Dan

    Last hope: a fully qualified hint

    update /*+ PARALLEL(gen.chi,4) */

    so forcing the degree also.

    But I’m not sure it will work…

    I would try to put the parallel hint also in the select.

    Comment by lascoltodelvenerdi — December 20, 2008 @ 8:02 pm BST Dec 20,2008 | Reply

    • lascoltodelvenerdi,

      Relational database theory says that, in principle, you should not be able to tell whether your SQL is accessing a a table or a view – so I should be able to insert a row into the view and then be able to see it when I query the view (Chris Date has some very good material in his presentations about this).

      In Vadim’s example, he’s inserted a row into a view, and simply doesn’t exist when he queries that view.

      Comment by Jonathan Lewis — December 21, 2008 @ 9:37 pm BST Dec 21,2008 | Reply

  6. “then be able to see it when I query the view ”
    When you create an explicit view you can add the “WITH CHECK OPTION” clause so that you can’t insert a row that you couldn’t subsequently query. Have to see what that does for vadim’s example

    Comment by Gary — December 21, 2008 @ 10:06 pm BST Dec 21,2008 | Reply

  7. @Jonathan

    I get the point, but looking at how Oracle works with updatable views, I think that the result of Vadim is OK.

    For sure it is not an optimal implementation, but is the only that we got (in Oracle at least).

    Comment by lascoltodelvenerdi — December 23, 2008 @ 9:08 am BST Dec 23,2008 | Reply

  8. Finally got around to checking Vadim’s view from comment 3 in 11g.
    Using an explicit view, rather than in-line view, it works as stated.
    If I add the WITH CHECK OPTION clause I mentioned in comment 10, the view becomes non-updatable.

    Comment by Gary — December 31, 2008 @ 3:19 am BST Dec 31,2008 | Reply

  9. Even though the Update statement runs in parallel we saw lot of “db file sequential eads” wait events even though we are updating columns which have no indexes.

    What could be the reason for that,can you please let us know.

    Thanks

    Comment by Vikas — July 26, 2009 @ 5:59 am BST Jul 26,2009 | Reply

    • I bet it’s single block reads of the table which is updated – because to update a row by ROWID you have to read this block to buffer cache. When PX is in use, there’s a big chance a block wouldn’t be in cache, so you have to read it.

      Comment by Timur Akhmadeev — July 26, 2009 @ 9:02 am BST Jul 26,2009 | Reply

  10. Vikas,

    Parallel execution doesn’t stop indexed access paths appearing in the execution plan. (See this note, for example). And even if you do nothing by direct path reads to get the raw data, you may have to do single block reads on the undo segment to create read-consistent copies of some of the data blocks.

    Comment by Jonathan Lewis — July 26, 2009 @ 6:48 am BST Jul 26,2009 | Reply

  11. Hi Jonathan,

    Thanks for clarifying some of the points. However if you could explain me a liitle bit more on the updateable join views it would definitely help.

    
          update /*+ parallel (x) nologging */ 
          (
            select /*+ parallel(a) parallel(b) use_hash (b a) */
            a.crtv_name,
            a.crtv_pend_flag,
            a.crtv_edit_status,
            a.crtv_exploded_flag,
            a.ss_rank,
            a.cm_rank,
            a.crtv_edit_subm_tmstmp,
            a.crtv_edit_revw_tmstmp,
            a.crtv_del_tmstmp,
            a.crtv_lang_id,
            a.crtv_url_key,
            a.crtv_title,
            a.crtv_desc,
            a.crtv_short_desc,
            a.crtv_display_url,
            a.crtv_status,
            a.crtv_ss_weight,
            a.crtv_cm_weight,
            a.crtv_ss_rotation_flg,
            a.crtv_cm_rotation_flg,
            a.crtv_rating,
            a.historical_crtv_sid,
            a.effective_start_date,
            b.crtv_name b_crtv_name,
            b.pending_flg b_pending_flg,
            b.edit_status b_edit_status,
            b.exploded_flg b_exploded_flg,
            b.ss_rank b_ss_rank,
            b.cm_rank b_cm_rank,
            b.edit_subm_tmstmp b_edit_subm_tmstmp,
            b.edit_revw_tmstmp b_edit_revw_tmstmp,
            b.del_tmstmp b_del_tmstmp,
            b.lang_id b_lang_id,
            b.url_id b_url_id,
            b.title b_title,
            b.descr b_descr,
            b.short_descr b_short_descr,
            b.display_url b_display_url,
            b.status b_status,
            b.ss_weight b_ss_weight,
            b.cm_weight b_cm_weight,
            b.ss_rotation_flg b_ss_rotation_flg,
            b.cm_rotation_flg b_cm_rotation_flg,
            b.crtv_rating b_crtv_rating,
            b.historical_crtv_sid b_historical_crtv_sid,
            decode(b.ir_dml_type, '2', trunc(sysdate), a.effective_start_date) b_effective_start_date
            from ysmcd.dim_com_creative_c a,
            ysmcd.dim_stg_creative b
            where a.crtv_sid = b.crtv_id
            and b.ir_dml_type in ('1','2')
          ) x
          set
          crtv_name = b_crtv_name,
          crtv_pend_flag = b_pending_flg,
          crtv_edit_status = b_edit_status,
          crtv_exploded_flag = b_exploded_flg,
          ss_rank = b_ss_rank,
          cm_rank = b_cm_rank,
          crtv_edit_subm_tmstmp = b_edit_subm_tmstmp,
          crtv_edit_revw_tmstmp = b_edit_revw_tmstmp,
          crtv_del_tmstmp = b_del_tmstmp,
          crtv_lang_id = b_lang_id,
          crtv_url_key = b_url_id,
          crtv_title = b_title,
          crtv_desc = b_descr,
          crtv_short_desc = b_short_descr,
          crtv_display_url = b_display_url,
          crtv_status = b_status,
          crtv_ss_weight = b_ss_weight,
          crtv_cm_weight = b_cm_weight,
          crtv_ss_rotation_flg = b_ss_rotation_flg,
          crtv_cm_rotation_flg = b_cm_rotation_flg,
          crtv_rating = b_crtv_rating,
          historical_crtv_sid = b_historical_crtv_sid,
          effective_start_date = b_effective_start_date;
    
    
    SQL> set linesize 1000
    SQL> select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Plan hash value: 4193413443
    
    ------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
    ------------------------------------------------------------------------------------------------------------------------------------------------
    |   0 | UPDATE STATEMENT             |                    |  3220K|  1824M| 23783   (1)| 00:07:09 |    |          |        |      |            |
    |   1 |  UPDATE                      | DIM_COM_CREATIVE_C |       |       |            |          |    |          |        |      |            |
    |   2 |   PX COORDINATOR             |                    |       |       |            |          |    |          |        |      |            |
    |   3 |    PX SEND QC (RANDOM)       | :TQ10001           |  3220K|  1824M| 23783   (1)| 00:07:09 |    |          |  Q1,01 | P->S | QC (RAND)  |
    |*  4 |     HASH JOIN                |                    |  3220K|  1824M| 23783   (1)| 00:07:09 |    |          |  Q1,01 | PCWP |            |
    |   5 |      PX RECEIVE              |                    |  3220K|   826M|   699   (1)| 00:00:13 |    |          |  Q1,01 | PCWP |            |
    |   6 |       PX SEND BROADCAST LOCAL| :TQ10000           |  3220K|   826M|   699   (1)| 00:00:13 |    |          |  Q1,00 | P->P | BCST LOCAL |
    |   7 |        PX BLOCK ITERATOR     |                    |  3220K|   826M|   699   (1)| 00:00:13 |  1 |    64    |  Q1,00 | PCWC |            |
    |*  8 |         TABLE ACCESS FULL    | DIM_STG_CREATIVE   |  3220K|   826M|   699   (1)| 00:00:13 |  1 |    64    |  Q1,00 | PCWP |            |
    |   9 |      PX BLOCK ITERATOR       |                    |   309M|    93G| 23067   (1)| 00:06:56 |  1 |    64    |  Q1,01 | PCWC |            |
    |  10 |       TABLE ACCESS FULL      | DIM_COM_CREATIVE_C |   309M|    93G| 23067   (1)| 00:06:56 |  1 |    64    |  Q1,01 | PCWP |            |
    ------------------------------------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       4 - access("A"."CRTV_SID"="B"."CRTV_ID")
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
       8 - filter("B"."IR_DML_TYPE"='1' OR "B"."IR_DML_TYPE"='2')
    
    

    During the run of this update we have continously monitored that the inner join is doing a HASH JOIN, with DIM_COM_CREATIVE_C doing Rowid Range Scan by “n” parallel slaves. We can see the direct path read events during its run which is obvious since the data is getting hash probed in DIM_STG_CREATIVE which is hashed.

    My questions:

    1) Is this statement doing Parallel Updates, I see P->S operation during the last step.
    2) At which step of the execution plan does it know which blocks need updation and have to be sent to buffer cache using single block I/O access. I am seeing this event even when Parallel Slaves are doing a direct path read. should it not be the last step when the BULK UPDATE is going to happen.
    3) I am seeing the direct path reads along with db file sequential reads happening during this update run process.

    Please let me understand how it works.

    Thanks,

    Comment by Vikas — July 27, 2009 @ 4:03 am BST Jul 27,2009 | Reply

    • 1) No. PDML must be explicitly enabled in the session:

      alter session enable parallel dml;
      

      And there are limitations to PDML.
      2) At the #1 step update takes place. Rows are sent to QC at the step #3. As you are using partial partition-wise join with broadcasting DIM_STG_CREATIVE table, Oracle is able to pipe all steps and run them “simultaneously”. Here is a note by Jonathan about HJ with PX when HASH-HASH distribution is in use – in this case you won’t see update before HJ is completely finished.
      3) Again – table and undo blocks have to be read to perform update

      You should read documentation about PX and plans to get better understanding what’s going on:
      Using Parallel Execution, http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm#i26005.

      Comment by Timur Akhmadeev — July 27, 2009 @ 7:13 am BST Jul 27,2009 | Reply

    • Vikas,
      I think Timur has addressed all your points.

      Comment by Jonathan Lewis — July 27, 2009 @ 8:18 pm BST Jul 27,2009 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.