Oracle Scratchpad

June 24, 2016

Never …

Filed under: Infrastructure,Oracle,Philosophy — Jonathan Lewis @ 1:15 pm BST Jun 24,2016

From time to time a question comes up on OTN that results in someone responding with the mantra: “Never do in PL/SQL that which can be done in plain  SQL”. It’s a theme I’ve mentioned a couple of times before on this blog, most recently with regard to Bryn Llewellyn’s presentation on transforming one table into another and Stew Ashton’s use of Analytic functions to solve a problem that I got stuck with.

Here’s a different question that challenges that mantra. What’s the obvious reason why someone might decide to produce the following code rather than writing a simple “insert into t1 select * from t2;”:


declare

        cursor c1 is
        select * from t2
        ;

        type c1_array is table of c1%rowtype index by binary_integer;
        m_tab c1_array;

begin

        open c1;
        loop
                fetch c1
                bulk collect into m_tab limit 100;

                begin
                        forall i in 1..m_tab.count
                                insert into t1 values m_tab(i);
                exception
                        when others
                                then begin
                                        --  proper exception handling should go here
                                        dbms_output.put_line(m_tab(1).id);
                                        dbms_output.put_line(sqlerrm);
                                end;
                end;

                exit when c1%notfound;

        end loop;
        close c1;
end;
/

There is a very good argument for this approach.

Follow-up (Saturday 25th)

As Andras Gabor pointed out in one of the comments, there are documented scenarios where the execution plan for a simple select statement is not legal for the select part of an “insert into .. select …” statement. Specifically, if you have a distributed query the most efficient execution plan may require the remote site to be the driving site, but the plan for a CTAS or insert/select is required to use the local site as the driving site.

There are workarounds – if you’re allowed to use them – such as creating a view at the remote site and selecting from the view, or you could create a pipelined function locally and select from the pipelined function (but that’s going to be writing PL/SQL anyway, and you’d have to create one or two object types in the database to implement it).s

Another example of plan limitations, that I had not seen before (but have now found documented as “not a bug in MoS note 20112932”), showed up in a comment from Louis: a select statement may run efficiently because the plan uses a Bloom filter, but the filter disappears when the statement is used in insert/select.

These limitations, however, were not the point I had in mind. The “obvious” reason for taking the pl/sql approach is error handling. What happens if one of the rows in your insert statement raises an Oracle exception ? The entire statement has to rollback. If you adopt the PL/SQL array processing approach then you can trap each error as it occurs and decide what to do about it – and there’s an important detail behind that statement that is really important: the PL/SQL can operate at virtually the same speed as the simple SQL statement once you’ve set the arraysize to a value which allows each insert to populate a couple of blocks.

Let me emphasise the critical point of the last sentence:  array inserts in PL/SQL operate at (virtually) the speed of the standard SQL insert / select.

As it stands I don’t think the exception handler in my code above could detect which row in the batch had caused the error – I’ve just printed the ID from the first row in the batch as a little debug detail that’s only useful to me because of my knowledge of the data. Realistically the PL/SQL block to handle the inserts might look more like the following:


-- In program declaration section

        dml_errors      exception;
        pragma exception_init(dml_errors, -24381);

        m_error_pos     number(6,0)     := 0;

-- ------------------------------

                begin
                        forall i in 1..m_tab.count save exceptions
                                insert into t1 values m_tab(i);
                exception
                        when dml_errors then begin

                                for i in 1..sql%bulk_exceptions.count loop

                                        dbms_output.put_line(
                                                'Array element: ' ||
                                                        sql%bulk_exceptions(i).error_index || ' ' ||
                                                        sqlerrm(-sql%bulk_exceptions(i).error_code)
                                        );

                                        m_error_pos := sql%bulk_exceptions(i).error_index;
                                        dbms_output.put_line(
                                                'Content: ' || m_tab(m_error_pos).id || ' ' || m_tab(m_error_pos).n1
                                        );

                                end loop;
                        end;

                        when others then raise;
                end;


You’ll notice that I’ve added the SAVE EXCEPTIONS clause to the FORALL statement. This allows Oracle to trap any errors that occur in the array processing step and record details of the guilty array element as it goes along, storing those details in an array calls SQL%BULK_EXCEPTIONS. My exception handler then handles the array processing exception by walking through that array.

I’ve also introduced an m_error_pos variable (which I could have declared inside the specific exception handler) to remove a little of the clutter from the line that shows I can identify exactly which row in the source data caused the problem. With a minimum of wasted resources this code now inserts all the valid rows and reports the invalid rows (and, if necessary, could take appropriate action on each invalid row as it appears).

If you’ve got a data loading requirement where almost all the data is expected to be correct but errors occasionally happen, this type of coding strategy is likely to be the most efficient thing you could do to get your data into the database. It may be slightly slower when there are no errors, but that’s a good insurance premium when compared with the crash and complete rollback that occurs if you take the simple approach – and there are bound to be cases where a pre-emptive check of all the data (that would, probably, make the insert safe) would add far more overhead than the little bit of PL/SQL processing shown here.

Results

It’s obviously a little difficult to produce any time-based rates that demonstrate the similarity in performance of the SQL and PL/SQL approaches – the major time component in a little demo I built was about the I/O rather than the the CPU (which, in itself, rather validates the claim anyway). But if you want to do some testing here’s my data model with some results in the following section:


rem
rem     Script: plsql_loop_insert.sql
rem     Author: Jonathan Lewis
rem

execute dbms_random.seed(0)

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        cast(rownum as number(8,0))                     id,
        2 * trunc(dbms_random.value(1e10,1e12))         n1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;
create table t2
nologging
noparallel
as
select
        /*+ no_parallel(t1) */
        id + 1e6        id,
        n1 - 1          n1,
        rpad('x',100,'x') padding
from t1 
;

-- update t2 set n1 = n1 + 1 where id = 2e6;
-- update t2 set n1 = n1 + 1 where id = 2e6 - 10;
-- update t2 set n1 = n1 + 1 where id = 2e6 - 20;
-- update t2 set n1 = n1 + 1 where id = 1750200;
-- update t2 set n1 = n1 + 1 where id = 1500003;
-- update t2 set n1 = n1 + 1 where id = 1500001;
commit;

alter system checkpoint;
alter system switch logfile;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

create unique index t1_i1 on t1(n1) nologging;
create unique index t1_pk on t1(id) nologging;
alter table t1 add constraint t1_pk primary key(id);


I’ve generated 1 million rows with an id column and a random integer – picking the range of the random numbers to give me a very good chance (that worked) of getting unique set of values. I’ve doubled the random values I use for t1 so that I can substract 1 and still guarantee uniqueness when I generate the t2 values (I’ve also added 1 million to the id value for t2 for the same uniqueness reasons).

The optional update to add 1 to a scattering of rows in t2 ensures that those values go back to their original t1 values so that they can cause “duplicate key” errors. The SQL insert was a simple insert into t1 select * from t2 (ensuring that parallel query didn’t come into play), and the PL/SQL detail I used was as follows:

declare

        cursor c1 is
        select /*+ no_parallel(t2) */ * from t2
        ;

        type c1_array is table of c1%rowtype index by binary_integer;
        m_tab c1_array;

        dml_errors      exception;
        pragma exception_init(dml_errors, -24381);

        m_error_pos     number(6,0)     := 0;

begin

        open c1;
        loop
                fetch c1
                bulk collect
                into m_tab limit 100;

                begin
                        forall i in 1..m_tab.count save exceptions
                                insert into t1 values m_tab(i);

                exception
                        when dml_errors then begin

                                for i in 1..sql%bulk_exceptions.count loop

                                        dbms_output.put_line(
                                                'Array element: ' ||
                                                        sql%bulk_exceptions(i).error_index || ' ' ||
                                                        sqlerrm(-sql%bulk_exceptions(i).error_code)
                                        );

                                        m_error_pos := sql%bulk_exceptions(i).error_index;
                                        dbms_output.put_line(
                                                'Content: ' || m_tab(m_error_pos).id || ' ' || m_tab(m_error_pos).n1
                                        );

                                end loop;
                        end;

                        when others then raise;

                end;

                exit when c1%notfound;  -- when fetch < limit

        end loop;
        close c1;
end;
/

The PL/SQL output with one bad row (2e6 – 20) looked like this:

Array element: 80 ORA-00001: unique constraint (.) violated
Content: 1999980 562332925640

Here are some critical session statistics for different tests in 11g:


No bad data, insert select
--------------------------
Name                                                 Value
----                                                 -----
CPU used when call started                             944
CPU used by this session                               944
DB time                                              1,712
redo entries                                     1,160,421
redo size                                      476,759,324
undo change vector size                        135,184,996

No bad data, PL/SQL loop
------------------------
Name                                                 Value
----                                                 -----
CPU used when call started                             990
CPU used by this session                               990
DB time                                              1,660
redo entries                                     1,168,022
redo size                                      478,337,320
undo change vector size                        135,709,056


Duplicate Key (2e6-20), insert select (with huge rollback)
----------------------------------------------------------
Name                                                 Value
----                                                 -----
CPU used when call started                           1,441
CPU used by this session                             1,440
DB time                                              2,427
redo entries                                     2,227,412
redo size                                      638,505,684
undo change vector size                        134,958,012
rollback changes - undo records applied          1,049,559

Duplicate Key (2e6-20), PL/SQL loop - bad row reported
------------------------------------------------------
Name                                                 Value
----                                                 -----
CPU used when call started                             936
CPU used by this session                               936
DB time                                              1,570
redo entries                                     1,168,345
redo size                                      478,359,528
undo change vector size                        135,502,488
rollback changes - undo records applied                 74

Most of the difference between CPU time and DB time in all the tests was file I/O time (in my case largerly checkpoint wait time, I had small log files, but in larger systems it’s quite common to see a lot of time spent on db file sequential reads as index blocks are read for update). You can see that there’s some “unexpected” variation in CPU time – I wasn’t expecting the PL/SQL loop that failed after nearly 1M inserts to use less CPU than anything else – but the CPU numbers fluctuated a few hundredths of a second across tests, this just happened to be particularly noticeable with the first one I did – so to some extent this was probably affected by background activity relating to space management, job queue processing and all the other virtual machines on the system.

Critically I think it’s fair to say that the differences in CPU timing are not hugely significant across a reasonably sized data set, and most importantly the redo and undo hardly vary at all between the successful SQL and both PL/SQL tests. The bulk processing PL/SQL approach doesn’t add a dramatic overhead – but it clearly does bypass the threat of a massive rollback.

Footnote:

You might want to argue the case for using basic SQL with the log errors clause. The code method is simple and it gives you a table of rows which have caused exceptions as the insert executed – and that may be sufficient for your purposes; but there’s a problem until you upgrade to 12c.

Here’s how I had to modify my test case to demonistrate the method:


begin
        dbms_errlog.create_error_log('t1');
end;
/

insert into t1 select * from t2
log errors
reject limit unlimited
;

The procedure call creates a table to hold the bad rows, by default it’s name will be err$_t1, and it will be a clone of the t1 table with changes to column types (which might be interseting if you’ve enable 32K columns in 12c — to be tested) and a few extra columns:


SQL> desc err$_t1
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 ORA_ERR_NUMBER$                        NUMBER
 ORA_ERR_MESG$                          VARCHAR2(2000)
 ORA_ERR_ROWID$                         ROWID
 ORA_ERR_OPTYP$                         VARCHAR2(2)
 ORA_ERR_TAG$                           VARCHAR2(2000)
 ID                                     VARCHAR2(4000)
 N1                                     VARCHAR2(4000)
 PADDING                                VARCHAR2(4000)

SQL> execute print_table('select * from err$_t1')
ORA_ERR_NUMBER$               : 1
ORA_ERR_MESG$                 : ORA-00001: unique constraint (TEST_USER.T1_I1) violated

ORA_ERR_ROWID$                :
ORA_ERR_OPTYP$                : I
ORA_ERR_TAG$                  :
ID                            : 1999980
N1                            : 562332925640
PADDING                       : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

SO what’s the problem with logging errors ? Here are the sets of session stats corresponding to the ones that I reported above for the SQL and PL/SQL options. The first set comes from running this test on 11.2.0.4, the second from 12.1.0.2:


11g results
===========
Name                                                 Value
----                                                 -----
CPU used when call started                           1,534
CPU used by this session                             1,534
DB time                                              2,816
redo entries                                     3,113,105
redo size                                      902,311,860
undo change vector size                        269,307,108

12c results
===========
Name                                                 Value
----                                                 -----
CPU used when call started                             801
CPU used by this session                               801
DB time                                              3,061  -- very long checkpoint waits !!
redo entries                                     1,143,342
redo size                                      492,615,336
undo change vector size                        135,087,044


Ihe 12c stats are very sinilar to the stats from the perfect SQL run and the two PL/SQL runs – but if you look at the 11g stats you’ll see that they’re completely different from all the other stats. The number of redo entries (if nothing else) tells you that Oracle has dropped back from array processing to single row processing in order to be able to handle the error logging (1 million rows, one entry for each row, it’s PK index entry, and the unique key index entry.)

Until 12c error logging is just row by row processing.

Footnote:

As far as I can tell, I first pointed out this “single row processing” aspect of the log errors option some time around December 2005.

Late Entry:

While looking for a posting about efficient updates  I came across another of my posting that compares SQL with PL/SQL for updates – it’s worth a read.

 

28 Comments »

  1. risk of ora-01555 where t2 is so big it takes longer than your undo can hold, although in that case your cursor should have a minus (select t2 minus select t1) in case you’re catching up from being interrupted for whatever reason

    Comment by cars — June 24, 2016 @ 1:46 pm BST Jun 24,2016 | Reply

    • cars,

      Batching in pl/sql rather than a monolithic SQL statement could be justified by the ORA-01555 argument – but in this case (a) I don’t have any commits in the loop, so the ORA-01555 would be just as likely with either strategy and (b) commits in the batches would probably result in a “fetch out of sequence” error and the supporting code would have to cater for resuming at the right SCN (unless the process didn’t have to worry about point-in-time insertion).

      Comment by Jonathan Lewis — June 24, 2016 @ 8:44 pm BST Jun 24,2016 | Reply

  2. With plain SQL, if there’s a failure, the whole insert will be rollbacked. With that PL/SQL block, whatever goes through without errors will still be inserted and you can deal with the errors after. I don’t always agree to the mantra “Never do in PL/SQL that which can be done in plain SQL”… PL/SQL can be much faster than plain SQL… I use PL/SQL bulk array fetch and insert vs insert/select when the select involves large joins and uses Bloom Filters (for joins and partition pruning). For some reasons, which I cannot explain, a select (exact same select) inside an insert/select will not use BF (Oracle bug?) so the select is many times longer in that case making PL/SQL go multiple times faster than plain SQL (can be 10-20 times faster) since the select part is alone and can use BF. Using 11.2.0.4.

    Comment by Louis — June 24, 2016 @ 2:24 pm BST Jun 24,2016 | Reply

    • “With that PL/SQL block, whatever goes through without errors will still be inserted and you can deal with the errors after.”

      Yes, provided that the exception handling does not do a RAISE or RAISE_APPLICATION_ERROR.

      The code fetches 100 rows at a time. If there is an error somewhere in there, the entire insert of those 100 rows will be rolled back no matter what. That is why Jonathan outputs m_tab(1).id, which is the ID of the first of those 100 rows. That’s where the insert needs to restart once the error is corrected somehow – and the answer to Jonathan’s question, I think…

      Comment by stewashton — June 24, 2016 @ 3:02 pm BST Jun 24,2016 | Reply

      • Stewashton, the comment says that proper error handling should go here… I agree that the 100 rows get rolled back. In order to implement a restart/recovery like you are saying, you would need to exit on the first error after displaying the key (which it does not currently), and the select would need an order by that key to guarantee the row order or else it will surely not work. You could end up with duplicate rows in the target table on restart if you don’t have a primary key and/or you could be missing some rows (keys smaller than the output key) since the select does not necessarily fetch the rows in order of that key without an order by clause.

        Comment by Louis — June 24, 2016 @ 3:49 pm BST Jun 24,2016 | Reply

        • Louis, I agree about the need for an exit on error, and above all an ORDER BY clause. Because of that, I now suspect Jonathan is thinking about something other than restartability.

          Comment by stewashton — June 24, 2016 @ 9:47 pm BST Jun 24,2016

      • Stew,

        I’ve stripped the code to minimise the on-screen impact, so there’s plenty of scope for adjusting details for best effect.
        Funnily enough the one bit of code I didn’t strip out was the debug statement that reported the first row of each batch – it hadn’t crossed my mind to use the value as a restart value.

        I’d have to check earlier versions, but in 12c the FORALL insert wouldn’t rollback the whole batch when it hit a row that produced an error, it’s just that row (and the ones following it in the array) that would fail to be inserted, so there’s a variety of things that could be done to “complete” this code fragment.

        Comment by Jonathan Lewis — June 24, 2016 @ 9:46 pm BST Jun 24,2016 | Reply

        • Jonathan,

          Thanks for your reply. I admit, I assumed without testing that FORALL would work a certain way.

          My assumption was based on Oracle’s promise of “statement level atomicity”: “a SQL statement is an atomic unit of work and either completely succeeds or completely fails.” Now, within the FORALL the INSERT is only executed once (see V$SQL for confirmation) and inserts multiple rows. I assumed “one execution” = “one statement”.

          After testing, that assumption appears to be false. Within a FORALL, there are many INSERT “statements” that only count as one “execution”.

          Comment by stewashton — June 26, 2016 @ 1:49 pm BST Jun 26,2016

    • Louis,

      I hadn’t come across that “Bloom filter goes missing” effect when changing from “select” to “insert select” – if you’ve got a model I’d be interested to see it. Does it always seem to happen or is it just some particular shape of statement that gets the problem ? There is at least one other well-documented class of activity, though, where “select” and “insert / select” produce different plans – and that’s one good argument for adopting the PL/SQL approach.

      And yes, there is a big difference regardin rollback on error.

      UPDATE: I’ve just searced the MoS bug database using “insert select Bloom Filter” and the first hit was: “Bug 20112932 : BLOOM FILTER IS NOT CHOOSED WHEN EXECUTING INSERT SELECT STATEMENT” dated Nov 2014, closed as “Not a bug” ! (There were three other bugs referenced in the same document.

      Comment by Jonathan Lewis — June 24, 2016 @ 8:47 pm BST Jun 24,2016 | Reply

      • Hi Jonathan, thanks for the update and the bug number. Bloom filters (started in 10g) is one of the best new features in the Oracle database. It’s makes a big difference in the speed of queries when used. But for me, it always gives a different path (no BF) in DML statements. I saw it with INSERT statement and also in the SELECT of a MERGE statement. I cannot get the optimizer to use BF in any DML. Strangely, the optimizer comes up with the same cost… but one is 20 times faster than the other especially when partitioning is involved (bloom filter pruning). Example of plans below. With the INSERT in front of the select, a full table scan of a huge table is used, and with BF, only the partitions of the keys passed by a view are read. In those cases, I use a PL/SQL block with array fetch and insert and it is 20 times faster than plain SQL.

        SELECT only (BF is used):
        -------------------------------------------------------------------------------------------------------------------------------
        | Id  | Operation                          | Name            | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop |
        -------------------------------------------------------------------------------------------------------------------------------
        |   0 | SELECT STATEMENT                   |                 |   1657K|   502M|       |  2482K  (5)| 08:16:27 |       |       |
        |*  1 |  VIEW                              |                 |   1657K|   502M|       |  2482K  (5)| 08:16:27 |       |       |
        |*  2 |   WINDOW SORT PUSHED RANK          |                 |   1657K|   145M|   235M|  2482K  (5)| 08:16:27 |       |       |
        |*  3 |    HASH JOIN                       |                 |   1657K|   145M|       |  2449K  (5)| 08:09:54 |       |       |
        |   4 |     PART JOIN FILTER CREATE        | :BF0000         |      2 |    32 |       |     9   (0)| 00:00:01 |       |       |
        |   5 |      VIEW                          |                 |      2 |    32 |       |     9   (0)| 00:00:01 |       |       |
        |   6 |       UNION-ALL                    |                 |        |       |       |            |          |       |       |
        |*  7 |        INDEX STORAGE FAST FULL SCAN| PK_STORE        |      1 |     4 |       |     3   (0)| 00:00:01 |       |       |
        |*  8 |        TABLE ACCESS STORAGE FULL   | WH              |      1 |    10 |       |     6   (0)| 00:00:01 |       |       |
        |   9 |     PARTITION RANGE ITERATOR       |                 |   1939M|   137G|       |  2443K  (4)| 08:08:41 |    37 |    59 |
        |  10 |      PARTITION HASH JOIN-FILTER    |                 |   1939M|   137G|       |  2443K  (4)| 08:08:41 |:BF0000|:BF0000|
        |* 11 |       TABLE ACCESS STORAGE FULL    | JCP_WK_ITEM_SOH |   1939M|   137G|       |  2443K  (4)| 08:08:41 |  2305 |  3776 |
        -------------------------------------------------------------------------------------------------------------------------------
        
        With INSERT in front of SELECT (no BF and very slow):
        ------------------------------------------------------------------------------------------------------------------------------------
        | Id  | Operation                          | Name                 | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop |
        ------------------------------------------------------------------------------------------------------------------------------------
        |   0 | INSERT STATEMENT                   |                      |   1657K|   502M|       |  2482K  (5)| 08:16:27 |       |       |
        |   1 |  LOAD TABLE CONVENTIONAL           | JCP_WK_ITEM_SOH_HIST |        |       |       |            |          |       |       |
        |*  2 |   VIEW                             |                      |   1657K|   502M|       |  2482K  (5)| 08:16:27 |       |       |
        |*  3 |    WINDOW SORT PUSHED RANK         |                      |   1657K|   145M|   235M|  2482K  (5)| 08:16:27 |       |       |
        |*  4 |     HASH JOIN                      |                      |   1657K|   145M|       |  2449K  (5)| 08:09:54 |       |       |
        |   5 |      VIEW                          |                      |      2 |    32 |       |     9   (0)| 00:00:01 |       |       |
        |   6 |       UNION-ALL                    |                      |        |       |       |            |          |       |       |
        |*  7 |        INDEX STORAGE FAST FULL SCAN| PK_STORE             |      1 |     4 |       |     3   (0)| 00:00:01 |       |       |
        |*  8 |        TABLE ACCESS STORAGE FULL   | WH                   |      1 |    10 |       |     6   (0)| 00:00:01 |       |       |
        |   9 |      PARTITION RANGE ITERATOR      |                      |   1939M|   137G|       |  2443K  (4)| 08:08:41 |    37 |    59 |
        |  10 |       PARTITION HASH ALL           |                      |   1939M|   137G|       |  2443K  (4)| 08:08:41 |     1 |    64 |
        |* 11 |        TABLE ACCESS STORAGE FULL   | JCP_WK_ITEM_SOH      |   1939M|   137G|       |  2443K  (4)| 08:08:41 |  2305 |  3776 |
        ------------------------------------------------------------------------------------------------------------------------------------
        

        Comment by Louis — June 26, 2016 @ 3:24 pm BST Jun 26,2016 | Reply

        • Louis,

          Thanks for the example.
          I’ll probably run up a demo in the not too distant future. I’ll also try to find a little time to see if there are any details about why (and when) it’s “not a bug” for the Bloom filter to disappear.

          Comment by Jonathan Lewis — June 26, 2016 @ 6:46 pm BST Jun 26,2016

        • Jonathan, from the bug you sent (which Oracle concluded is not a bug), there were 3 other referenced bug. I believe the one I hit is this one which is a real bug: Bug 13801198 BLOOM PRUNING/FILTER NOT USED IN DML STATEMENTS. And that bug refers to base bug “Bug 18949550 : MERGE QUERY NOT USING BLOOM FILTER” which will be fixed in 12.2 only. But what about users of 11gR2 like me?

          Comment by Louis — June 26, 2016 @ 7:41 pm BST Jun 26,2016

        • Louis,

          A standard approach would be to raise an SR describing your problem, noting that you have found a reported bug that matches and has a fix, and requesting a backport.

          Sometimes backports are literally not possible, and sometimes Oracle will only create them for terminal releases of earlier versions, but they do seem to be do “recent” releases fairly frequently.

          Comment by Jonathan Lewis — June 27, 2016 @ 8:31 am BST Jun 27,2016

        • Louis,

          I may have found a partial workaround for you.

          I decided to see if I could build a quick model of the problem and hacked a little script I’d written to demo partition Bloom filter to change a select to an insert/select. On the first hack the Bloom filter disappeared but was replaced by a subquery filter (partition start / stop = KEY(SQ)); so I modified the data to make the optimizer decide to not do subquery filtering, and then padded the subquery_pruning() hint. The hint worked. (I had also tried the px_join_filter() hint, but that didn’t work.)

          In my case I didn’t have composite partitioning so my hint was subquery_pruning(@sel$1 pt@sel$1 partition) — where pt was the partitioned large table.
          I don’t know how (or even if) this strategy can be applied to the example you’ve shown; if it can I think your filter is on the subpartition so you’d need to change the key word in the hint; there’s also the problem that your build “table” is an inline non-mergeable view, so the @sel$1 might be wrong as the query block identifier (though it might be the originating query block for your partitioned table).

          Comment by Jonathan Lewis — June 27, 2016 @ 1:46 pm BST Jun 27,2016

        • Thanks for the advice to ask for a backport and thanks for the possible workaround using hints, I will give it a try.

          Comment by Louis — June 27, 2016 @ 2:38 pm BST Jun 27,2016

  3. Louis – wouldn’t using “log errors into” solve this problem? This doesn’t solve the 1555 situation, but it does allow you to avoid a single record ruining your entire insert.

    Comment by Tony — June 24, 2016 @ 2:57 pm BST Jun 24,2016 | Reply

    • Tony,

      That gives you a strategy for getting most of the data in and not doing a massive rollback. The drawback to the “log errors” clause, though, is that it turns the array processing mechanism of insert/select into single row processing – i.e. vastly more undo and redo.

      [Update – this anomaly seems to have been fixed by 12c]

      Comment by Jonathan Lewis — June 24, 2016 @ 9:49 pm BST Jun 24,2016 | Reply

  4. One example is when t2 is a view with reference to a data over db link we can use this approach to have the select run on the remote side. With DMLs the target site is always driving. You had earlier posts in distributed DML where in comments we discussed this.

    Comment by Andras Gabor — June 24, 2016 @ 4:04 pm BST Jun 24,2016 | Reply

    • Andras,

      Definitely a good use case – the earier comment is here; though the view would have to include a distributed join before the mechanism would be needed – and then you might only do this when the query was more efficient when the remotes site was the driving site.

      Comment by Jonathan Lewis — June 24, 2016 @ 9:52 pm BST Jun 24,2016 | Reply

  5. Should the exit condition be tab.count = 0 rather than c1%notfound?

    Comment by Phil Singer — June 24, 2016 @ 7:21 pm BST Jun 24,2016 | Reply

    • Phil,

      When using BULK COLLECT you need to deal with the “last few” rows fetched, and %notfound is set when the number of rows fetched is less than the limit.
      Admittedly my code goes round the fetch loop one time more than it needs to if the last fetch matches the limit (and then does nothing in the inner loop because the count will be zero).

      Comment by Jonathan Lewis — June 24, 2016 @ 9:57 pm BST Jun 24,2016 | Reply

  6. I don’t remember exactly under which condition but I guess if a database link is involved a PL/SQL approach can better the SQL-only approach. So, in this case, if t2 is a remote table, the PL/SQL approach would be more performant that SQL-only approach ( I guess)

    Comment by Narendra — June 24, 2016 @ 9:01 pm BST Jun 24,2016 | Reply

  7. So, how about the APPEND hint and using Direct load, Commit is gone.

    Comment by Tim Stonehouse — June 25, 2016 @ 8:19 am BST Jun 25,2016 | Reply

    • Tim,

      Horses for courses: we should always try to identify the most appropriate mechanism and the /*+ append */ hint has its place, but the only detail that addresses is a (not particularly dramatic) reduction in undo (and its associated redo).

      In passing, there are various reasons why the append hint is invalid – most significant, perhaps, is the presence of a foreign key constraint in the table’s declaration.

      Comment by Jonathan Lewis — June 25, 2016 @ 9:04 am BST Jun 25,2016 | Reply

  8. […] of the comments on my recent posting about “Why use pl/sql bulk strategies over simple SQL” pointed out that it’s not just distributed queries that can change plans dramatically when […]

    Pingback by DML and Bloom | Oracle Scratchpad — July 8, 2016 @ 1:01 pm BST Jul 8,2016 | Reply

  9. […] select from table(pipeline_function)”, or to write a pl/sql block that opens a cursor to do a select with bulk collect and loops through an array insert. The overhead in both cases is likely to be relatively small (especially when compared with the […]

    Pingback by Union All MV | Oracle Scratchpad — July 12, 2016 @ 10:10 am BST Jul 12,2016 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.