From time to time a question comes up on OTN that results in someone responding with the mantra: “Never do in PL/SQL that which can be done in plain SQL”. It’s a theme I’ve mentioned a couple of times before on this blog, most recently with regard to Bryn Llewellyn’s presentation on transforming one table into another and Stew Ashton’s use of Analytic functions to solve a problem that I got stuck with.
Here’s a different question that challenges that mantra. What’s the obvious reason why someone might decide to produce the following code rather than writing a simple “insert into t1 select * from t2;”:
declare cursor c1 is select * from t2 ; type c1_array is table of c1%rowtype index by binary_integer; m_tab c1_array; begin open c1; loop fetch c1 bulk collect into m_tab limit 100; begin forall i in 1..m_tab.count insert into t1 values m_tab(i); exception when others then begin -- proper exception handling should go here dbms_output.put_line(m_tab(1).id); dbms_output.put_line(sqlerrm); end; end; exit when c1%notfound; end loop; close c1; end; /
There is a very good argument for this approach.
Follow-up (Saturday 25th)
As Andras Gabor pointed out in one of the comments, there are documented scenarios where the execution plan for a simple select statement is not legal for the select part of an “insert into .. select …” statement. Specifically, if you have a distributed query the most efficient execution plan may require the remote site to be the driving site, but the plan for a CTAS or insert/select is required to use the local site as the driving site.
There are workarounds – if you’re allowed to use them – such as creating a view at the remote site and selecting from the view, or you could create a pipelined function locally and select from the pipelined function (but that’s going to be writing PL/SQL anyway, and you’d have to create one or two object types in the database to implement it).s
Another example of plan limitations, that I had not seen before (but have now found documented as “not a bug in MoS note 20112932”), showed up in a comment from Louis: a select statement may run efficiently because the plan uses a Bloom filter, but the filter disappears when the statement is used in insert/select.
These limitations, however, were not the point I had in mind. The “obvious” reason for taking the pl/sql approach is error handling. What happens if one of the rows in your insert statement raises an Oracle exception ? The entire statement has to rollback. If you adopt the PL/SQL array processing approach then you can trap each error as it occurs and decide what to do about it – and there’s an important detail behind that statement that is really important: the PL/SQL can operate at virtually the same speed as the simple SQL statement once you’ve set the arraysize to a value which allows each insert to populate a couple of blocks.
Let me emphasise the critical point of the last sentence: array inserts in PL/SQL operate at (virtually) the speed of the standard SQL insert / select.
As it stands I don’t think the exception handler in my code above could detect which row in the batch had caused the error – I’ve just printed the ID from the first row in the batch as a little debug detail that’s only useful to me because of my knowledge of the data. Realistically the PL/SQL block to handle the inserts might look more like the following:
-- In program declaration section dml_errors exception; pragma exception_init(dml_errors, -24381); m_error_pos number(6,0) := 0; -- ------------------------------ begin forall i in 1..m_tab.count save exceptions insert into t1 values m_tab(i); exception when dml_errors then begin for i in 1..sql%bulk_exceptions.count loop dbms_output.put_line( 'Array element: ' || sql%bulk_exceptions(i).error_index || ' ' || sqlerrm(-sql%bulk_exceptions(i).error_code) ); m_error_pos := sql%bulk_exceptions(i).error_index; dbms_output.put_line( 'Content: ' || m_tab(m_error_pos).id || ' ' || m_tab(m_error_pos).n1 ); end loop; end; when others then raise; end;
You’ll notice that I’ve added the SAVE EXCEPTIONS clause to the FORALL statement. This allows Oracle to trap any errors that occur in the array processing step and record details of the guilty array element as it goes along, storing those details in an array calls SQL%BULK_EXCEPTIONS. My exception handler then handles the array processing exception by walking through that array.
I’ve also introduced an m_error_pos variable (which I could have declared inside the specific exception handler) to remove a little of the clutter from the line that shows I can identify exactly which row in the source data caused the problem. With a minimum of wasted resources this code now inserts all the valid rows and reports the invalid rows (and, if necessary, could take appropriate action on each invalid row as it appears).
If you’ve got a data loading requirement where almost all the data is expected to be correct but errors occasionally happen, this type of coding strategy is likely to be the most efficient thing you could do to get your data into the database. It may be slightly slower when there are no errors, but that’s a good insurance premium when compared with the crash and complete rollback that occurs if you take the simple approach – and there are bound to be cases where a pre-emptive check of all the data (that would, probably, make the insert safe) would add far more overhead than the little bit of PL/SQL processing shown here.
It’s obviously a little difficult to produce any time-based rates that demonstrate the similarity in performance of the SQL and PL/SQL approaches – the major time component in a little demo I built was about the I/O rather than the the CPU (which, in itself, rather validates the claim anyway). But if you want to do some testing here’s my data model with some results in the following section:
rem rem Script: plsql_loop_insert.sql rem Author: Jonathan Lewis rem execute dbms_random.seed(0) create table t1 nologging as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select cast(rownum as number(8,0)) id, 2 * trunc(dbms_random.value(1e10,1e12)) n1, cast(lpad('x',100,'x') as varchar2(100)) padding from generator v1, generator v2 where rownum <= 1e6 ;
create table t2 nologging noparallel as select /*+ no_parallel(t1) */ id + 1e6 id, n1 - 1 n1, rpad('x',100,'x') padding from t1 ; -- update t2 set n1 = n1 + 1 where id = 2e6; -- update t2 set n1 = n1 + 1 where id = 2e6 - 10; -- update t2 set n1 = n1 + 1 where id = 2e6 - 20; -- update t2 set n1 = n1 + 1 where id = 1750200; -- update t2 set n1 = n1 + 1 where id = 1500003; -- update t2 set n1 = n1 + 1 where id = 1500001; commit; alter system checkpoint; alter system switch logfile; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T2', method_opt => 'for all columns size 1' ); end; / create unique index t1_i1 on t1(n1) nologging; create unique index t1_pk on t1(id) nologging; alter table t1 add constraint t1_pk primary key(id);
I’ve generated 1 million rows with an id column and a random integer – picking the range of the random numbers to give me a very good chance (that worked) of getting unique set of values. I’ve doubled the random values I use for t1 so that I can substract 1 and still guarantee uniqueness when I generate the t2 values (I’ve also added 1 million to the id value for t2 for the same uniqueness reasons).
The optional update to add 1 to a scattering of rows in t2 ensures that those values go back to their original t1 values so that they can cause “duplicate key” errors. The SQL insert was a simple insert into t1 select * from t2 (ensuring that parallel query didn’t come into play), and the PL/SQL detail I used was as follows:
declare cursor c1 is select /*+ no_parallel(t2) */ * from t2 ; type c1_array is table of c1%rowtype index by binary_integer; m_tab c1_array; dml_errors exception; pragma exception_init(dml_errors, -24381); m_error_pos number(6,0) := 0; begin open c1; loop fetch c1 bulk collect into m_tab limit 100; begin forall i in 1..m_tab.count save exceptions insert into t1 values m_tab(i); exception when dml_errors then begin for i in 1..sql%bulk_exceptions.count loop dbms_output.put_line( 'Array element: ' || sql%bulk_exceptions(i).error_index || ' ' || sqlerrm(-sql%bulk_exceptions(i).error_code) ); m_error_pos := sql%bulk_exceptions(i).error_index; dbms_output.put_line( 'Content: ' || m_tab(m_error_pos).id || ' ' || m_tab(m_error_pos).n1 ); end loop; end; when others then raise; end; exit when c1%notfound; -- when fetch < limit end loop; close c1; end; /
The PL/SQL output with one bad row (2e6 – 20) looked like this:
Array element: 80 ORA-00001: unique constraint (.) violated Content: 1999980 562332925640
Here are some critical session statistics for different tests in 11g:
No bad data, insert select -------------------------- Name Value ---- ----- CPU used when call started 944 CPU used by this session 944 DB time 1,712 redo entries 1,160,421 redo size 476,759,324 undo change vector size 135,184,996 No bad data, PL/SQL loop ------------------------ Name Value ---- ----- CPU used when call started 990 CPU used by this session 990 DB time 1,660 redo entries 1,168,022 redo size 478,337,320 undo change vector size 135,709,056 Duplicate Key (2e6-20), insert select (with huge rollback) ---------------------------------------------------------- Name Value ---- ----- CPU used when call started 1,441 CPU used by this session 1,440 DB time 2,427 redo entries 2,227,412 redo size 638,505,684 undo change vector size 134,958,012 rollback changes - undo records applied 1,049,559 Duplicate Key (2e6-20), PL/SQL loop - bad row reported ------------------------------------------------------ Name Value ---- ----- CPU used when call started 936 CPU used by this session 936 DB time 1,570 redo entries 1,168,345 redo size 478,359,528 undo change vector size 135,502,488 rollback changes - undo records applied 74
Most of the difference between CPU time and DB time in all the tests was file I/O time (in my case largerly checkpoint wait time, I had small log files, but in larger systems it’s quite common to see a lot of time spent on db file sequential reads as index blocks are read for update). You can see that there’s some “unexpected” variation in CPU time – I wasn’t expecting the PL/SQL loop that failed after nearly 1M inserts to use less CPU than anything else – but the CPU numbers fluctuated a few hundredths of a second across tests, this just happened to be particularly noticeable with the first one I did – so to some extent this was probably affected by background activity relating to space management, job queue processing and all the other virtual machines on the system.
Critically I think it’s fair to say that the differences in CPU timing are not hugely significant across a reasonably sized data set, and most importantly the redo and undo hardly vary at all between the successful SQL and both PL/SQL tests. The bulk processing PL/SQL approach doesn’t add a dramatic overhead – but it clearly does bypass the threat of a massive rollback.
You might want to argue the case for using basic SQL with the log errors clause. The code method is simple and it gives you a table of rows which have caused exceptions as the insert executed – and that may be sufficient for your purposes; but there’s a problem until you upgrade to 12c.
Here’s how I had to modify my test case to demonistrate the method:
begin dbms_errlog.create_error_log('t1'); end; / insert into t1 select * from t2 log errors reject limit unlimited ;
The procedure call creates a table to hold the bad rows, by default it’s name will be err$_t1, and it will be a clone of the t1 table with changes to column types (which might be interseting if you’ve enable 32K columns in 12c — to be tested) and a few extra columns:
SQL> desc err$_t1 Name Null? Type ----------------------------- -------- -------------------- ORA_ERR_NUMBER$ NUMBER ORA_ERR_MESG$ VARCHAR2(2000) ORA_ERR_ROWID$ ROWID ORA_ERR_OPTYP$ VARCHAR2(2) ORA_ERR_TAG$ VARCHAR2(2000) ID VARCHAR2(4000) N1 VARCHAR2(4000) PADDING VARCHAR2(4000) SQL> execute print_table('select * from err$_t1') ORA_ERR_NUMBER$ : 1 ORA_ERR_MESG$ : ORA-00001: unique constraint (TEST_USER.T1_I1) violated ORA_ERR_ROWID$ : ORA_ERR_OPTYP$ : I ORA_ERR_TAG$ : ID : 1999980 N1 : 562332925640 PADDING : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
SO what’s the problem with logging errors ? Here are the sets of session stats corresponding to the ones that I reported above for the SQL and PL/SQL options. The first set comes from running this test on 220.127.116.11, the second from 18.104.22.168:
11g results =========== Name Value ---- ----- CPU used when call started 1,534 CPU used by this session 1,534 DB time 2,816 redo entries 3,113,105 redo size 902,311,860 undo change vector size 269,307,108 12c results =========== Name Value ---- ----- CPU used when call started 801 CPU used by this session 801 DB time 3,061 -- very long checkpoint waits !! redo entries 1,143,342 redo size 492,615,336 undo change vector size 135,087,044
Ihe 12c stats are very sinilar to the stats from the perfect SQL run and the two PL/SQL runs – but if you look at the 11g stats you’ll see that they’re completely different from all the other stats. The number of redo entries (if nothing else) tells you that Oracle has dropped back from array processing to single row processing in order to be able to handle the error logging (1 million rows, one entry for each row, it’s PK index entry, and the unique key index entry.)
Until 12c error logging is just row by row processing.
As far as I can tell, I first pointed out this “single row processing” aspect of the log errors option some time around December 2005.
While looking for a posting about efficient updates I came across another of my posting that compares SQL with PL/SQL for updates – it’s worth a read.