Oracle Scratchpad

July 3, 2019

DB links

Filed under: distributed,Oracle,Performance — Jonathan Lewis @ 1:16 pm BST Jul 3,2019

A recent thread on the ODC SQL & PL/SQL forum raised the question of how data moves across a database link – is it row by row or array based (or other)? It’s a good question as it’s fairly common knowledge that distributed queries can be very much slower than the same query when executed on an equivalent set of local tables, so it’s worth having a little poke around to see if there’s anything in the way the data moves across the link that could be tweaked to improve performance.

It’s fairly easy to get some ideas by running a couple of queries and checking session activity stats and wait events – so here’s a little test conducted between a database running 12.2.0.1 and a database running 11.2.0.4. For this type of testing it’s probably sensible to use two database rather than faking things with a loopback database link in case the loopback introduces some special features (a.k.a. dirty tricks).


rem
rem     Script:         db_link_usage.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2019
rem
rem     Last tested 
rem             12.2.0.1 -> 11.2.0.4
rem

prompt  ==================================
prompt  SQL*Plus 20,000 rows, arraysize 20
prompt  ==================================

set feedback only
set arraysize 20

execute snap_my_stats.start_snap
execute snap_events.start_snap
select * from all_objects@d11g where rownum <= 20000;

set feedback on
execute snap_events.end_snap
execute snap_my_stats.end_snap

prompt  ====================================
prompt  SQL*Plus 20,000 rows, arraysize 2000
prompt  ====================================

set feedback only
set arraysize 2000

execute snap_my_stats.start_snap
execute snap_events.start_snap
select * from all_objects@d11g where rownum <= 20000;

set feedback on
execute snap_events.end_snap
execute snap_my_stats.end_snap

All I’ve done is select 20,000 rows from view all_objects in a remote database with two different settings for the array fetch size. The “feedback only” option is a 12c option that means the session doesn’t output the data, it shows only the final “N rows selected” report. The two “snap” packages are some code I wrote a couple of decades ago to report changes in my session’s activity stats and wait events.

It turned out that there was very little of interest in the session activity stats although the impact of SQL*Net compression is always worth a quick reminder (here and here) on the other hand the wait events (specifically the wait counts) are most revealing:

================================== 
SQL*Plus 20,000 rows, arraysize 20
================================== 
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                       1,004           0          15.69        .016     414,828
SQL*Net message to dblink                         1,004           0           0.04        .000           0
SQL*Net message from dblink                       1,004           0          61.02        .061          11
SQL*Net message to client                         1,004           0           0.11        .000           0

====================================
SQL*Plus 20,000 rows, arraysize 2000
====================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                          14           0           4.88        .349     414,828
SQL*Net message to dblink                            13           0           0.00        .000           0
SQL*Net message from dblink                          13           0           7.72        .594          11
SQL*Net message to client                            14           0           0.00        .000           0

SQL*Net more data from dblink                        96           0           6.26        .065           6
SQL*Net more data to client                          96           0           0.16        .002           4

Doing a little arithmetic (and allowing for a few “metadata” messages that introduce small variations in the numbers), we can see that when we fetched the 20,000 rows with an arraysize of 20 this turned into 1,000 (fetch) calls from SQL*Plus to the server, with a matching 1,000 calls from the local server to the remote server. When the arraysize goes up to 2,000 rows, though, the SDU (session data unit) setting for SQL*Net is too small to hold the whole of a single fetch and we see a single fetch from SQL*Plus turning into one “SQL*Net message to client” accompanied by 7 or 8 “SQ:(Net more data to client” with exactly the same pattern of conversation between the local server and the remote server. You could imagine the conversation as something like:

  • Client to local server: “give me 2,000 rows”
  • local server to remote server: “give me 2,000 rows”
  • remote server to local server: “Here, I can manage 120 rows as the first installment”
  • local server to client: “Here’s a first installment”
  • Client to local server: “Got it, where’s the rest?”     *** but no message recorded in the session activity stats
  • Local server to remote server: “Where’s the rest?”
  • Remote server to local server: “You want more – here’s another 120 rows”
  • Local server to client: “Here’s a second installment”
  • Client to local server: “Got it, where’s the rest?”     *** but no message recorded in the session activity stats
  • etc.

In this example the volume of data sent back to the client in each message was limited by the SDU size negotiated between the local server and the remote server as the link opens. In my case I happened to have the default (apparently 8KB) as the SDU_SIZE in the 12c and a non-default 32KB setting in the 11g sqlnet.ora, so the negotiated SDU between servers was 8KB (the lower of the two).

Here’s what the figures looked like after I had restarted with the SDU set to 16KB in the 12c tnsnames.ora, and 16KB in the 11g listener.ora:


==================================
SQL*Plus 20,000 rows, arraysize 20
==================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                       1,004           0          24.23        .024         588
SQL*Net message to dblink                         1,009           0           0.06        .000           0
SQL*Net message from dblink                       1,010           0          77.76        .077           4
SQL*Net message to client                         1,004           0           0.15        .000           0

====================================
SQL*Plus 20,000 rows, arraysize 2000
====================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                          14           0           1.61        .115         588
SQL*Net message to dblink                            13           0           0.00        .000           0
SQL*Net message from dblink                          13           0           4.21        .324           4
SQL*Net message to client                            14           0           0.00        .000           0

SQL*Net more data from dblink                        45           0          13.53        .301           6
SQL*Net more data to client                          45           0           0.13        .003           0

The first set of figures (arraysize 20) don’t change. If an 8KB SDU is large enough to hold an array of 20 rows then it’s not going to make a difference when the SDU is increased. In the second set of figures we see that for each “SQL*Net message from dblink” we now have roughly 3 “SQL*Net more data from dblink” (with matching counts for “SQL*Net more data to client”). With an SDU of double the size it now takes a total of roughly 4 packets to transmit the array fetch rather than the 8 or 9 we saw with the smaller SDU size.

As a final test with SQL*Plus, what happens if we set the SDU size at 32K for the 12c database (and that means for the client calling in and the server calling out) and 8KB for the 11g database? The client negotiates a 32KB SDU with the 12c server, but the 12c server negotiates an 8KB SDU with the 11g server. Here are the stats for just the test with the large fetch size in this case:


====================================
SQL*Plus 20,000 rows, arraysize 2000
====================================
Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                          14           0           5.30        .379     214,570
SQL*Net message to dblink                            13           0           0.00        .000           0
SQL*Net message from dblink                          13           0           4.09        .314          13
SQL*Net message to client                            14           0           0.00        .000           0

SQL*Net more data from dblink                        96           0          14.46        .151           6
SQL*Net more data to client                          20           0           0.08        .004           0

We get the same 10 (plus a few) “message to/from client/dblink”, but now the “more data” waits are dramatically different. When the client calls for the “next fetch” the local server has to call the remote server 4 or 5 times to get enough 8KB data packets to fill a single 32KB packet to return to the client. You can confirm this (and some of my descriptions of the earlier behaviour) by enabling extended tracing for wait states. Here’s an extract of 5 consecutive lines from a suitable trace file (you have to ignore the “#bytes” parameter on the “SQL*Net more data from dblink” waits, they’ve always been garbage:


WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 2793 driver id=1413697536 #bytes=7 p3=0 obj#=-1 tim=703316387674
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 34 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=703316388447
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 8 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=703316389134
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 16 driver id=1413697536 #bytes=12 p3=0 obj#=-1 tim=703316389818
WAIT #140028098478296: nam='SQL*Net more data to client' ela= 73 driver id=1413697536 #bytes=32671 p3=0 obj#=-1 tim=703316389960

Summary Note

Oracle will use the negotiated SDU to do array fetches across a database link. Since a larger arraysize can (for large volumes of data) reduce the total work load on the remote database and on the network you may want the local server to have a specifically crafted tnsnames.ora entry and the remote server to expose a specific service with matching SDU size to help improve the efficiency of transferring data between two databases.

Lagniappe

If you want to look at other cases of how array fetching and SDU sizes interact, here are a couple of examples of using PL/SQL to execute SQL that fetches data across database links. The first is a simple, embedded “cursor for loop” that (apparently) does “row by row” procssing – although an enhancement appeared many versions ago to make this type of code use array fetching of 100 rows under the covers. The second demonstrates the effects of an explicit cursor with “fetch, bulk collect, limit”:


execute snap_events.start_snap

declare
        m_total number := 0;
        m_ct    number := 0;
begin
        for c1 in (select * from all_objects@d11g where rownum < 20000) loop
                m_total := m_total + c1.object_id;
                m_ct    := m_ct + 1;
        end loop;
end;
/

execute snap_events.end_snap

execute snap_events.start_snap

declare

        cursor c1 is  select * from all_objects@d11g where rownum <= 20000;

        type c1_array is table of c1%rowtype index by binary_integer;
        m_tab c1_array;

        m_ct number := 0;

begin
        open c1;
        loop
                fetch   c1
                bulk collect
                into    m_tab
                limit   1000
                ;

                exit when c1%notfound;
                m_ct := m_ct + 1;

        end loop;
        close c1;
end;
/

execute snap_events.end_snap

Checking the session events for these two test we see the following when the SDU_SIZE has been set at 16KB at both ends of the database link (which means the negotiated SDU will be 16KB):


Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                           4           0           0.66        .165         588 
SQL*Net message to dblink                           203           0           0.05        .000           0
SQL*Net message from dblink                         203           0          38.51        .190           5
SQL*Net message to client                             4           0           0.00        .000           0

Event                                             Waits   Time_outs           Csec    Avg Csec    Max Csec
-----                                             -----   ---------           ----    --------    --------
SQL*Net message from client                           4           0           0.08        .021         588
SQL*Net message to dblink                            24           0           0.01        .000           0
SQL*Net message from dblink                          24           0           9.19        .383           5
SQL*Net more data from dblink                        40           0           8.47        .212           6
SQL*Net message to client                             4           0           0.00        .000           0

The critical information we can see in the first example is that getting 20,000 rows from the remote database “row by row” takes 200 (plus noise) round-trips – showing that Oracle really is converting our single-row processing loop in array fetches of 100 rows.

The second example shows us that (allowing for a little noise, again) it takes 20 messages to the remote database to fetch 20,000 rows – so 1,000 rows per message – but the SDU size is too small and we have to send each array in three consecutive packets, the “message from dblink” wait and two “more data from dblink” waits.

Footnote

I did point out that the session activity stats for theses tests weren’t interesting. There is, however, one detail worth mentioning since you could otherwise be fooled into thinking the number of packet travelling between the databases was smaller than it really was. There are two statistics about network roundtrips:


Name                                         Value
----                                         -----
SQL*Net roundtrips to/from client               18         
SQL*Net roundtrips to/from dblink               13

A “roundtrip” is a “message to / message from” pair. It doesn’t account for traffic due to “more data” being requested or sent.

 

19 Comments »

  1. Hi ,
    Thank you for information shared with us .
    However I would like to know incase if single plane insert statement is having high round trip value
    Then how should one tackle it

    Thanks,
    Krishna

    Comment by Krishna — July 3, 2019 @ 6:31 pm BST Jul 3,2019 | Reply

    • Krishna,

      You need to give a little more information, there are too many ways to interpret “single insert statement”.
      I’ll assume you mean “insert as select” where the select statement is either complete from a remote join or from a distributed join.
      If it’s from a purely remote statement the only optimisation is to make the SDU as large as possible so the array that travels across the network is as large as possible and the traffic overheads are reduced. There are a couple of little other little network tweaks that can help, but they tend to be good by default in recent versions of Oracle.

      If it’s a distributed query the most significant problem is that the query has to be optimised and managed by the database where the insert is taking place, so the path for the SELECT may be efficient on its own while the path for the INSERT AS SELECT changes dramatically. In this case the best solution may be to write a pipelined function wrapping the select so that you can do “insert as select from pipelined function.” There’s an example here: https://jonathanlewis.wordpress.com/2010/10/07/distributed-pipelines/

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — July 3, 2019 @ 8:19 pm BST Jul 3,2019 | Reply

  2. I’ve been look for ways to improve the Fetches done by :
    (a) a user query (similar to yours) being executed from an application (e.g. a screen presented by a java application)
    (b) an MV refresh scheduled through either dbms_job or dbms_scheduler — your PLSQL example is custom code, can we rewrite materialized view refresh’s to use bulk collect ?

    Comment by Hemant K Chitale — July 4, 2019 @ 3:08 am BST Jul 4,2019 | Reply

    • Hemant,

      I’ve just done a quick test (with a call to dbms_mview.refresh(‘mvname’,’F’) with a copy all_objects 11g with a materialized view log, and a fast refresh on demand materialized view on 12c. After updating a few hundred rows of the table, and deleting a few dozen I traced the refresh. It did an array fetch of the data to be applied and the object_ids to be deleted, using the negotiated SDU size.

      Sample from tkprof’ed trace:

      
      SELECT /*+ remote_mapped("D11G") */ CURRENT$."OWNER",CURRENT$."OBJECT_NAME",
        CURRENT$."SUBOBJECT_NAME",CURRENT$."OBJECT_ID",CURRENT$."DATA_OBJECT_ID",
        CURRENT$."OBJECT_TYPE",CURRENT$."CREATED",CURRENT$."LAST_DDL_TIME",
        CURRENT$."TIMESTAMP",CURRENT$."STATUS",CURRENT$."TEMPORARY",
        CURRENT$."GENERATED",CURRENT$."SECONDARY",CURRENT$."NAMESPACE",
        CURRENT$."EDITION_NAME"
      FROM
       (SELECT "T1"."OWNER" "OWNER","T1"."OBJECT_NAME" "OBJECT_NAME",
        "T1"."SUBOBJECT_NAME" "SUBOBJECT_NAME","T1"."OBJECT_ID" "OBJECT_ID",
        "T1"."DATA_OBJECT_ID" "DATA_OBJECT_ID","T1"."OBJECT_TYPE" "OBJECT_TYPE",
        "T1"."CREATED" "CREATED","T1"."LAST_DDL_TIME" "LAST_DDL_TIME",
        "T1"."TIMESTAMP" "TIMESTAMP","T1"."STATUS" "STATUS","T1"."TEMPORARY"
        "TEMPORARY","T1"."GENERATED" "GENERATED","T1"."SECONDARY" "SECONDARY",
        "T1"."NAMESPACE" "NAMESPACE","T1"."EDITION_NAME" "EDITION_NAME" FROM
        "T1"@"D11G" "T1") CURRENT$, (SELECT DISTINCT MLOG$."OBJECT_ID" FROM
        "TEST_USER"."MLOG$_T1"@"D11G" MLOG$ WHERE "SNAPTIME$$" > :1 AND
        ("DMLTYPE$$" != 'D')) LOG$ WHERE CURRENT$."OBJECT_ID" = LOG$."OBJECT_ID"
      
      
      call     count       cpu    elapsed       disk      query    current        rows
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      Parse        1      0.00       0.00          0          0          0           0
      Execute      1      0.00       0.01          0          0          0           0
      Fetch        1      0.01       0.01          0          0          0         691
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      total        3      0.02       0.03          0          0          0         691
      
      
      call     count       cpu    elapsed       disk      query    current        rows
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      Parse        1      0.00       0.00          0          0          0           0
      Execute      1      0.00       0.01          0          0          0           0
      Fetch        1      0.01       0.01          0          0          0         691
      ------- ------  -------- ---------- ---------- ---------- ----------  ----------
      total        3      0.02       0.03          0          0          0         691
      
      Misses in library cache during parse: 1
      Misses in library cache during execute: 1
      Optimizer mode: ALL_ROWS
      Parsing user id: 138     (recursive depth: 1)
      
      Elapsed times include waiting on following events:
        Event waited on                             Times   Max. Wait  Total Waited
        ----------------------------------------   Waited  ----------  ------------
        SQL*Net message to dblink                       3        0.00          0.00
        SQL*Net message from dblink                     3        0.01          0.02
        SQL*Net more data from dblink                   4        0.00          0.00
      ********************************************************************************
      
      

      As you can see, the query fetch 691 rows *all that needed update) in one fetch, with one “SQL*Net message from dblink” and 4 “SQL*Net more data from dblink”. The fetched data was then applied one row at a time – so it’s not the bulk collect we need to worry about, it’s the FORALL insert / update / delete.

      I would be a little surprised if things were different for refreshes called through dbms_scheduler or dbms_job so the only tweak to improve performance of the refresh looks like making sure the two database communicate through the largest SDU_SIZE that you feel happy with.

      Comment by Jonathan Lewis — July 4, 2019 @ 3:40 pm BST Jul 4,2019 | Reply

      • Jonathan,

        Thanks.

        I ran my test (10g) and found that, as you have proven to me, the SELECT isn’t the painful portion :

        call     count       cpu    elapsed       disk      query    current        rows
        ------- ------  -------- ---------- ---------- ---------- ----------  ----------
        Parse        1      0.00       0.00          0          0          0           0
        Execute      1      0.00       0.15          0          0          0           0
        Fetch        1      0.06       0.48          0          0          0        4329
        ------- ------  -------- ---------- ---------- ---------- ----------  ----------
        total        3      0.06       0.64          0          0          0        4329
        

        It is the subsequent UPDATE in my test :

        call     count       cpu    elapsed       disk      query    current        rows
        ------- ------  -------- ---------- ---------- ---------- ----------  ----------
        Parse        1      0.00       0.00          0          0          0           0
        Execute   4329      0.06       0.50          0       8658       8860        4329
        Fetch        0      0.00       0.00          0          0          0           0
        ------- ------  -------- ---------- ---------- ---------- ----------  ----------
        total     4330      0.06       0.51          0       8658       8860        4329
        

        So, I have been wrong about where the performance of an MV Refresh is slow.

        Comment by Hemant K Chitale — July 5, 2019 @ 9:00 am BST Jul 5,2019 | Reply

        • The waits on the SELECT were :

          Elapsed times include waiting on following events:
            Event waited on                             Times   Max. Wait  Total Waited
            ----------------------------------------   Waited  ----------  ------------
            SQL*Net message to dblink                       3        0.00          0.00
            SQL*Net message from dblink                     3        0.18          0.38
            SQL*Net more data from dblink                 109        0.00          0.00
          

          Comment by Hemant K Chitale — July 5, 2019 @ 9:01 am BST Jul 5,2019

        • Hemant,

          Thanks for taking the time to let us know that the same thing was happening on your system in the dbms_job/dms_scheduler environment.

          Regards
          Jonathan Lewis

          Comment by Jonathan Lewis — July 5, 2019 @ 12:28 pm BST Jul 5,2019

  3. I have been working on a system where the number of sqlnet round trips will make a massive difference to performance depending on the distance between two databases.
    In a nutshell here is the scenario:

    There is a legacy system consisting of two databases in the same data centre where the latency between the databases is around 1-2 msecs.

    A job on database A runs a PL/SQL loop where it selects data from database B across a database link and then inserts data into database A and deletes it from database B one row at a time:

    PROCEDURE archive_data
       IS
          CURSOR c1
          IS
                select 
                p.id
                from mpi.prices@proddb p
                where p.createddate < trunc(SYSDATE);
       BEGIN
          FOR r1 IN c1
          LOOP
             INSERT INTO prices
                SELECT *
                  FROM mpi.prices@proddb
                 WHERE ID = r1.ID;
             DELETE FROM mpi.prices@proddb
                   WHERE ID = r1.ID;
          END LOOP;
    END archive_data;
    

    The cursor typically selects about 500k rows and then inserts locally and deletes from the remote db one row at a time doing a round trip for each operation.
    This is fine while the databases are next to each other, and the job is very fast.

    There is a requirement to relocate database A to a different country and this will increase the latency between the databases to about 60 msecs.
    For this reason the job was rewritten to batch the inserts and deletes and thus avoid so many sqlnet round trips:

    PROCEDURE archive_data
       IS
      TYPE t_bulk_collect_prices IS TABLE OF TMP_PRICES_IDS%ROWTYPE;
        l_tab   t_bulk_collect_prices;
        CURSOR C1 IS
                select 
                p.id
                from mpi.prices@proddb p
                where p.createddate < trunc(SYSDATE);
    BEGIN
        OPEN C1;
        LOOP
            FETCH C1 BULK COLLECT INTO l_tab LIMIT 50000;
            EXIT WHEN l_tab.COUNT = 0;
              FORALL i IN l_tab.FIRST .. l_tab.LAST
                INSERT INTO TMP_prices_IDS
                     VALUES l_tab (i);
                INSERT INTO mpi.prices
                SELECT * FROM mpi.prices@proddb p
                     WHERE ID in (select id from TMP_PRICES_IDS);
                DELETE FROM mpi.prices@proddb p
                   WHERE ID in (select id from TMP_PRICES_IDS);
        END LOOP;
        CLOSE C1;   
    END archive_data;
    

    Now this is where it gets interesting/confusing.

    There are multiple jobs similar to the above, but for different tables; in each case there is an old row-by-row version and a new bulk collect version.

    Some of the new bulk collect jobs work as expected and the sqlnet round trips are minimal but others continue to work in the same way as the original versions doing round trips for every row being processed.
    Why some jobs work as expected and others don't, we have not been able to establish. We need to find a way to force Oracle to avoid doing a round trip for every insert and every delete, but as yet have not been able to do this.

    Suggestions very welcome….

    Comment by Peter Clark — November 8, 2019 @ 11:28 am GMT Nov 8,2019 | Reply

    • Peter,

      Interesting – especially the apparent randomness.

      One thought comes to mind that has nothing to do with db links. If the sum of the declared lengths of the columns in a table exceeds the block size then array processing drops back to single row processing. I’ll have to search my blog for a link that points to an article about this, and will update this reply. In the mean-time it might be worth picking a couple of “broken” examples and seeing if this could be the issue.

      Off the top of my head I’m not sure how this would be affected if characters columns were defined with multi-byte character sets, but I imagine it’s the byte length that triggers the effect.

      UPDATE: Here’s a link to the article demonstrating the problem that appears when the declared (not used) row-length is larger than the blocksize: https://jonathanlewis.wordpress.com/2014/12/23/just-in-case/

      Comment by Jonathan Lewis — November 8, 2019 @ 6:07 pm GMT Nov 8,2019 | Reply

  4. Jonathan,

    Thanks for the update. I’ve checked just one of the jobs that ‘works’ and one of the ‘broken’ ones, and interestingly the working job is on a table for which the sum of the declared column widths (as opposed to the actual row lengths) is much bigger than the database blocksize.
    This particular table has 57 columns with a mix of data types, but amongst other things, it has 4 columns which are declared as VARCHAR2(4000 CHAR), and the database block size is 8192.
    The broken job has only 11 columns, 10 of which are NUMBER data type and the other is a TIMESTAMP(6) WITH TIME ZONE.

    Interesting, because it appears exactly the opposite to the premise that Oracle will use single-row processing where the declared column width exceeds the block size.

    The database by the way, is 11.2.0.4 Enterprise Edition and the character set is UTF8.

    More digging and testing required but this is a frustrating problem… My suspicions are that it is related to either a) something to do with processing over the database link, or b) something to do with the table statistics.

    Regards,

    Peter

    Comment by Peter Clark — November 9, 2019 @ 9:07 am GMT Nov 9,2019 | Reply

    • Peter,

      The “failure” to switch from array processing to single row processing is interesting – and something I’ll have to look into.

      Your description suggests another possibility. Some odd things happen with date and time columns, and I wonder if “timestamp with timezone” is a particular threat – maybe the remote database link does something unexpected like constantly checking for any differences between its timezone and the local timezone.

      In a similar vein, do any of the statement user systimestamp or sysdate, or do any of the remote tables have triggers that try to set columns to sysdate or systimestamp.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — November 9, 2019 @ 10:03 am GMT Nov 9,2019 | Reply

  5. One thing I missed, although it’s probably not relevant: the local database (i.e. the one where the jobs run), is actually 19.0.0.0.0, and it is the database on the other side of the link which is 11.2.0.4. Both have a block size of 8192 and both are UTF8.

    Comment by Peter Clark — November 9, 2019 @ 9:24 am GMT Nov 9,2019 | Reply

    • Peter,

      Prompted by your comment I’ve just re-run my “just in case” test on 19.3.0.0 to see if the array/single row switch has been addressed for purely local processing. No change, though.

      Comment by Jonathan Lewis — November 9, 2019 @ 10:04 am GMT Nov 9,2019 | Reply

  6. […] a little discussion in the comments to a posting on db_links that raised these […]

    Pingback by Just in case | Oracle Scratchpad — November 9, 2019 @ 10:15 am GMT Nov 9,2019 | Reply

  7. Jonathan,

    ALL of the tables on the remote database have at least one column which is TIMESTAMP(6) WITH TIME ZONE and it defaults to SYSTIMESTAMP, but there are no triggers which fire when rows are being deleted from the remote database. The local database has no triggers or constraints on any of the tables being archived.

    Here are the column definitions from one remote table where the array processing seems to be working (i.e. few round trips):

    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL TIMESTAMP(6) WITH TIME ZONE DEFAULT SYSTIMESTAMP
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL VARCHAR2(4000 CHAR)
    xxxxxxx VARCHAR2(4000 CHAR)
    xxxxxxx VARCHAR2(20 CHAR)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER(1)
    xxxxxxx TIMESTAMP(6) WITH TIME ZONE
    xxxxxxx NOT NULL TIMESTAMP(6) WITH TIME ZONE
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx VARCHAR2(60 CHAR)
    xxxxxxx TIMESTAMP(6) WITH TIME ZONE
    xxxxxxx VARCHAR2(5 CHAR)
    xxxxxxx NOT NULL NUMBER(1)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx VARCHAR2(40 CHAR)
    xxxxxxx VARCHAR2(40 CHAR)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NUMBER
    xxxxxxx NOT NULL NUMBER(1)
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NOT NULL NUMBER
    xxxxxxx VARCHAR2(4000 CHAR)
    xxxxxxx NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NUMBER(18)
    xxxxxxx NOT NULL NUMBER(1)
    xxxxxxx NUMBER(18)
    xxxxxxx VARCHAR2(2)
    xxxxxxx CHAR(7)
    xxxxxxx VARCHAR2(4000 CHAR)
    xxxxxxx NUMBER(1)

    Here are the column definitions from another remote table where the array processing is not working (i.e. at least two round trips for every row processed):

    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL TIMESTAMP(6) WITH TIME ZONE DEFAULT SYSTIMESTAMP
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL NUMBER(18)
    xxxxxxx NOT NULL NUMBER
    xxxxxxx VARCHAR2(40 CHAR)
    xxxxxxx NUMBER
    xxxxxxx NUMBER
    xxxxxxx NOT NULL NUMBER(1)

    In all cases the local table definition is the same as the remote one, but without the NOT NULL constraints.
    Also, in all cases the processing logic is the same, the only difference being the table name and the WHERE clause on the cursor.

    Regards,

    Peter

    Comment by Peter Clark — November 9, 2019 @ 10:41 am GMT Nov 9,2019 | Reply

    • Peter,
      I may have been heading off in completely the wrong direction on this one.
      I’ve just checked the “select .. bulk collect ..” / “forall insert” strategy and it behaves in exactly the same way as the simple “insert select” SQL statement.

      The problem in your case may simply be a change in execution plan for the IN subqueries. There may be some cases where Oracle decides to use a hash join approach (which would give you minimum round trips) and some where it uses a nested loop approach (which would give you the appearance of “single row processing”).

      UPDATE Although I made a mistake with the FORALL DELETE, I would still look at pulling the whole row across the network and then do two forall inserts to copy the data then populate a temp table of ids. Your code doesn’t show how clean the TMP table so that you don’t keep trying to delete rows you’ve already deleted from the remote site.

      The other thing to look at the the logic of what you’re doing – I don’t hink you need the GTT that you’re selecting ID into, something like the following should work:
      Wrong – it gets compilation error PLS-00739: FORALL INSERT/UPDATE/DELETE not supported on remote tables

      
      PROCEDURE archive_data
         IS
        TYPE t_bulk_collect_prices IS TABLE OF mpi.prices%ROWTYPE;
          l_tab   t_bulk_collect_prices;
          CURSOR C1 IS
                  select 
                        *
                  from mpi.prices@proddb p
                  where p.createddate < trunc(SYSDATE);
      BEGIN
          OPEN C1;
          LOOP
              FETCH C1 BULK COLLECT INTO l_tab LIMIT 50000;
              EXIT WHEN l_tab.COUNT = 0;
                FORALL i IN l_tab.FIRST .. l_tab.LAST
                  INSERT INTO mpi.prices values l_tab(i);
                FORALL i IN l_tab.FIRST .. l_tab.LAST
                  DELETE FROM mpi.prices@proddb p WHERE ID = l_tab(i).id;
          END LOOP;
          CLOSE C1;   
      END archive_data;
      /
      
      

      I would reduce the size of the LIMIT, you’re probably not going to get much improvement once it gets past 255 and it’s possible that you’ll pay a penalty somewhere else by making it too big.

      It’s possible that this approach for the delete might do something unexpected like sending the whole of the l_tab() array back across the network, If it does then you could introduce an array of IDs and copy the l_tab id components into it and use the copy for the delete.

      Comment by Jonathan Lewis — November 9, 2019 @ 7:32 pm GMT Nov 9,2019 | Reply

  8. Jonathan,

    Thanks for the update. In answer to your point about the TMP table – it’s a global temporary table defined as “ON COMMIT DELETE ROWS” and I omitted to show the COMMITS which are in the procedure so I’m afraid I gave you an incomplete picture. In the meantime my colleague has created a modified version of the procedure which we think may solve the problem. We plan to test this in the next day or so, so I’ll share the findings with you.

    Regards,

    Peter

    Comment by Peter Clark — November 12, 2019 @ 9:54 am GMT Nov 12,2019 | Reply

  9. Jonathan,

    The rewrite has worked. We now see minimal round trips in all cases, plus there is the benefit of the procedure being super fast compared to the previous version. Here is the new code which as you will see now has an inner block added. As before, the TMP table is a global temporary table with “ON COMMIT DELETE ROWS” and I have left the commit in for completeness:

    PROCEDURE archive_data
    IS
        TYPE t_bulk_collect_prices IS TABLE OF TMP_PRICES_IDS%ROWTYPE;
        l_tab     t_bulk_collect_prices;
        CURSOR c1 IS
            select 
                p.id
                from mpi.prices@proddb p
                where p.createddate &lt; trunc(SYSDATE);
    BEGIN
        OPEN c1;
        LOOP
            FETCH c1 BULK COLLECT INTO l_tab LIMIT 50000;
            EXIT WHEN l_tab.COUNT = 0;
            FORALL i IN l_tab.FIRST .. l_tab.LAST
                INSERT INTO TMP_PRICES_IDS
                     VALUES l_tab (i);
            DECLARE
                TYPE t_bulk_table_data IS TABLE OF mpi.prices%ROWTYPE;
                t_data   t_bulk_table_data;
                CURSOR c_table IS
                    SELECT /*+ DRIVING_SITE(p) */
                           *
                     FROM mpi.prices@proddb p
                     WHERE p.ID IN (SELECT tmp.id
                                      FROM TMP_PRICES_IDS tmp);
                BEGIN
                    OPEN c_table;
                    FETCH c_table BULK COLLECT INTO t_data;
                    FORALL i IN t_data.FIRST .. t_data.LAST
                        INSERT INTO mpi.prices
                             VALUES t_data (i);
                    DELETE FROM mpi.prices@proddb p
                          WHERE p.ID IN (SELECT tmp.id
                                           FROM TMP_PRICES_IDS tmp);
                    CLOSE c_table;
                END;
            COMMIT;
        END LOOP;
        CLOSE c1;
    END;
    /
    

    Comment by Peter Clark — November 13, 2019 @ 1:53 pm GMT Nov 13,2019 | Reply

    • Peter,

      Thanks for the update.

      I have to say that it’s not immediately obvious why that code should be much faster than the original, it doesn’t really seem to be matierally different.

      Regards
      Jonathan Lewis

      Comment by Jonathan Lewis — November 17, 2019 @ 10:05 am GMT Nov 17,2019 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.