A recent thread on the ODC SQL & PL/SQL forum raised the question of how data moves across a database link – is it row by row or array based (or other)? It’s a good question as it’s fairly common knowledge that distributed queries can be very much slower than the same query when executed on an equivalent set of local tables, so it’s worth having a little poke around to see if there’s anything in the way the data moves across the link that could be tweaked to improve performance.
It’s fairly easy to get some ideas by running a couple of queries and checking session activity stats and wait events – so here’s a little test conducted between a database running 12.2.0.1 and a database running 11.2.0.4. For this type of testing it’s probably sensible to use two database rather than faking things with a loopback database link in case the loopback introduces some special features (a.k.a. dirty tricks).
rem rem Script: db_link_usage.sql rem Author: Jonathan Lewis rem Dated: July 2019 rem rem Last tested rem 12.2.0.1 -> 11.2.0.4 rem prompt ================================== prompt SQL*Plus 20,000 rows, arraysize 20 prompt ================================== set feedback only set arraysize 20 execute snap_my_stats.start_snap execute snap_events.start_snap select * from all_objects@d11g where rownum <= 20000; set feedback on execute snap_events.end_snap execute snap_my_stats.end_snap prompt ==================================== prompt SQL*Plus 20,000 rows, arraysize 2000 prompt ==================================== set feedback only set arraysize 2000 execute snap_my_stats.start_snap execute snap_events.start_snap select * from all_objects@d11g where rownum <= 20000; set feedback on execute snap_events.end_snap execute snap_my_stats.end_snap
All I’ve done is select 20,000 rows from view all_objects in a remote database with two different settings for the array fetch size. The “feedback only” option is a 12c option that means the session doesn’t output the data, it shows only the final “N rows selected” report. The two “snap” packages are some code I wrote a couple of decades ago to report changes in my session’s activity stats and wait events.
It turned out that there was very little of interest in the session activity stats although the impact of SQL*Net compression is always worth a quick reminder (here and here) on the other hand the wait events (specifically the wait counts) are most revealing:
================================== SQL*Plus 20,000 rows, arraysize 20 ================================== Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 1,004 0 15.69 .016 414,828 SQL*Net message to dblink 1,004 0 0.04 .000 0 SQL*Net message from dblink 1,004 0 61.02 .061 11 SQL*Net message to client 1,004 0 0.11 .000 0 ==================================== SQL*Plus 20,000 rows, arraysize 2000 ==================================== Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 14 0 4.88 .349 414,828 SQL*Net message to dblink 13 0 0.00 .000 0 SQL*Net message from dblink 13 0 7.72 .594 11 SQL*Net message to client 14 0 0.00 .000 0 SQL*Net more data from dblink 96 0 6.26 .065 6 SQL*Net more data to client 96 0 0.16 .002 4
Doing a little arithmetic (and allowing for a few “metadata” messages that introduce small variations in the numbers), we can see that when we fetched the 20,000 rows with an arraysize of 20 this turned into 1,000 (fetch) calls from SQL*Plus to the server, with a matching 1,000 calls from the local server to the remote server. When the arraysize goes up to 2,000 rows, though, the SDU (session data unit) setting for SQL*Net is too small to hold the whole of a single fetch and we see a single fetch from SQL*Plus turning into one “SQL*Net message to client” accompanied by 7 or 8 “SQ:(Net more data to client” with exactly the same pattern of conversation between the local server and the remote server. You could imagine the conversation as something like:
- Client to local server: “give me 2,000 rows”
- local server to remote server: “give me 2,000 rows”
- remote server to local server: “Here, I can manage 120 rows as the first installment”
- local server to client: “Here’s a first installment”
- Client to local server: “Got it, where’s the rest?” *** but no message recorded in the session activity stats
- Local server to remote server: “Where’s the rest?”
- Remote server to local server: “You want more – here’s another 120 rows”
- Local server to client: “Here’s a second installment”
- Client to local server: “Got it, where’s the rest?” *** but no message recorded in the session activity stats
- etc.
In this example the volume of data sent back to the client in each message was limited by the SDU size negotiated between the local server and the remote server as the link opens. In my case I happened to have the default (apparently 8KB) as the SDU_SIZE in the 12c and a non-default 32KB setting in the 11g sqlnet.ora, so the negotiated SDU between servers was 8KB (the lower of the two).
Here’s what the figures looked like after I had restarted with the SDU set to 16KB in the 12c tnsnames.ora, and 16KB in the 11g listener.ora:
================================== SQL*Plus 20,000 rows, arraysize 20 ================================== Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 1,004 0 24.23 .024 588 SQL*Net message to dblink 1,009 0 0.06 .000 0 SQL*Net message from dblink 1,010 0 77.76 .077 4 SQL*Net message to client 1,004 0 0.15 .000 0 ==================================== SQL*Plus 20,000 rows, arraysize 2000 ==================================== Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 14 0 1.61 .115 588 SQL*Net message to dblink 13 0 0.00 .000 0 SQL*Net message from dblink 13 0 4.21 .324 4 SQL*Net message to client 14 0 0.00 .000 0 SQL*Net more data from dblink 45 0 13.53 .301 6 SQL*Net more data to client 45 0 0.13 .003 0
The first set of figures (arraysize 20) don’t change. If an 8KB SDU is large enough to hold an array of 20 rows then it’s not going to make a difference when the SDU is increased. In the second set of figures we see that for each “SQL*Net message from dblink” we now have roughly 3 “SQL*Net more data from dblink” (with matching counts for “SQL*Net more data to client”). With an SDU of double the size it now takes a total of roughly 4 packets to transmit the array fetch rather than the 8 or 9 we saw with the smaller SDU size.
As a final test with SQL*Plus, what happens if we set the SDU size at 32K for the 12c database (and that means for the client calling in and the server calling out) and 8KB for the 11g database? The client negotiates a 32KB SDU with the 12c server, but the 12c server negotiates an 8KB SDU with the 11g server. Here are the stats for just the test with the large fetch size in this case:
==================================== SQL*Plus 20,000 rows, arraysize 2000 ==================================== Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 14 0 5.30 .379 214,570 SQL*Net message to dblink 13 0 0.00 .000 0 SQL*Net message from dblink 13 0 4.09 .314 13 SQL*Net message to client 14 0 0.00 .000 0 SQL*Net more data from dblink 96 0 14.46 .151 6 SQL*Net more data to client 20 0 0.08 .004 0
We get the same 10 (plus a few) “message to/from client/dblink”, but now the “more data” waits are dramatically different. When the client calls for the “next fetch” the local server has to call the remote server 4 or 5 times to get enough 8KB data packets to fill a single 32KB packet to return to the client. You can confirm this (and some of my descriptions of the earlier behaviour) by enabling extended tracing for wait states. Here’s an extract of 5 consecutive lines from a suitable trace file (you have to ignore the “#bytes” parameter on the “SQL*Net more data from dblink” waits, they’ve always been garbage:
WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 2793 driver id=1413697536 #bytes=7 p3=0 obj#=-1 tim=703316387674 WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 34 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=703316388447 WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 8 driver id=1413697536 #bytes=2 p3=0 obj#=-1 tim=703316389134 WAIT #140028098478296: nam='SQL*Net more data from dblink' ela= 16 driver id=1413697536 #bytes=12 p3=0 obj#=-1 tim=703316389818 WAIT #140028098478296: nam='SQL*Net more data to client' ela= 73 driver id=1413697536 #bytes=32671 p3=0 obj#=-1 tim=703316389960
Summary Note
Oracle will use the negotiated SDU to do array fetches across a database link. Since a larger arraysize can (for large volumes of data) reduce the total work load on the remote database and on the network you may want the local server to have a specifically crafted tnsnames.ora entry and the remote server to expose a specific service with matching SDU size to help improve the efficiency of transferring data between two databases.
Lagniappe
If you want to look at other cases of how array fetching and SDU sizes interact, here are a couple of examples of using PL/SQL to execute SQL that fetches data across database links. The first is a simple, embedded “cursor for loop” that (apparently) does “row by row” procssing – although an enhancement appeared many versions ago to make this type of code use array fetching of 100 rows under the covers. The second demonstrates the effects of an explicit cursor with “fetch, bulk collect, limit”:
execute snap_events.start_snap declare m_total number := 0; m_ct number := 0; begin for c1 in (select * from all_objects@d11g where rownum < 20000) loop m_total := m_total + c1.object_id; m_ct := m_ct + 1; end loop; end; / execute snap_events.end_snap execute snap_events.start_snap declare cursor c1 is select * from all_objects@d11g where rownum <= 20000; type c1_array is table of c1%rowtype index by binary_integer; m_tab c1_array; m_ct number := 0; begin open c1; loop fetch c1 bulk collect into m_tab limit 1000 ; exit when c1%notfound; m_ct := m_ct + 1; end loop; close c1; end; / execute snap_events.end_snap
Checking the session events for these two test we see the following when the SDU_SIZE has been set at 16KB at both ends of the database link (which means the negotiated SDU will be 16KB):
Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 4 0 0.66 .165 588 SQL*Net message to dblink 203 0 0.05 .000 0 SQL*Net message from dblink 203 0 38.51 .190 5 SQL*Net message to client 4 0 0.00 .000 0 Event Waits Time_outs Csec Avg Csec Max Csec ----- ----- --------- ---- -------- -------- SQL*Net message from client 4 0 0.08 .021 588 SQL*Net message to dblink 24 0 0.01 .000 0 SQL*Net message from dblink 24 0 9.19 .383 5 SQL*Net more data from dblink 40 0 8.47 .212 6 SQL*Net message to client 4 0 0.00 .000 0
The critical information we can see in the first example is that getting 20,000 rows from the remote database “row by row” takes 200 (plus noise) round-trips – showing that Oracle really is converting our single-row processing loop in array fetches of 100 rows.
The second example shows us that (allowing for a little noise, again) it takes 20 messages to the remote database to fetch 20,000 rows – so 1,000 rows per message – but the SDU size is too small and we have to send each array in three consecutive packets, the “message from dblink” wait and two “more data from dblink” waits.
Footnote
I did point out that the session activity stats for theses tests weren’t interesting. There is, however, one detail worth mentioning since you could otherwise be fooled into thinking the number of packet travelling between the databases was smaller than it really was. There are two statistics about network roundtrips:
Name Value ---- ----- SQL*Net roundtrips to/from client 18 SQL*Net roundtrips to/from dblink 13
A “roundtrip” is a “message to / message from” pair. It doesn’t account for traffic due to “more data” being requested or sent.