Oracle Scratchpad

May 7, 2010

SQL*Net compression

Filed under: Infrastructure,Performance,Troubleshooting — Jonathan Lewis @ 7:21 pm BST May 7,2010

Here’s a little demonstration I’ve been meaning to write about for the last few years – it’s very simple: create a table, then query it a couple of times.

execute dbms_random.seed(0)

create table t1
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 3000
)
select
	lpad(dbms_random.string('U',2),40,'X')	v1
from
	generator	v1,
	generator	v2
where
	rownum <= 10000
;

break on report
compute sum of number_of_groups on report

select
	group_size, count(*)	number_of_groups
from
	(
	select
		v1, count(*) group_size
	from
		t1
	group by
		v1
	)
group by
	group_size
order by
	group_size
;

The data is a single column, basically a randomly generated two-character string, padded up to 40 characters on the left with lots of ‘X’s. The first first query gives you an indication of the data distribution, which is as follows:

GROUP_SIZE NUMBER_OF_GROUPS
---------- ----------------
         5                2
         6                5
         7                6
         8               15
         9               27
        10               30
        11               51
        12               55
        13               63
        14               59
        15               78
        16               65
        17               63
        18               54
        19               21
        20               35
        21               19
        22               14
        23                8
        24                5
        26                1
           ----------------
sum                     676

The way I’ve written this query you can see that there are two pairs of characters that appear five times each, one pair of characters that appears 26 times, and 78 pairs of characters that appear 15 times each; so there’s a fair bit of repetition, and a total of 676 different pairs of characters (which isn’t too surprising since I’ve stuck to the upper case alphabet, sampled 10,000 times, and 26 * 26 = 676).

So let’s try to query all the data from SQL*Plus:

set arraysize 1000
set autotrace traceonly statistics

select	v1
from	t1
;

select	v1
from	t1
order by
	v1
;

set autotrace off

Since I’ve set autotrace to display only the trace output, and only the statistics for the trace, the output is very limited. Here are the results for the first query:

10000 rows selected.

Elapsed: 00:00:00.07

Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
        143  consistent gets
          0  physical reads
          0  redo size
     425479  bytes sent via SQL*Net to client
        494  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

We have 10,000 rows processed, 11 SQL*Net round-trips, and 420KB of data travelling across SQL*Net to the client. At 10,000 rows with about 40 characters per rows that last figure is obviously in the right ballpark for the volume of data moved. But what are the stats for the second query (the one where I’ve added an “order by” clause) going to look like:

10000 rows selected.

Elapsed: 00:00:00.04

Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
        133  consistent gets
          0  physical reads
          0  redo size
      79287  bytes sent via SQL*Net to client
        494  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      10000  rows processed

It’s interesting to note that the query completed faster (but it’s not really sensible to make a big thing of comparing 0.07 seconds with 0.03 seconds); but what’s really interesting is that the volume of data moved across the network is only about 75KB – nearly one-sixth of what you might expect.

There are two questions here: how come we received 10,000 rows of about 40 bytes each in only 79KB of data transmission and, for the very observant, how come the number of SQL*Net round-trips hasn’t dropped. (I’ll leave the second question for another posting – there’s too much to say in just one note.)

SQL*Net has been able to do compression (or rather “de-duplication”) for years, at least since 8.0. If a single session data unit (SDU) contains data with a suitable pattern of duplicates SQL*Net can remove duplicates before transmission (and put them back on receipt, luckily).

The critical pattern is (or seems to be) described by the question: “does this row have the same value in column X as the last row, if so than I don’t have to send the whole of column X again.” So the de-duplication is from row to row, not a comparison across columns, or across the entire bulk of the session data unit. This means that if we sort our rows appropriately we can maximise the volume of data where “the columns in this row” are often copies of “the corresponding columns in the last row”.

In my special case, of course, my “order by” clause ensure that I would always be sending at least five consecutive rows that were identical and up to 26 rows that were identical – thereby achieving lots of eliminations of the 40-character column value.

Footnote:

I can say with some confidence that this de-duplication worked from at least Oracle 8 (or rather, SQL*Net 8 ) because of a discussion I had about “object-relational” technology just after 8.0 was launched. Someone tried to persuade me that it was kinder on the network to send “parent object with embedded multiple children” rather than sending “parent joined to child”. Their argument was that the object approach reduced the network traffic because the join repeated the required columns of the parent with each child while the object method didn’t. But as I was able to point out, the SQL may have repeated the columns but SQL*Net eliminated the repetitions. (In fact in some cases the object approach used to increase the network traffic because of the extra messages from the front-end asking for the definition of the object type – I don’t know if this is still the case, though.)

 

24 Comments »

  1. Hmmm, very interesting: “Someone tried to persuade me that it was kinder on the network to send “parent object with embedded multiple children” rather than sending “parent joined to child”. Their argument was that the object approach reduced the network traffic because the join repeated the required columns of the parent with each child while the object method didn’t.”

    I recently heard a similar argument made for the retrieval of parent and child data as two separate queries, so that the single row of parent data was not transmitted for every one of the fifty child rows, so it’s not just an O-R vs SQL argument. Of course one usually finds that it’s sufficient to challenge such opinions by asking the amount by which performance degraded when one query iss run instead of two (or three, or four). No such tests are every carried out.

    It also occurs to me that where multiple parent rows are being retrieved the chance of benefiting from this compression of parent data in the absence of an explicit ORDER BY clause would also depend on the join method between the tables.

    It would be interesting to think of beneficial optimiser enhancements to assist in situations where network performance is a limiting factor — you know, like when we’re all living on orbiting space stations and running queries against databases in Slough. What kind of segment statistics would we need in order to reliably predict the achievable compression rate in SQL*Net? Would those statistics also help in achieving better segment data compression, with the optimizer automatically adding an ORDER BY to an insert into a compressed table?

    Comment by David Aldridge — May 7, 2010 @ 9:15 pm BST May 7,2010 | Reply

    • David,

      “No such tests are every carried out.”

      And the sad thing is that it’s often very quick and easy to create a cheap test that shows whether or not a more expensive thorough test is worth doing. There have been occasions when I’ve spent 15 minutes building a model to demonstrate the validity (or otherwise) of a hypothesis after a committee has spent (literally) days arguing about it.

      “It also occurs to me that where multiple parent rows are being retrieved the chance of benefiting from this compression of parent data in the absence of an explicit ORDER BY clause would also depend on the join method between the tables.”

      Indeed – especially when you realise that the ordering of the result set is dictated by the first table in a nested loop, but by the second table in a hash join.

      This can also make a HUGE difference to queries involving filter subqueries: I’ll have to post an example one day.

      Comment by Jonathan Lewis — May 9, 2010 @ 11:12 am BST May 9,2010 | Reply

  2. Correct me if I’m wrong (or make me wait for your next post :) ), but the number of round trips depends on the number of fetches, which depends on the number of rows and the ARRAY SIZE parameter. I’m guessing you had an array size of 1000, which is a bit large. Of course, you needed a large array size to make your point about deduplication, n’est-ce pas?

    The main point of your article is extremely interesting. I too have wondered about using the “object approach”, or else artificially setting the repeated data to null. How wonderful to know that because of SQL*Net the repeated data is not a problem!

    Comment by Stew Ashton — May 8, 2010 @ 10:18 am BST May 8,2010 | Reply

    • Stew,

      That’s correct – round-trips depends on fetches (and other database calls), so the arraysize makes a big difference. But you also have to look at “SQL*Net more data …” stats, and the number of “network (i.e. O/S) round-trips” to get the full picture – and that’s what the follow-up post will cover.

      Comment by Jonathan Lewis — May 9, 2010 @ 11:03 am BST May 9,2010 | Reply

  3. Very, very interesting.

    Do you happen to know whether this “de-duplication” happens the other way around (from the client to the server) – for bulk-inserts for example ?

    Comment by Alberto Dell'Era — May 8, 2010 @ 11:07 am BST May 8,2010 | Reply

    • Alberto,

      Yes, I believe it does. I was trying to find some supporting figures in my portfolio but can’t locate them. However I think I had a client fairly recently who was using Informatica Powermart to do bulk inserts of HUGE amounts of data.

      If they used the direct path option from Informatica the volume of data across the network was enormous because the client seemed to be creating and sending complete Oracle blocks; when doing an array insert (with a declared client buffer of about 10Mb) the volume of network traffic was much smaller – and I think I deduced that this was because of SQL*Net compression.

      Net result: simple insert with index maintenance was much faster than direct path insert for this client with their data pattern

      Comment by Jonathan Lewis — May 9, 2010 @ 11:07 am BST May 9,2010 | Reply

      • Rather a damning indictment either of their network design or of ETL (vs ELT) in general. I’ve worked with organisations willing to invest in a 96cpu license for Informatica Powermart but without the foresight to provide more than a single one gigabit network interface from the Powermart server host. And heaven forbid that the Powermart Server should be hosted on the same box as the Oracle database — that would break their holy command that the app shall be on a separate tier from the database. Sometimes, all one can do is weep.

        Comment by David Aldridge — May 9, 2010 @ 12:20 pm BST May 9,2010 | Reply

  4. Interesting article which clearly explains why the third screen capture in my “Network Monitoring Experimentations 5″ blog article (showing the first packet in a fetch) shows column data that repeats for every row, while the fourth screen capture (showing one of the later packets in an array fetch of 5,000 rows) does not show the column data that repeats for every row. I previously could not identify a suitable reason for the missing repeated column data when examining the raw packet contents.

    Comment by Charles Hooper — May 10, 2010 @ 1:30 am BST May 10,2010 | Reply

  5. Indeed very interesting.
    Does this also apply when using the JDBC thin driver?

    Comment by Thomas — May 10, 2010 @ 7:35 am BST May 10,2010 | Reply

    • In regards to JDBC thin – Using the new tracing added to the 11.2 JDBC driver (enabled using oracle.net.ns.level), against an 11.2.0.1 DB, I can see this form of deduplication occuring.

      Unfortunately, I’m looking because we’re intermittently getting a single column replaced by null (ASCII 0) characters. The null characters don’t appear in the actual stream from the server so I’m suspecting it might be something going wrong with this de-dup process.

      Does anyone know of a way to disable it to test the theory?

      Comment by Tim Hopkins — June 4, 2010 @ 3:34 pm BST Jun 4,2010 | Reply

      • Tim,

        I don’t know of any option to disable the feature, and haven’t been able to find anything on Metalink (MOS). Have you tried raising an SR ?

        Comment by Jonathan Lewis — June 8, 2010 @ 10:10 pm BST Jun 8,2010 | Reply

  6. Jonathan,

    many thanks as always; I’m going to investigate the (JDBC) bulk-insert case in a near future, I’ll post the results if I manage to find something interesting.

    Comment by Alberto Dell'Era — May 10, 2010 @ 8:28 am BST May 10,2010 | Reply

  7. Hi Jonathan,

    the sql*net is able to do the ‘compression’ of data – is it because the data is coming from memory (not logical i/o) ?

    Comment by Bix — May 11, 2010 @ 10:02 pm BST May 11,2010 | Reply

  8. Jonathan,
    when I use your example in my database (10.2.0.4) I get – it’s no surprise – the same results. But when I replace “lpad(dbms_random.string(‘U’,2),40,’X’) v1″ by a simple “mod(rownum, 10) v1″ I see that the sorted query transfers a little bit more data than the query without sorting (without order by: 44144 bytes sent via SQL*Net to client; with “order by v1″: 51151 bytes sent via SQL*Net to client). Is this an effect of the small size of the column or does the de-duplication not work with numeric data (or work different)? I have no idea how to test this …

    Regards
    Martin Preiss

    Comment by Martin Preiss — June 2, 2010 @ 6:48 am BST Jun 2,2010 | Reply

    • Martin,
      I think this may be a result of a fixed overhead for compression (or de-duplication, which might be a slightly better term). I think a row probably has to say “this column is the same as last time”, so I would look for a fixed size token to represent “same as last row”.

      Test: create a table of 1,000,000 rows with a constant (e.g. 1000000001) and a variable (e.g. 1000000001 + rownum) column, set a large arraysize (5,000 is the max) then select each column in turn with autotrace traceonly statistics. See if the difference in volume is significant. It is, but there is a minimum volume of data per row which means you don’t see much of a saving with a single column of small numbers.

      Comment by Jonathan Lewis — June 2, 2010 @ 7:39 am BST Jun 2,2010 | Reply

  9. Jonathan,
    thank you for the plausible explanation and the test (and by the way for your blog and your books)

    Comment by Martin Preiss — June 2, 2010 @ 8:18 am BST Jun 2,2010 | Reply

  10. interesting article, but isn’t more overhead put on the server(cpu/memory) as Oracle has to sort data first? I feel this comes at the price of stretching system resource, not sure if it’s worthwhile doing so in reality.

    Comment by jason — February 22, 2011 @ 3:51 am BST Feb 22,2011 | Reply

    • Jason,

      More importantly, probably, is that if you sort the data to reduce the network volume you need to be sure that the front-end doesn’t care about the order.

      However, your point is taken, and comes under the general principle of using whichever resource is spare to reduce the load on the resource that is the bottleneck. If the network is the problem and CPU is available … (which was the case in my Informatica example above).

      Comment by Jonathan Lewis — February 22, 2011 @ 11:39 am BST Feb 22,2011 | Reply

  11. [...] were transferred over the network in the unhinted version of the query?  Could it be a case where SQL*Net compression has helped to strip the repeating columns from adjacent rows in the result set prior to [...]

    Pingback by Give Me a Hint – How were These Autotrace Execution Statistics Achieved? « Charles Hooper's Oracle Notes — June 27, 2011 @ 1:38 pm BST Jun 27,2011 | Reply

  12. You asked, ” how come the number of SQL*Net round-trips hasn’t dropped. (I’ll leave the second question for another posting – there’s too much to say in just one note.)”. Was this answered in another post?

    I’ve a scenario where bytes sent via SQL*NET is distinctly higher on a one particular database (11G) and another (10G), but roundtrip to/from remains the same. How can a higher number of bytes be transferred in the same amounf of roundtrip?

    Comment by Edwin — May 16, 2013 @ 9:25 am BST May 16,2013 | Reply

  13. […] wrote a note a few years ago about SQL*Net compression (this will open in a new window so that you can read the posts concurrently), showing how the […]

    Pingback by SQL*Net Compression – 2 | Oracle Scratchpad — June 9, 2013 @ 2:53 pm BST Jun 9,2013 | Reply

  14. […] article de Jonathan Lewis publié sur son blog. L’article original en anglais se trouve ici. Il montre un fonctionnalité peu connue: la compression des données transférées par SQL*Net […]

    Pingback by Compression SQL*Net, par Jonathan Lewis | Oracle – Concepts et Exemples — November 19, 2013 @ 4:03 pm BST Nov 19,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,877 other followers