Oracle Scratchpad

April 10, 2015

Counting

Filed under: Execution plans,Indexing,Oracle,Performance — Jonathan Lewis @ 5:27 pm BST Apr 10,2015

There’s a live example on OTN at the moment of an interesting class of problem that can require some imaginative thinking. It revolves around a design that uses a row in one table to hold the low and high values for a range of values in another table. The problem is then simply to count the number of rows in the second table that fall into the range given by the first table. There’s an obvious query you can write (a join with inequality) but if you have to join each row in the first table to several million rows in the second table then aggregate to count them that’s an expensive strategy.  Here’s the query (with numbers of rows involved) that showed up on OTN; it’s an insert statement, and the problem is that it takes 7 hours to insert 37,600 rows:


    INSERT INTO SA_REPORT_DATA
    (REPORT_ID, CUTOFF_DATE, COL_1, COL_2, COL_3)
    (
    SELECT 'ISRP-734', to_date('&DateTo', 'YYYY-MM-DD'),
           SNE.ID AS HLR
    ,      SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER AS NUMBER_RANGE
    ,      COUNT(M.MSISDN) AS AVAILABLE_MSISDNS
    FROM
           SA_NUMBER_RANGES SNR          -- 10,000 rows
    ,      SA_SERVICE_SYSTEMS SSS        --  1,643 rows
    ,      SA_NETWORK_ELEMENTS SNE       --    200 rows
    ,      SA_MSISDNS M                  --    72M rows
    WHERE
           SSS.SEQ = SNR.SRVSYS_SEQ
    AND    SSS.SYSTYP_ID = 'OMC HLR'
    AND    SNE.SEQ = SSS.NE_SEQ
    AND    SNR.ID_TYPE = 'M'
    AND    M.MSISDN  >= SNR.FROM_NUMBER
    AND    M.MSISDN  <= SNR.TO_NUMBER
    AND    M.STATE  = 'AVL'
    GROUP BY
           SNE.ID,SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER
    )  

The feature here is that we are counting ranges of msisdn: we take 10,000 number ranges (snr) and join with inequality to a 72M row table. It’s perfectly conceivable that at some point the data set expands (not necessarily all at once) to literally tens of billions of rows that are then aggregated down to the 37,600 that are finally inserted.

The execution plan shows the optimizer joining the first three tables before doing a merge join between that result set and the relevant subset of the MSISDNs table – which means the MSISDNs have to be sorted and buffered (with a probable spill to disc) before they can be used. It would be interesting to see the rowsource execution stats for the query – partly to see how large the generated set became but also to see if the ranges involved were so large that most of the time went in constantly re-reading the sorted MSISDNs from the temporary tablespace.

As far as optimisation is concerned there are a couple of trivial things around the edges we can examine: we have 10,000 number ranges but insert 37,600 results, and the last stages of the plan generated those results so we’ve scanned and aggregated the sorted MSISDNs 37,600 times. Clearly we could look for a better table ordering that eliminated any number ranges early, then did the minimal number of joins to MSISDN, aggregated, then scaled up to 37,600. With the best join order we might reduce the run time by a factor of around 3.76 or more. (But that’s still a couple of hours run time.)

What we really have to do to make a significant difference is change the infrastructure in some way – preferably invisibly to the rest of the application. There are a number of specific details relating to workload, read-consistency, timing, concurrency, etc. that will need to be considered but, broadly speaking, we need to take advantage of a table that effectively holds the “interesting” MSISDNs in sorted order. When considering the options it’s worth remembering that currently the result is “wrong” because by the time the 7 hour run is complete some (or even many) of the MSISDNs are probably no longer available – so how accurate does the report have to be ?

I’ve kept the approach simple here, and it would need a few modifications for a production system. The important bit of the report is the bit that produces the count so I’m only going to worry about a two-table join – number ranges and msidn; here’s some model data:


execute dbms_random.seed(0)

create table msisdns
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      msisdn
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table number_ranges
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      from_number,
        trunc(dbms_random.value(1e9,1e10))      to_number
from
        generator       v1
where
        rownum  <= 1000
;

update number_ranges set
        from_number = to_number,
        to_number = from_number
where
        to_number < from_number
;

commit;

I’ve created a table of numbers with values between 10e9 and 10e10 to represent 1 million MSISDNs, and a list of 1,000 number ranges – making sure that the FROM number is not greater than the TO number. Now I need a “summary” table of the MSISDNs, which I’m going to create as an index-organized table:


create table tmp_msisdns (
        msisdn,
        counter,
        constraint tmp_pk primary key (msisdn, counter)
)
organization index
as
select
        msisdn,
        row_number() over(order by msisdn)      counter
from
        msisdns
;

This is only a demonstration so I’ve haven’t bothered with production-like code to check that the MSISDNs I had generated were unique (they were); and I’ve casually included the row_number() as part of the primary key as a performance fiddle even though it’s something that could, technically, allow some other program to introduce bad data if I made the table available for public use rather than keeping it task specific.

Finally we get down to the report. To find out how many MSISDN values there are between the FROM and TO number in a range I just have to find the lowest and highest MSISDNs from in that range and find the difference between their counter values, and add 1. And there’s a very fast way to find the lowest or highest values when you have the appropriate index – the index range scan (min/max) – but you have to access the table twice, once for the low, once for the high. Here’s the necessary SQL, with execution plan from 12.1.0.2:


select
        nr.from_number, nr.to_number,
--      fr1.msisdn, fr1.counter,
--      to1.msisdn, to1.counter,
        1 + to1.counter - fr1.counter range_count
from
        number_ranges   nr,
        tmp_msisdns     fr1,
        tmp_msisdns     to1
where
        fr1.msisdn = (
                select min(msisdn) from tmp_msisdns where tmp_msisdns.msisdn >= nr.from_number
        )
and     to1.msisdn = (
                select max(msisdn) from tmp_msisdns where tmp_msisdns.msisdn <= nr.to_number
        )
;

-------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |               |       |       |  4008 (100)|          |
|   1 |  NESTED LOOPS                   |               |  1000 | 38000 |  4008   (1)| 00:00:01 |
|   2 |   NESTED LOOPS                  |               |  1000 | 26000 |  2005   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL            | NUMBER_RANGES |  1000 | 14000 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN             | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   5 |     SORT AGGREGATE              |               |     1 |     7 |            |          |
|   6 |      FIRST ROW                  |               |     1 |     7 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN (MIN/MAX)| TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
|*  8 |   INDEX RANGE SCAN              | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   9 |    SORT AGGREGATE               |               |     1 |     7 |            |          |
|  10 |     FIRST ROW                   |               |     1 |     7 |     3   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN (MIN/MAX) | TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("FR1"."MSISDN"=)
   7 - access("TMP_MSISDNS"."MSISDN">=:B1)
   8 - access("TO1"."MSISDN"=)
  11 - access("TMP_MSISDNS"."MSISDN"<=:B1)

Execution time – with 1 million MSISDNs and 1,000 ranges: 0.11 seconds.

For comparative purposes, and to check that the code is producing the right answers, here’s the basic inequality join method:


select
        nr.from_number, nr.to_number, count(*) range_count
from
        number_ranges   nr,
        msisdns         ms
where
        ms.msisdn >= nr.from_number
and     ms.msisdn <= nr.to_number
group by
        nr.from_number, nr.to_number
order by
        nr.from_number
;

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |       |       |       |   472K(100)|          |
|   1 |  HASH GROUP BY        |               |   707K|    14M|  6847M|   472K (17)| 00:00:19 |
|   2 |   MERGE JOIN          |               |   255M|  5107M|       | 13492  (77)| 00:00:01 |
|   3 |    SORT JOIN          |               |  1000 | 14000 |       |     3  (34)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | NUMBER_RANGES |  1000 | 14000 |       |     2   (0)| 00:00:01 |
|*  5 |    FILTER             |               |       |       |       |            |          |
|*  6 |     SORT JOIN         |               |  1000K|  6835K|    30M|  3451   (7)| 00:00:01 |
|   7 |      TABLE ACCESS FULL| MSISDNS       |  1000K|  6835K|       |   245  (14)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("MS"."MSISDN"<="NR"."TO_NUMBER")
   6 - access("MS"."MSISDN">="NR"."FROM_NUMBER")
       filter("MS"."MSISDN">="NR"."FROM_NUMBER")

The two queries produced the same results (apart from ordering); but the second query took 2 minutes 19.4 seconds to complete.

Update:

In a moment of idle curiosity I recreated the data with 40 Million rows in the MSISDNs table to get some idea of how fast the entire report process could go when re-engineered (remember the OP has 72M rows, but selects the subset flagged as ‘AVL’). It took 1 minute 46 seconds to create the IOT – after which the report for 1,000 number ranges still took less than 0.2 seconds.

Footnote:

My random generation of data doesn’t attempt to avoid duplicate MSISDNs, but they are unlikely to appear in the 1M row test; the 40M row test, however, will almost certainly produce a small percentage of duplicates. As a consequence the final result may exceed 1,000 rows by one or two, but since this is just a demonstration of the principle and a quick performance check that doesn’t worry me.

Update 12th Dec 2015

Stew Ashton was in the audience at the UKOUG Tech 15 conference when I described the problem in a presentation on “avoiding work you don’t need to do” and immediately suggested using match_recognize() as an alternative strategy for solving the problem. He has since published his solution for two different scenarios on his blog.

When I tested his “overlapping ranges” solution against my temporary table solution it produced two advantages – first it ran twice as quickly to produce the result, secondly it didn’t need an auxiliary table. Having brought the run-time down from hours to a minute or two the second advantage is rather more desirable, probably, than the first – especially since you then don’t have to mess around producing a point-in-time read-consistent result. Of course, you may still base your choice of method on the ease of comprehension of the code – and in the short term the match_recognize() mechanism is not well-known.

8 Comments »

  1. Storing the ranges in a single row seems to invoke the old space-time tradeoff, where saving space ends up costing you in processing time. I would be interested to see the impact of instead storing the relationship in the classic way, with a many-to-many relationship.

    The old table:
    SA_NUMBER_RANGES (from_number, to_number)

    replaced by two tables:
    SA_NUMBER_CATEGORIES (category_id, from_number, to_number)
    SA_MSISDNS_TO_NUM_CATEGORIES (msisdn, category_id)

    Populating the many-to-many would require the 7-hour insert once, but then keeping it up to date shouldn’t be too bad, assuming that msisdn values and number ranges aren’t updated often. Then again, those assumptions might be the reason for the “optimization” in the first place.

    Comment by Francisco — April 10, 2015 @ 6:06 pm BST Apr 10,2015 | Reply

    • The scale of the problem depends to a huge extent on the the size and overlap thats allowed in the ranges; I’ve assumed that the 7 hour run time is due to the ranges being very large and many of the ranges have significant overlaps, leading to the need to process an intermediate result set of billions or tens of billions of rows. If, on the other hand, the ranges are small and non-overlapping, say 72,000 rows for each of 10,000 ranges to get 72M rows then there’s probably a completely different reason why the insert takes 7 hours.

      If we assume, for the moment, that my guess about distribution is correct then, unless I’ve misinterpreted your comments, your approach is going to have to populate a table of billions of rows and index it twice. The original query (which looked only for MSISDNs flagged as ‘AVL’) then has to join SA_MSISDNs to SA_MSISDNS_TO_NUM_CATEGORIES to sum the correct MSISDNs by category_id.

      Comment by Jonathan Lewis — April 11, 2015 @ 7:52 am BST Apr 11,2015 | Reply

  2. […] 10 seconds” trick for a client using a technique that is conceptually very simple but, like my example from last week, falls outside the pattern of generic SQL. The problem (with some camouflage) is as follows: we […]

    Pingback by Cartesian join | Oracle Scratchpad — April 15, 2015 @ 6:41 pm BST Apr 15,2015 | Reply

  3. […] the UKOUG Tech15 conference, Jonathan Lewis presented a question asked on OTN and the answer published on his blog. The problem is how to count the number of rows in one table that fall into the ranges stored […]

    Pingback by Summarize Data by Range | An Oracle Programmer — December 12, 2015 @ 3:30 pm BST Dec 12,2015 | Reply

  4. Hello Jonathan,

    The pingback in comment 3. links to the alternative I mentioned after your Tech15 session. I found a simple solution when the ranges don’t overlap and a less simple one when they do. Response time is comparable to your solution.

    Your session was outstanding as usual, but especially enjoyable and thought-provoking to me because it involved SQL, its limits in solving problems and how to overcome those limits.

    Kindest regards, Stew

    Comment by stewashton — December 12, 2015 @ 3:43 pm BST Dec 12,2015 | Reply

    • Stew,

      Thanks for the comment – I enjoyed the session and the comments from the audience. And I was very impressed with how quickly you produced the two solutions.

      I clearly ought to take some time to get to grips with match_recognize(): it’s pretty amazing that you’ve effectively made it turn a join into a UNION ALL to cut out a massive amount of work – so I’m beginning to wonder how often it could be used in other “brontosaurus” queries in the same sort of way.

      Comment by Jonathan Lewis — December 12, 2015 @ 6:12 pm BST Dec 12,2015 | Reply

  5. “Of course, you may still base your choice of method on the ease of comprehension of the code – and in the short term the match_recognize() mechanism is not well-known.”

    True on both counts, at least “in the short term” as you say.

    Comment by stewashton — December 12, 2015 @ 5:55 pm BST Dec 12,2015 | Reply

    • Stew,

      With the examples you’ve produced of match_recognize() maybe it will become well-known much more rapidly than all the analytic stuff (which still only appears very rarely in production systems).

      Comment by Jonathan Lewis — December 12, 2015 @ 6:13 pm BST Dec 12,2015 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.