Oracle Scratchpad

March 12, 2021

Distributed Sequences

Filed under: distributed,Execution plans,Oracle,Performance,Problem Solving — Jonathan Lewis @ 9:09 am GMT Mar 12,2021

A request for help came up some time ago on ODC reporting a query that was hanging when it included a sequence.nextval. In fact the intial “query” was an “insert as select” with a select that was a join of two remote tables.

Making the fairly automatic assumption that many people say “hanging” when they really mean “hasn’t finished yet” the first thought I had about the structure of the statement was that it was just a variation of the standard problem of distributed DML. I haven’t written anything previously about how using sequences can introduce the problem so here’s a note to demonstrate the issue and suggest a workaround:

We start with a database link and a few tables:


rem
rem     Script:         distributed_sequence.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2019
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             18.3.0.0
rem             12.2.0.1
rem

rem     create public database link orcl@loopback using 'orcl';

define m_target=orcl@loopback

create sequence s1 cache 10000;
select s1.nextval from dual;

create table t1
segment creation immediate
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        rownum                          n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1
;

create table t2
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        rownum                          n2,
        lpad(rownum,10,'0')             v2,
        lpad('x',100,'x')               padding
from
        generator       v1
;

create table t3(
        id1     number(6,0),
        id2     number(6,0),
        n0      number(6,0),
        n1      number(6,0),
        n2      number(6,0),
        v1      varchar2(10),
        v2      varchar2(10)
)
segment creation immediate
;

create or replace view v1 as
select
        t1.id id1,
        t2.id id2,
        t1.n1,
        t2.n2,
        t1.v1,
        t2.v2
from
        t1, t2
where
        t2.id = t1.id
;

The command to create a public database link (one example of the many optional commands in my original source) has to be run by a suitably privileged schema as a one-off event. The definition of the substitution variable m_target that I’ll be using as my database link (again with many possibilities in my original script) means I only have to edit my choice of database link once in my script as I change Oracle instances. You’ll notice I’ve done my usual trick of using a loopback database link to emulate a distributed system.

I’ve then created two populated tables (t1, t2) and a third empty table (t3) that will be the target of an insert. I’ve also created a view (v1) that joins the first two tables and a sequence (s1) that I’ve primed with a single call to nextval. When I get to the tests I’ll be using t1, t2 and v1 as if they had been created in the remote database (referenced through my loopback database link) while t3 and s1 will be local objects.

So let’s run a couple of statements and see what their execution plans look like:


set serveroutput off

prompt  =======================
prompt  Insert without sequence
prompt  =======================

insert into t3 (id1, id2, n0, n1, n2, v1, v2)
select
        t1.id,
        t2.id,
        0,
        t1.n1,
        t2.n2,
        t1.v1,
        t2.v2
from
        t1@&m_target    t1,
        t2@&m_target    t2
where
        t2.id = t1.id
;

select * from table(dbms_xplan.display_cursor(format=>'-plan_hash'));

prompt  =======================
prompt  Insert WITH sequence
prompt  =======================

insert into t3 (id1, id2, n0, n1, n2, v1, v2)
select
        t1.id,
        t2.id,
        s1.nextval,
        t1.n1,
        t2.n2,
        t1.v1,
        t2.v2
from
        t1@&m_target    t1,
        t2@&m_target    t2
where
        t2.id = t1.id
;

select * from table(dbms_xplan.display_cursor(format=>'-plan_hash'));
commit;

The code joins t1 and t2, selects a couple of columns and inserts then into t3 without or with a call to s1.nextval (the local sequence). Here’s the output (with minor cosmetic changes) from executing this code under 19.3.0.0 (the effects are the same in 18.3.0.0 and 12.2.0.1):


=======================
Insert without sequence
=======================

10000 rows created.


SQL_ID  373nz810u3frv, child number 0
-------------------------------------
insert into t3 (id1, id2, n0, n1, n2, v1, v2) select  t1.id,  t2.id,
0,  t1.n1,  t2.n2,  t1.v1,  t2.v2 from  t1@orclpdb@loopback t1,
t2@orclpdb@loopback t2 where  t2.id = t1.id

-----------------------------------------------------------------
| Id  | Operation                | Name | Cost  | Inst   |IN-OUT|
-----------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |        |      |
|   1 |  LOAD TABLE CONVENTIONAL | T3   |       |        |      |
|   2 |   REMOTE                 |      |       | ORCLP~ | R->S |
-----------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------
   2 - SELECT "A2"."ID","A1"."ID",0,"A2"."N1","A1"."N2","A2"."V1","A1"."
       V2" FROM "T1" "A2","T2" "A1" WHERE "A1"."ID"="A2"."ID" (accessing
       'ORCLPDB.LOCALDOMAIN@LOOPBACK' )

Note
-----
   - cpu costing is off (consider enabling it)


=======================
Insert WITH sequence
=======================

10000 rows created.


SQL_ID  8jg23arujnh01, child number 0
-------------------------------------
insert into t3 (id1, id2, n0, n1, n2, v1, v2) select  t1.id,  t2.id,
s1.nextval,  t1.n1,  t2.n2,  t1.v1,  t2.v2 from  t1@orclpdb@loopback
t1,  t2@orclpdb@loopback t2 where  t2.id = t1.id

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |       |       |    54 (100)|          |        |      |
|   1 |  LOAD TABLE CONVENTIONAL | T3   |       |       |            |          |        |      |
|   2 |   SEQUENCE               | S1   |       |       |            |          |        |      |
|*  3 |    HASH JOIN             |      | 10000 |   937K|    54   (8)| 00:00:01 |        |      |
|   4 |     REMOTE               | T1   | 10000 |   468K|    26   (4)| 00:00:01 | ORCLP~ | R->S |
|   5 |     REMOTE               | T2   | 10000 |   468K|    26   (4)| 00:00:01 | ORCLP~ | R->S |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T2"."ID"="T1"."ID")

Remote SQL Information (identified by operation id):
----------------------------------------------------
   4 - SELECT /*+ OPAQUE_TRANSFORM */ "ID","N1","V1" FROM "T1" "A2" (accessing
       'ORCLPDB.LOCALDOMAIN@LOOPBACK' )

   5 - SELECT /*+ OPAQUE_TRANSFORM */ "ID","N2","V2" FROM "T2" "A1" (accessing
       'ORCLPDB.LOCALDOMAIN@LOOPBACK' )

The key thing to notice is that when we want to insert the local sequence as a column in the select list Oracle breaks the hash join into two separate accesses to the remote database and pulls all the data we might need from the two tables before trying to join them locally. When the query is “fully remote” the local database can let the remote database deal with the join, when the query is distributed – which is a side effect of introducing the sequence – the local site becomes the driving site and has to work out the least worst way of handling the join, which might be much slower than the remote join.  (It’s an odd little quirk that when the select is fully remote the optimizer thinks that it’s not using CPU costing. Possibly that’s because all the arithmetic happens somewhere else and the local cost of the query never gets above zero.)

This is one of those cases where we might work around the problem by creating a remote view to handle the join – hence the creation of the view v1; here’s a suitable statement and the resulting execution plan:

prompt  ==============================
prompt  Insert using view and sequence
prompt  ==============================

set serveroutput off

insert into t3 (id1, id2, n0, n1, n2, v1, v2)
select
        v1.id1,
        v1.id2,
        s1.nextval,
        v1.n1,
        v1.n2,
        v1.v1,
        v1.v2
from
        v1@&m_target v1
;

select * from table(dbms_xplan.display_cursor(format=>'-plan_hash'));
commit;



==============================
Insert using view and sequence
==============================

10000 rows created.


SQL_ID  4tz0rrqt87nb8, child number 0
-------------------------------------
insert into t3 (id1, id2, n0, n1, n2, v1, v2) select  v1.id1,  v1.id2,
s1.nextval,  v1.n1,  v1.n2,  v1.v1,  v1.v2 from  v1@orclpdb@loopback v1

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |       |       |    27 (100)|          |        |      |
|   1 |  LOAD TABLE CONVENTIONAL | T3   |       |       |            |          |        |      |
|   2 |   SEQUENCE               | S1   |       |       |            |          |        |      |
|   3 |    REMOTE                | V1   | 10000 |   937K|    27   (8)| 00:00:01 | ORCLP~ | R->S |
-------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------
   3 - SELECT /*+ OPAQUE_TRANSFORM */ "ID1","ID2","N1","N2","V1","V2" FROM "V1" "V1"
       (accessing 'ORCLPDB.LOCALDOMAIN@LOOPBACK' )

As you can see, the local optimizer doesn’t know enough about the remote view to be able to split it into components and make a mess of the execution plan, it simply sees a query against a “single table” and leaves the remote database to worry about optimising it. But, of course, we don’t always have the luxury of being able to create objects on someone else’s database, so what’s the alternative?

Try rewriting the query to use an inline view with the /*+ no_merge() */ hint:


prompt  =======================================
prompt  Insert from a no-merge inline view with
prompt  a sequence.nextval in the outer query
prompt  ========================================

set serveroutput off

insert into t3 (id1, id2, n0, n1, n2, v1, v2)
select
        id1, id2, s1.nextval, n1, n2, v1, v2
from    (
        select  /*+ no_merge */
                t1.id   id1,    
                t2.id   id2,
                0       n0,
                t1.n1,
                t2.n2,
                t1.v1,
                t2.v2
        from
                t1@&m_target    t1,
                t2@&m_target    t2
        where
                t2.id = t1.id
        )       v1
;

select * from table(dbms_xplan.display_cursor(format=>'-plan_hash'));
commit;


=======================================
Insert from a no-merge inline view with
a sequence.nextval in the outer query
========================================

10000 rows created.


SQL_ID  20z81g550tbsk, child number 0
-------------------------------------
insert into t3 (id1, id2, n0, n1, n2, v1, v2) select  id1, id2,
s1.nextval, n1, n2, v1, v2 from (  select /*+ no_merge */   t1.id id1,
 t2.id id2,   0 n0,   t1.n1,   t2.n2,   t1.v1,   t2.v2  from
t1@orclpdb@loopback t1,   t2@orclpdb@loopback t2  where   t2.id = t1.id
 ) v1

-------------------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |       |       |    54 (100)|          |        |      |
|   1 |  LOAD TABLE CONVENTIONAL | T3   |       |       |            |          |        |      |
|   2 |   SEQUENCE               | S1   |       |       |            |          |        |      |
|   3 |    VIEW                  |      | 10000 |   937K|    54   (8)| 00:00:01 |        |      |
|   4 |     REMOTE               |      |       |       |            |          | ORCLP~ | R->S |
-------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------
   4 - SELECT /*+ NO_MERGE */ "A2"."ID","A1"."ID",0,"A2"."N1","A1"."N2","A2"."V1","A1"."V
       2" FROM "T1" "A2","T2" "A1" WHERE "A1"."ID"="A2"."ID" (accessing
       'ORCLPDB.LOCALDOMAIN@LOOPBACK' )


The hint has done exactly what we needed: it has passed the text of the inline view to the remote database for optimisation so the join happens remotely, and the sequence number is then included after the result set comes back from the remote database. The SQL is a little messier, of course, mostly thanks to the doubled appearance of the columns in the select list.

Summary

Using a local sequence in DML that accesses a remote database makes the optimizer treats the underlying query as a distributed query, and this may mean it can’t find an efficient execution path unless you do some re-engineering of the code. If you can manage to make an insert with a constant efficient then using that version of the code as an in-line no_merge view with one extra layer that brings the sequence into play may be all you need to do to make the DML operate efficiently.

1 Comment »

  1. […] start with the database link, tables and sequence I created in the previous article (link will open in a separate tab/window for easy viewing), but this time I’m going to do […]

    Pingback by Distributed Sequences | Oracle Scratchpad — June 1, 2021 @ 5:55 pm BST Jun 1,2021 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.