IOT Load

November 27, 2012

IOT Load

Filed under: Infrastructure,IOT,Oracle — Jonathan Lewis @ 5:15 pm GMT Nov 27,2012

When I introduced Connor McDonald’s blog a few days ago, it was because we had exchanged a couple of email messages (through the Oak Table Network) about how to minimise the resource consumption when copying a load of data from one IOT to another of the same structure. His problem was the way in which the obvious way of copying the data resulted in a massive sort even though, in principle, it should not have been necessary to sort anything since the data could have been extracted in order by walking the existing IOT.

As a suggestion I referenced a comment I had made in the Addenda to Practical Oracle 8i about 12 years ago when I had first solved the problem of loading an IOT with minimal logging and no sorting. At the time I had been loading data from a sorted file into an empty table that was then going to be exchanged into a partitioned IOT – but it crossed my mind that loading from a flat file and loading from a UNIX pipe were pretty much the same thing, so perhaps Connor could workaround his problem by making one session spool to a pipe while another session was reading it. In the end, he simply created a massive temporary tablespace, but I thought I’d modify a test script I wrote a few years ago to demonstrate my idea – and here it is:

rem
rem     Script:         iot_load.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2003
rem     Updated:        Nov 2012
rem

rem
rem     create an empty IOT
rem     Note the columns used for the PK
rem

create table iot_load(
        id      number,
        name    varchar2(32),
        padding varchar2(100),
        constraint iot_pk primary key(name, id)
)
organization index
;

rem
rem     Create a load file
rem     But since we are running UNIX use the mknod command to
rem     create a pipe before spooling to that pipe.
rem

set pagesize 0
set feedback off
set newpage 0
set trimspool on
set linesize 180
set timing off
set termout off

column object_name format a38

host mknod /tmp/iot_load.dat p

spool /tmp/iot_load.dat

select 
        r, object_name, padding
from
(
        select
                rownum            r,
                object_name,
                rpad('x',100,'x') padding
        from
                all_Objects
        where
                rownum <= 10000 -- > comment to avoid wordpress format issue
)
order by
        object_name, r
;

spool off

This script creates a target IOT, and uses a simple select statement (you get a clue about how old this code is by the fact that I’ve referenced all_objects, and not a “connect by” with dual) to generate a set of date that is sorted in an order that is a suitable match for the definition of the IOT. In Connor’s case he could have used a simple ‘select from source IOT order by primary key’. I’ve set up a few SQL*Plus environment details to make the output completely flat and undecorated. A key point, though, is that I’m spooling to a file called /tmp/iot_load.dat and that file has been pre-created as a UNIX pipe. When you run this script the select statement will hang almost immediately because the pipe will become full, at which point you need another process to start emptying it. So from another UNIX session run the following (changing the userid and password as necessary):

sqlldr userid=test_user/test control=iot_load data=/tmp/iot_load.dat

Here’s the content of the control file iot_load.ctl

OPTIONS (direct = true)
UNRECOVERABLE
LOAD DATA
TRUNCATE
INTO table iot_load
sorted indexes (iot_pk)
(
        id              position(01:10) char,
        name            position(12:49) char,
        padding         position(51:150) char
)

We use direct path load (which means virtually no undo) in unrecoverable mode (which means virtually no redo). But the unrecoverable bit only works because we start by truncating the IOT, and promising that the incoming data is sorted in order of the IOT’s primary key index. And since the data is appearing in order, Oracle doesn’t have to sort it before inserting it. Net result: no undo, no redo (apart from metadata) and no sorting – i.e. minimum overhead for loading the IOT – we simply walk the source in order and copy it into the target.

As this session reads from the pipe, the first session can resume writing into the pipe, so the data flows from source to target through a small piece of shared memory (technically a bit of the file-system buffer, I suppose) until the source session closes the pipe with its spool off.

Update (Feb 2021)

Revisiting Connor’s blog I saw that Mathias Rogel had added a suggestion a couple of years on (so, perhaps, a newer version of Oracle) that there was a NOSORT option in the syntax for “organization index”, hence:

create table T_TGT ( n , r , constraint T_TGT_PK primary key( n ))
organization index NOSORT
nologging
as select * from T_SOURCE order by 1
/

This doesn’t quite work. In 19c I have a test which tries to create a 50,000 row IOT by copying in order a 50,000 rows IOT. The select avoids any sorting by doing a full index scan, but part of the internal code for creating an IOT is statement to create an index as the “IOT TOP” segment. Typically it looks something like the following – which I extracted after running tkprof against a tracefile for a statement that created IOT3 from IOT1:

CREATE UNIQUE INDEX "TEST_USER"."SYS_IOT_TOP_105150" on "TEST_USER"."IOT3"("OBJECT_ID") 
INDEX ONLY TOPLEVEL
NOLOGGING NOPARALLEL as
select   ...


Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  INDEX BUILD UNIQUE SYS_IOT_TOP_105150 (cr=50049 pr=0 pw=384 time=292319 us starts=1)(object id 0)
     50000      50000      50000   SORT CREATE INDEX (cr=50001 pr=0 pw=0 time=64659 us starts=1 cost=390 size=2400000 card=50000)
     50000      50000      50000    SORT CREATE INDEX (cr=50001 pr=0 pw=0 time=45852 us starts=1 cost=390 size=2400000 card=50000)
     50000      50000      50000     INDEX FULL SCAN IOT1_PK (cr=50001 pr=0 pw=0 time=36708 us starts=1 cost=390 size=2400000 card=50000)(object id

As you can see Oracle has done an index full scan to get the data – and that would produce it in the right order to fill the index. Then it has sorted it twice! And with a 10032 (Sort) trace included it’s easy to see that it has sorted data that’s already sorted in exactly the same way.

Adding the (undocumented) NOSORT option to the create table statement, the plan reported by tkprof changes to:


CREATE UNIQUE INDEX "TEST_USER"."SYS_IOT_TOP_105148" on "TEST_USER"."IOT2"("OBJECT_ID") 
INDEX ONLY TOPLEVEL 
NOSORT NOLOGGING NOPARALLEL as
select ...

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  INDEX BUILD UNIQUE SYS_IOT_TOP_105148 (cr=50049 pr=0 pw=384 time=292877 us starts=1)(object id 0)
     50000      50000      50000   SORT CREATE INDEX (cr=50001 pr=0 pw=0 time=48567 us starts=1 cost=390 size=2400000 card=50000)
     50000      50000      50000    INDEX FULL SCAN IOT1_PK (cr=50001 pr=0 pw=0 time=34790 us starts=1 cost=390 size=2400000 card=50000)(object id 105147)

With the addition of the NOSORT option we see it echoed in the Create Index command, and one of the two sorts disappears from the plan (and from the 10032 trace and from the session statistics).

The reason we do two “sort create index” operations is because of an “order by” clause I added to my “create table” statement – but if I take that out Oracle insists on doing an index fast full scan – which results in no sorting if the NOSORT is in place, but in the general case that means the statement could fail because the fast full scan won’t visit the index leaf blocks in key order. (It does, however, reduce eliminate one of the two sorts in the case where I haven’t used the NOSORT option).

Bottom line – maybe the pipe is the only way to do an absolute minimum of work at present.

Comments (12)

12 Comments »

Hello Jonathan,

After a quick search of the documentation, I don’t see any equivalent to the “sorted indexes” clause when loading from an external table. Is there such a thing, or is this one advantage of SQL*Loader over external tables?

Comment by Stew Ashton — November 27, 2012 @ 8:14 pm GMT Nov 27,2012 | Reply
- Stew,
  
  This option exists simply to tell Oracle that it can physically build an index in order as the data arrives, but the raw data from external tables never ends up in the database, so never needs to be indexed, so the feature doesn’t apply.
  
  There is an option with sqlldr, though to turn a control file into an external table definition (sqlldr … external_table = generate_only) so I ran through a little test with my control file to see what would happen:
```
[jonathan@linux01 working]$  sqlldr userid=test_user/test control=iot_load data=/tmp/iot_load.dat external_table=generate_only
```
  Gradually hacking my way through the control file, I got the following errors:
```
SQL*Loader-144: Conflicting load methods: direct=true/external_table=generate_only specified.
SQL*Loader-268: UNRECOVERABLE keyword may be used only in direct path.
SQL*Loader-256: SORTED INDEXES option allowed only for direct path
```
  Comment by Jonathan Lewis — December 13, 2012 @ 6:54 pm GMT Dec 13,2012 | Reply
Reblogged this on lava kafle kathmandu nepal.

Comment by lkafle — November 28, 2012 @ 3:22 am GMT Nov 28,2012 | Reply
Nice post and a good example of re-using old code. Do you happen to know if there is an equivalent windows utility to the unix mknod program for the pipe function?

Incidentally, and off topic, I note your ‘Curent Moon’ status is behind. If you pop your head out of the window you’ll see it’s a full moon today!

Comment by Tony Sleight — November 28, 2012 @ 7:13 am GMT Nov 28,2012 | Reply
- How long is the full moon? A second? A minute? Wasn’t when you posted just before and now aren’t we just past the Full Beaver Moon?
  
  MS seems to have examples of named pipes, but it seems obsolete: https://forums.oracle.com/forums/thread.jspa?threadID=2243304
  
  Comment by jgarry — November 29, 2012 @ 5:41 pm GMT Nov 29,2012 | Reply
- Tony,
  
  I did a Google search for “mknod for windows” and the first hit was “CoreUtils for Windows” http://gnuwin32.sourceforge.net/packages/coreutils.hten't which claimed to have all you need.
  I haven’t tried it though.
  
  (I haven’t been watching the moon display – I may delete it soon.)
  
  Comment by Jonathan Lewis — December 13, 2012 @ 6:57 pm GMT Dec 13,2012 | Reply
As an off-topic, may i know whats the purpose of rpad(‘x’,100,’x’) padding in your example?

Thanks,

Comment by pandeesh — November 28, 2012 @ 7:28 am GMT Nov 28,2012 | Reply
- Pandeesh,
  
  It’s a general purpose bit of padding that I use to create rows of a reasonable size. It’s just an attempt to avoid extreme boundary conditions of very short rows in some of my tests.
  
  Comment by Jonathan Lewis — December 13, 2012 @ 6:58 pm GMT Dec 13,2012 | Reply
When Oracle is able to write the ‘select’ content into pipe(spool file), why Oracle is not able to write the 10046 tracing content to pipe(10046 trace file) ?

-Yasser

Comment by Yasser Khan — December 5, 2012 @ 10:13 am GMT Dec 5,2012 | Reply
- Yasser,
  
  Since there’s no mechanical reason why it shouldn’t be possible, I think the answer has to be that the code has been fixed to avoid that possibility. (I tried to create a pipe with the name of a trace file that had not yet been opened, but when I enabled tracing (dbms_monitor call) the session didn’t try to write to the pipe. I suppose I could try using truss (or tusc, or some such) to see exactly what the session did – but I’d rather just suggest that Oracle doesn’t allow you to trace to a file because the session would have to stop and wait for another process to read from the pipe all the time, and this might cause problems.
  
  I think Tanel Poder once demonstrated a cunning trick where he managed to get the trace file written to the screen as he ran some SQL – but I don’t remember how he did it, or even what platform he was running on at the time. (You could try searching his blog.)
  
  Comment by Jonathan Lewis — December 13, 2012 @ 9:48 pm GMT Dec 13,2012 | Reply
  - Thanks for your time in researching into this issue. At last i was able to simulate it in RHEL but not in SUSE and SOLARIS.
    
    By the way its Tanel Poder’s trick which i was trying to simulate it.
    
    Comment by Yasser Khan — December 17, 2012 @ 10:59 am GMT Dec 17,2012 | Reply
[…] Thoughts on loading an empty IOT as quickly as possible (Nov 2012) […]

Pingback by IOTs | Oracle Scratchpad — February 11, 2021 @ 5:46 pm GMT Feb 11,2021 | Reply