Oracle Scratchpad

November 27, 2012

IOT Load

Filed under: Infrastructure,IOT,Oracle — Jonathan Lewis @ 5:15 pm GMT Nov 27,2012

When I introduced Connor McDonald’s blog a few days ago, it was because we had exchanged a couple of email messages (through the Oak Table Network) about how to minimise the resource consumption when copying a load of data from one IOT to another of the same structure. His problem was the way in which the obvious way of copying the data resulted in a massive sort even though, in principle, it should not have been necessary to sort anything since the data could have been extracted in order by walking the existing IOT.

As a suggestion I referenced a comment I had made in the Addenda to Practical Oracle 8i about 12 years ago when I had first solved the problem of loading an IOT with minimal logging and no sorting. At the time I had been loading data from a sorted file into an empty table that was then going to be exchanged into a partitioned IOT – but it crossed my mind that loading from a flat file and loading from a UNIX pipe were pretty much the same thing, so perhaps Connor could workaround his problem by making one session spool to a pipe while another session was reading it. In the end, he simply created a massive temporary tablespace, but I thought I’d modify a test script I wrote a few years ago to demonstrate my idea – and here it is:

rem     Script:         iot_load.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2003
rem     Updated:        Nov 2012

rem     create an empty IOT
rem     Note the columns used for the PK

create table iot_load(
        id      number,
        name    varchar2(32),
        padding varchar2(100),
        constraint iot_pk primary key(name, id)
organization index

rem     Create a load file
rem     But since we are running UNIX use the mknod command to
rem     create a pipe before spooling to that pipe.

set pagesize 0
set feedback off
set newpage 0
set trimspool on
set linesize 180
set timing off
set termout off

column object_name format a38

host mknod /tmp/iot_load.dat p

spool /tmp/iot_load.dat

        r, object_name, padding
                rownum            r,
                rpad('x',100,'x') padding
                rownum <= 10000 -- > comment to avoid wordpress format issue
order by
        object_name, r

spool off

This script creates a target IOT, and uses a simple select statement (you get a clue about how old this code is by the fact that I’ve referenced all_objects, and not a “connect by” with dual) to generate a set of date that is sorted in an order that is a suitable match for the definition of the IOT. In Connor’s case he could have used a simple ‘select from source IOT order by primary key’. I’ve set up a few SQL*Plus environment details to make the output completely flat and undecorated. A key point, though, is that I’m spooling to a file called /tmp/iot_load.dat and that file has been pre-created as a UNIX pipe. When you run this script the select statement will hang almost immediately because the pipe will become full, at which point you need another process to start emptying it. So from another UNIX session run the following (changing the userid and password as necessary):

sqlldr userid=test_user/test control=iot_load data=/tmp/iot_load.dat

Here’s the content of the control file iot_load.ctl

OPTIONS (direct = true)
INTO table iot_load
sorted indexes (iot_pk)
        id              position(01:10) char,
        name            position(12:49) char,
        padding         position(51:150) char

We use direct path load (which means virtually no undo) in unrecoverable mode (which means virtually no redo). But the unrecoverable bit only works because we start by truncating the IOT, and promising that the incoming data is sorted in order of the IOT’s primary key index. And since the data is appearing in order, Oracle doesn’t have to sort it before inserting it. Net result: no undo, no redo (apart from metadata) and no sorting – i.e. minimum overhead for loading the IOT – we simply walk the source in order and copy it into the target.

As this session reads from the pipe, the first session can resume writing into the pipe, so the data flows from source to target through a small piece of shared memory (technically a bit of the file-system buffer, I suppose) until the source session closes the pipe with its spool off.

Update (Feb 2021)

Revisiting Connor’s blog I saw that Mathias Rogel had added a suggestion a couple of years on (so, perhaps, a newer version of Oracle) that there was a NOSORT option in the syntax for “organization index”, hence:

create table T_TGT ( n , r , constraint T_TGT_PK primary key( n ))
organization index NOSORT
as select * from T_SOURCE order by 1

This doesn’t quite work. In 19c I have a test which tries to create a 50,000 row IOT by copying in order a 50,000 rows IOT. The select avoids any sorting by doing a full index scan, but part of the internal code for creating an IOT is statement to create an index as the “IOT TOP” segment. Typically it looks something like the following – which I extracted after running tkprof against a tracefile for a statement that created IOT3 from IOT1:

select   ...

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  INDEX BUILD UNIQUE SYS_IOT_TOP_105150 (cr=50049 pr=0 pw=384 time=292319 us starts=1)(object id 0)
     50000      50000      50000   SORT CREATE INDEX (cr=50001 pr=0 pw=0 time=64659 us starts=1 cost=390 size=2400000 card=50000)
     50000      50000      50000    SORT CREATE INDEX (cr=50001 pr=0 pw=0 time=45852 us starts=1 cost=390 size=2400000 card=50000)
     50000      50000      50000     INDEX FULL SCAN IOT1_PK (cr=50001 pr=0 pw=0 time=36708 us starts=1 cost=390 size=2400000 card=50000)(object id 

As you can see Oracle has done an index full scan to get the data – and that would produce it in the right order to fill the index. Then it has sorted it twice! And with a 10032 (Sort) trace included it’s easy to see that it has sorted data that’s already sorted in exactly the same way.

Adding the (undocumented) NOSORT option to the create table statement, the plan reported by tkprof changes to:

select ...

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  INDEX BUILD UNIQUE SYS_IOT_TOP_105148 (cr=50049 pr=0 pw=384 time=292877 us starts=1)(object id 0)
     50000      50000      50000   SORT CREATE INDEX (cr=50001 pr=0 pw=0 time=48567 us starts=1 cost=390 size=2400000 card=50000)
     50000      50000      50000    INDEX FULL SCAN IOT1_PK (cr=50001 pr=0 pw=0 time=34790 us starts=1 cost=390 size=2400000 card=50000)(object id 105147)

With the addition of the NOSORT option we see it echoed in the Create Index command, and one of the two sorts disappears from the plan (and from the 10032 trace and from the session statistics).

The reason we do two “sort create index” operations is because of an “order by” clause I added to my “create table” statement – but if I take that out Oracle insists on doing an index fast full scan – which results in no sorting if the NOSORT is in place, but in the general case that means the statement could fail because the fast full scan won’t visit the index leaf blocks in key order. (It does, however, reduce eliminate one of the two sorts in the case where I haven’t used the NOSORT option).

Bottom line – maybe the pipe is the only way to do an absolute minimum of work at present.


  1. Hello Jonathan,

    After a quick search of the documentation, I don’t see any equivalent to the “sorted indexes” clause when loading from an external table. Is there such a thing, or is this one advantage of SQL*Loader over external tables?

    Comment by Stew Ashton — November 27, 2012 @ 8:14 pm GMT Nov 27,2012 | Reply

    • Stew,

      This option exists simply to tell Oracle that it can physically build an index in order as the data arrives, but the raw data from external tables never ends up in the database, so never needs to be indexed, so the feature doesn’t apply.

      There is an option with sqlldr, though to turn a control file into an external table definition (sqlldr … external_table = generate_only) so I ran through a little test with my control file to see what would happen:

      [jonathan@linux01 working]$  sqlldr userid=test_user/test control=iot_load data=/tmp/iot_load.dat external_table=generate_only

      Gradually hacking my way through the control file, I got the following errors:

      SQL*Loader-144: Conflicting load methods: direct=true/external_table=generate_only specified.
      SQL*Loader-268: UNRECOVERABLE keyword may be used only in direct path.
      SQL*Loader-256: SORTED INDEXES option allowed only for direct path

      Comment by Jonathan Lewis — December 13, 2012 @ 6:54 pm GMT Dec 13,2012 | Reply

  2. Reblogged this on lava kafle kathmandu nepal.

    Comment by lkafle — November 28, 2012 @ 3:22 am GMT Nov 28,2012 | Reply

  3. Nice post and a good example of re-using old code. Do you happen to know if there is an equivalent windows utility to the unix mknod program for the pipe function?

    Incidentally, and off topic, I note your ‘Curent Moon’ status is behind. If you pop your head out of the window you’ll see it’s a full moon today!

    Comment by Tony Sleight — November 28, 2012 @ 7:13 am GMT Nov 28,2012 | Reply

  4. As an off-topic, may i know whats the purpose of rpad(‘x’,100,’x’) padding in your example?


    Comment by pandeesh — November 28, 2012 @ 7:28 am GMT Nov 28,2012 | Reply

    • Pandeesh,

      It’s a general purpose bit of padding that I use to create rows of a reasonable size. It’s just an attempt to avoid extreme boundary conditions of very short rows in some of my tests.

      Comment by Jonathan Lewis — December 13, 2012 @ 6:58 pm GMT Dec 13,2012 | Reply

  5. When Oracle is able to write the ‘select’ content into pipe(spool file), why Oracle is not able to write the 10046 tracing content to pipe(10046 trace file) ?


    Comment by Yasser Khan — December 5, 2012 @ 10:13 am GMT Dec 5,2012 | Reply

    • Yasser,

      Since there’s no mechanical reason why it shouldn’t be possible, I think the answer has to be that the code has been fixed to avoid that possibility. (I tried to create a pipe with the name of a trace file that had not yet been opened, but when I enabled tracing (dbms_monitor call) the session didn’t try to write to the pipe. I suppose I could try using truss (or tusc, or some such) to see exactly what the session did – but I’d rather just suggest that Oracle doesn’t allow you to trace to a file because the session would have to stop and wait for another process to read from the pipe all the time, and this might cause problems.

      I think Tanel Poder once demonstrated a cunning trick where he managed to get the trace file written to the screen as he ran some SQL – but I don’t remember how he did it, or even what platform he was running on at the time. (You could try searching his blog.)

      Comment by Jonathan Lewis — December 13, 2012 @ 9:48 pm GMT Dec 13,2012 | Reply

      • Thanks for your time in researching into this issue. At last i was able to simulate it in RHEL but not in SUSE and SOLARIS.

        By the way its Tanel Poder’s trick which i was trying to simulate it.

        Comment by Yasser Khan — December 17, 2012 @ 10:59 am GMT Dec 17,2012 | Reply

  6. […] Thoughts on loading an empty IOT as quickly as possible (Nov 2012) […]

    Pingback by IOTs | Oracle Scratchpad — February 11, 2021 @ 5:46 pm GMT Feb 11,2021 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by