Index Explosion

July 28, 2009

Index Explosion

Filed under: Index Explosion,Indexing,Infrastructure,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 7:28 pm BST Jul 28,2009

In Index Quiz 1 and Index Quiz 2 I demonstrated a couple of details of how the ITL (interested transaction list) behaves in indexes. In this note I’m going to explain how these details can result in a nasty waste of space in indexes.

The two points I made in the previous posts were:

at high levels of concurrency you can “lose” a lot of space in an index leaf block to the ITL
when a leaf block splits the ITL of the newer block is a copy of the ITL from the older block

One consequence of point 2 is that you need only have one brief burst of activity that makes one or two ITLs grow to an uncharacteristic size, after which every leaf block that splits off in that portion of the index from then onwards will have a large ITL even if it doesn’t need it.

Combine this with something that I think is probably a bug – but which may actually have been deliberately designed in as a feature (with an unexpected side effect) for concurrency reasons – and surprises appear.

In Index Quiz 1 I showed you an index with the following statistics, and pointed out that it had lost about 50% of it’s available space because its ITLs had grown to the maximum (for an 8KB block) of 169 entries – and pointed out that this can be a side effect of very high concurrency. In fact, when I ran the test case, I was running just eight concurrent processes on a machine with two CPUs.

Here’s some code if you want to try the test on your own systems:


rem
rem     Script:         order_create.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2009
rem

create sequence s1   cache 250000;

create table orders (
        date_placed	date	not null,
        order_id	number(8),
        time_stamp	timestamp,
        id_owner	number(2),
        padding		varchar2(168)
)
tablespace test_8k_assm
initrans 8
;

create index ord_placed on orders(date_placed)
tablespace test_8k
initrans 9
;

The tablespace test_8k_assm is using 8KB blocks, 1MB uniform extent sizes, and automatic segment space management (ASSM). The tablespace test_8k is similar but uses freelist (manual) segment space management. I’ve used initrans 8 on the table and initrans 9 on the index because I tested eight concurrent processes (the setting on the table was probably redundant given I was using ASSM and used less than 16 concurrent sessions in the test).

After creating the sequence, table, and index , you can run the following PL/SQL from as many different sessions as you like (adjusting the limit on the loop if necessary):


rem
rem     Script:         order_entry.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2009
rem

declare
        m_ord	number(8)	:= 0;
begin
        while m_ord <= 100000 loop

                insert into orders (
                        date_placed, order_id, padding
                )
                values (
                        sysdate - 1000 + s1.nextval/1000,
                        s1.nextval,
                        rpad('x',168)
                )
                returning order_id into m_ord;

--              dbms_lock.sleep(0.01);

                commit write immediate wait;  -- 10g feature
        end loop;
end;
/

The code is designed to model an order entry system with 1,000 orders per day over the last 1,000 days. But the rate of data entry is, of course, accelerated to extremes.

If you run just one copy of the PL/SQL and validate the index afterwards you should get results like this (running 10.2.0.3):

HEIGHT                        : 2
BLOCKS                        : 256
LF_ROWS                       : 100001
LF_BLKS                       : 238
LF_ROWS_LEN                   : 1900019
LF_BLK_LEN                    : 8000
BR_ROWS                       : 237
BR_BLKS                       : 1
BR_ROWS_LEN                   : 3312
BR_BLK_LEN                    : 8032
BTREE_SPACE                   : 1912032
USED_SPACE                    : 1903331
PCT_USED                      : 100

The index space is perfectly (100%) used – it’s the natural consequence of the sequential nature of the data inserts; and although we specified (and Oracle “ignored” initrans 9) we can see that the lf_blk_len is 8,000 which means every leaf block still has the minimum two entries in its ITL.

If you run two copies of the script (inserting 50,000 rows each), you may see results more like this:

HEIGHT                        : 2
BLOCKS                        : 384
LF_ROWS                       : 100002
LF_BLKS                       : 255
LF_ROWS_LEN                   : 1900038
LF_BLK_LEN                    : 7976
BR_ROWS                       : 254
BR_BLKS                       : 1
BR_ROWS_LEN                   : 3544
BR_BLK_LEN                    : 8032
BTREE_SPACE                   : 2041912
USED_SPACE                    : 1903582
PCT_USED                      : 94

There’s just a few percent “lost” space, and the lf_blk_len is showing that some blocks have acquired a third entry in the ITL – hence the 24 byte drop from 8,000 to 7,976.

If you run something like my “index efficiency” code to check how well each block in the index is used, you may find something like this:

ROWS_PER_BLOCK     BLOCKS TOTAL_ROWS RUNNING_TOTAL
-------------- ---------- ---------- -------------
           206         12       2472            12
           208          1        208            13
           209          1        209            14
           211          2        422            16
           212          1        212            17
           214         15       3210            32
           260          1        260            33
           410          1        410            34
           419        221      92599           255
               ---------- ----------
sum                   255     100002

Most blocks are full (ca. 420 entries), after a “90/10 leaf node split”, but a few have done a 50/50 split (ca. 210 entries). The 50/50 splits are what you get on a multi-user system with multiple CPUs. Occasionally a session will get its sequence number then get pre-empted by the operating system, allowing another session to get and insert a higher value: with a little bad luck this will happen just as a leaf block fills.

Important Note: The laptop that I used to generate these results has 2 CPUs – if you try running this test on a machine with a single CPU then the concurrent test may give dramatically different results.

Now repeat with more copies of the PL/SQL (scaling down the number of rows per session as you increase the number of sessions). As the number of concurrent sessions grows the space requirement will climb, and the lf_blk_len will eventually drop to 3,992 and the number of leaf blocks (lf_blks) will probably be two or three times as large as you got in serial execution (In the worst case it could be four times as large, but this would only happen if every block had lost half its space to ITLs and did a 50/50 split).

If you have N CPUs (N > 1) then I would expect to see the problem starting to appear somewhere between N+1 and 2N concurrent sessions – but if you want to short-cut the testing just go for 4N sessions and see what happens. If the CPU is already heavily loaded before you start then the problem will appear with fewer concurrent sessions (I got some dramatic results by running with just a couple of sessions whilst doing a full machine virus scan at maximum speed).

The test isn’t deterministic – the results will depend on things like the version of Oracle, size of the redo log buffer, the size of the redo log files, the speed of the I/O subsystem, the number of CPUs, the operating system, the process ids of the sessions you happen to connect to (that’s a side effect of ASSM) , and any workload that happens to be going on at the time. But the bottom line is this – if you’ve got hot spots in indexes that are subject to a lot of concurrent DML then you can find yourself wasting space unnecessarily in that area of the index.

In the next installment of the series I’ll make some comments about what I think is happening, and discuss how to address the issue. But before I finish, here’s how odd you can make index_stats look if you engineer a bizarre accident – it must be a really efficient index, I’ve used 174% of the available space !

HEIGHT                        : 2
BLOCKS                        : 384
NAME                          : ORD_PLACED
LF_ROWS                       : 110008
LF_BLKS                       : 301
LF_ROWS_LEN                   : 2090152
LF_BLK_LEN                    : 3992
BR_ROWS                       : 300
BR_BLKS                       : 1
BR_ROWS_LEN                   : 4191
BR_BLK_LEN                    : 8032
DISTINCT_KEYS                 : 110008
MOST_REPEATED_KEY             : 1
BTREE_SPACE                   : 1209624
USED_SPACE                    : 2094343
PCT_USED                      : 174
ROWS_PER_KEY                  : 1
BLKS_GETS_PER_ACCESS          : 3

Update Sept 2018

I’ve just re-run the test code on 18.3.0.0 on a (virtual) machine with 4 CPUs, running 4 concurrent insert statements.

By the time I had inserted 6,000 orders the ITL count (itc value) on the last few index leaf blocks had jumped to 39, so there still seems to be some room for improvement in this area. I’ve tried increasing the degree of concurrency and number duration of the test – but so far I’ve not gone above 39 ITL entries (a couple of other people have run this test on 12c over the years and reported 41 ITL entries) – so perhaps there’s a fairly hard limit that can be reached quite easily (using about 1KB from the block). Somewhere I’m sure I had a note (or found a note on the Internet) about how the fix probably worked, but I can’t find it at present. There is a parameter “_index_split_chk_cancel” (introduced in 12.2, default value 5) which looks as if it might be relevant, but it doesn’t seem to have an effect on the problem and might be something to do with how hard Oracle tries to re-use empty index blocks before allocating a newly formatted one.

Update Oct 2021

The effect is still present up to 21.3.0.0 – but I did extend the testing by one extra detail. I varied the setting of initrans on the index and found that the limit on the number of ITL slots allocated seemed to be related to the setting of initrans – the maximum number of ITL slots reached kept reaching something close to 30 + initrans.

[My catalogue page for the “Index Explosion” series ]

Comments (13)

13 Comments »

If you want a consistent example of the aforementioned worst case scenario: “In the worst case it could be four times as large, but this would only happen if every block had lost half its space to ITLs and did a 50/50 split”; all you need to do is to use RAC and run two sessions on each node (with at least one CPU per node) and be sure to specify the ‘order’ qualifier in the create sequence statement (otherwise some of the session will short-circuit their execution).

Comment by Jeroen — July 29, 2009 @ 7:43 am BST Jul 29,2009 | Reply
- Jeroen,
  
  That’s an interesting thought – have you actually tried it, or seen it ?
  
  I have to say that if you have set the ORDER option on a sequence it’s also NOCACHE – which means you would see lots of waits for the “SQ” enqueue, and lots of waits for “gc current block 2-way” or “gc current block busy” – not to mention “log file sync” as the block from seq$ bounced back and forth.
  
  Having said that; a sequence based index with a large sequence cache size on RAC will always be doing 50/50 splits on all nodes except one – so going highly concurrent in that case could give you 25% effective utilisation over most of the index.
  
  Update: The comment about ORDER implying NOCACHE is wrong. The impact of order is that there is a single cached sequence value shared by all instances, and it gets passed from instance to instance through a global SV enqueue (resulting in waits for DFS Lock Handle). The sequence is still CACHE (unless explicitly set to NOCACHE). The higher incidence of “gc current block busy” etc. is, at most, a side effect of the rate of sequence use if the CACHE size has not been set large enough (and it’s common to see the CACHE size left to its default of 20, even for very high use sequences). The SQ enqueue is taken each time the seq$ entry is updated and the block from the seq$ table is, of course, subject to cache fusion etc.
  
  Comment by Jonathan Lewis — August 6, 2009 @ 3:50 pm BST Aug 6,2009 | Reply
  - My observations are based on an experiment with Oracle 11.1.0.7 running on a Linux operating system. I don’t recall seeing a lot of SQ-type waits in the AWR reports I generated, but do recall seeing many waits for the “DFS lock handle” event. I did some google-ing on this and located an article on the Pythian web site that explains in detail the synchronisation of sequences between nodes in an RAC system. The URL is http://www.pythian.com/news/383/sequences-in-oracle-10g-rac.
    
    Comment by Jeroen — August 12, 2009 @ 12:57 am BST Aug 12,2009 | Reply
PCT_USED : 174
But before I finish, here’s how odd you can make index_stats look if you engineer a bizarre accident – it must be a really efficient index, I’ve used 174% of the available space !

Is that a bug to show 174%?

Comment by Daniel — August 3, 2009 @ 12:43 pm BST Aug 3,2009 | Reply
- Daniel,
  “Is that a bug to show 174%”.
  
  Since it’s not possible to use 174% of the space available in an index, I think the only way to describe it is as a bug.
  If you want to be charitable, you could call it an approximation that loses accuracy as the variation in ITL usage increases.
  
  Comment by Jonathan Lewis — August 6, 2009 @ 4:21 pm BST Aug 6,2009 | Reply
A little update on this topic. I’ve just been sent an email pointing me to Metalink Bug 8767925.

If you do a search on the bug number you will be able to find the abstract for it that says: “ADD MORE LOGIC FOR RESERVING ITL SPACE FOR INDEX LEAF BLOCKS.”

The interesting thing about the note is that the “Modified date” (when I first saw it) was 05-AUG-2009 – just one week after I posted the test case above.

I can’t help wondering whether this is a coincidence or whether someone at Oracle Support (or one of their customers) has been reading my blog and just discovered the cause of a problem that’s been bugging them.

Comment by Jonathan Lewis — August 13, 2009 @ 11:40 am BST Aug 13,2009 | Reply
[…] comparison purposes, here’s a section of the index produced during one run of my “index explosion” […]

Pingback by treedump « Oracle Scratchpad — August 17, 2009 @ 5:32 pm BST Aug 17,2009 | Reply
[…] Index Explosion https://jonathanlewis.wordpress.com/2009/07/28/index-explosion/ […]

Pingback by reading this week | Sidney's blog — June 18, 2010 @ 11:42 am BST Jun 18,2010 | Reply

A little update:

This problem is now visible as a bug on Metalink (MOS). Bug number 9865890, raised by a client of mine because they didn’t like having indexes with 2.3 million blocks when they should have been closer to 1.2 million blocks.

The test case in the bug is from the code above, but run on an AIX box with 16 CPUs.

Comment by Jonathan Lewis — August 23, 2010 @ 5:03 pm BST Aug 23,2010 | Reply

Latest update:

This bug has been linked to base bug 8767925 (see comment #4 above) which is now reported as “fixed in version 12.1”; so the client that raised bug 9865890 has asked for a backport to 11.1.0.7.

Comment by Jonathan Lewis — November 22, 2010 @ 12:13 pm GMT Nov 22,2010 | Reply

Jonathan, As per Oracle support note,”Bug 8767925 – ITL wasting a lot of space in indexes with high concurrency (Doc ID 8767925.8)”, The fix for 8767925 is first included in 11.2.0.4(server patch set) and in 12.1.0.1(base release). I have tested in Oracle RDBMS version 11.2.0.4.1 (Linux x86_64 bit,64 CPUs) with 5 concurrent users. The result shows that they didn’t fix it.

HEIGHT                        : 2
BLOCKS                        : 640
NAME                          : ORD_PLACED
PARTITION_NAME                :
LF_ROWS                       : 100005
LF_BLKS                       : 574
LF_ROWS_LEN                   : 1900095
LF_BLK_LEN                    : 3988
BR_ROWS                       : 573
BR_BLKS                       : 1
BR_ROWS_LEN                   : 7974
BR_BLK_LEN                    : 8028
DEL_LF_ROWS                   : 0
DEL_LF_ROWS_LEN               : 0
DISTINCT_KEYS                 : 100005
MOST_REPEATED_KEY             : 1
BTREE_SPACE                   : 2297140
USED_SPACE                    : 1908069
PCT_USED                      : 84
ROWS_PER_KEY                  : 1
BLKS_GETS_PER_ACCESS          : 3
PRE_ROWS                      : 0
PRE_ROWS_LEN                  : 0
OPT_CMPR_COUNT                : 0
OPT_CMPR_PCTSAVE              : 0
-----------------



ROWS_PER_BLOCK     BLOCKS     ROW_CT CUMULATIVE_BLOCKS
-------------- ---------- ---------- -----------------
           101          5        505                 5
           102          4        408                 9
           103          3        309                12
           104          3        312                15
           107          1        107                16
           109         18       1962                34
           112         31       3472                65
           113         31       3503                96
           114         21       2394               117
           115         19       2185               136
           116          6        696               142
           117          5        585               147
           118          6        708               153
           119         11       1309               164
           120          4        480               168
           121        115      13915               283
           163          1        163               284
           207          1        207               285
           208          1        208               286
           209         49      10241               335
           211          1        211               336
           214          1        214               337
           231          1        231               338
           232        231      53592               569
           417          2        834               571
           418          3       1254               574
               ---------- ----------
sum                   574     100005


Block header dump:  0x01000143
 Object id on Block? Y
 seg/obj: 0x16a01  csc: 0x00.666c43f9  itc: 151  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x1000140 ver: 0x01 opc: 0
     inc: 0  exflg: 0

Number of ITL entries in this leaf block is 151.

Comment by dbabibleantony — June 27, 2014 @ 7:08 pm BST Jun 27,2014 | Reply

Thanks for the details. Prompted by your report I’ve just run the test on 12c (12.1.0.1) with 2 CPUs and 4 processes and got the following results from index_stats:
```
HEIGHT                        : 2
BLOCKS                        : 640
NAME                          : ORD_PLACED
LF_ROWS                       : 100009
LF_BLKS                       : 517
LF_ROWS_LEN                   : 1900171
LF_BLK_LEN                    : 7060
BR_ROWS                       : 516
BR_BLKS                       : 1
BR_ROWS_LEN                   : 6894
BR_BLK_LEN                    : 8028
DISTINCT_KEYS                 : 100009
MOST_REPEATED_KEY             : 1
BTREE_SPACE                   : 3658048
USED_SPACE                    : 1907065
PCT_USED                      : 53
ROWS_PER_KEY                  : 1
BLKS_GETS_PER_ACCESS          : 3
```
The 7060 for lf_blk_len is enough to show that there’s still a problem – the value shouldn’t be less than approximately 7,900 unless we have blocks with far more than 4 or 5 ITL slots allocated.

And here’s a fragment of a blockdump to show the inappropriately large interested transaction count.
```
]

 seg/obj: 0x1845d  csc: 0x00.69af4f  itc: 41  flg: -  typ: 2 - INDEX
     fsl: 0  fnx: 0x0 ver: 0x01
```
Unfortunately you can’t give feedback on bug notes otherwise I’d let Oracle support know the bug is still present.

On the other hand, I’ve just noticed that bug 8767925 is showing an update at 4:00 am today – so maybe you’ve passed on your finding, or someone at Oracle Support picked up your comment here.

Comment by Jonathan Lewis — June 28, 2014 @ 5:45 pm BST Jun 28,2014

For those who can read Russian – here’s an example of what an index leaf block looks like when the going gets really tough: http://sql.ru/forum/actualthread.aspx?tid=754211&hl=itl

Comment by Jonathan Lewis — December 6, 2010 @ 10:03 pm GMT Dec 6,2010 | Reply

RSS feed for comments on this post.

Oracle Scratchpad

July 28, 2009

Index Explosion

Update Sept 2018

Update Oct 2021

13 Comments »

Leave a reply to Daniel Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS

Oracle Scratchpad

July 28, 2009

Index Explosion

Update Sept 2018

Update Oct 2021

Share this:

Related

13 Comments »

Leave a reply to Daniel Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS