In Index Quiz 1 and Index Quiz 2 I demonstrated a couple of details of how the ITL (interested transaction list) behaves in indexes. In this note I’m going to explain how these details can result in a nasty waste of space in indexes.
The two points I made in the previous posts were:
- at high levels of concurrency you can “lose” a lot of space in an index leaf block to the ITL
- when a leaf block splits the ITL of the newer block is a copy of the ITL from the older block
One consequence of point 2 is that you need only have one brief burst of activity that makes one or two ITLs grow to an uncharacteristic size, after which every leaf block that splits off in that portion of the index from then onwards will have a large ITL even if it doesn’t need it.
Combine this with something that I think is probably a bug – but which may actually have been deliberately designed in as a feature (with an unexpected side effect) for concurrency reasons – and surprises appear.
In Index Quiz 1 I showed you an index with the following statistics, and pointed out that it had lost about 50% of it’s available space because its ITLs had grown to the maximum (for an 8KB block) of 169 entries – and pointed out that this can be a side effect of very high concurrency. In fact, when I ran the test case, I was running just eight concurrent processes on a machine with two CPUs.
Here’s some code if you want to try the test on your own systems:
rem Script: order_create.sql create sequence s1 cache 250000; create table orders ( date_placed date not null, order_id number(8), time_stamp timestamp, id_owner number(2), padding varchar2(168) ) tablespace test_8k_assm initrans 8 ; create index ord_placed on orders(date_placed) tablespace test_8k initrans 9 ;
The tablespace test_8k_assm is using 8KB blocks, 1MB uniform extent sizes, and automatic segment space management (ASSM). The tablespace test_8k is similar but uses freelist (manual) segment space management. I’ve used initrans 8 on the table and initrans 9 on the index because I tested eight concurrent processes (the setting on the table was probably redundant given I was using ASSM and used less than 16 concurrent sessions in the test).
After creating the sequence, table, and index , you can run the following PL/SQL from as many different sessions as you like (adjusting the limit on the loop if necessary):
rem Script: order_entry.sql declare m_ord number(8) := 0; begin while m_ord <= 100000 loop insert into orders ( date_placed, order_id, padding ) values ( sysdate - 1000 + s1.nextval/1000, s1.nextval, rpad('x',168) ) returning order_id into m_ord ; commit write immediate wait; -- 10g feature end loop; end; /
The model that this code is generating is an order entry system with 1,000 orders per day over the last 1,000 days. But the rate of data entry is, of course, accelerated to extremes.
If you run just one copy of the pl/sql and validate the index afterwards you should get results like this:
HEIGHT : 2 BLOCKS : 256 LF_ROWS : 100001 LF_BLKS : 238 LF_ROWS_LEN : 1900019 LF_BLK_LEN : 8000 BR_ROWS : 237 BR_BLKS : 1 BR_ROWS_LEN : 3312 BR_BLK_LEN : 8032 BTREE_SPACE : 1912032 USED_SPACE : 1903331 PCT_USED : 100
The index space is perfectly (100%) used – it’s the natural consequence of the sequential nature of the data inserts; and although we specified (and Oracle “ignored” initrans 9) we can see that the lf_blk_len is 8,000 which means every leaf block still has the minimum two entries in its ITL.
If you run two copies of the script, you may see results more like this:
HEIGHT : 2 BLOCKS : 384 LF_ROWS : 100002 LF_BLKS : 255 LF_ROWS_LEN : 1900038 LF_BLK_LEN : 7976 BR_ROWS : 254 BR_BLKS : 1 BR_ROWS_LEN : 3544 BR_BLK_LEN : 8032 BTREE_SPACE : 2041912 USED_SPACE : 1903582 PCT_USED : 94
There’s just a few percent “lost” space, and the lf_blk_len is showing that some blocks have acquired a third entry in the ITL – hence the 24 byte drop from 8,000 to 7,976.
If you run something like my “index efficiency” code to check how well each block in the index is used, you may find something like this:
ROWS_PER_BLOCK BLOCKS TOTAL_ROWS RUNNING_TOTAL -------------- ---------- ---------- ------------- 206 12 2472 12 208 1 208 13 209 1 209 14 211 2 422 16 212 1 212 17 214 15 3210 32 260 1 260 33 410 1 410 34 419 221 92599 255 ---------- ---------- sum 255 100002
Most blocks are full (ca. 420 entries), after a “90/10 leaf node split”, but a few have done a 50/50 split (ca. 210 entries). The 50/50 splits are what you get on a multi-user system with multiple CPUs. Occasionally a session will get its sequence number then get pre-empted by the operating system, allowing another session to get and insert a higher value: with a little bad luck this will happen just as a leaf block fills.
Important Note: The laptop that I used to generate these results has 2 CPUs – if you try running this test on a machine with a single CPU then the concurrent test may give dramatically different results.
Now repeat with more copies of the pl/sql. As the number of concurrent sessions grows the space requirement will climb, and the lf_blk_len will eventually drop to 3,992 and the number of leaf blocks (lf_blks) will be probably be two or three times as large as you got in serial execution (In the worst case it could be four times as large, but this would only happen if every block had lost half its space to ITLs and did a 50/50 split).
If you have N CPUs (N > 1) then I would expect to see the problem starting to appear somewhere between N+1 and 2N concurrent sessions – but if you want to short-cut the testing just go for 4N sessions and see what happens. If the CPU is already heavily loaded before you start then the problem will appear with fewer concurrent sessions (I got some dramatic results by running with just a couple of sessions whilst doing a full machine virus scan at maximum speed).
The test isn’t deterministic – the results will depend on things like the version of Oracle, size of the redo log buffer, the size of the redo log files, the speed of the I/O subsystem, the number of CPUs, the operating system, the process ids of the sessions you happen to connect to (that’s a side effect of ASSM) , and any workload that happens to be going on at the time. But the bottom line is this – if you’ve got hot spots in indexes that are subject to a lot of concurrent DML then you can find yourself wasting space unnecessarily in that area of the index.
In the next installment of the series I’ll make some comments about what I think is happening, and discuss how to address the issue. But before I finish, here’s how odd you can make index_stats look if you engineer a bizarre accident – it must be a really efficient index, I’ve used 174% of the available space !
HEIGHT : 2 BLOCKS : 384 NAME : ORD_PLACED LF_ROWS : 110008 LF_BLKS : 301 LF_ROWS_LEN : 2090152 LF_BLK_LEN : 3992 BR_ROWS : 300 BR_BLKS : 1 BR_ROWS_LEN : 4191 BR_BLK_LEN : 8032 DISTINCT_KEYS : 110008 MOST_REPEATED_KEY : 1 BTREE_SPACE : 1209624 USED_SPACE : 2094343 PCT_USED : 174 ROWS_PER_KEY : 1 BLKS_GETS_PER_ACCESS : 3