Yesterday I asked the question: how many row entries can you create in an 8KB block with pctfree 0. It brought out some interesting observations – including one I hadn’t thought of relating to a small difference between ASSM and freelist management.
I suspect, however, that most people didn’t quite realise the significance of the wording: “row entries”– I wasn’t asking about “rows”. So here’s a little script you can run after you create the table, followed by the result of dumping the first block in the table.
begin for i in 1..2048 loop insert into t1 values(null); delete from t1; end loop; end; / Block header dump: 0x0140020a Object id on Block? Y seg/obj: 0x1749e csc: 0x00.3df6da7 itc: 2 flg: O typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0006.021.0000c0bd 0x00801d04.1966.5d ---- 2017 fsc 0x0007.00000000 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 data_block_dump,data header at 0x7f1a25c =============== tsiz: 0x1fa0 hsiz: 0xfd4 pbl: 0x07f1a25c bdba: 0x0140020a 76543210 flag=-------- ntab=1 nrow=2017 frre=-1 fsbo=0xfd4 fseo=0xfdb avsp=0x3 tosp=0xfcc 0xe:pti nrow=2017 offs=0 0x12:pri offs=0x1f9e 0x14:pri offs=0x1f9c ... 0xfd0:pri offs=0xfe0 0xfd2:pri offs=0xfdb block_row_dump: tab 0, row 0, @0x1f9e tl: 2 fb: --HDFL-- lb: 0x1 tab 0, row 1, @0x1f9c tl: 2 fb: --HDFL-- lb: 0x1 tab 0, row 2, @0x1f9a tl: 2 fb: --HDFL-- lb: 0x1 ... tab 0, row 2015, @0xfe0 tl: 2 fb: --HDFL-- lb: 0x1 tab 0, row 2016, @0xfdb tl: 2 fb: --HDFL-- lb: 0x1 end_of_block_dump
I’ve trimmed out most of the block dump – but as you can see there are no rows in the block – but there are 2,017 row entries in the row directory.
Each time I insert a row, I need a row entry in the directory; but when I delete a row I can’t re-use that entry until my delete is commited – so I have to keep adding new entries for every row in my loop.
As I delete a row it is first marked as deleted (the “flag” byte gets the D bit set) and eventually, when the block needs to be tidied and re-packed to make available space usable, the deleted rows are trimmed back to a stub of just two bytes (the flag byte and the lock byte).
In passing, there is a new statistic in 10g which I believe records this packing process so that you can see how often it happens, this is the: “heap block compress”. In a presentation at Collaborate 09, Carl Dudley pointed out that 11g does its “real-time” advanced compression only at this moment – so perhaps that’s why the statistic has recently appeared.
You may say, of course, that this is a pointless demonstration that bears no resemblance to real life. But you would be forgetting Murphy’s law, which (in one of its polite forms) says: “anything that can go wrong will go wrong”.
The title of “lunchtime quiz” came about because I was giving a seminar in Zagreb, Croatia, and one of the attendees mentioned that the data warehouse tool they used did exactly this type of thing – and crashed as a consequence . So, in the lunch break, I ran a quick test and posted the note – this was a genuine production problem. Metalink notes: 465226.1, 603480.1, and bug 4260477 are relevant.
The problem in the metalink notes is this: you can get 2,017 row entries in a 8KB block and you can get 4,065 entries in a 16KB block – so you have space to get roughly 8,160 entries in a 32KB block … but there’s a hard limit of 4,095 to the field (12 bits) in the ITL where the number of rows locked by a single transaction is recorded. (Testing it with a 32KB block on 22.214.171.124 the limit seemed to be 4,094 rather than the 4,095 given in the bug notes.)
I don’t have an older copy of Oracle that will create 32KB blocks, so I can’t test exactly what happens when you pass the limit, but there are two possibilities I can think of: (a) your session crashes, or (b) your session corrupts the block, and some other processes crashes later on trying to read it.
Oracle’s initial response to the problem was to raise Oracle error ORA-08007 to stop you from overshooting the limit. (This, of course, crashed the user’s data warehouse build process); the latest fix is to limit the number of rowid entries (potentially “wasting” space from the block) to protect the ITL limit.
The workaround on older systems is to move critical tables into a tablespace using a smaller block size – 32KB is the problem. But if you’ve created your database on a 32KB block size then the block size for the temporary tablespace has to be 32KB – and the data warehouse builder was doing this with global temporary tables. Workaround (1): don’t use a temporary tablespace !, workaround (2): rebuild the database on a 16KB block size !
There is another side effect of this large row directory – but one that surely won’t be seen (Murphy willing).
Nobody, surely, is going to write code that does this type of insert/delete cycle on a table with bitmap indexes – the consequences could be dire in terms of undo, and redo (even in 10g where the index explosion problem has been addressed).
But what if they load the table with code like this, then try to build some bitmap indexes. This is what happened on my 8KB block example when I didn’t delete the last row in the block:
create bitmap index t1_b1 on t1(n1) * ERROR at line 1: ORA-28604: table too fragmented to build bitmap index (20971786,2016,744)
When building bitmap indexes, the codes assumes that the number of rows in a block cannot exceed roughly blocksize/11 rows. (The 11 comes from assuming all rows are migrated, then each row requires: 2 bytes row directory, one byte each for column count, flags, and lock byte, and 6 bytes for the forwarding rowid – see “Practical Oracle 8i”). The error message is telling me that I have 2017 rows in a block when the Hakan factor (maximum expected) is 744.
Fortunately there is a workaround for this (though I haven’t tested it for side effects) – you can set the maximum to a value higher than the default using the standard mechanism for optimising bitmap index storage:
SQL> alter table t1 minimize records_per_block; Table altered. SQL> create bitmap index t1_b1 on t1(n1); Index created.
There are a couple more interesting points I could make about blocks, rows, deletions, and the row directory (compression is an interesting topic) – but it’s a lovely morning in Zagreb, so I’m off to see a few of the sights before I fly home.