If you’re starting to work with Exadata you need to work out what how much impact you can have by doing the right (or wrong, or unlucky) sorts of things with the technology. Fortunately there are only a few special features that really need investigation: Hybrid Columnar Compression (HCC), Offloading (and related Smart Scan stuff) and Storage Indexes. These are the features that will have the biggest impact on the space taken up by your data, the volume of data that will move through your system as you load and query it, and the amount of CPU it takes to do something with your data.
There are other features that are important, of course, such as the features for affecting parallel execution, and the options for resource management, but the three I’ve listed above are the core of what’s (currently) unique to Exadata. In this note I’m just going to make a few comments about how Oracle implements HCC, and what side-effects this may have.
The Compression Unit
Under HCC Oracle uses a structure called a Compression Unit. If you look at how Oracle stores a compression unit it is, in a very real sense, stored as a single row. The number of columns in that one row is the same as the number of (real) columns declared for the table, but the content of each individual column of the row is created by concatenting the equivalent column from hundreds, if not thousands, of the original rows from the table then using a compression algorithm such as bzip2 on the result.
Imagine we have 1,000 rows in a table of four columns – and we issue a command like: alter table t1 move compress for archive high; that might pack the contents of the entire table into a single compression unit looking something like the following picture:
The compression unit has a small header section – holding, among other things, the offsets into the compression unit of four “compressed columns”, then there’s a bitmap (one bit for each row in the original set of 1,000) that is used to identify rows that have been deleted (or migrated to a different location because of an update) and finally there are the four “compressed columns”. Each “column” in the picture above consists of the list of values for that column extracted from the 1,000 rows in order, and then compressed using an algorithm dependent on the level of compression (query low, query high, archive low, or archive high).
Once Oracle has constructed the compression unit, the “mega-row” is stored as a single chained row with a single column, using Oracle’s standard approach to chained rows. This allows Oracle to introduce a split anywhere in the row (including in mid-column) and continue the row in another block. Assuming my compression unit is large enough to fill four blocks, the multi-level picture would look like this:
Each block (the blue outline) has a block header (the grey area) and a row directory (the dark red strip). In this case every row directory has a single entry in it that points to the single row-piece in the block. Each row-piece has a row-header (the orange strip) describing the row but, importantly, since the row has chained across all four blocks the row header for the first three row-pieces includes a “next rowid” pointer to the next row-piece.
Here’s a data dump showing the row header information produced by a single compression unit spanning five blocks in a table. I cut these from a standard block range dump. The compression unit is made up of the first row piece in 5 consecutive blocks. Each row piece points to the rowid of the next row piece (nrid).
tab 0, row 0, @0x30 tl: 8016 fb: --H-F--N lb: 0x0 cc: 1 nrid: 0x01802ee4.0 col 0:  tab 0, row 0, @0x1f tl: 8033 fb: ------PN lb: 0x0 cc: 1 nrid: 0x01802ee5.0 col 0:  tab 0, row 0, @0x1f tl: 8033 fb: ------PN lb: 0x0 cc: 1 nrid: 0x01802ee6.0 col 0:  tab 0, row 0, @0x1f tl: 8033 fb: ------PN lb: 0x0 cc: 1 nrid: 0x01802ee7.0 col 0:  tab 0, row 0, @0x1d56 tl: 554 fb: -----LP- lb: 0x0 cc: 1 col 0: 
If you’ve looked at the dump details for chained or migrated rows before, you’ll be familiar with the values in the “flag byte” (fb) – the F in the first rowpiece is for “First”, the L in the last rowpiece is for “Last”. The N’s and P’s show that there is a Next row and/or Previous rowpiece, and the H in the first row piece tells us that this row is a row header section. (You could have an H without an F to show that a row had been migrated). As you can see, each row piece consists of a single column (cc: 1 / col 0:) – but remember, this one column has nothing to do with the columns of the original columns of the original table, the entire compression unit is a single length column. In this example you can calculate the total length of the compression unit by summing the column lengths from the five row-pieces: 8004 + 8021 + 8021 +8021 + 548 = 32,615 bytes.
You might have realised that the last block in the set of four has only used up (including the block overhead) about 750 bytes from the 8KB available. This won’t (necessarily) be wasted space; I have some examples of Oracle putting the first piece of a second compression unit into a block when the first compression unit ends with a short row-piece.
Once you realise that a Compression Unit is stored as a chained row of one column you might then start asking questions about how Oracle gets usable data from a compression unit, and how efficiently it can do it, and how big a compression unit can be.
The answer to the easy question – how big can a compression unit be – seems to be dicated by the lesser of:
- 32, 759 rows from the original data (even when compressed for query low)
- 250KB (approximately – though this might have been a side effect of my system configuration)
I’ll be looking at the harder, and more interesting, questions in the next HCC posting. In the meantime, here’s a link to all my previous pieces on Exadata topics (including a catalogue or articles by other writers).