255 columns

May 19, 2017

255 columns

Filed under: Infrastructure,Oracle — Jonathan Lewis @ 5:49 pm BST May 19,2017

This is one of my “black hole” articles – I drafted it six months ago, but forgot to publish it.

A recent post on OTN highlighted some of the interesting oddities that appear when you create tables with more than 255 columns. In fact this was a more subtle case than usual because it reminded us that it’s possible to have a partitioned table which appears to have less than the critical 255 columns while actually having more than 255 columns thanks to the anomaly of how Oracle handles dropping columns in a partitioned table. (For a useful insight see this note from Dominic Brooks – and for a nice thought about preparing simple tables for an exchange with such a partitioned tables look at the 12.2 feature of “create table for exchange” in Maria Colgan’s recent article)

The thread took me down the path of trying to recreate some notes I wrote a long time ago and can no longer find and the OTN problem wasn’t the basic one I had assumed anyway, but I thought I’d publish a bit of the work I had done so that you can see another of the funny effects that appear when your table definition has too many columns (and you use them).

The OP told us about a table with more than 350 columns, so here’s a little script I wrote to generate a table with 365 columns and some data. (It turned out that the OP had more than 390 columns in the table, but 30+ had been “dropped”.)


rem
rem	Script:		wide_table_2.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Nov 2016
rem	Purpose:
rem
rem	Last tested
rem		11.2.0.4
rem

create sequence s1;

declare
	m_statement_1	varchar2(32767) :=
		'create table t1(col0001 varchar2(10),';
	m_statement_2	varchar2(32767) :=
		'insert into t1 values(lpad(s1.nextval,10),';
begin
	for i in 2 .. 365 loop
		m_statement_1 := m_statement_1 ||
			'col' || to_char(i,'FM0000') || ' varchar2(100),'
		;
		m_statement_2 := m_statement_2 ||
			case when i in (2,3,4)
			-- case when i in (122,123,124)
			-- case when i in (262,263,264)
				then 'dbms_random.string(''U'',ceil(dbms_random.value(0,100))),'
			     when i = 365
				then 'lpad(s1.currval,7))'
				else '''COL' || to_char(i,'FM0000') || ''','
			end
		;
	end loop;

	m_statement_1 := substr(m_statement_1, 1, length(m_statement_1) - 1);
	m_statement_1 := m_statement_1 || ') pctfree 25';

	execute immediate m_statement_1;

	for i in 1..10000 loop
		execute immediate m_statement_2;
	end loop;

end;
/

I’ve taken a fairly simple approach to building a string that creates a table – and it’s easy to adjust the number of columns – and a string to insert some values into that table. The insert statement will insert a row number into the first and last columns of the table and generate a random length string for a few of the columns. I’ve picked three possible sets of three columns for the random length string; one set is definitely going to be in the first row piece, one set is definitely going to be in the last row piece, and (since the row will split 110/255) one will be somewhere inside whichever is the larger row piece.

If I wanted to do something more sophisticated I’d probably have to switch to a PL/SQL array for the two statements strings – 32,767 characters doesn’t give me much freedom to play if I wanted to test a table with 1,000 columns.

Having created and populated my table, I performed the following three tests on it:


analyze table t1 compute statistics;

prompt	====
prompt	CTAS
prompt	====

create table t1a pctfree 25 as select * from t1;
analyze table t1a compute statistics;

select	table_name, num_rows, avg_row_len, blocks, chain_cnt
from	user_tables
where	table_name like 'T1%'
;

prompt	======
prompt	Insert
prompt	======

truncate table t1a;
insert into t1a select * from t1;
analyze table t1a compute statistics;

select	table_name, num_rows, avg_row_len, blocks, chain_cnt
from	user_tables
where	table_name like 'T1%'
;

prompt	=============
prompt	Insert append
prompt	=============

truncate table t1a;
insert /*+ append */ into t1a select * from t1;
analyze table t1a compute statistics;

select	table_name, num_rows, avg_row_len, blocks, chain_cnt
from	user_tables
where	table_name like 'T1%'
;

The first test creates a new table (t1a, at pctfree 25, matching the original) copying the original table with a simple “create as select”.

The second test truncates this table and does a basic “insert as select” to repopulate it.

Third test truncates the table again and does an “insert as select” with the /*+ append */ hint to repopulate it.

In all three cases (and with three variations of where the longer random strings went) I used the analyze command to gather stats on the tables so that I could get a count of the number of chained rows; and I dumped a couple of blocks from the tables to see what the inserted rows looked like.

Here’s a summary of the results from 11.2.0.4 when the random-length columns are near the start of the row (the position didn’t really affect the outcome, and the results for 12.1.0.2 and 12.2.0.1 were very similar):

====
CTAS
====

TABLE_NAME             NUM_ROWS AVG_ROW_LEN     BLOCKS  CHAIN_CNT
-------------------- ---------- ----------- ---------- ----------
T1                        10000        3062       6676       3313
T1A                       10000        3062       9504        237

======
Insert
======

TABLE_NAME             NUM_ROWS AVG_ROW_LEN     BLOCKS  CHAIN_CNT
-------------------- ---------- ----------- ---------- ----------
T1                        10000        3062       6676       3313
T1A                       10000        3062       6676       3287

=============
Insert append
=============

TABLE_NAME             NUM_ROWS AVG_ROW_LEN     BLOCKS  CHAIN_CNT
-------------------- ---------- ----------- ---------- ----------
T1                        10000        3062       6676       3313
T1A                       10000        3062       9504        237

As you can see we get two significantly different results: the CTAS and the “insert append” produce tables reporting 9,504 blocks and 237 chained rows, while the original table (single row inserts) and the regular “insert as select” produce tables with 6,676 blocks and 3,133 chained rows. It seems that the CTAS (which would also cover “alter table move”) and direct path insert have minimised the number of chained rows at a cost of a dramatically increased number of blocks. (The scale of the difference happens to be particularly extreme in this case – I didn’t do this deliberately it was simply a consequence of the way I happened to generate the data and the length of the rows.)

We know, of course, that every row in this table will consist of two row pieces, one of 110 columns and one of 255 columns; so every row is in some respects chained due to the potential for intra-block chaining of those two pieces, but the analyze command reports only inter-block chaining i.e. only those rows that start in one block and end in another block – intra-block chaining doesn’t count as “proper” chaining (at least in this version of Oracle).

There are two questions to address in these results: the first is “What’s happening?”, the second, which we ask when we get the answer to the first, is “How come the direct path method still gives us some chained rows?”

I believe the answer to the first question is that the direct path method attempts to avoid chaining unavoidable row-pieces. Even if it means leaving a huge amount of empty space in a block Oracle starts a new row in a new block if there isn’t enough space for both of the anticipated row-pieces to fit in the current block. I think this may be a feature to help Exadata and its use of direct path reads for smart scans, where a relatively small number of chained rows (which might be outside the current Exadata storage unit – and even in the disk space managed by another cell server) could have a catastrophic impact on performance because the system would have to do a single block read to pick up the extra piece – which could have a devastating impact on the performance.

So why do some rows still see chaining under this strategy – I think it’s because there’s a small error in the arithmetic somewhere (possibly visible only in ASSM tablespaces, perhaps related to row-piece headers) where Oracle thinks there’s enough space for both row pieces but there isn’t quite so it tries to insert both pieces and has to chain the second one. (I’ll corroborate this hypothesis with some analysis of block dumps in a moment).

So you have a choice – lots of wasted space and a little row-chaining, or maximum packing of data and (potentially) lots of row-chaining. But there’s more: I’ve said we get one row piece of 110 columns and one of 255 columns for each row, but the point at which the split occurs and the order in which the pieces are inserted depends on the method used.

Single row inserts (initial table creation, typical OLTP processing): The split occurred at column 111 – so the leading 110 columns are in one row piece and the trailing 255 columns are in the other – and the row piece with the trailing columns is inserted first.
Array inserts (normal): Exactly the same as the single row inserts.
Direct path inserts / CTAS: The split occurred at column 256, with the leading 255 column row-piece inserted first and the trailing 110 column row-piece inserted second.

I’m not sure that this particular detail matters very much in normal circumstances when you consider the dramatic difference in size that can appear in the comparison between direct path and normal inserts, but maybe there’s someone who will notice a performance (or even space) side effect because of this inconsistency. We will see in a later post, though, that this difference can have an enormous impact if you subsequently add columns to the table and populate them.

I said I’d come back to the row-chaining anomaly. One of the little details that I didn’t include in my code listing was the call to “analyze table report chained rows” that I did (after executing $ORACLE_HOME/rdbms/admin/utlchain.sql) to list the head rowids of the chained rows into the chained_rows table. After doing this I ran a simple pl/sql loop to dump all the relevant blocks to the trace file:

begin
	for r in (
		select
			dbms_rowid.rowid_relative_fno(head_rowid) file#,
			dbms_rowid.rowid_block_number(head_rowid) block#
			from	chained_rows
		) loop
			execute immediate 'alter system dump datafile ' || r.file# || ' block ' || r.block#;
	end loop;
end;
/

Here’s a little extract from the resulting trace file showing you what the start of a row piece looks like when dumped:

tab 0, row 0, @0x1765
tl: 2075 fb: --H-F--- lb: 0x0  cc: 255
nrid:  0x01401bc4.1
col  0: [10]  20 20 20 20 20 20 20 33 36 32
col  1: [19]  41 52 43 4a 4a 42 4e 55 46 4b 48 4c 45 47 4c 58 4c 4e 56
col  2: [ 8]  59 4b 51 46 4a 50 53 55
col  3: [17]  53 58 59 4e 4a 49 54 4a 41 5a 5a 51 44 44 4b 58 4d
col  4: [ 7]  43 4f 4c 30 30 30 35
col  5: [ 7]  43 4f 4c 30 30 30 36

A convenient thing to check is the cc: (“column count”) entry (end of 2nd line). You can see that this row piece has 255 columns, and if you look at the first six columns dumped you can see that it’s the row numbered 362 (33 36 32), then there are three columns of different length strings, then two columns with the values ‘COL00005’ and ‘COL0007’ respectively. It’s the “cc:” entry that’s useful though, so I’m going to do a bit of simple unix hackery:

grep " cc: " test_ora_24398.trc | sed "s/^.*cc: //"  | sort | uniq -c | sort -n
      1 1
      5 2
    112 108
    125 109
    237 110
    474 255

In my 237 blocks with chained rows I had 474 row pieces of 255 columns and 237 row pieces of 110 columns; then I had 125 row pieces that had lost (and therefore chained) one column and 112 row pieces that had lost and therefore chained 2 columns. I also had a couple of small “tail-end” pieces from earlier blocks scattered in these blocks. These figures suggest that there’s a small error (actually no more than about 20 bytes) in the calculation Oracle does to decide if it can fit a whole row into the current block or whether it has to go on to the next empty block.

Conclusions

When copying a table defined with more than 255 columns there’s the potential for a huge variation in the space usage and chain count depending on whether you do a CTAS (or insert /*+ append */) or a simple insert. You have to decide which option is the biggest threat to your available resources.

There is a little anomaly with the way in which rows are split that is also dependent on the method used for copying – this may also have some effect, though perhaps small enough to be ignored when compared with the space/chaining difference as far as ordinary OLTP processing is concerned. But there are some important side effects we will consider in a later post.

Even though CTAS/direct path insert can eliminate a lot of row chaining it is still possible to find some row chaining in the resulting data. This may be the result of a calculation error (or possibly a deliberate space saving compromise).

Note that any comments about using CTAS to copy a table also apply to “alter table move” and to using expdp/impdp.

Comments (1)

1 Comment »

[…] 255 columns (May 2017) – possible consequences of copying a wide table (with links to a couple of relevant articles on partition exchange and dropped columns. […]

Pingback by 255 column catalogue | Oracle Scratchpad — January 25, 2022 @ 12:17 pm GMT Jan 25,2022 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome. Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Scratchpad

May 19, 2017

255 columns

Conclusions

1 Comment »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS

Oracle Scratchpad

May 19, 2017

255 columns

Conclusions

Share this:

Related

1 Comment »

Comments and related questions are welcome. Cancel reply

Click on the motto below

Everything Changes Eventually

Search this blog

Categories

Special Links

Recent Posts

Recent Comments

Popular articles

Popular References

Archives

Blogroll

Posts by RSS