Oracle Scratchpad

August 11, 2019


Filed under: Oracle,Performance,Troubleshooting — Jonathan Lewis @ 9:28 pm BST Aug 11,2019

Here’s a question to provoke a little thought if you’ve got nothing more entertaining to do on a Sunday evening.  What threats do you think of when you see a statement like the following in (say) an AWR report or in a query against v$sql?

update tableX set
        col001 = :1, col002 = :2, col003 = :3, ...
        -- etc. 
        -- the names are supposed to indicate that the statement updates 302 columns
        -- etc.
        col301 = :301, col302 = :302
        pk_col = :303

I’ll be writing up some suggestions tomorrow (Monday, UK BST), possibly linking to a few other articles for background reading.


The first three comments have already hit the high points, but I’m going to jot down a few notes anyway.

The first two things that really (should) spring to mind are:

  • There’s a table in the application with (at least) 303 columns – anything over 255 is a potential disaster area
  • An update statement that updates 302 columns is probably machine generated by a non-scalable application

A secondary detail that might be useful is recognising the style of the text – lower case for column names, simple “:nnn” for bind variables.  As it stands I don’t recognise the possible source for this convention but I know it’s not SQL embedded in PL/SQL (which would be all capitals for the SQL with “:Bnnn” as bind variable names) and it’s not part of a distributed query coming in from a remote database (which would be in capitals with names in double quotes, and table aliases like “A1”, “A2”, …), and it’s not “raw” Hibernate code which produces meaningless table and column aliases based on numbers with a leading “t” for table names and “c” for column names.

So let’s think about possible problems and symptoms relating to the two obvious issues:

Wide tables

Once you have more than 255 (real) columns in a table – even if that count includes columns that have been marked unused – Oracle will have to split rows into “rowpieces” that do not exceed 255 columns and chain those pieces together. Oracle will try to be as efficient as possible – with various damage-limitation coding strategies that have appeared across versions – attempting to store these rowpieces together and keeping the number to a minimum, but there are a number of anomalies that can appear that have a significant impact on performance.

Simply having to visit two rowpieces to pick up a column in the 2nd rowpiece (even if it is in the same block) adds to the cost of processing; but when you have to visit a second block to acquire a 2nd (or 3rd, or 4th) rowpiece the costs can be significant. As a quirky little extra, Oracle’s initial generation of rowpieces creates them from the end backwards – so a row with 256 columns starts with a rowpiece of one column following by a rowpiece of 255 columns: so you may find that you have to fetch multiple rowpieces for virtually every row you access.

It’s worth noting that row splitting is based only on columns that have been used in the row. If your data is restricted to the first 255 columns of a row then the entire row can be stored as a single rowpiece (thanks to the basic rule that “trailing nulls take no space”); but as soon as you start to update such a row by populating columns past the 255 boundary Oracle will start splitting from the end – and it may create a new trailing rowpiece each time you populate a column past the current “highest” column.  In an extreme case I’ve managed to show an example of a single row consisting of 746 rowpieces, each in a different block (though that’s a bug/feature that’s been addressed in recent versions of Oracle).

If rows have been split across multiple blocks then one of the nastier performance problems appears with direct path read tablescans. As Oracle follows the pointer to a secondary rowpiece it will do a physical read of the target block then immediately forget the target block so, for example, if you have inserted 20 (short) rows into a block then updated all of them in a way that makes them split and all their 2nd rowpieces go to the same block further down the table you can find that Oracle does a single direct path read that picks up the “head” pieces, then 20 “db file sequential read” calls to the same block to pick up the follow-on pieces. (The same effect appears with simple migrated rows.) Contrarily, if you did the same tablescan using “db file scattered read” requests then Oracle might record a single, highly deceptive “table fetch continued row” because it managed to pin the target block and revisit it 20 times.

Often a very wide row (large number of columns) means the table is trying to hold data for multiple types of object. So a table of 750 columns may use the first 20 columns for generic data, columns 21 to 180 for data for object type A, 181 to 395 for data for object type B, and so on.  This can lead to rows with a couple of hundred used columns and several hundred null columns in the middle of each row – taking one byte per null column and burning up lots of CPU as Oracle walks a row to find a relevant column. A particularly nasty impact can appear from this type of arrangement when you upgrade an applications:  imagine you have millions of rows of the “type A” described above which use only the first 180 columns.  For some reason the application adds one new “generic” column that (eventually) has to be populated for each row – as the column is populated for a type A row the row grows by 520 (null counting) bytes and splits into at least 3 pieces. The effect could be catastrophic for anyone who had been happy with their queries reporting type A data.

One of the difficulties of handling rows that are chained due to very high column counts is that the statistics can be quite confusing (and subject to change across versions). The most important clue comes from “table fetch continued row”; but this doesn’t tell you whether your “continued” rows are migrated or chained (or both), which table they come from, and whether you’ve been fetching the same small number of rows multiple times or many different rows. Unfortunately the segment statistics (v$segstat / v$segment_statistics) don’t capture the number of continued fetches by segment – it would be nice if they did since it ought to be a rare event so a low instrumentation cost for most users. The best you can do, probably, is to look at the v$sql_monitor report for queries that report tablescans against large tables but report large numbers of single block reads in the tablescan (with the proviso that those might be reads of undo blocks) – and for repeatable cases enable SQL trace with wait state tracing against suspect queries to see if they show the characteristic mix of direct path reads and repeated db file sequential reads.

Update every column

The key performance threat in statements that update every column – including the ones that didn’t change – is that Oracle doesn’t compare before and after values when doing the update. Oracle’s code path assumes you know what you’re doing so it saves every “old” value to an undo record (which gets copied to the redo) and writes every “new” value to a redo change vector.  (Fortunately Oracle does check index definitions to see which index entries really have suffered changes, so it doesn’t visit index leaf blocks unnecessarily). It’s possible that some front-end tool that adopts this approach has a configuration option that switches from “one SQL statement for all update” to “construct minimal statement based on screen changes”.

The simple trade-off between these two options is the undo/redo overhead vs. parsing and optimisation overheads as the tool creates custom statements on demand. In the case of the table with more than 255 columns, of course, there’s the added benefit that an update of only the changed columns might limit the update to columns that are in the first rowpiece, eliminating the need (some of the time) to chase pointers to follow-up pieces.

Limiting the update can help with undo and redo, of course, but if the tool always pulls the entire row to the client anyway you still have network costs to consider. With the full row pulled and then updated you may find it takes several SQL*Net roundtrips to transfer the whole row between client and server.  In a quick test on a simple 1,000 column table with an update that set every column in a single row to null (using a bind variables) I found that the a default setup couldn’t even get 1,000 NULLs (let alone “real values”) across the network without resorting to one wait on “SQL*Net more data from client”

variable b1 number
exec :b1 := null;

update t1 set
        col0001 = :b1,
        col0002 = :b1,
        col1000 = :b1

Although “SQL*Net message to/from client” is typically (though not always correctly) seen as an ignorable wait, you need to be a little more suspicious of losing time to “SQL*Net more data to/from client”. The latter two waits mean you’re sending larger volumes of information across the network and maybe you can eliminate some of the data or make the transfer more efficient – perhaps a simple increase in the SDU (session data unit) in the tnsnames.ora, listener.ora, or sqlnet.ora (for default_sdu_size) might be helpful.


One of the features of trouble-shooting from cold is that you don’t know very much about the system you’re looking at – so it’s nice to be able to take a quick overview of a system looking for any potentially interesting anomalies and take a quick note of what they are and what thoughts they suggest before you start asking questions and digging into a system. This article is just a couple of brief notes along the lines of: “that doesn’t look nice- what questions does it prompt”.


  1. unnecessary resource consumption by updating every column (even when values have not changed) comes to mind

    Comment by jkstill — August 11, 2019 @ 10:13 pm BST Aug 11,2019 | Reply

  2. Row chaining bc we have more than 255 cols.

    The fact that there are bind variables tells me that this is probably not a one-time query… Which means we’re making (possibly bulk) changes in a row-by-row fashion.

    Probably what’s going on is something ORM-ish where we get a row (or rows), change the info on the client side, and put it back. Which is terrible because then you end up selecting all columns out (even if you don’t need them) and writing all columns back (ever if they’re unchanged) and processing everything in a row-by-row fashion.

    Since we’re essentially replacing the entire row, we’re also probably writing to undo more than we should. All this (writing to data file +writing to undo) will cascade into generating more redo as well.

    Comment by Kaley Crum — August 11, 2019 @ 11:11 pm BST Aug 11,2019 | Reply

  3. Some further thoughts on the above.

    1. I agree with Kaley that this looks very much as if it is generated code (from an ORM or other such mechanisms). These sorts of applications often write a small subset of data to the rows at creation time and then update the same row at least once, and sometimes several times with more data as it becomes available. That likely means row migration as well as row-chaining.

    2. It is pretty unlikely that a well-defined entity really has 302 attributes; it’s much more likely that this is an attempt to store different types of data in the same table (or report data) – possibly to avoid joins which are a well-known performance threats in many parts :(. Whilst breaking normalized design can be useful – it’s far more often a cause of ongoing performance issues in its own right.

    3. If the column names are not obfuscated (though I think they probably are) then you almost certainly have an EAV data model which will certainly not scale.

    4. From experience, if your 300+ column table is an IOT then you will be in a world of pain :(

    Comment by nlitchfield — August 12, 2019 @ 10:03 am BST Aug 12,2019 | Reply

  4. Being a PK based update, I am thinking there might be “lost updates” in case of concurrent updates

    Comment by Narendra — August 12, 2019 @ 4:19 pm BST Aug 12,2019 | Reply

  5. The column naming cnvention reminds me of the BMC Remedy schema.

    Comment by Fran — August 13, 2019 @ 12:56 pm BST Aug 13,2019 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by