When I introduced Connor McDonald’s blog a few days ago, it was because we had exchanged a couple of email messages (through the Oak Table Network) about how to minimise the resource consumption when copying a load of data from one IOT to another of the same structure. His problem was the way in which the obvious way of copying the data resulted in a massive sort even though, in principle, it should not have been necessary to sort anything since the data could have been extracted in order by walking the existing IOT.
As a suggestion I referenced a comment I had made in the Addenda to Practical Oracle 8i about 12 years ago when I had first solved the problem of loading an IOT with minimal logging and no sorting. At the time I had been loading data from a sorted file into an empty table that was then going to be exchnaged into a partitioned IOT – but it crossed my mind that loading from a flat file and loading from a UNIX pipe were pretty much the same thing, so perhaps Connor could workaround his problem by making one session spool to a pipe while another session was reading it. In the end, he simply created a massive temporary tablespace, but I thought I’d modify a test script I wrote a few years ago to demonstrate my idea – and here it is:
There’s never enough time to read everything that’s worth reading, so even though Guy Harrison’s blog is one of the ones worth reading I find that it’s often months since I last read it. Visiting it late last night, I found an interesting batch of articles spread over the last year about the performance of SSD – the conclusions may not be what you expect, but make sure you read all the articles or you might end up with a completely misleading impression:
Don’t forget to read the comments as well. For other notes Guy has written about SSD, here’s a URL for his SSD tag.
In an earlier post I showed you how you could generate SQL to analyze the distribution of row sizes in a table. In the introduction to the code I made a comment about how it failed to “allow for nulls at the end of rows”; a feature of Oracle storage that isn’t commonly known is that a set of consecutive null columns at the end of a row take up no storage space, while any null columns followed by a non-null column take up one byte (holding the value 0xFF) per column so that Oracle can “count its way” through the null columns to the non-null column. Consider this example:
A recent question on Oracle-L (a performance problem creating indexes that turned into an issue with migrated rows) prompted me to revisit a piece of code I first wrote about 20 years ago for an Oracle 6 system, and first published in November 2000. It addresses the question of analysing row-lengths in detail and, if you look at the example I’ve linked to, you’ll see that this could be very useful when you’re trying to work out suitable settings for pctfree and why you’re suffering from row migration.
The script is strictly limited to “simple” tables, by which I mean heap tables with columns that are basic data types and don’t include such things as Longs, LOBs, nested tables and all the other nasty things that usually break simple utilities. All it does is estimate the length of each row, then aggregating by row length. The estimate doesn’t allow for nulls at the ends of rows or columns longer than 254 bytes – technically the former don’t need length bytes and the latter use 3 bytes for the length when the column is more than 250 bytes. I don’t use the script often, and haven’t got around to including all the column types it won’t work for – that’s left as an exercise for the reader.
Following a question on the Oracle Core Addenda pages, here’s a little script to tell you about the sizes of the public and private redo threads currently active in the instance. It’s a minor variation of a script I published in Appendix D (Dumping and Debugging), page 237 to show the addresses of current activity in the various log buffers:
One of the partitioning features that Oracle introduced relatively recently was the “interval partition”, for example:
create table transactions (
account_id number(8) not null,
transaction_date date not null,
transaction_type varchar2(2) not null,
transaction_id varchar2(10) not null,
amount number(10,2) not null,
partition by range (transaction_date)
partition p200801 values less than (to_date('01-FEB-2008','DD-MON-YYYY'))
When I insert data into this table for a partition that doesn’t yet exist, Oracle will work out which partition it should be and create it automatically before doing the insert. The benefit of this trick, of course, is that the DBAs and developers don’t have to write any code to add partitions in anticipation of time passing and new data appearing.
Here’s a simple piece of SQL that could, in theory, compare the current size of a table with the size it might be after a call to “alter table move” – and it’s followed by the results for a table that’s current in the database that I’m looking at:
blocks, num_rows, avg_row_len, pct_free,
ceil(num_rows * avg_row_len / (8000 * ((100 - pct_free)/100))) blocks_needed
table_name = 'T1'
BLOCKS NUM_ROWS AVG_ROW_LEN PCT_FREE BLOCKS_NEEDED
---------- ---------- ----------- ---------- -------------
25 1000 22 10 4
A question came up on the OTN database forum a few days ago asking why a very simple query against a hash partitioned table was doing so much physical I/O to get a very small amount of data from a single partition:
I have table hash-partitioned on an indexed column “id” which is non-unique and part of my primary key. Inside each partition, rows with the same id are located close to each other which is done by dbms_redefinition reorg using orderby_cols. The intention is to reduce the amount of physical reads since there are no queries that don’t filter on the id column.
What I’m seeing is a lot of physical reads though. The first partition has roughly 80K rows, an average row length of 347, block size of 8K and compression … resulting in 821 blocks. And when (after flushing buffered cache and shared pool) submit a query that filters on “id” only and results is 106 selected rows, I see roughly 1400 physical reads.
The poster has clearly applied some careful thought to minimising the work required – so what’s gone wrong ? To help us analyse the problem he posted the results from running a query through autotrace:
A few years ago (2007) I wrote about a problem that could appear when you mixed parallel execution with system managed extent allocation. A couple of years later I added a note that Christian Antognini had observed a patch in 126.96.36.199 that addressed the specific issue I had raised. Today, thanks to an email exchange with Christo Kutrovsky of Pythian, I can report that there is a variation of this issue still available even in 188.8.131.52.
The basic problem is that you can end up with a very large number of very small extents, leading to poor performance in parallel queries and a significant waste of space in a data segment. Here’s a simple, though not particularly realistic, way to demonstrate the problem.
There was a question on the OTN database forum a little while ago about the different between the relative and absolute files numbers for a data file as reported in view dba_data_files – where they appear as file_id and relative_fno – or v$datafile – where they appear as file# and rfile#. In response, someone resurrected a very old posting of mine from the Usenet newsgroup comp.databases.oracle.server, which says the following:
A recent posting on the OTN database forum (which I didn’t answer correctly on my first attempt) raised the problem of truncating partitions when there are referential integrity constraints in place. Let’s start with a demonstration of the problem:
Here’s another one of those little changes that sidles in when you’re not looking.
When locally managed tablespaces first appeared, there were a number of posts on Usenet (comp.databases.oracle.server) and the metalink lists (the OTN database forum didn’t exist at the time) about “missing” space in data files. The commonest sort of comment was along the lines of
“I’ve created a data file of 100 MB, how come I can only create 99 extents of 1 MB each?”
“I’ve created a data file of 10 GB, how come I can only create 9 extents of 1 GB each?”
For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.
It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while reading the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment on page 429 to the effect that:
In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.
I was in a discussion recently about how to estimate the size of a bitmap index before you build it, and why it’s much harder to do this for bitmap indexes than it is for B-tree indexes. Here’s what I wrote in “Practical Oracle 8i”:
If you drop a unique or primary key constraint the index that supports it may be dropped at the same time – but this doesn’t always happen. Someone asked me recently if it was possible to tell whether or not an index would be dropped without having to find out the hard way by dropping the constraint. The answer is yes – after all, Oracle has to make a decision somehow, so if we can find out how it makes the decision we can predict the decision.
So here’s my best theory so far – along with the observations that led to it. First, run a trace while dropping a primary key constraint and see if this gives you any clues; on an instance running 10gR2 I noticed the following statement appearing in the trace file immediately after the delete from cdef$ (constraint definitions).