Question – How can you have a single file in a single tablespace showing multiple free extents when there are no objects using any space in that file ? For example, from an 184.108.40.206 database:
February 25, 2013
February 11, 2013
I was at a client site recently where one of the end-users seemed to have discovered a cunning strategy for optimising a critical SQL statement. His problem was that his query screen times out after 2 minutes, so any query he runs has to complete in less than two minutes or he doesn’t see the results. Unfortunately he had a particular query which took nearly 32 minutes from cold to complete – partly because it’s a seven-table join using ANSI OUTER joins, against tables ranging through the 10s of millions of rows and gigabytes of data – the (necessary) tablescan of the table that had to be first in the join order took 70 seconds alone.
But our intrepid user seems to have made an important discovery and engineered a solution to his performance problem. I think he’s noticed that when you run a query twice in a row the second execution is often faster than the first. I can’t think of any other reason why the same person would run the same query roughly every four minutes between 8:00 and 9:00 am every morning (and then do the same again around 5:00 in the afternoon).
Looking at the SQL Monitoring screen around 10:00 the first day I was on-site I noticed this query with a very pretty graphic effect of gradually shrinking blue bars as 32 minutes of I/O turned into 2 minutes of CPU over the course of 8 consecutive executions which reported run times something like: 32 minutes, 25 minutes, 18 minutes, 12 minutes, 6 minutes, 4 minutes, 2.1 minutes, 2 minutes.
It’s lucky (for that user) that the db_cache_size is 60GB. On the other hand this machine is one of those Solaris boxes that likes to pretend that it’s got 128 CPUs when really it’s only 16 cores with 8 lightweight threads per core – you don’t want anyone running a query that uses 2 solid CPU minute on one of those boxes because it’s taking out 1/16th of your CPU availability, while reporting a load of 1/128 of your CPUs.
Footnote: the query can be optimised (properly) – it accessed roughly 100M rows of data to return roughly 300 rows (with no aggregation), so we just need to do a little bit of work on precise access paths.
January 24, 2013
Red Gate have asked me to write a few articles for their Oracle site, so I’ve sent them a short series on “traditional” compression in Oracle – which means I won’t be mentioning Exadata hybrid columnar compression (HCC a.k.a. EHCC). There will be five articles, published at the rate of one per week starting Tuesday (15th Jan). I’ll be supplying links for them as they are published.
December 28, 2012
Here’s a little puzzle if you have nothing better to do between now and the new year. The following extract came from 220.127.116.11, but could have come from 10.2.0.5 or 18.104.22.168 (and many others). I’ve got a simple heap table where the last thing I (or anyone) did was “alter table t1 move” before dumping the first (data) block of the table. Looking at the resulting trace file, I see the following:
fsbo=0x56e fseo=0xf4d avsp=0x5f tosp=0x5f
If you need to have the acronyms decoded they are (according to my best guess):
- fsbo – free space, begin offset
- fseo – free space, end offset
- avsp – available space
- tosp – total space
Doing the arithmetic, the free space starts at offset 0x56e and ends at 0xf4d, which means the free space gap is 2,527 bytes; but the total space available for use is only 0x5f bytes, i.e. 95 bytes. So what has happened to the other 2,432 ?
Remember – I dumped the block immediately after issuing “alter table t1 move”, so there are no issues of delayed block cleanout, uncommitted transactions etc. to worry about.
Footnote: the reason why you have “available space” and “total space” is to keep track of the space made available by deleted rows. The “avsp” (usually) reports the size of the gap between the row directory and the row heap; the “tosp” includes the space in the holes left in the row heap after rows have been deleted (or updated in a way that moves them up to the top of the heap, leaving a gap behind them, or updated in situ in a way that reduces the row length leaving a little hole).
November 27, 2012
When I introduced Connor McDonald’s blog a few days ago, it was because we had exchanged a couple of email messages (through the Oak Table Network) about how to minimise the resource consumption when copying a load of data from one IOT to another of the same structure. His problem was the way in which the obvious way of copying the data resulted in a massive sort even though, in principle, it should not have been necessary to sort anything since the data could have been extracted in order by walking the existing IOT.
As a suggestion I referenced a comment I had made in the Addenda to Practical Oracle 8i about 12 years ago when I had first solved the problem of loading an IOT with minimal logging and no sorting. At the time I had been loading data from a sorted file into an empty table that was then going to be exchnaged into a partitioned IOT – but it crossed my mind that loading from a flat file and loading from a UNIX pipe were pretty much the same thing, so perhaps Connor could workaround his problem by making one session spool to a pipe while another session was reading it. In the end, he simply created a massive temporary tablespace, but I thought I’d modify a test script I wrote a few years ago to demonstrate my idea – and here it is:
October 5, 2012
There’s never enough time to read everything that’s worth reading, so even though Guy Harrison’s blog is one of the ones worth reading I find that it’s often months since I last read it. Visiting it late last night, I found an interesting batch of articles spread over the last year about the performance of SSD – the conclusions may not be what you expect, but make sure you read all the articles or you might end up with a completely misleading impression:
- Using flash disk for redo on Exadata
- Using SSD for redo on Exadata – pt. 2
- Exadata smart flash logging
- Exadata smart flash logging – outliers
Don’t forget to read the comments as well. For other notes Guy has written about SSD, here’s a URL for his SSD tag.
October 1, 2012
In an earlier post I showed you how you could generate SQL to analyze the distribution of row sizes in a table. In the introduction to the code I made a comment about how it failed to “allow for nulls at the end of rows”; a feature of Oracle storage that isn’t commonly known is that a set of consecutive null columns at the end of a row take up no storage space, while any null columns followed by a non-null column take up one byte (holding the value 0xFF) per column so that Oracle can “count its way” through the null columns to the non-null column. Consider this example:
September 27, 2012
A recent question on Oracle-L (a performance problem creating indexes that turned into an issue with migrated rows) prompted me to revisit a piece of code I first wrote about 20 years ago for an Oracle 6 system, and first published in November 2000. It addresses the question of analysing row-lengths in detail and, if you look at the example I’ve linked to, you’ll see that this could be very useful when you’re trying to work out suitable settings for pctfree and why you’re suffering from row migration.
The script is strictly limited to “simple” tables, by which I mean heap tables with columns that are basic data types and don’t include such things as Longs, LOBs, nested tables and all the other nasty things that usually break simple utilities. All it does is estimate the length of each row, then aggregating by row length. The estimate doesn’t allow for nulls at the ends of rows or columns longer than 254 bytes – technically the former don’t need length bytes and the latter use 3 bytes for the length when the column is more than 250 bytes. I don’t use the script often, and haven’t got around to including all the column types it won’t work for – that’s left as an exercise for the reader.
September 17, 2012
Following a question on the Oracle Core Addenda pages, here’s a little script to tell you about the sizes of the public and private redo threads currently active in the instance. It’s a minor variation of a script I published in Appendix D (Dumping and Debugging), page 237 to show the addresses of current activity in the various log buffers:
September 8, 2012
One of the partitioning features that Oracle introduced relatively recently was the “interval partition”, for example:
create table transactions ( account_id number(8) not null, transaction_date date not null, transaction_type varchar2(2) not null, transaction_id varchar2(10) not null, amount number(10,2) not null, padding varchar2(100) ) partition by range (transaction_date) interval (numtoyminterval(1,'MONTH')) ( partition p200801 values less than (to_date('01-FEB-2008','DD-MON-YYYY')) ) ;
When I insert data into this table for a partition that doesn’t yet exist, Oracle will work out which partition it should be and create it automatically before doing the insert. The benefit of this trick, of course, is that the DBAs and developers don’t have to write any code to add partitions in anticipation of time passing and new data appearing.
August 27, 2012
Here’s a simple piece of SQL that could, in theory, compare the current size of a table with the size it might be after a call to “alter table move” – and it’s followed by the results for a table that’s current in the database that I’m looking at:
select blocks, num_rows, avg_row_len, pct_free, ceil(num_rows * avg_row_len / (8000 * ((100 - pct_free)/100))) blocks_needed from user_tables where table_name = 'T1' ; BLOCKS NUM_ROWS AVG_ROW_LEN PCT_FREE BLOCKS_NEEDED ---------- ---------- ----------- ---------- ------------- 25 1000 22 10 4
July 13, 2012
A question came up on the OTN database forum a few days ago asking why a very simple query against a hash partitioned table was doing so much physical I/O to get a very small amount of data from a single partition:
I have table hash-partitioned on an indexed column “id” which is non-unique and part of my primary key. Inside each partition, rows with the same id are located close to each other which is done by dbms_redefinition reorg using orderby_cols. The intention is to reduce the amount of physical reads since there are no queries that don’t filter on the id column.
What I’m seeing is a lot of physical reads though. The first partition has roughly 80K rows, an average row length of 347, block size of 8K and compression … resulting in 821 blocks. And when (after flushing buffered cache and shared pool) submit a query that filters on “id” only and results is 106 selected rows, I see roughly 1400 physical reads.
The poster has clearly applied some careful thought to minimising the work required – so what’s gone wrong ? To help us analyse the problem he posted the results from running a query through autotrace:
June 14, 2012
A few years ago (2007) I wrote about a problem that could appear when you mixed parallel execution with system managed extent allocation. A couple of years later I added a note that Christian Antognini had observed a patch in 22.214.171.124 that addressed the specific issue I had raised. Today, thanks to an email exchange with Christo Kutrovsky of Pythian, I can report that there is a variation of this issue still available even in 126.96.36.199.
The basic problem is that you can end up with a very large number of very small extents, leading to poor performance in parallel queries and a significant waste of space in a data segment. Here’s a simple, though not particularly realistic, way to demonstrate the problem.
June 7, 2012
There was a question on the OTN database forum a little while ago about the different between the relative and absolute files numbers for a data file as reported in view dba_data_files – where they appear as file_id and relative_fno – or v$datafile – where they appear as file# and rfile#. In response, someone resurrected a very old posting of mine from the Usenet newsgroup comp.databases.oracle.server, which says the following:
June 5, 2012
A recent posting on the OTN database forum (which I didn’t answer correctly on my first attempt) raised the problem of truncating partitions when there are referential integrity constraints in place. Let’s start with a demonstration of the problem: