A question came up on the OTN database forum a few days ago asking why a very simple query against a hash partitioned table was doing so much physical I/O to get a very small amount of data from a single partition:
I have table hash-partitioned on an indexed column “id” which is non-unique and part of my primary key. Inside each partition, rows with the same id are located close to each other which is done by dbms_redefinition reorg using orderby_cols. The intention is to reduce the amount of physical reads since there are no queries that don’t filter on the id column.
What I’m seeing is a lot of physical reads though. The first partition has roughly 80K rows, an average row length of 347, block size of 8K and compression … resulting in 821 blocks. And when (after flushing buffered cache and shared pool) submit a query that filters on “id” only and results is 106 selected rows, I see roughly 1400 physical reads.
The poster has clearly applied some careful thought to minimising the work required – so what’s gone wrong ? To help us analyse the problem he posted the results from running a query through autotrace:
A few years ago (2007) I wrote about a problem that could appear when you mixed parallel execution with system managed extent allocation. A couple of years later I added a note that Christian Antognini had observed a patch in 18.104.22.168 that addressed the specific issue I had raised. Today, thanks to an email exchange with Christo Kutrovsky of Pythian, I can report that there is a variation of this issue still available even in 22.214.171.124.
The basic problem is that you can end up with a very large number of very small extents, leading to poor performance in parallel queries and a significant waste of space in a data segment. Here’s a simple, though not particularly realistic, way to demonstrate the problem.
There was a question on the OTN database forum a little while ago about the different between the relative and absolute files numbers for a data file as reported in view dba_data_files – where they appear as file_id and relative_fno – or v$datafile – where they appear as file# and rfile#. In response, someone resurrected a very old posting of mine from the Usenet newsgroup comp.databases.oracle.server, which says the following:
A recent posting on the OTN database forum (which I didn’t answer correctly on my first attempt) raised the problem of truncating partitions when there are referential integrity constraints in place. Let’s start with a demonstration of the problem:
Here’s another one of those little changes that sidles in when you’re not looking.
When locally managed tablespaces first appeared, there were a number of posts on Usenet (comp.databases.oracle.server) and the metalink lists (the OTN database forum didn’t exist at the time) about “missing” space in data files. The commonest sort of comment was along the lines of
“I’ve created a data file of 100 MB, how come I can only create 99 extents of 1 MB each?”
“I’ve created a data file of 10 GB, how come I can only create 9 extents of 1 GB each?”
For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.
It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while reading the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment on page 429 to the effect that:
In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.
I was in a discussion recently about how to estimate the size of a bitmap index before you build it, and why it’s much harder to do this for bitmap indexes than it is for B-tree indexes. Here’s what I wrote in “Practical Oracle 8i”:
If you drop a unique or primary key constraint the index that supports it may be dropped at the same time – but this doesn’t always happen. Someone asked me recently if it was possible to tell whether or not an index would be dropped without having to find out the hard way by dropping the constraint. The answer is yes – after all, Oracle has to make a decision somehow, so if we can find out how it makes the decision we can predict the decision.
So here’s my best theory so far – along with the observations that led to it. First, run a trace while dropping a primary key constraint and see if this gives you any clues; on an instance running 10gR2 I noticed the following statement appearing in the trace file immediately after the delete from cdef$ (constraint definitions).
I know I haven’t been very good about posting on the blog or replying to questions lately (and a big thank you to anyone who has answered some of the recent questions correctly), but tonight is a Friday night, and I have a few moments to spare, so here’s a question prompted by a recent comment on OTN.
I have a table declared as follows (and the ellipsis means repeat the column definitions according to the simplest and most obvious pattern to give you 1,000 columns all of type number(1)):
You’ve probably heard about adaptive cursor sharing, and possibly you’ve wondered why you haven’t seen it happening very often on your production systems (assuming you’re running a modern version of Oracle). Here’s a little bug/fix that may explain the non-appearance.
MOS Doc ID 9532657.8 Adaptive cursor sharing ignores SELECTs which are not fully fetched.
This bug is confirmed in 126.96.36.199, and fixed in 188.8.131.52. The problem is that the ACS code doesn’t process the statistical information from the cursor unless the cursor reaches “end of fetch” – i.e. if you don’t select all the data in your query, Oracle doesn’t consider the statistics of that execution when deciding whether or not to re-optimise a statement.
It’s quite possible, of course, for an OLTP system, and particularly a web-based system, to execute a number of that allow the user to fetch data one “page” at a time, and stop before fetching all the data – so this bug (or limitation, perhaps) means that some critical statements in your application may never be re-optimized. If this is the case, and you know that you have some such statements that should generate multiple plans, then you could add the hint /*+ bind_aware */ to the SQL.
Upgrade woes: as ever, when a bug is fixed, it’s possible that a few people will suffer from unfortunate side-effects. In the case of this bug, Oracle may start to re-optimize and generated multiple child cursors for SQL statements that (from your perspective) didn’t need the extra work. If you’re very unlucky this may have an undesirable impact on execution performance, and library cache activity.
Thanks to Leonid Roodnitsky for sending me a note about this bug after attending one of my tutorial days last month.
Here’s a quirky little detail that may make you think carefully about how you define and load large tables.
I have a large table which I load with data and then apply the following:
alter table t_15400 modify (id not null, small_vc not null);
Would you really expect to find Oracle doing two tablescans on the table to enable these constraints ? This is what I found in a trace file (with a lot of db file scattered read waits and other stuff in between) when I ran the test recently on 184.108.40.206:
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "SYS"."T_15400" A where( "ID" is null)
select /*+ all_rows ordered */ A.rowid, :1, :2, :3 from "SYS"."T_15400" A where( "SMALL_VC" is null)
It’s just a little difficult to come up with a good reason for this approach, rather than a single statement that validates both constaints at once.
Somewhere I think I’ve published a note that points out that when you add a primary key constraint Oracle first checks that the key column (or column set) is not null – which means that adding a primary key may also result in a tablescan for every column in the index before the index is created – but in that case you can’t see the SQL that checks each column, you have to infer the check from the number of tablescans and number of rows fetched by tablescan. The trace file is rather more helpful if all you’re doing is adding the not null constraints.
A recent post on the OTN database forum raises a problem with v$sql_shared_memory:
query to V$SQL_SHARED_MEMORY don’t return rows
please explain why ?
A follow-up posting then describes how the OP picked the view definition from v$fixed_view_definitions and used the text of that query instead of the view itself – and still got no rows returned:
SELECT * FROM
(select /*+use_nl(h,c)*/ c.inst_id,kglnaobj,kglfnobj, kglnahsh, kglobt03, kglobhd6, rtrim(substr(ksmchcom, 1, instr(ksmchcom, ‘:’, 1, 1) – 1)), ltrim(substr(ksmchcom, -(length(ksmchcom) – (instr(ksmchcom, ‘:’, 1, 1))), (length(ksmchcom) – (instr(ksmchcom, ‘:’, 1, 1)) + 1))), ksmchcom, ksmchptr, ksmchsiz, ksmchcls, ksmchtyp, ksmchpar from x$kglcursor c, x$ksmhp h where ksmchds = kglobhd6 and kglhdadr != kglhdpar)
WHERE ROWNUM < 3;
The answer is quite simple – but in two parts.
When I wrote a note last week about the fixes to the subquery factoring optimizer code in 220.127.116.11, I finished with a comment about having more to say on the test case if I materialized the subquery. Today’s the day to talk about it. As a reminder, here’s the query, but with the /*+ materialize */ hint in place:
In my previous post, I made the comment:
In general, if you have a three-column index that starts with the same columns in the same order as the two-column index then the three-column index will be bigger and have a higher clustering_factor.
So what scenarios can you come up with that fall outside the general case ?
Alternatively, what argument could you put forward that justifies the general claim ?
I’ll try to respond to comments on this post a little more quickly than the last one, but I still have quite a lot of other comments to catch up on.
In a recent question on OTN someone asked why Oracle had put some columns into the overflow segment of an IOT when they had specified that they should be in the main index section (the “IOT_TOP”) by using the including clause.
The answer is simple and devious; there’s a little trap hidden in the including clause. It tells Oracle which columns to include, but it gets applied only after Oracle has re-arranged the column ordering (internally) to put the primary key columns first. The OP had put the last column of the primary key AFTER the dozen columns in the table that he wanted in the index section, but Oracle moved that column to the fifth position in the internal table definition, so didn’t include the desired 10 extra columns.