Oracle Scratchpad

March 13, 2014

Shrink Space

Filed under: Bugs,deadlocks,fragmentation,Index Rebuilds,Indexing,Locks,Oracle — Jonathan Lewis @ 7:08 am BST Mar 13,2014

Here’s a lovely effect looking at v$lock (on 11.2.0.4)

select
        sid, type, id1, id2, lmode, request, ctime, block
from
        V$lock
where   sid in (
                select  sid
                from    V$session
                where   username = 'TEST_USER'
        )
order by
        sid, type desc
;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- -- ---------- ---------- ---------- ---------- ---------- ----------
       145 TX     589833       7518          6          0        489          0
           TM      92275          0          2          6        489          1
           SK          7   25165955          6          0        489          0
           AE        100          0          4          0        582          0

       148 TX     524308       7383          6          0        490          0
           TM      92275          0          2          6        490          1
           SK          7   25173379          6          0        490          0
           AE        100          0          4          0        998          0

You’ll notice I’ve got two sessions holding a TM lock on the same table (object 92275) in mode 2 (sub-share) and waiting for an exclusive lock on the same table. Both sessions are recording the fact that they are blocking something else. You’ll have trust me when I say there are no other user sessions on the system at this point, and none of the background sessions is doing anything with that table.

The clue to what’s happening is the SK lock – it’s the “segment shrink” lock. I had two sessions start an “alter index I_n shrink space” (two different indexes on the same table) at the same time. The problem is that “shrink space” without the “compact” tries to drop the highwater mark on the index’ allocated space after completing the shrink phase – and there’s a defect in the internal code that tries to get the required exclusive lock on the underlying table: it doesn’t seem to allow for all the possible ways you can fail to get the lock. If you look at v$session_wait_history for either of these sessions, you’ll see something like the following:


  SID  SEQ#     EVENT# EVENT                            P1         P2         P3  WAIT_TIME WAIT_TIME_MICRO TIME_SINCE_LAST_WAIT_MICRO
----- ----- ---------- ------------------------ ---------- ---------- ---------- ---------- --------------- --------------------------
  145     1        798 Wait for shrink lock              0          0          0         10           99954                      70137
          2        235 enq: TM - contention     1414332422      92275          0        600         6002022                        101
          3        798 Wait for shrink lock              0          0          0         10          100723                      69335
          4        235 enq: TM - contention     1414332422      92275          0        600         6001589                         81
          5        798 Wait for shrink lock              0          0          0         10          100500                      69542
          6        235 enq: TM - contention     1414332422      92275          0        600         6002352                         86
          7        798 Wait for shrink lock              0          0          0         10          100618                      69145
          8        235 enq: TM - contention     1414332422      92275          0        600         6001545                        144
          9        798 Wait for shrink lock              0          0          0         10          100996                      69445
         10        235 enq: TM - contention     1414332422      92275          0        600         6002744                        310

The attempt to acquire the TM enqueue (lock) times out every three seconds – and I think the session then releases and re-acquires the SK lock before trying to re-acquire the TM lock – and it’s never going to get it.

The alert log, by the way, looked like this:


Wed Mar 12 12:53:27 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
Wed Mar 12 12:53:30 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
Wed Mar 12 12:53:40 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
Wed Mar 12 12:53:43 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.

I’ve said it before, and I keep repeating it when people say “Oracle resolves deadlocks automatically”: Oracle does NOT resolve deadlocks automatically – one of the sessions will rollback its last DML statement to clear the deadlock, but the other session will (almost invariably) still be waiting. It’s up to the application to do something sensible to resolve the deadlock after it receives the ORA-00060 error.

Don’t ask about the trace files – but they had both reached 400MB by the time I finished this note.

Strategy Note

If you are going to shrink objects, it’s probably best to do it in two steps: “shrink space compact, followed by “shrink space”.
If you’re going to try to use multiple sessions to shrink several indexes as quickly as possible, make sure there’s no way that two sessions can try to “shrink space” on the indexes on the same table at the same time.

Footnote:

This article was prompted by the recent appearance of bug 18388128 on MoS.

June 28, 2013

Index rebuild 12c

Filed under: 12c,Index Rebuilds,Indexing,Oracle — Jonathan Lewis @ 8:35 am BST Jun 28,2013

Just one of those little snippets about 12c that might help someone.

Further to an earlier post, online rebuild works in 12c even when the key is “too long”. The internal code has changed completely, and there is no sign of the problematic journal table that caused the problem in earlier versions.

September 4, 2012

Online Rebuild

Filed under: Index Rebuilds,Indexing,Oracle — Jonathan Lewis @ 5:46 pm BST Sep 4,2012

I’ve commented in the past about the strange stories you can find on the internet about how Oracle works and how sometimes, no matter how daft those stories seem, there might be something behind them. Here’s one such remark I came across a little while ago – published in two or three places this year:

“An index that enforces referential integrity cannot be rebuilt online.”

There are a couple of problems with this statement – first, of course, indexes don’t enforce referential integrity, though they may help to enforce uniqueness, and the so-called “foreign key” index may avoid a locking issue related to referential integrity: that’s splitting hairs a little bit, though, and we can probably guess what the author means by “indexes enforcing referential integrity”.  (An example demonstrating the problem would have been useful, though – it would have saved me from writing this note, and it might save other people from jumping to the wrong conclusion and taking unsuitable action as a consequence.)

So here’s a simple test (run under 11.2.0.3):
(more…)

May 28, 2012

Ch-ch-ch-ch-changes

Filed under: Index Rebuilds,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 5:31 pm BST May 28,2012

For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.

It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while reading the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment on page 429 to the effect that:

In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.

(more…)

August 9, 2011

blevel=1

Filed under: CBO,Index Rebuilds,Indexing,Oracle,Statistics — Jonathan Lewis @ 8:34 pm BST Aug 9,2011

Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget where I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.

Another problem with high insert/delete rates appears with very small indexes.

If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.

The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.

Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of  “2 * number of times you visit the table” if your small table is the second table in a nested loop join – and suddenly a nested loop becomes much more expensive without a real change in the data size.

Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.

Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)

March 30, 2011

ASSM wreck

Filed under: ASSM,Bugs,Index Rebuilds,Indexing,Oracle,Troubleshooting — Jonathan Lewis @ 5:25 pm BST Mar 30,2011

Yesterday I introduced a little framework I use to avoid the traps inherent in writing PL/SQL loops when modelling a session that does lots of simple calls to the database. I decided to publish the framework because I had recently come across an example where a series of SQL statements gives a very different result from a single PL/SQL block.

 

(more…)

March 10, 2011

Index Rebuilds

Filed under: Index Rebuilds — Jonathan Lewis @ 6:15 pm BST Mar 10,2011

I wrote a short note last week that linked to a thread on the Russian Oracle forum about indexing, and if you’ve followed the thread you will have seen a demonstration that seemed to be proving the point that there were cases where an index rebuild would be beneficial.

Of course it’s not difficult to come up with cases where index rebuilds should make a difference – but it’s harder to come up with demonstrations that look realistic, so I thought I’d review the example to explain why it doesn’t really work as a good example of why you might need to think about rebuilding some production index.
(more…)

March 3, 2011

Index Rebuilds

Filed under: Index Rebuilds,Indexing,Infrastructure,Performance — Jonathan Lewis @ 6:43 pm BST Mar 3,2011

A couple of days ago I found several referrals coming in from a question about indexing on the Russian Oracle Forum. Reading the thread I found a pointer to a comment I’d written for the Oracle-L list server a couple of years ago about Advanced Queueing and why you might find that it was necessary to rebuild the IOTs (index organized tables) that support AQ.

The queue tables are, of course, a perfect example of what I call the “FIFO” index so it’s not a surprise that they might need special consideration. Rather than rewrite the whole note I’ll just link to it from here. (One of the notes in the rest of the Oracle-L thread also points to MOS document 271855.1 which describes the whys and hows of rebuilding AQ tables.)

December 27, 2010

Index Rebuilds

Filed under: Index Rebuilds,Indexing — Jonathan Lewis @ 9:36 am BST Dec 27,2010

There are many suggestions floating around the internet about identifying which Oracle indexes to rebuild. One of these involves running the validate (or analyze index validate) command on an index and checking the resulting figures in view index_stats to determine whether or not del_lf_rows exceeds some percentage (usually 20%) of lf_rows and rebuilding the index if so. Since this suggestion came up, yet again, in a recent OTN thread about Oracle index rebuilding I thought I’d write up a quick note highlighting one of the obvious flaws in the argument.

I’m not going to bother pointing out the threat inherent in the table-locking when you use the “validate index” command (or the “analyze index validate” equivalent) but I think it’s worth making some comments about the misunderstanding built into this policy. So let’s start by building some data, creating an index on it, then deleting 90% of the data:
(more…)

August 24, 2010

Index rebuilds

Filed under: Index Rebuilds,Indexing,Infrastructure,Performance — Jonathan Lewis @ 6:56 pm BST Aug 24,2010

A couple of years ago I wrote about a poster on the OTN db forum who was puzzled by the fact that when he started rebuilding tables they got bigger.  (Bad luck, sometimes that’s what happens !)

A few days ago a related question appeared: I rebuilt some indexes and my query got slower. (Bad luck, sometimes that’s what happens – again!)

If you rebuild an index it’s physically different and its statistics are different. Plans can change and go slower because the index stats look sufficiently different; plans can stay the same and go slower because the index is physically different. 

I’ve added a couple of comments to the thread – there may still be some further mileage in it.

[Further reading on rebuilding indexes]

July 22, 2010

Fragmentation 4

Filed under: fragmentation,Index Rebuilds,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 7:00 pm BST Jul 22,2010

This note is part four of a four-part series, and covers Index fragmentation. The whole series is as follows

  1. Introduction – with links to parts 2 – 4
  2. Disk and Tablespace Fragmentation
  3. Table Fragmentation
  4. Index Fragmentation – this bit

4. Index “fragmentation”.

The multiple extent and ASSM “fragmentation” that I described in the previous article about table fragmentation applies equally well to indexes, of course, and matters in just the same way – i.e. hardly ever.
(more…)

June 27, 2010

Coalesce

Filed under: Index Rebuilds,Indexing,Infrastructure,Oracle,Performance,redo — Jonathan Lewis @ 6:36 pm BST Jun 27,2010

The following question came up in an email conversation a little while ago:

Are you aware of any problems a large oltp site might have with running index coalesce during production hours, as opposed to doing index rebuilds in a maintenance window?

(more…)

May 30, 2010

Index Rebuilds

Filed under: Index Rebuilds,Indexing,Oracle — Jonathan Lewis @ 7:57 pm BST May 30,2010

A question came up on OTN a little while ago about an oddity during an online index rebuild. The original question describes how a relatively small index (500MB) takes several minutes to rebuild, with a few minutes where users who are trying to update the table seem to be stuck on enqueue wait events, waiting for locks on the table to be released. But Oracle tells us that “online” rebuilds of indexes don’t cause long-lasting locks to appear – is this a bug, or a new myth ?

My response to the question is speculative but, I hope, a reasonable suggestion that fits the original description. The amount of time that the locks are held at either end of the rebuild are dependent on the level (and timing) of activity on the table.

In Oracle 10g, I have managed to cause long waits for other users – even when the process doing the index rebuild took fractions of a second to get its final lock – by generating a very large volume of change as the initial stage of the index rebuild was taking place.

Note: this was not the same problem as the process doing the rebuild having to wait for the final lock because another user had uncommitted transactions on the table.

April 19, 2010

Index Rebuild ?

Filed under: Index Rebuilds,Indexing,Partitioning,Troubleshooting — Jonathan Lewis @ 7:13 pm BST Apr 19,2010

While searching on Metalink for clues about an oddity relating to an index rebuild, I came across Bug 6767655  – reported in 10.2.0.3 and fixed in 11.2.  The problem is described as follows:

When having two sessions running concurrently, one doing inserts to a partitioned table, and the other doing partition maintenance operations, there is a chance of the DML session core dumping in kauxs_do_journal(),  i.e.:

ORA-7445: exception encountered: core dump [kauxs_do_journal()+21] [SIGSEGV]  [Address not mapped to object] [0x2] [] []

DIAGNOSTIC ANALYSIS:
The problem seems to be a timing issue – or at least there is a timing component involved. When doing the partition split, an ORA-54 is normally raised, however by running the split in a loop, it will eventually succeed (and thus causing problems for the session running the DML).

(more…)

June 5, 2009

Online Rebuild

Filed under: Index Rebuilds,Indexing,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 7:44 pm BST Jun 5,2009

Here’s a little oddity that may be waiting to catch you out – but only if you like to create indexes with very long keys.

create table t1(
	v1	varchar2(4000),
	v2	varchar2(2387),
	v3	varchar2(100)
);

create index t1_i1 on t1(v1, v2);

alter index t1_i1 rebuild;
alter index t1_i1 rebuild online;

My key value is at the limit for an 8KB block size in Oracle 9i and later – which is roughly 80% of (block size – 190 bytes). In earlier versions of Oracle the limit was roughly half that (i.e. 40% rather than 80%). 

(more…)

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,985 other followers