Oracle Scratchpad

March 26, 2014

Diagnostics

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 2:13 pm GMT Mar 26,2014

Here’s a little test you might want to try. Examine the following script, and decide what sort of symptoms you would see in the AWR report.


create global temporary table gtt1(n1 number);

execute dbms_workload_repository.create_snapshot;

insert into gtt1 values(1);
truncate table gtt1;

-- repeat insert/truncate for a total of 100 cycles

execute dbms_workload_repository.create_snapshot;

-- generate an AWR report across the interval.

I don’t need anyone to tell me their results – but if your predictions and the actual results match then you can give yourself a pat on the head.
You might also like to enable SQL trace for all the inserts/truncate to see if that shows you anything interesting.

This is one of the simpler scripts of the 3,500 I have on my laptop that help me interpret the symptoms I see in client systems.

March 3, 2014

Flashback Fail ?

Filed under: Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 4:19 pm GMT Mar 3,2014

Sitting in an airport, waiting for a plane, I decided to read a note (pdf) about Flashback data archive written by Beat Ramseier from Trivadis.  I’d got about three quarters of the way through it when I paused for thought and figured out that on the typical database implementation something nasty is going to happen after approximately 3 years and 9 months.  Can you guess why ?

It’s all about smon_scn_time – which normally records one row every five minutes (created by smon) with a continuous cycle of 24 hours – typically giving you about 1,440 rows in the table. The table is in a cluster, and the cluster key is the instance (thread) number. Clearly this was originally a clever idea from someone who realised that a cluster key of thread number would be beneficial if you had a RAC system with multiple instances – each instance gets its own blocks and the data for any one instance is as well clustered as possible.

The trouble is, when you enable flashback data archive smon no longer sticks to a 24 hour cycle, it just keeps adding rows. Now on my 8KB block tablespace I see 6 rows per block in the table/cluster – which means I get through 48 blocks per days,  17,520 blocks per year, and in 3 years and 9 months I’ll get to roughly 65,700 blocks – and that’s the problem.  An index entry in a cluster index points to a chain of cluster blocks, and the last two bytes of the “rowid” in the index entry identify which block within the chain the cluster key scan should start at – and two bytes means you can only have 65,536 blocks for a single cluster key.

I don’t know what’s going to happen when smon tries to insert a row into the 65,535th (-ish) block for the current thread – but it ought to raise an Oracle error, and then you’ll probably have to take emergency action to make sure that the flashback mechanisms carry on running.

Although oraus.msg indicates that it’s an error message about hash clusters it’s possible that the first sight will be: Oracle error: “ORA-02475 maximum cluster chain block count of %s has been exceeded”. If you’re using a 16KB block size then you’ve got roughly 7.5 years, and 32KB block sizes give you about 15 years (not that that’s a good argument for selecting larger block sizes, of course.)

Footnote:

Searching MoS for related topics (smon_scn_time flashback) I found doc ID: 1100993.1 from which we could possibly infer that the 1,440 rows was a fixed limit in 10g but that the number of rows allowed in smon_scn_time could increase in 11g if you enable automatic undo management. I also found a couple of bugs relating to index or cluster corruption – fixed by 11.2.0.4, though.

 

 

February 10, 2014

Row Migration

Filed under: Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 6:55 pm GMT Feb 10,2014

At one of the presentations I attended at RMOUG this year the presenter claimed that if a row kept increasing in size and had to migrate from block to block as a consequence then each migration of that row would leave a pointer in the previous block so that an indexed access to the row would start at the original table block and have to follow an ever growing chain of pointers to reach the data.

This is not correct, and it’s worth making a little fuss about the error since it’s the sort of thing that can easily become an urban legend that results in people rebuilding tables “for performance” when they don’t need to.

Oracle behaves quite intelligently with migrated rows. First, the migrated row has a pointer back to the original location and if the row has to migrate a second time the first place that Oracle checks for space is the original block, so the row might “de-migrate” itself; however, even if it can’t migrate back to the original block, it will still revisit the original block to change the pointer in that block to refer to the block it has moved on to – so the row is never more than one step away from its original location. As a quick demonstration, here’s some code to generate and manipulate some data:

(more…)

RAC Plans

Filed under: Execution plans,Hints,Oracle,RAC,Troubleshooting — Jonathan Lewis @ 1:12 pm GMT Feb 10,2014

Recently appeared on Mos – “Bug 18219084 : DIFFERENT EXECUTION PLAN ACROSS RAC INSTANCES”

Now, I’m not going to claim that the following applies to this particular case – but it’s perfectly reasonable to expect to see different plans for the same query on RAC, and it’s perfectly possible for the two different plans to have amazingly different performance characteristics; and in this particular case I can see an obvious reason why the two nodes could have different plans.

Here’s the query reported in the bug:

(more…)

February 5, 2014

Minus

Filed under: Execution plans,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 5:42 pm GMT Feb 5,2014

Here’s a little script to demonstrate an observation about a missed opportunity for avoiding work that appeared in my email this morning (that’s morning Denver time):

(more…)

February 2, 2014

Clustaghhh!

Filed under: clusters,Infrastructure,Oracle,Troubleshooting — Jonathan Lewis @ 10:27 pm GMT Feb 2,2014

It doesn’t matter which bit of Oracle technology you want to use, eventually someone, somewhere, runs into the special case where something nasty happens. Here’s an edge case for people using (index) clusters – Oracle Bug  17866999 ora-1499 for cluster following rman convert

It comes from a conversation on Oracle-L where Jack van Zanen reported a problem of inconsistent results after migrating data between platforms using rman to converts some tablespaces containing index clusters. This is the starting post where he shows a query that is clearly getting the wrong answer (select where channel_number = 503 obviously shouldn’t return data with channel_number 501).

(more…)

January 26, 2014

Pagination

Filed under: Bugs,Indexing,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 12:08 pm GMT Jan 26,2014

I was involved in a thread on Oracle-L recently started with the question: “How many LIOs is too many LIOs”. Rather than rewrite the whole story, I’ve supplied a list of links to the contributions I made, in order – the final “answer” is actually the answer to a different question – but travels an interesting path to get there.#

I’ve got a script to emulate the requirement so that people can see for themselves the bug that I mention in post 15; I’ll try to add a couple of notes to it and publish it some time, but for the moment I’ll just remind myself that it’s called (slightly counter-intuitively: no_sort_problem.sql)

January 6, 2014

LOB changes

Filed under: Infrastructure,LOBs,Oracle,Troubleshooting — Jonathan Lewis @ 7:10 pm GMT Jan 6,2014

It’s always useful to collect baseline information – especially when it helps you notice that the baseline has moved in a way that might explain the next performance problem you see. Here’s an example demonstrating the benefit.

I have a table with a LOB column that is stored out of line. Many years ago I decided that I wanted to compare how the redo generation varied as I change the LOB from cached to nocache (with nologging). So here was one of my simplest test scripts (stripped to a minimum):

(more…)

December 31, 2013

Troubleshooting

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 5:30 pm GMT Dec 31,2013

If it’s a long night, and you’ve got nothing better to do waiting for the new year, then here’s some reading to keep you entertained. A collation of Tanel Poder’s trouble-shooting blog postings, dating from June 2007 to September 2013:

  1. When the wait interface is not enough
  2. No magic is needed, systematic approach will do
  3. More adventures in the process stack
  4. Diagnosing a long parsing issue
  5. Sampling v$ stuff with waitprof really fast using SQL
  6. Understanding execution plans with OS explain
  7. Sampling latch holder statistics using latchprof
  8. Even more details latch troubleshooting using latchprofx
  9. Process stack profiling from SQL Plus using ostackprof
  10. Index unique scan doing multiblock reads
  11. Complex wait chain signature analysis with ash_wait_chains sql

December 23, 2013

Random slowdown

Filed under: Oracle,Troubleshooting — Jonathan Lewis @ 6:48 pm GMT Dec 23,2013

One of the old, old questions came up on OTN a few days ago:  “What reasons are there for a piece of (batch) SQL to run very slowly occasionally when it usually runs very quickly ?” Then a similar one appeared a couple of days later. There are plenty of reasons, so I thought I’d start listing a few and see how many other (realistic) reasons other people wanted to add.

(more…)

December 11, 2013

Null Quiz

Filed under: Oracle,Performance,Troubleshooting,Tuning — Jonathan Lewis @ 6:42 pm GMT Dec 11,2013

Here’s an example I saw a few months ago of the confusion caused by NULL. As the owner of the problem put it: the first query, run from SQL*Plus for testing purposes, takes no time to complete; but when “put into a pl/sql cursor” (as shown in the second query) it takes ages to complete.

(more…)

August 26, 2013

Index Sorting

Filed under: latches,Oracle,Troubleshooting — Jonathan Lewis @ 7:48 am GMT Aug 26,2013

From time to time I’ve mentioned the fact that the optimizer will sort indexes alphabetically by name as the last tie-breaker when two plans have the same cost. Thanks to an email that arrived a couple of days ago I’ve been alerted to event 10089 (which has been around since at least 8.1.7.4) with the description: “CBO Disable index sorting”.

(more…)

July 12, 2013

Wrong Index 2

Filed under: CBO,Oracle,trace files,Troubleshooting — Jonathan Lewis @ 5:17 pm GMT Jul 12,2013

A couple of days ago I wrote an article about Oracle picking the “wrong index” after an index rebuild, and I mentioned that the sample data I had generated looked a little odd because it came from a script I had been using to investigate a completely different problem. This note describes that other problem, which appeared on the Oracle-L mailing list last month.

Stripped to a bare minimum, here’s the issue: we have a simple query against a single table with two indexes IDX2_AUFTRAG(arsAuftragsNr, dategAuftragsNr) and IDX7_AUFTRAG(arsAuftragsNr), and a predicate “arsAuftragsNr = {constant}”. Since the second column in the two-column index is irrelevant (we can’t use it to avoid visiting the table, and it’s not part of a group by or order by clause), and since adding a column to an index is likely to increase the clustering_factor and leaf block count of the index, we would probably expect to see Oracle pick the single column index as the path to the table – but it doesn’t, it picks the two-column index.

(more…)

July 2, 2013

12c I/O debug

Filed under: 12c,Oracle,Troubleshooting — Jonathan Lewis @ 6:05 pm GMT Jul 2,2013

By my count there are 109 new v$ and gv$ dynamic performance views in 12c (so far) – and Glen Fawcett has posted a short note on a group that may be of particular benefit to anyone who finds they really have to delve into esoteric I/O problems from time to time. For the less exotic, there’s v$io_outliers and v$lgwrio_outliers which give details about any very slow I/Os – for the more exotic there’s v$kernel_io_outliers – which is the really fascinating one.

Here’s a short session capture focused on v$io_outliers:

(more…)

July 1, 2013

12c Determinism

Filed under: 12c,Oracle,Troubleshooting — Jonathan Lewis @ 4:55 pm GMT Jul 1,2013

Following a comment from Marcin Przepiorowski on my last post, it crossed my mind to check whether “with” functions can be deterministic – the answer seems to be “not yet”. Here’s a simple script that you can run from end to end to check current and future releases – it compares inline (with) and standalone functions when the “deterministic” keyword has been used.

(more…)

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,523 other followers