Oracle Scratchpad

December 7, 2018

Plans and Trees

Filed under: Uncategorized — Jonathan Lewis @ 5:58 pm GMT Dec 7,2018

Prompted by a question on the ODC database forum – and also because I failed to get to the “Bonus slides” on my presentation on basic execution plans at both the DOAG and UKOUG conferences, here’s a small of slides demonstrating how to convert a text execution plan into a tree that you can read using the mechanism described in Oracle’s white paper by the phrase: “start from the bottom left and work across and then up”.

The file is a Microsoft Powerpoint file (early version).

 

Misdirection

Filed under: Uncategorized — Jonathan Lewis @ 11:48 am GMT Dec 7,2018

A recent post on the ODC database forum prompted me to write a short note about a trap that catches everyone from time to time. The trap is following the obvious; and it’s a trap because it’s only previous experience that lets you decide what’s obvious and the similarity between what you’re looking and your previous experience may be purely coincidental.

The question on OTN (paraphrased) was as follows:

When I run the first query below Oracle doesn’t use the index on column AF and is slow, but when I run the second query the Oracle uses the index and it’s fast. So when the input starts with ‘\\’ the indexes are not used. What’s going on ?


SELECT * FROM T WHERE AF = '\\domain\test\1123.pdf';
SELECT * FROM T WHERE AF = 'a\\domain\test\1123.pdf';

Looking at the two queries my first thought was that it’s obvious what’s (probably) happening, and my second thought was the more interesting question: “why does this person think that the ‘\\’ is significant ?”

The cause of the difference in behaviour is probably related to the way that Oracle stores statistics (specifically histograms) about character columns, and the way in which the cardinality calculations can go wrong.  If two character match over the first few characters the numeric representation of those strings that Oracle uses in a histogram is identical, and if they are long enough even the “actual value” stored would be identical. It looks as if this person is storing URLs, and it’s quite likely that there are a lot of long URLs that start with the same (long) string of characters – it’s a very old problem – and it’s an example of a column where you probably want to be absolutely sure that you don’t gather a histogram.

But why did the OP decide that the ‘\\’ was the significant bit ? I don’t know, of course, but  how about this:

  • No contrary tests: Perhaps every single time the query misbehaved the value started with ‘\\’ and it never went wrong for any other starting values. And maybe the OP tested several different domain names – it would be much easier to see the ‘\\’ as the common denominator rather than “repetitive leading character string” if you tested with values that spanned different domains.

combined with

  • An easily available “justification”: In many programming languages (including SQL) ‘\’ is an escape character – if you don’t really know much about how the optimizer works you might believe that that could be enough to confuse the optimizer.

It can be very difficult when you spot an obvious pattern to pause long enough to consider whether you’ve identified the whole pattern, or whether you’re looking at a special case that’s going to take you in the wrong direction.

 

October 15, 2018

Faking Histograms

Filed under: Uncategorized — Jonathan Lewis @ 1:37 pm GMT Oct 15,2018

This is a short index of articles I’ve written on how to create the different types of histogram that the optimizer uses:

  • Faking a frequency histogram    How to create frequency histograms (using a numeric column for the example)
  • Histogram Tip  An example of creating a simple character-based frequency histogram (published in the IOUG Tips booklet 2014).
  • Faking a height-balanced histogram  How to create a height-balanced histogram (using a numeric column for the example).
  • Hybrid Fake: How to create a hybrid histogram (using a character column for the example).
  • Extended Histogram:  faking values into a histogram for a column group – only special because we need to derive the value stored.
  • Top frequency:  I haven’t yet worked out how to fake a Top Frequency histogram. Since it’s little more than a frequency histogram where the optimizer knows there’s a further small percentage (less than one bucketful) of other data, this doesn’t worry me; if necessary I’ll just create a “good enough” frequency histogram and set a suitable density for the remainder.

And a couple of miscellaneous things about histograms

  • Big number problem – older versions of Oracle (pre 12c) can go wrong with data values more than 15 digits long
  • Long strings problem – until 12c Oracle stored at most 32 bytes of a string in the endpoint_actual_value column.
  • Hybrid/Top-N problem – a bug, fixed in 12.2 with a patch for 12.1.
  • Upgrade threat – a step you need to take to upgrade from 11.2.0.3 if you have histograms on char() columns
  • Upgrade threat 2 (Oracle-L) – if you’ve got a big history of histograms then the upgrade from 11.2.0.3 (or earlier) could take a long time

 

Powered by WordPress.com.