It has taken much longer than I anticipated to get around to writing part 3 of this mini-series on what Oracle has done about histograms in 12c.
In part 1 I gave a thumbnail sketch of the three types of histogram available in 12c
In part 2 I described in some detail the improvements in performance and accuracy for the frequency and top-frequency histograms
In part 3 of this mini-series I’ll be describing how the implementation of the “hybrid” histogram that Oracle produces if the “approximate NDV” mechanism has been enabled and you’ve left the estimate_percent to auto_sample_size. There is little difference between the work needed to create a hybrid histogram and the work needed to generate the old “height-balanced” histogram, but the degree of information captured by the hybrid is much greater than that of the height-balanced.
I’ve spent so many years trying to explain that a “hint” to the Oracle optimizer is an order – if you know how to do it properly – that I finally decided to list the manual references that have made this point over the last 15 or so years. Here’s the list, which ends with a surprising change of flavour. (Emphasis in the body of the text is mine).
Here’s a little note that came about after I tweeted an idle thought on Twitter yesterday
- 12c allows you to have multiple indexes on the same columns on a table, although only one of them is allowed to be visible at any one time – you can do the same with any recent versions of Oracle “almost”, and without the invisibility requirements. (Thanks to Jason Bucata for suggesting the critical detail on this one.)
- 12c allows you to have “partial” indexing on partitioned tables – you can do the same with earlier versions of Oracle “almost” but only if the indexes are local indexes or globally partitioned.
- 12c doesn’t officially allow you to create an index that is a bitmap in the past and a btree in the present (yet) – although you can almost do this in any recent versions of Oracle.
There are a couple of posts on the blog describing problems with updateable join views or, to be more precise, join views which were key-preserved but which the optimizer did not recognize as key-preserved. Both scenarios are addressed in 12c:
Just glancing through the 12c manuals (Server Reference 12.1 June 2013 – E17615-16) to check a particular database limit, I came across the following: “Services – maximum per instance – 115″. That’s a bit of a problem, given that you can have 254 pluggable (tenant) databases in a single container database, and each plugged database gets its own service – but I’m guessing that that bit of the manual is wrong, after all it didn’t say anything about pluggable databases at all. It’s hard to keep documentation up to date as things change.
Here’s a random thought, though, loosely linked to database limits. If you’re looking ahead to a time when you have lots of tenants in a container database, you might want to start by migrating your existing databases from smallfile tablespaces to bigfile tablespaces (which may make it a good idea to run with change tracking enabled) so that the final container database doesn’t have a totally unmanageable number of database files.
Update 13th Aug 2013
Read the comments for a limit on the total number of services a container database can run.
I’ve written a few notes about anomalies in subquery factoring (with subquery) in the past, principally making a fuss about the fact that moving an inline view into a “with subquery” can cause a plan to change even when the internal code moves the subquery back in line. With the arrival of 12c one of my first sets of tests was to rerun all the examples to see how many of them had been addressed. I hadn’t written about as many examples as I had thought, and some of them had been fixed before 12c, but here are few references to a couple of outstanding items that I thought worth a mention:
[Further reading on "subquery factoring"]
I’ve posted this note as a quick way of passing on an example prompted by a twitter conversation with Timur and Maria about Bloom filters:
The Bloom filter (capital B because it’s named after a person) is not supposed to appear in Oracle plans unless the query is executing in parallel but here’s an example which seems to use a serial Bloom filter. Running in 18.104.22.168 and 22.214.171.124 (the results shown are the latter – the numbers are slightly different between versions):
In part 2 of this mini-series I’ll be describing the new mechanism for the simple frequency histogram and the logic of the Top-N frequency histogram. In part 3 I’ll be looking at the new hybrid histogram.
You need to know about the approximate NDV before you start examining the 12c implementation of the frequency and top-frequency histograms – but there’s a thumbnail sketch at the end of the posting if you need a quick reminder.
There are a few enhancements in 12c that might make a big difference to performance for a small investment in effort. One of the important enhancements comes from changes in histograms – which improve speed of collection with accuracy of results. The changes are so significant that I chose the topic as my presentation at OpenWorld last year.
A comment from Greg Rahn in response to my posting yesterday prompted me to do a quick follow-up (test time ca. 3 minutes, write-up time, ca. 50 minutes – thanks for the temptation, Greg ;). Greg asked if the “Top N” would push down for a parallel query, so all I had to do was re-run my script with a parallel hint in place. (Such is the joy of constructing test cases – when you get a new idea you may not need to do much work to test it.)
There have been a couple of nice posts about the “Top N” (or First N / Next N) syntax that has appeared in 12c, here and here, for example. I particularly like the first set of examples because they include some execution plans that give you a good idea of what’s going on under the covers. “Under the covers” is important, because if you don’t actually have a large data set to test on you might not realise what impact a “Top N” query might have on a production data set.
Starting from a comment on an old statspack/AWR page, with a near-simultaneous thread appearing on OTN, (do read both) here’s a quick summary of getting statspack onto 12c with containers. (For non-container databases it’s a standard install).
Weighing in at a massive 54 characters – the longest parameter name in 12c is:
Followed very closely by (you guessed it)
By my count there are 109 new v$ and gv$ dynamic performance views in 12c (so far) – and Glen Fawcett has posted a short note on a group that may be of particular benefit to anyone who finds they really have to delve into esoteric I/O problems from time to time. For the less exotic, there’s v$io_outliers and v$lgwrio_outliers which give details about any very slow I/Os – for the more exotic there’s v$kernel_io_outliers – which is the really fascinating one.
Here’s a short session capture focused on v$io_outliers:
Following a comment from Marcin Przepiorowski on my last post, it crossed my mind to check whether “with” functions can be deterministic – the answer seems to be “not yet”. Here’s a simple script that you can run from end to end to check current and future releases – it compares inline (with) and standalone functions when the “deterministic” keyword has been used.