In part 2 of this mini-series I’ll be describing the new mechanism for the simple frequency histogram and the logic of the Top-N frequency histogram. In part 3 I’ll be looking at the new hybrid histogram.
You need to know about the approximate NDV before you start examining the 12c implementation of the frequency and top-frequency histograms – but there’s a thumbnail sketch at the end of the posting if you need a quick reminder.
There are a few enhancements in 12c that might make a big difference to performance for a small investment in effort. One of the important enhancements comes from changes in histograms – which improve speed of collection with accuracy of results. The changes are so significant that I chose the topic as my presentation at OpenWorld last year.
A comment from Greg Rahn in response to my posting yesterday prompted me to do a quick follow-up (test time ca. 3 minutes, write-up time, ca. 50 minutes – thanks for the temptation, Greg ;). Greg asked if the “Top N” would push down for a parallel query, so all I had to do was re-run my script with a parallel hint in place. (Such is the joy of constructing test cases – when you get a new idea you may not need to do much work to test it.)
There have been a couple of nice posts about the “Top N” (or First N / Next N) syntax that has appeared in 12c, here and here, for example. I particularly like the first set of examples because they include some execution plans that give you a good idea of what’s going on under the covers. “Under the covers” is important, because if you don’t actually have a large data set to test on you might not realise what impact a “Top N” query might have on a production data set.
Starting from a comment on an old statspack/AWR page, with a near-simultaneous thread appearing on OTN, (do read both) here’s a quick summary of getting statspack onto 12c with containers. (For non-container databases it’s a standard install).
Weighing in at a massive 54 characters – the longest parameter name in 12c is:
Followed very closely by (you guessed it)
By my count there are 109 new v$ and gv$ dynamic performance views in 12c (so far) – and Glen Fawcett has posted a short note on a group that may be of particular benefit to anyone who finds they really have to delve into esoteric I/O problems from time to time. For the less exotic, there’s v$io_outliers and v$lgwrio_outliers which give details about any very slow I/Os – for the more exotic there’s v$kernel_io_outliers – which is the really fascinating one.
Here’s a short session capture focused on v$io_outliers:
SQL> select function_name, wait_event, count(*) from v$io_outlier group by function_name, wait_event;
FUNCTION_NAME WAIT_EVENT COUNT(*)
------------------ ---------------------------------------------------------------- ----------
Direct Writes 1
Buffer Cache Reads db file parallel read 1
Others control file parallel write 32
Buffer Cache Reads 10
6 rows selected.
SQL> select io_size, io_latency from v$io_outlier
2 where wait_event = 'control file parallel write'
3 order by io_latency;
32 rows selected.
SQL> select * from v$event_histogram where event = 'control file parallel write';
WAIT_TIME_MILLI WAIT_COUNT LAST_UPDATE_TIME CON_ID
--------------- ---------- ----------------------------------- ----------
1 6903 02-JUL-13 08.34.51.556059 AM +01:00 0
2 67990 02-JUL-13 08.36.06.643280 AM +01:00 0
4 5839 02-JUL-13 08.36.00.635787 AM +01:00 0
8 909 02-JUL-13 08.32.17.999903 AM +01:00 0
16 404 02-JUL-13 08.26.17.019774 AM +01:00 0
32 375 02-JUL-13 08.20.16.049842 AM +01:00 0
64 316 02-JUL-13 08.25.35.997658 AM +01:00 0
128 154 02-JUL-13 07.50.17.522235 AM +01:00 0
256 43 02-JUL-13 07.50.17.268394 AM +01:00 0
512 37 02-JUL-13 07.50.21.405903 AM +01:00 0
1024 31 01-JUL-13 01.46.43.419938 PM +01:00 0
2048 3 02-JUL-13 07.50.14.716905 AM +01:00 0
4096 2 01-JUL-13 06.47.38.055541 AM +01:00 0
8192 5 02-JUL-13 06.12.18.501807 AM +01:00 0
16384 2 01-JUL-13 01.09.21.839019 PM +01:00 0
32768 1 30-JUN-13 02.39.07.008164 AM +01:00 0
65536 0 0
131072 0 0
262144 0 0
524288 0 0
1048576 0 0
2097152 0 0
4194304 1 01-JUL-13 22.214.171.1240514 AM +01:00 0
23 rows selected.
Unfortunately, although the figures are close, a quick check on the v$event_histogram ranges ending 4096, 8192, 16384 and 32767 don’t quite match the precision figures given in v$io_outlier – there’s a little slippage across the boundaries.
Footnote: looking at the gv$ definitions – v$lgwrio_outlier and v$io_outlier differ only in whether they include or exclude “io_component_id” number 2 (the lgwr component).
Following a comment from Marcin Przepiorowski on my last post, it crossed my mind to check whether “with” functions can be deterministic – the answer seems to be “not yet”. Here’s a simple script that you can run from end to end to check current and future releases – it compares inline (with) and standalone functions when the “deterministic” keyword has been used.
create table t1
rownum <= 10
create or replace function standalone_function(i_in number) return number
prompt arraysize 20
set arraysize 20
function inline_function(i_in number) return number
The code is very simple – create a table with 10 rows of one column, every row holding the same value; create a standalone function declared as deterministic that simple outputs a message and returns the input value. If determinism is actually in play then, within a single database call, Oracle will call the function just once per input value rather than once per reference. The query then defines an inline (with) function, and selects using the standalone and inline functions from the table. Here’s the output:
10 rows selected.
As you can see, the inline function has been called 10 times while the standalone function has been called twice. (The two calls to the standalone function is a feature of the way that SQL*Plus fetches the first row from a query and then starts using the arraysize for subsequent fetches – each fetch call is a new database call which requires a new call to the deterministic function … try running the code with the arraysize set to 2 (or 1 – which SQL*Plus will silently change to 2).
The intent of inline (with) functions is to make PL/SQL function calls within SQL more efficient – but before you change your code to take advantage of the feature, make sure that you are going to lose the benefit of determinism.
Here’s a quirky little thing I discovered about 5 minutes after installing 12c Beta 1 (cut-n-pasted from SQL*Plus):
create or replace view v$my_stats
sn.statistic# = ms.statistic#
create or replace view v$my_stats
ERROR at line 1:
ORA-00999: invalid view name
You can’t create views with names that start with V$ or GV$ in the sys schema. Presumably to eliminate the risk of someone’s clever view definition from overwriting and disabling one of the distributed dynamic performance views by accident.
Now that 12c is out, here’s an idea that might save you some time even if you have no intention of migrating to, or even testing, the product for a couple of years. Download the “List of bugs fixed in 12c”: you may find that it’s the best starting point when you’re trying to solve a problem in your current version of Oracle.
A slightly more sophisticated version of the same thing – download and install the product, then take a dump of v$system_fix_control – that may also give you some insight into anomalies (that are not necessarily declared as bugs) in the way Oracle – and the optimizer in particular – behave. I updated the referenced note to add in a couple of figures for 12.1 – but one figure that’s not there is the number of database parameters: now at 368 in the v$ and 3,333 in the x$ (in my Beta 3 release).
Just one of those little snippets about 12c that might help someone.
Further to an earlier post, online rebuild works in 12c even when the key is “too long”. The internal code has changed completely, and there is no sign of the problematic journal table that caused the problem in earlier versions.
The news is out that 12c is now available for download (Code, Docs and Tutorials). There are plenty of nice little bits in it, and several important additions or enhancements to the optimizer, but there’s one feature that might prove to be very popular:
SQL> alter table p1 move partition solo ONLINE;
For anyone looking for information on 12c, there are several posts about OpenWorld at the dbi Services blog (see links at right). In particular there’s a summary post about the “pluggable database” describing how you could plug a database into a “container” database, clone it inside the container database, then unplug it and put it somewhere else.
When I heard about the feature, it crossed my mind that there were two “obvious” targets for this technology that Oracle had in mind – in no particular order:
- Consolidation onto Exadata – I’ve seen an Oracle presentation about a customer who moved 18 databases from 14 servers onto 2 Exadata quarter racks; that’s a lot of processes that have to be running per rack simply to keep the many instances idling. If you plugged all the database into a single instance you should get no application interference between databases and a minimal set of background processes.
- Applications as a Service (or should that be Software as a Service – SaaS): if a 3rd party wants to run your Peoplesoft system for you, they would probably prefer to run one database with multiple Peoplesoft databases plugged into it.
Currently “real” consolidation means lots of work to change multiple databases into multiple schemas in a single database, and worrying about public synonyms; running multiple copies of the same application in the same database demonstrates the most extreme example of how pluggable databases bypass the problem. Just think how nice it would be, as a service provider, to keep a single “empty” Peoplesoft pluggable database in your container database which you clone whenever you sign up with a new customer. And, as a customer, if you want to change your service provider, perhaps you could insist that you supplier unplugs your Peoplesoft database so that you can plug it in at your new service provider.
Another little detail that Hermann Baer mentioned in his presentation yesterday was the ability to create multiple indexes with the same column definition – something which currently gets you Oracle error “ORA-01408: such column list already indexed.”
No details, and there’s always the “safe harbour” slide of course – the one which says seomthing about the presentation being only an indication of current thinking and nothing is guaranteed to appear.
Having said that, this looks like an interesting option for those (possibly rare) occasions when you want to change a unique index into a non-unique index (for example, to change a unique constraint to deferrable). Rather than having to drop the index and create a new one – leaving the table unindexed while the index builds, you appear to have the option to: “create new index online”, “drop old index”. Moving a primary key constraint from one index to the other might not be so easy, of course, especially if there are foreign keys in place – but this certainly looks like a helpful step.
Details to follow when 12c becomes available.
Update Sept 2013
Although you can create multiple indexes with the same column definition, only one of them can be visible at any time – so this should remove the temptation that Richard describes in his comment below. It won’t stop people creating “duplicates”, though, and leaving some of them invisible for a while just in case they need to change their minds. Always keep firm control of your indexing.
Most useful presentation of OOW so far, from Hermann Baer of Oracle on improvements in partitioning features in 12c – and there are lots of useful ones, including:
Online move of a partition – so easy to compress a partition when it has reached its final “read-only” state
multiple partition maintenance in one operation – e.g. merge 3 monthly partitions into one quarterly partition, or split one partition into 24 (think about “how do I partition a non-partitioned table”, and 12c has just made it easier and quicker – exchange it into an empty partitioned table, then do one massive split).
partial indexing – define which partitions of a partitioned table should be included in the indexes you create on the table – and the optimizer also knows that different partitions need different plans (an enhancement of “table expansion”.
interval-reference partitionining – you can create an interval partitioned table, and use ref-partitioning to create child (and further decendent) tables, and their partitions will automatically be generated, truncated, and dropped as the parent is extended, truncated or dropped (needs enabled foreign key constraints).
Lots more details – and lots of stress-testing to be done – but I’m off to hear “the optimizer lady” talk about hints.
Oracle has a “safe harbour” slide at the start of all presentations about future developments pointing out that the information presented is an indication of direction, but not guaranteed to make it into production.