Several years ago I wrote the following in “Cost Based Oracle – Fundamentals” (p.47):
The maxthr and slavethr figures relate to throughput for parallel execution slaves. I believe that the figures somehow control the maximum degree of parallelism that any given query may operate at by recording the maximum rate at which slaves have historically been able to operate—but I have not been able to verify this.
Browsing the internet recently, I discovered that that no-one else seems to have published anything to verify my comment, so I decided it was about time I did so myself. I’m going to work up to it in two blog notes , so if you do happen to know of any document that describes the impact of maxthr and slavethr on the optimizer’s costing algorithms please give me a reference in the comments – that way I might not have to write the second note.
In my last post I made a comment about how the optimizer will use the new format of the index hint to identify an index that is an exact match if it can, and any index that starts with the same columns (in the right order) if it can’t find an exact match. It’s fairly easy to demonstrate the behaviour in 11g by examining the 10053 (CBO) trace file generated by a simple, single table, query – in fact, this is probably a case that Doug Burns might want to cite as an example of how, sometimes, the 10053 is easy to interpret (in little patches):
A recent posting on OTN came up with a potentially interesting problem – it started roughly like this:
I have two queries like this:
select * from emp where dept_id=10 and emp_id=15;
select * from emp where dept_id=10 and emp_id=16;
When I run them separately I get the execution plan I want, but when I run a union of the two the plans change.
This, of course, is extremely unlikely – even if we assume that the two queries are more complex than the text shown. On the other hand you might, after a little thought, come up with the idea that perhaps the optimizer had done something really clever like join factorization (moving a join that’s common to the two parts of the UNION from inside to outside the UNION), or maybe there’s some really new trick the optimizer had played because a UNION ultimately requires a SORT UNIQUE, and the optimizer had chosen a different path that returned the data from each part of the UNION in sorted order to decrease the cost of that final sort.
In fact it turned out to be a lot simpler than that. The query looked more like this:
Actually it’s probably not the NOT IN that’s nasty, it’s the thing you get if you don’t use NOT IN that’s more likely to be nasty. Just another odd little quirk of the optimizer, which I’ll demonstrate with a simple example (running under 220.127.116.11 in this case):
I thought I’d try to spend some of today catching up on old comments – first the easier ones, then the outstanding questions on Oracle Core.
The very first one I looked at was about pushing predicates, and the specific comment prompted me to jot down this little note about the 10053 trace file (the CBO trace).
An interesting little problem appeared on the Oracle-L mailing list earlier on this week – a query ran fairly quickly when statistics hadn’t been collected on the tables, but then ran rather slowly after stats collection even though the plan hadn’t changed, and the tkprof results were there to prove the point. Here are the two outputs (edited slightly for width – the original showed three sets of row stats, the 1st, avg and max, but since the query had only been run once the three columns showed the same results in each case):
Here’s a live example demonstrating a point I’ve often made – you have to be very detailed in your hinting or Oracle will find a way to obey your hints and do the wrong thing. A recent posting on the OTN database forum gave use the following query and execution plan:
I’ve written about dynamic sampling in the past, but here’s a little wrinkle that’s easy to miss. How do you get the optimizer to work out the correct cardinality for a query like (the table creation statement follows the query):
Cost Based Oracle – Fundamentals (November 2005)
But the most interesting function for our purposes is sys_op_countchg(). Judging from its name, this function is probably counting changes, and the first input parameter is the block ID portion (object_id, relative file number, and block number) of the table’s rowid, so the function is clearly matching our notional description of how the clustering_factor is calculated. But what is that 1 we see as the second parameter?
When I first understood how the clustering_factor was defined, I soon realized that its biggest flaw was that Oracle wasn’t remembering recent history as it walked the index; it only remembered the previous table block so that it could check whether the latest row was in the same table block as last time or in a new table block. So when I saw this function, my first guess (or hope) was that the second parameter was a method of telling Oracle to remember a list of previous block visits as it walked the index.
And finally, Oracle Corp. had implemented an official interface to the second parameter of sys_op_countchg() – provided you install the right patch – through a new table (or schema, or database) preference type available to the dbms_stats.set_table_prefs() procedure.
Here’s a summary of a recent posting on OTN:
I have two indexes (REFNO, REFTYPESEQNO) and (REFNO,TMSTAMP,REFTYPESEQNO). When I run the following query the optimizer uses the second index rather than the first index – which is an exact match for the predicates, unless I hint it otherwise:
Just a quick note to say that I found a blog over the weekend with a number of interesting posts, so I thought I’d pass it on: http://www.bobbydurrettdba.com/
There’s a really cute example (complete with test case) of an optimizer bug (possibly only in 11.1) in the December archive: http://www.bobbydurrettdba.com/2012/12/04/index-causes-poor-performance-in-query-that-doesnt-use-it/
Here’s a question that is NOT a trick question, it’s demonstrating an example of optimizer behaviour that might come as a surprise.
I have an index (addr_id0050, effective_date), the first column is numeric, the second is a date. Here’s a query with an execution plan that uses that index:
A recent question on OTN asked how you could model a case where Oracle had the choice between a “perfect” index for a range scan and an index that could be used for an index skip scan and choose the latter path even though it was clearly (to the human eye) the less sensible choice. There have been a number of wierd and wonderful anomalies with the index skip scan and bad choice over the years, and this particular case is just one of many oddities I have seen in the past – so I didn’t think it would be hard to model one (in fact, I thought I already had at least two examples somewhere in my library – but I couldn’t find them).
Take a data set with two columns, call them id1 and id2, and create indexes on (id1), and (id2, id1). Generate the id1 column as a wide range of cyclic values, generate the id2 set with a small number of repetitive values so that a large number of physically adjacent rows hold the same value. The clustering_factor on the (id1) index will be very large, the clustering_factor on the (id2, id1) index will be relatively small because it will be controlled largely by the repetitive id2 value. Here’s the data set:
The Enkitec Extreme Exadata Expo (E4) event is over, but I still have plenty to say about the technology. The event was a great success, with plenty of interesting speakers and presentations. I was particularly keen to hear Frits Hoogland’s comments on Exadata and OLTP, Richard Foote on Indexes, and Maria Colgan’s comments on how Oracle is making changes to the optimizer to understand Exadata a little better.
All three presentations were interesting – but Maria’s was possiby the most important (and entertaining). In particular she told us about two patches for 18.104.22.168, one current and one that is yet to be released (unfortunately I forgot to take note of the patch numbers – ed: but they’ve been supplied by readers’ comment below).
When I arrived in Edinburgh for the UKOUG Scotland conference a little while ago Thomas Presslie, one of the organisers and chairman of the committee, asked me if I’d sign up on the “unconference” timetable to give a ten-minute talk on something. So I decided to use Hybrid Columnar Compression to make a general point about choosing and testing features. For those of you who missed this excellent conference, here’s a brief note of what I said.