Recently appeared on Mos – “Bug 18219084 : DIFFERENT EXECUTION PLAN ACROSS RAC INSTANCES”
Now, I’m not going to claim that the following applies to this particular case – but it’s perfectly reasonable to expect to see different plans for the same query on RAC, and it’s perfectly possible for the two different plans to have amazingly different performance characteristics; and in this particular case I can see an obvious reason why the two nodes could have different plans.
Here’s the query reported in the bug:
I’m afraid this is one of my bad puns again – an example of the optimizer making a real hash of the index hash join. I’m going to create a table with several indexes (some of them rather similar to each other) and execute a query that should do an index join between the obvious two indexes. To show how obvious the join should be I’m going to start with a couple of queries that show the cost of simple index fast full scans.
Here’s the data generating code:
Here’s a little oddity I came across in 126.96.36.199 a few days ago – don’t worry too much about what the query is trying to do, or why it has been written the way I’ve done it, the only point I want to make is that I’ve got the same plan from two different strategies (according to the baseline/outline/hints), but the plans have a difference in cost.
I’ve spent so many years trying to explain that a “hint” to the Oracle optimizer is an order – if you know how to do it properly – that I finally decided to list the manual references that have made this point over the last 15 or so years. Here’s the list, which ends with a surprising change of flavour. (Emphasis in the body of the text is mine).
I’ve written notes about the different join mechanisms in the past – but such things are always worth revisiting, so here’s an accumulated bundle of comments about hash joins.
A hash join takes two inputs that (in most of the Oracle literature) are referred to as the “build table” and the “probe table”. These rowsources may be extracts from real tables or indexes, or might be result sets from previous joins. Oracle uses the “build table” to build a hash table in memory, consuming and using the rowsource in a single call; it then consumes the “probe table” one row at a time, probing the in-memory hash table to find a match.
Access to the hash table is made efficient by use of a hashing function applied to the join columns – rows with the same value on the join column end up hashing to the same place in the hash table. It is possible for different input values to produce the same hash value (a hash collision) so Oracle still has to check the actual values once it has identified “probable” joins in the hash table. Because the comparison is based on a hashing mechanism, hash joins can only be used for join predicates that are equality predicates.
In my last post I made a comment about how the optimizer will use the new format of the index hint to identify an index that is an exact match if it can, and any index that starts with the same columns (in the right order) if it can’t find an exact match. It’s fairly easy to demonstrate the behaviour in 11g by examining the 10053 (CBO) trace file generated by a simple, single table, query – in fact, this is probably a case that Doug Burns might want to cite as an example of how, sometimes, the 10053 is easy to interpret (in little patches):
I’ll probably have to file this one under “Optimizer ignoring hints” – except that it should also go under “bugs”, and that’s one of the get-out clauses I use in my “hints are not hints” argument.
Sometimes an invisible index isn’t completely invisible.
A recent posting on OTN came up with a potentially interesting problem – it started roughly like this:
I have two queries like this:
select * from emp where dept_id=10 and emp_id=15;
select * from emp where dept_id=10 and emp_id=16;
When I run them separately I get the execution plan I want, but when I run a union of the two the plans change.
This, of course, is extremely unlikely – even if we assume that the two queries are more complex than the text shown. On the other hand you might, after a little thought, come up with the idea that perhaps the optimizer had done something really clever like join factorization (moving a join that’s common to the two parts of the UNION from inside to outside the UNION), or maybe there’s some really new trick the optimizer had played because a UNION ultimately requires a SORT UNIQUE, and the optimizer had chosen a different path that returned the data from each part of the UNION in sorted order to decrease the cost of that final sort.
In fact it turned out to be a lot simpler than that. The query looked more like this:
Here’s a live example demonstrating a point I’ve often made – you have to be very detailed in your hinting or Oracle will find a way to obey your hints and do the wrong thing. A recent posting on the OTN database forum gave use the following query and execution plan:
I’ve written about dynamic sampling in the past, but here’s a little wrinkle that’s easy to miss. How do you get the optimizer to work out the correct cardinality for a query like (the table creation statement follows the query):
That’s SQL Tuning Sets, optimizer_features_enable, and SQL Plan Management.
There’s a recent post on OTN describing an issue when using SQL Tuning Sets to enforce plan stability when upgrading from 10.2.0.3 to 188.8.131.52 – it doesn’t always work. Here’s a very simple model to demonstrate the type of thing that can happen (the tables are cloned from a completely different demo, so don’t ask why I picked the data they hold):
I have a small collection of postings where I’ve described anomalies or limitations in subquery factoring (the “with subquery”, or Common Table Expression (CTE) to give it the official ANSI name). Here’s another example of Oracle’s code not behaving consistently. You may recognise the basic query from yesterday’s example of logical tuning – so I won’t reprint the code to generate the data sets. This examples in this note were created on 184.108.40.206 – we start with a simple query and its execution plan:
Here’s a model of a problem I solved quite recently at a client site. The client’s query was much more complex and the volume of data much larger, but this tiny, two table, example is sufficient to demonstrate the key principle. (Originally I thought I’d have to use three tables to model the problem, which is why you may find my choice of table names a little odd). I ran this example on 220.127.116.11 – which was the client version:
When I wrote a note last week about the fixes to the subquery factoring optimizer code in 18.104.22.168, I finished with a comment about having more to say on the test case if I materialized the subquery. Today’s the day to talk about it. As a reminder, here’s the query, but with the /*+ materialize */ hint in place:
Browsing a little history recently I came across a note I’d written about the new-style index hint. In that note I claimed that:
… the index has to start with the columns (product_group, id) in that order – with preference given to an exact match, otherwise using the lowest cost index that starts the right way.
On reading this statement I suddenly realised that I hadn’t actually proved (to myself, even) that if I had the indexes (product_group, id) and (product_group, id, other_col) then a two-column hint forced Oracle to use the two column index in all (legal) circumstances.
So, tonight’s quiz – are there any edge cases, and what easy ways can you think of to prove (or disprove) the claim for the general case.
Footnote: you don’t have to demonstrate the method, just a brief outline of the idea will be sufficient.