Here’s another of my “draft” notes that needs some exapansion and, most importantly, proof.
I have a fact table with a status id column that shows a massive skew. But I also have a dimension table that holds the “status code” so (in theory, at least) I have to do a join from the statuses table to the facts table to find rows of a given status. Unfortunately the join hides the skew:
Here’s an offering in my “drafts for someone else to finish” suite of postings; it’s one I’ve been meaning to retest and publish for quite some time. It’s a possible answer to a question that comes up occasionally on the OTN database forum: “How do I resynchronize two tables that are supposed to be copies of each other?”
It’s finally time to take a close look at the parallel versions of the execution plan I produced a little while ago for a four-table hash join. In this note I’ll examine the broadcast parallel distribution. First, here’s a list of the hints I’m going to use to get the effect I want:
leading(t4 t1 t2 t3)
full(t4) parallel(t4, 2)
use_hash(t1) swap_join_inputs(t1) pq_distribute(t1 none broadcast)
full(t1) parallel(t1, 2)
use_hash(t2) swap_join_inputs(t2) pq_distribute(t2 none broadcast)
full(t2) parallel(t2, 2)
use_hash(t3) swap_join_inputs(t3) pq_distribute(t3 none broadcast)
full(t3) parallel(t3, 2)
Sometimes you get some questions on OTN lead to very geeky investigations. Here’s one that came up a while ago that started with a reasonable observation about recursive subquery factoring, then devolved into a real geek-attack question about buffer headers (x$bh) and buffer pins (x$kccbf).
I contributed a couple of ideas and some basic SQL to the discussion but never got around to doing anything concrete. If anyone has time and is sufficiently curious to play around I’d be interested to see what you did and what conclusions you came to.
When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:
I have, in the past, used the dbms_rowid package to create rowids from block addresses (typically faking the first and last rowids that could appear in an extent); but I’ve just been sent a piece of information by Valentin Nikotin that’s going to make me go back and check whether what I’ve done with the package will always give me the correct results. Here’s a little demonstration code that highlights the issue:
A recent question on the OTN database forum included an execution plan that prompted one reader to ask: “but where has the existence subquery gone?” Here’s the original question showing the query, and here’s the later response showing the plan that prompted the question.
There were three possible reasons why that question may have been posed:
Oracle 12c has increased the maximum length of character-based columns to 32K bytes – don’t get too excited, they’re stored out of lines (so similar in cost to LOBs) and need some modification to the parameter file and data dictionary (starting the database in upgrade mode) before you can use them.
Richard Foote has a pair of articles on indexing such columns:
Be cautious about enabling this option and test carefully – there are going to be a number of side effects, and some of them may require a significant investment in time to resolve. The first one that came to my mind was that if you’ve created a function-based index on a pl/sql function that returns a varchar2() type and haven’t explicitly created the index on a substr() of the return value then the data type of the function’s return value will change from the current default of varchar2(4000) to varchar2(32767) – which means the index will become invalid and can’t be rebuilt or recreated.
Obviously you can redefine the index to include an explicit substr() call – but then you have to find all the code that was supposed to use the index and modify it accordingly.
Here’s one of those little details which I would have said just couldn’t be true – except it’s in the manuals, and the manuals happen to be right.
Here’s a question that appeared recently on OTN, and it’s one I’ve wondered about a few times – but never spent any time investigating. Are there any overheads to enabling row movement on a table ? If not, why is it not enabled by default and eliminated as an option ?
Obviously there are costs when a row moves – it will be updated, deleted and re-inserted with all relevant index entries adjusted accordingly – but is there an inherent overhead even if you do nothing to move a single row ?
Equally obviously you’ve made it possible for some to “accidentally” shrink the table, cause short term locking problems, and longer term performance probems; similarly it becomes possible to update rows in partitioned tables in a way that causes them to move; but “someone might do it wrong” doesn’t really work as an argument for “de-featurising” something that need not have been a feature in the first place.
What have I missed ?
Answers in the comments gratefully received – and possibly discussed.
I wrote a note about the 12c “In-Memory” option some time ago on the OTN Database forum and thought I’d posted a link to it from the blog. If I have I can’t find it now so, to avoid losing it, here’s a copy of the comments I made:
There an interesting example of a deadlock on the OTN database forum:
DEADLOCK DETECTED ( ORA-00060 )
Resource Name process session holds waits process session holds waits
PS-00000001-00000011 92 423 S 33 128 S X
BF-2ed08c01-00000000 33 128 S 92 423 S X
Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem:
Here’s an odd little optimizer glitch – probably irrelevant to most people, but an indication of the apparent randomness that appears as you combine features. I’ve created an example which is so tiny that the only explanation I can come up with the for optimizer not “behaving properly” is that I’ve found an undocumented restriction relating to a particular feature.
Here’s a lovely little example that just came up on the OTN database forum of how things break when features collide. It’s a bug (I haven’t looked for the number) that seems to be fixed in 126.96.36.199. All it takes is a deferrable foreign key and an outer join. I’ve changed the table and column names from the original, and limited the deferability to just the foreign key: