Here’s a question that appeared recently on OTN, and it’s one I’ve wondered about a few times – but never spent any time investigating. Are there any overheads to enabling row movement on a table ? If not, why is it not enabled by default and eliminated as an option ?
Obviously there are costs when a row moves – it will be updated, deleted and re-inserted with all relevant index entries adjusted accordingly – but is there an inherent overhead even if you do nothing to move a single row ?
Equally obviously you’ve made it possible for some to “accidentally” shrink the table, cause short term locking problems, and longer term performance probems; similarly it becomes possible to update rows in partitioned tables in a way that causes them to move; but “someone might do it wrong” doesn’t really work as an argument for “de-featurising” something that need not have been a feature in the first place.
What have I missed ?
Answers in the comments gratefully received – and possibly discussed.
I wrote a note about the 12c “In-Memory” option some time ago on the OTN Database forum and thought I’d posted a link to it from the blog. If I have I can’t find it now so, to avoid losing it, here’s a copy of the comments I made:
I posted this question on twitter earlier on today (It was a thought that crossed my mind during a (terrible) presentation on partitioning that I had to sit through a few weeks ago – it’s always possible to be prompted to think of some interesting questions no matter how bad the presentation is, though):
Quiz: if you create a virtual column as trunc(date_col,’W’) and partition on it – will a query on date_col result in partition elimination?
A recent question on the Oracle-L list server described a problem with data coming in from SQL Server and an oddity with referential integrity failing on Oracle because (for example) a child row was in lower case while the parent was in upper.
This raised a few comments on how you might handle referential integrity while allowed case to differ. No doubt it’s been done before – by Tom Kyte if no-one else – but the first thought that crossed my mind was to use virtual columns:
I have a fairly strong preference for choosing simple solutions over complex solutions, and using Oracle-supplied packages over writing custom code – provided the difference in cost (whether that’s in human effort, run-time resources or licence fees) is acceptable. Sometimes, though, the gap between simplicity and cost is so extreme that a hand-crafted solution is clearly the better choice. Here’s an idea prompted by a recent visit to a site that makes use of materialized views and also happens to be licensed for the partitioning option.
Sorted Hash Clusters have been around for several years, but I’ve not yet seen them being used, or even investigated in detail. This is a bit of a shame, really, because they seem to be engineered to address a couple of interesting performance patterns.
Here’s a funny little problem I came across some time ago when setting up some materialized views. I have two tables, orders and order_lines, and I’ve set up materialized view logs for them that allow a join materialized view (called orders_join) to be fast refreshable. Watch what happens if I refresh this view just before gathering stats on the order_lines table.
Just glancing through the 12c manuals (Server Reference 12.1 June 2013 – E17615-16) to check a particular database limit, I came across the following: “Services – maximum per instance – 115″. That’s a bit of a problem, given that you can have 254 pluggable (tenant) databases in a single container database, and each plugged database gets its own service – but I’m guessing that that bit of the manual is wrong, after all it didn’t say anything about pluggable databases at all. It’s hard to keep documentation up to date as things change.
Here’s a random thought, though, loosely linked to database limits. If you’re looking ahead to a time when you have lots of tenants in a container database, you might want to start by migrating your existing databases from smallfile tablespaces to bigfile tablespaces (which may make it a good idea to run with change tracking enabled) so that the final container database doesn’t have a totally unmanageable number of database files.
Update 13th Aug 2013
Read the comments for a limit on the total number of services a container database can run.
A couple of thoughts.
The intent of ASSM is to minimise contention when multiple small transactions are busy inserting data concurrently into the same table. As a consequence, you may be able to find a number of odd behaviour patterns if you do experiments with a single session running one transaction at a time; or when executing a single large transaction, or when experimenting with small tables.
Weighing in at a massive 54 characters – the longest parameter name in 12c is:
Followed very closely by (you guessed it)
The news is out that 12c is now available for download (Code, Docs and Tutorials). There are plenty of nice little bits in it, and several important additions or enhancements to the optimizer, but there’s one feature that might prove to be very popular:
SQL> alter table p1 move partition solo ONLINE;
Here’s a little quiz: If I take the average row length of the rows in a table, multiply by the number of rows, and convert the result to the equivalent number of blocks, how can the total volume of data in the table be greater than the total number of blocks below the table high water mark ? I’ve got three tables in a schema, and they’re all in the same (8KB block, 1M uniform extent, locally managed) tablespace, but here’s a query, with results, showing their space utilisation – notice that I gather schema stats immediately before running my query:
I wrote a note a few years ago about SQL*Net compression (this will open in a new window so that you can read the posts concurrently), showing how the order of the data returned by a query could affect the amount of network traffic. An example in the note demonstrated, using autotrace statistics that the number of bytes transferred could change dramatically if you sorted your return data set. At the time I asked, and postponed answering, the question: “but how come the number of SQL*Net round trips has not changed ?”
Here’s a quick and dirty script to create a procedure (in the SYS schema – so be careful) to check the Hakan Factor for an object. If you’re not familiar with the Hakan Factor, it’s the value that gets set when you use the command “alter table minimize records_per_block;”.
Materialized views open up all sorts of possibilities for making reporting more efficient – but at the same time they can introduce some “interesting” side effects when you start seeing refreshes taking place. (Possibly one of the most dramatic surprises appeared in the upgrade that switched many refreshes into “atomic” mode, changing a “truncate / append” cycle into a massively expensive “delete / insert” cycle).
If you want to have some ideas of the type of work that is involved in the materialized view “fast refresh”, you could look at some recent articles by Alberto Dell’Era on (very specifically) outer join materialized views (which a link back to a much older article on inner join materialized view refresh):