This is another article that I wrote a long time ago (at least 15 years according to the file’s timestamp but internally stamped as 2002, and it references 8i and 9i and “enhancements in 10g”), but the method described is still relevant and worthy of consideration so I thought I’d resurrect it with the caveat that the idea is sound, but the details need review.
Yet another fix for 3rd-party performance
There are many ways to address the performance problems associated with third-party code that cannot be modified directly. Some vendors will allow you to modify indexes, some may even allow you to make more radical infrastructure changes – such as converting heap tables to index-organized tables. In some cases you may be able to make use of stored outlines to force specific execution plans on to particular SQL statements.
This article outlines yet another tactic that you can use to change the performance characteristics of someone else’s code, without changing the code itself: Query Rewrite.
Introduction to Query Rewrite.
If you say “materialized view” to most DBAs and developers, they will tend to think of data warehouses, big crunchy queries, and summary tables. But, with a little luck, you can use materialized views to great effect even in OLTP systems.
Materialized views have been around for many years, of course, originally in the guise of the snapshots that were introduced to allow data to be replicated between different databases. But the newer, special, feature of materialized views is the ability to create them in the local database and get the cost based optimizer to rewrite queries to use them whenever it seems to be a good idea, so the query:
select dept, sale_dt sum(val) from sales_table where sale_dt > sysdate - 14 group by dept, sale_dt ;
running against a huge table might invisibly be turned into:
select dept, sale_dt, v_sum from sales_sum where sale_dt > sysdate – 14 ;
running against a much smaller summary table. But this depends on the fact that we have created a materialized view called sales_sum typically using a command like the following:
create materialized view sales_sum refresh on demand enable query rewrite as select dept, sale_dt, sum(val) v_sum from sales_table group by dept, sale_dt ;
There are many ways (with different schedules and side effects) in which the summary table (or materialized view) can be kept up to date, but I’m not going to go into that in this article.
To most people, then, materialized view = summary table, which pretty much limits their functionality to data warehouses and massive number crunching queries.
But take a wider viewpoint. The purpose of a materialized view is to allow data to be found more efficiently – just like an index. So could you use the technology in a way that is closer to indexing than it is to summary tables? The answer is yes.
Materialized clones
The commonest symptom I see in Oracle systems [ed: still, in 2021] that have performance problems is that they want to acquire too much data from too many different locations on disc –the issues of precision and scatter.
We may be able to address the problem of examining too much data (precision) by changing index definitions – but if we still have to visit many different disc locations to acquire just the correct data (scatter), we may still have a performance problem.
Consider, for example, the problem of producing a monthly statement of account on a credit card. If the debits table is an ordinary heap table, then the statement of account for a typical customer is likely to require Oracle to locate a few dozen different rows – which will probably be scattered across a few dozen table blocks – which will require a few dozen physical disc reads.
The solution, for the credit card company that builds its own application, is to create the debits table as an index organized table (IOT) with a primary key starting with the account number and payment date – so that all the payments for a given customer are collated into a few table (index) blocks as they are created.
But if you don’t own the application, and the vendor tells you that you may not change ‘their’ heap table into ‘your’ IOT, what can you do? One option – which will be appropriate for some cases, and needs a proper cost/benefit analysis – is simply to clone the debits table as an IOT and keep the clone up to date in real time with a trigger; then call the clone a materialized view with query rewrite enabled.
Figure 1 builds a simple example to demonstrate the process. The demonstration is designed to run under Oracle 9i, hence the use of subquery factoring (“with subquery”) which will not work in Oracle 8i. (The complete example can be downloaded from the Wayback Machine archive of my website: https://web.archive.org/web/20181217220623/http://www.jlcomp.demon.co.uk).
The example starts by building two tables with a referential integrity constraint to demonstrate the principal in a non-trivial case. The child table corresponds to the debits table described in the text above.
The script creates the child table with 100,000 rows using 1,000 account numbers. The account numbers are generated using the random number generator, so there will be approximately 100 rows for each account, and they will be scattered fairly evenly throughout the table – which will be about 1,700 blocks long. Given this build, a query for all the data for a single account will typically have to visit 100 blocks in the table – and in a full-size system, those block visits will probably turn into physical disc reads.
rem rem Script: mv_idea_02.sql rem Author: Jonathan Lewis rem Dated: Sept 2002 rem rem Last tested rem 10.2.0.1 rem create table child as with generator as ( select --+ materialize rownum id from all_objects where rownum <= 5000 -- > comment to avoid WordPress formatting issue ) select /*+ ordered use_nl(v2) */ trunc(dbms_random.value(1,1001)) account, trunc(sysdate-10) + rownum/1000 tx_time, round(dbms_random.value(5,50),2) debit, rpad('x',100) padding from generator v1, generator v2 where rownum <= 100000 -- > comment to avoid WordPress formatting issue ; alter table child add constraint c_pk primary key(account,tx_time); create table parent as select account, min(tx_time) first_tx_time, rpad('x',300) padding from child group by account order by min(tx_time) ; alter table parent add constraint p_pk primary key(account); alter table child add constraint c_fk_p foreign key(account) references parent(account); Figure 1: Building the basic sample.
To add a little depth to the model, and demonstrate a little more of the power of query rewrite, I have added a parent table (in the previous discussion, it might be the accounts table, where the child table holds the debits for each account). Critically, I have declared and enabled a proper set of integrity constraints. If you try using materialized views, you either need to set up proper referential integrity, or you need to investigate the use of dimensions if you want the Cost Based Optimizer to recognise the opportunity for ‘subtle’ query rewrites.
Once I have my base table in place, the code in Figure 2 creates a clone of the data (in fact, using a subset of the columns) and a materialized view based on that clone.
create table mv_child( account, tx_time, debit, constraint mv_pk primary key (account, tx_time) ) organization index as select account, tx_time, debit from child ; create materialized view mv_child on prebuilt table never refresh enable query rewrite as select account, tx_time, debit from child ; -- now collect statistcs with dbms_stats Figure 2: Building the materialized view.
You will notice that the table mv_child has been declared as an index organized table. Since I’ve eliminated the padding column from the data as well, the data for each account is now packed into about 4 leaf blocks.
The materialized view (also called mv_child) is declared as prebuilt, with never refresh. As far as Oracle is concerned, the thing exists, and will ‘never’ be maintained, so there is no need to keep any sort of log for it. But it can be used for query rewrite because we have enabled it for query rewrite.
Of course, if Oracle does use this table for query rewrite, the results will start to go wrong as soon as the data in the base table (child) starts to change. So we have to tell the cost based optimizer to be very trusting about this materialized view. We start by telling Oracle to enable query rewrite, then set the rewrite integrity:
alter session set query_rewrite_enabled = true; alter session set query_rewrite_integrity = stale_tolerated;
Since we are going to be updating the base table, the stale_tolerated level is necessary. If we were not expecting to update the child table, then an integrity level of trusted would be sufficient.
Once we have set this up, we can start to run queries and check execution plans (making sure we check the real run-time plans, not just the output from explain plan which can produce incorrect results when materialized views come into play).
select p.account, c.tx_time, c.debit from parent p, child c where p.first_tx_time = trunc(sysdate) - :b1 and c.account = p.account order by p.account, c.tx_time ; SELECT STATEMENT SORT (order by) NESTED LOOPS TABLE ACCESS PARENT (full) Filter: P.FIRST_TX_TIME= TRUNC(SYSDATE@!)-TO_NUMBER(TO_CHAR(:B1)) INDEX MV_PK (range scan) Access: MV_CHILD.ACCOUNT=P.ACCOUNT Figure 3: Example query with execution plan:
In the example in figure 3, I have joined the parent and child tables. And, as you can see, the access to the child table (you might expect an index range scan on ‘C’ followed by table access to table ‘CHILD’ has been replaced by just the index range scan on index ‘MV_PK’ – the index on the mv_child table.
Next Steps:
Of course, this is just a simple example of how to use materialized views – to take advantage of the technique in an OLTP system, you have to ensure that the table mv_child stays in synch with the base child table. This is where simple triggers can come into play. (See fig 4.)
Problems
The sample trigger code will keep the child and mv_child table in synch on an update. You will have to create insert and delete triggers as well. Of course, a better (more efficient) style of code for this part of the task would be to create a package containing procedures to operate the actual SQL, then use the triggers to call the packaged procedures. The style shown just keeps the code short and simple.
create or replace trigger c_aru after update of account, tx_time, debit on child for each row begin if ( :new.account = :old.account and :new.tx_time = :old.tx_time ) then update mv_child set debit = :new.debit where account = :new.account and tx_time = :new.tx_time ; else delete from mv_child where account = :old.account and tx_time = :old.tx_time ; insert into mv_child ( account, tx_time, debit ) values ( :new.account, :new.tx_time, :new.debit ) ; end if; end; / Figure 4: Example of simple trigger on child table.
But there are problems with this approach that you may discover only after you have started to test the triggers.
Critically, if your session modifies the child table (in a way that makes the triggers fire) and then queries it before commit, the query will not be rewritten to use the materialized view. However, if your application always does a commit after any modification and before querying the table, your queries will be redirected against the materialized view.
I have to say I was a little surprised when I found that this technique worked at all – according to various notes in the manuals query rewrite is supposed to be blocked by the presence of bind variables. But, as you can see in fig. 3, my tests with bind variables did actually work.
One final thought – if you are already familiar with materialized views and query rewrite you may be wondering why I didn’t simply define my view with the option to “refresh fast on commit” instead of writing my own triggers. In a system with low throughput, this might work – but for a high-throughput OLTP system the overheads are enormous, roughly 5,000% the last time I checked [ed: but that was a long time ago, and things have improved]).
Conclusion
Many of the performance problems that I see boil down to excessive I/O – with solutions that basically find ways of reducing the amount of I/O needed to satisfy critical queries.
Although materialized views are usually associated with reducing I/O in Data Warehouse and Decision Support Systems, you may find that you can take advantage of them in a slightly fanciful way in OLTP systems as well, typically by creating copies of data restricted to subsets of columns (as above) or/and rows.
Footnote Dec 2021
Remember that this was written some time when 10g was very new – the idea is useful if you haven’t considered it before, but I won’t guarantee that the details are still correct, or that there isn’t a better way of implementing the strategy I’ve described above.
[…] A reprint of a very old article on using materialized views to address performance issues with 3rd p… […]
Pingback by Performance catalogue | Oracle Scratchpad — January 28, 2022 @ 4:52 pm GMT Jan 28,2022 |
[…] MVs and Hacking 3rd parties (Dec 2021) – a very old article (10g) very recently published presenting an idea of working around 3rd party performance issues […]
Pingback by Materialized Views catalogue | Oracle Scratchpad — February 16, 2022 @ 6:36 pm GMT Feb 16,2022 |