Oracle Scratchpad

December 22, 2021

MV Hacking

Filed under: Infrastructure,Materialized view,Oracle — Jonathan Lewis @ 11:54 am GMT Dec 22,2021

This is another article that I wrote a long time ago (at least 15 years according to the file’s timestamp but internally stamped as 2002, and it references 8i and 9i and “enhancements in 10g”), but the method described is still relevant and worthy of consideration so I thought I’d resurrect it with the caveat that the idea is sound, but the details need review.

Yet another fix for 3rd-party performance

 There are many ways to address the performance problems associated with third-party code that cannot be modified directly. Some vendors will allow you to modify indexes, some may even allow you to make more radical infrastructure changes – such as converting heap tables to index-organized tables. In some cases you may be able to make use of stored outlines to force specific execution plans on to particular SQL statements.

This article outlines yet another tactic that you can use to change the performance characteristics of someone else’s code, without changing the code itself: Query Rewrite.

Introduction to Query Rewrite.

If you say “materialized view” to most DBAs and developers, they will tend to think of data warehouses, big crunchy queries, and summary tables.  But, with a little luck, you can use materialized views to great effect even in OLTP systems.

Materialized views have been around for many years, of course, originally in the guise of the snapshots that were introduced to allow data to be replicated between different databases. But the newer, special, feature of materialized views is the ability to create them in the local database and get the cost based optimizer to rewrite queries to use them whenever it seems to be a good idea, so the query:

select
        dept, sale_dt sum(val) 
from
        sales_table
where
        sale_dt > sysdate - 14
group by 
        dept, sale_dt
;

running against a huge table might invisibly be turned into:

select  
        dept, sale_dt, v_sum 
from
        sales_sum
where
        sale_dt > sysdate – 14
;

running against a much smaller summary table. But this depends on the fact that we have created a materialized view called sales_sum typically using a command like the following:

create materialized view sales_sum
refresh on demand
enable query rewrite
as
select
        dept, sale_dt, 
        sum(val) v_sum
from    sales_table
group by 
        dept, sale_dt
;

There are many ways (with different schedules and side effects) in which the summary table (or materialized view) can be kept up to date, but I’m not going to go into that in this article.

To most people, then, materialized view = summary table, which pretty much limits their functionality to data warehouses and massive number crunching queries.

But take a wider viewpoint. The purpose of a materialized view is to allow data to be found more efficiently – just like an index. So could you use the technology in a way that is closer to indexing than it is to summary tables? The answer is yes.

Materialized clones

The commonest symptom I see in Oracle systems [ed: still, in 2021] that have performance problems is that they want to acquire too much data from too many different locations on disc –the issues of precision and scatter.

We may be able to address the problem of examining too much data (precision) by changing index definitions – but if we still have to visit many different disc locations to acquire just the correct data (scatter), we may still have a performance problem.

Consider, for example, the problem of producing a monthly statement of account on a credit card. If the debits table is an ordinary heap table, then the statement of account for a typical customer is likely to require Oracle to locate a few dozen different rows – which will probably be scattered across a few dozen table blocks – which will require a few dozen physical disc reads.

The solution, for the credit card company that builds its own application, is to create the debits table as an index organized table (IOT) with a primary key starting with the account number and payment date – so that all the payments for a given customer are collated into a few table (index) blocks as they are created.

But if you don’t own the application, and the vendor tells you that you may not change ‘their’ heap table into ‘your’ IOT, what can you do? One option – which will be appropriate for some cases, and needs a proper cost/benefit analysis – is simply to clone the debits table as an IOT and keep the clone up to date in real time with a trigger; then call the clone a materialized view with query rewrite enabled.

Figure 1 builds a simple example to demonstrate the process. The demonstration is designed to run under Oracle 9i, hence the use of subquery factoring (“with subquery”) which will not work in Oracle 8i. (The complete example can be downloaded from the Wayback Machine archive of my website: https://web.archive.org/web/20181217220623/http://www.jlcomp.demon.co.uk).

The example starts by building two tables with a referential integrity constraint to demonstrate the principal in a non-trivial case. The child table corresponds to the debits table described in the text above.

The script creates the child table with 100,000 rows using 1,000 account numbers. The account numbers are generated using the random number generator, so there will be approximately 100 rows for each account, and they will be scattered fairly evenly throughout the table – which will be about 1,700 blocks long. Given this build, a query for all the data for a single account will typically have to visit 100 blocks in the table – and in a full-size system, those block visits will probably turn into physical disc reads.

 
rem
rem     Script:         mv_idea_02.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sept 2002
rem
rem     Last tested
rem             10.2.0.1
rem

create table child
as
with generator as (
        select  --+ materialize
                rownum  id
        from    all_objects 
        where   rownum <= 5000 -- > comment to avoid WordPress formatting issue
)
select
        /*+ ordered use_nl(v2) */
        trunc(dbms_random.value(1,1001)) account,
        trunc(sysdate-10) + rownum/1000  tx_time,
        round(dbms_random.value(5,50),2) debit,
        rpad('x',100)                      padding
from
        generator       v1,
        generator       v2
where
        rownum <= 100000 -- > comment to avoid WordPress formatting issue
;

alter table child 
        add constraint c_pk primary key(account,tx_time);

create table parent as
select
        account,
        min(tx_time)    first_tx_time,
        rpad('x',300)   padding
from
        child
group by
        account
order by
        min(tx_time)
;

alter table parent 
        add constraint p_pk primary key(account);

alter table child  
        add constraint c_fk_p foreign key(account) 
        references parent(account);


Figure 1: Building the basic sample. 

To add a little depth to the model, and demonstrate a little more of the power of query rewrite, I have added a parent table (in the previous discussion, it might be the accounts table, where the child table holds the debits for each account).  Critically, I have declared and enabled a proper set of integrity constraints. If you try using materialized views, you either need to set up proper referential integrity, or you need to investigate the use of dimensions if you want the Cost Based Optimizer to recognise the opportunity for ‘subtle’ query rewrites.

Once I have my base table in place, the code in Figure 2 creates a clone of the data (in fact, using a subset of the columns) and a materialized view based on that clone.

create table mv_child(
        account,
        tx_time,
        debit,
        constraint mv_pk primary key (account, tx_time)
)
organization index
as
select  account, tx_time, debit
from    child
;

create materialized view mv_child
on prebuilt table 
never refresh
enable query rewrite
as
select  account, tx_time, debit
from    child
;

--      now collect statistcs with dbms_stats


Figure 2: Building the materialized view.

You will notice that the table mv_child has been declared as an index organized table.  Since I’ve eliminated the padding column from the data as well, the data for each account is now packed into about 4 leaf blocks.

The materialized view (also called mv_child) is declared as prebuilt, with never refresh. As far as Oracle is concerned, the thing exists, and will ‘never’ be maintained, so there is no need to keep any sort of log for it. But it can be used for query rewrite because we have enabled it for query rewrite.

Of course, if Oracle does use this table for query rewrite, the results will start to go wrong as soon as the data in the base table (child) starts to change. So we have to tell the cost based optimizer to be very trusting about this materialized view. We start by telling Oracle to enable query rewrite, then set the rewrite integrity:

alter session set query_rewrite_enabled   = true;
alter session set query_rewrite_integrity = stale_tolerated;

Since we are going to be updating the base table, the stale_tolerated level is necessary. If we were not expecting to update the child table, then an integrity level of trusted would be sufficient.

Once we have set this up, we can start to run queries and check execution plans (making sure we check the real run-time plans, not just the output from explain plan which can produce incorrect results when materialized views come into play).

select
        p.account,  c.tx_time, c.debit
from
        parent  p,
        child   c
where
        p.first_tx_time = trunc(sysdate) - :b1
and     c.account = p.account
order by
        p.account, c.tx_time
;

SELECT STATEMENT 
  SORT (order by)
    NESTED LOOPS
      TABLE ACCESS PARENT (full)  Filter: P.FIRST_TX_TIME= TRUNC(SYSDATE@!)-TO_NUMBER(TO_CHAR(:B1))
      INDEX MV_PK (range scan)    Access: MV_CHILD.ACCOUNT=P.ACCOUNT


Figure 3: Example query  with execution plan:

In the example in figure 3, I have joined the parent and child tables. And, as you can see, the access to the child table (you might expect an index range scan on ‘C’ followed by table access to table ‘CHILD’ has been replaced by just the index range scan on index ‘MV_PK’ – the index on the mv_child table.

Next Steps:

Of course, this is just a simple example of how to use materialized views – to take advantage of the technique in an OLTP system, you have to ensure that the table mv_child stays in synch with the base child table. This is where simple triggers can come into play. (See fig  4.)

Problems

The sample trigger code will keep the child and mv_child table in synch on an update. You will have to create insert and delete triggers as well. Of course, a better (more efficient) style of code for this part of the task would be to create a package containing procedures to operate the actual SQL, then use the triggers to call the packaged procedures. The style shown just keeps the code short and simple.

create or replace trigger c_aru
after update of account, tx_time, debit 
on child
for each row
begin
        if (
                :new.account = :old.account
            and :new.tx_time = :old.tx_time
        ) then
                update mv_child
                        set debit = :new.debit
                where   account = :new.account
                and     tx_time = :new.tx_time
                ;
        else
                delete from mv_child
                where   account = :old.account
                and     tx_time = :old.tx_time
                ;

                insert into mv_child (
                        account, tx_time, debit
                )
                values (
                        :new.account, 
                        :new.tx_time, 
                        :new.debit
                )
                ;
        end if;
end;
/


Figure 4: Example of simple trigger on child table.

But there are problems with this approach that you may discover only after you have started to test the triggers.

Critically, if your session modifies the child table (in a way that makes the triggers fire) and then queries it before commit, the query will not be rewritten to use the materialized view. However, if your application always does a commit after any modification and before querying the table, your queries will be redirected against the materialized view.

I have to say I was a little surprised when I found that this technique worked at all – according to various notes in the manuals query rewrite is supposed to be blocked by the presence of bind variables. But, as you can see in fig. 3, my tests with bind variables did actually work.

One final thought – if you are already familiar with materialized views and query rewrite you may be wondering why I didn’t simply define my view with the option to “refresh fast on commit” instead of writing my own triggers. In a system with low throughput, this might work – but for a high-throughput OLTP system the overheads are enormous, roughly 5,000% the last time I checked [ed: but that was a long time ago, and things have improved]).

Conclusion

Many of the performance problems that I see boil down to excessive I/O – with solutions that basically find ways of reducing the amount of I/O needed to satisfy critical queries.

Although materialized views are usually associated with reducing I/O in Data Warehouse and Decision Support Systems, you may find that you can take advantage of them in a slightly fanciful way in OLTP systems as well, typically by creating copies of data restricted to subsets of columns (as above) or/and rows.

Footnote Dec 2021

Remember that this was written some time when 10g was  very new – the idea is useful if you haven’t considered it before, but I won’t guarantee that the details are still correct, or that there isn’t a better way of implementing the strategy I’ve described above.

1 Comment »

  1. […] A reprint of a very old article on using materialized views to address performance issues with 3rd p… […]

    Pingback by Performance catalogue | Oracle Scratchpad — January 28, 2022 @ 4:52 pm GMT Jan 28,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

%d bloggers like this: