Oracle Scratchpad

November 21, 2009

ora_hash function

Filed under: Uncategorized — Jonathan Lewis @ 10:59 am UTC Nov 21,2009

A few weeks ago I wrote a note demonstrating the way in which Oracle’s strategy for hash partitioning is engineered to give an even data distribution when the number of partitions is a power of two. In one of the follow-up comments, Christo Kutrovsky pre-empted my planned follow-up by mentioning the hashing function ora_hash() that appeared in 10g.

This function takes up to three parameters: the input value (which can be of any base type), the maximum hash bucket value (the minimum value is implicitly zero) and a seed value (default zero) for the hash function. So, for example, ora_hash(‘fred’,1023) will return an integer value between zero and 1023.

The ora_hash() function is similar to the packaged function dbms_utility.get_hash_value({expression},{min},{max}) that appeared in 8i, although the older function only handles character expressions (using an implicit conversion for dates and numbers), requires you to specify the minimum and maximum bucket values, and doesn’t allow you to introduce a seed value.

In fact, for character inputs, the ora_hash() and get_hash_value() functions can be mapped to each other very easily by the relationship:

dbms_utility.get_hash_value(string,1,N) = 1 + ora_hash(string,N,0)

So, for example, we can run a little query like:


select
	1 + ora_hash('fred',1023,0)			ora_hash_plus_one,
	dbms_utility.get_hash_value('fred',1,1024)	get_hash_value
from
	dual
/

ORA_HASH_PLUS_ONE GET_HASH_VALUE
----------------- --------------
              904            904

One important feature of the ora_hash() function is that it seems to be the function used internally – with a zero seed – to determine which partition a row belongs to in a hash partitioned table. Of course, as I pointed out in the previous article, you have to tailor the use of the function to the number of partitions you have declared – rounding up to the nearest power of two (and subtracting one) for the “max bucket” input, then adding one to the result, then dropping the highest bit of the result if the number of partitions is not a power of two.

Here’s a little demonstration, based on a table with six partitions (i.e. poor choice of partition count), running under 11.1.0.6:

create table t1 (
	id,
	v1
)
partition by hash(v1)
partitions 6
as
select
	rownum,
	object_name
from
	all_objects
where
	rownum <= 32768
;

execute dbms_stats.gather_table_stats(user,'t1')

select	partition_position, partition_name, num_rows
from	user_tab_partitions
where	table_name = 'T1'
order by
	partition_position
;

column partition_name new_value m_pt

select	partition_name
from	user_tab_partitions
where	table_name = 'T1'
and	partition_position = 3
;

select
	ora_hash(v1, 7)				ora_hash,
	dbms_utility.get_hash_value(v1,1,8)	get_hash_value,
	dbms_mview.pmarker(rowid)		pmarker,
	count(*)
from
	t1 partition( &m_pt )
group by
	ora_hash(v1, 7),
	dbms_utility.get_hash_value(v1,1,8),
	dbms_mview.pmarker(rowid)
;

You’ll see that I’ve collected stats on the table then reported each partition’s position, name, and number of rows . This is the result I got from that query:


PARTITION_POSITION PARTITION_NAME         NUM_ROWS
------------------ -------------------- ----------
                 1 SYS_P608                   3989
                 2 SYS_P609                   4055
                 3 SYS_P610                   8356
                 4 SYS_P611                   8320
                 5 SYS_P612                   4083
                 6 SYS_P613                   3965

As you might expect from the previous article, two of my partitions (numbers 3 and 4) are twice the size of the rest. If I were to issue a ‘split partition’ call then partition 3 would split (roughly 50/50) into partitions 3 and 7, and on a second ‘split partition’ call partition 4 would split (again 50/50) into partitions 4 and 8.

To demonstrate the effect of the ora_hash() function and how it relates to hash partitioning I’ve then run a query that references partition 3 by name (using an earlier query to convert a partition position into a partition name).

You’ll notice that my call to ora_hash() uses the value seven for the maximum hash bucket – this is because (a) my ideal partition count is eight (the smallest power of two that is not less than six), and (b) although I want values between 1 and 8 the function will be generating values between zero and 7. Given the relationship between ora_hash() and dbms_utility.get_hash_value() – and given that I’m applying the functions to a character string – I’ve also called the get_hash_value() function to show that its behaviour really is consistent with ora_hash() across the entire data set. Here are the results from the final query:


  ORA_HASH GET_HASH_VALUE    PMARKER   COUNT(*)
---------- -------------- ---------- ----------
         2              3      86154       4194
         6              7      86154       4162

You can see that both functions have split my set of values into two distinct sets in exactly the same way – values that would go into a new partition three on a “split partition”, and values that would go into partition seven. And the split is roughly 50/50 (to within 1%).

As a little bonus, I’ve also included a call to dbms_mview.pmarker() in the query. This was a function that Oracle introduced in 9i (I think) to allow more refresh options for materialized views on partitioned objects when you used partition maintenance operations on them. The function returns an identifier for the partition of a partitioned table that the rowed belongs to. (In fact all it is effectively doing is returning the data_object_id from dba_objects for the containing partition).

Once you’re aware of how ora_hash() works you may be able to find ways to take advantage of it. Occasionally I’ve been asked if it’s possible to work out which partition some data is going to go into – now you know that the answer is yes.

To be investigated: Does the ora_hash() function play any part in the distribution of data across hash clusters ? (Probably not – the manuals make some comments about rounding up the supplied number of hashkeys to the next larger  prime number, which suggests that a different hashing algorithm is used.

To be confirmed: Does the ora_hash() function plan any part in parallel query distribution by hash ? One quick test suggests not – but the method of use in this case may be more subtle than the “obvious” guess that I used in my test.

November 20, 2009

OOW09 Video

Filed under: Uncategorized — Jonathan Lewis @ 7:12 pm UTC Nov 20,2009

I’ve just discovered I’m on YouTube in an impromptu video interview I gave while I was at Oracle World this year. It’s only three minutes, and doesn’t have a lot of technical content, but here’s the URL.

Footnote: Rumour has it that next year the event will be rebranded as My Oracle World – implemented entirely in Flash.

 

November 12, 2009

No change

Filed under: Troubleshooting — Jonathan Lewis @ 7:50 pm UTC Nov 12,2009

Every now and again I see a note on the OTN database forum, or Metalink, or the newsgroup asking why the time taken to run a query can vary so much between executions … even when the plan didn’t change (and the bind variable used as inputs didn’t change).

Just off the top of my head, here are a few possibilities:

  • Sometimes the required data is perfectly cached, sometimes it isn’t
  • Sometimes other processes are using the CPU or disks very aggressively
  • The volume of data can change significantly over short periods of time
  • A long-running transaction may require you to do more work generating read-consistent data sometimes.
  • A small change in data can cause a huge  variation in the number of times a subquery runs
  • A plan involving a hash join or sort may execute optimally on one occasion and multi-pass on another
  • A parallel plan may get all the PX slaves it wants on one occasion and none of them on another

Any further suggestions are welcome – remember, though, we are assuming that (a) the plan really didn’t change, (b) the input bind values really didn’t change, (c) your environment didn’t change and (d) there isn’t a mad scientist (or novice DBA) in the background doing unusual things to a production system.

November 10, 2009

Index Freelists

Filed under: Indexing, Infrastructure, Performance, Troubleshooting — Jonathan Lewis @ 7:09 pm UTC Nov 10,2009

A few months ago Saibabu Devabhaktuni let me know of an interesting issue he had had with indexes misbehaving, and pointed me to a blog note he had written to describe it. In the note he supplies a test case where Oracle walks through hundreds of blocks on an index freelist to find a single block that could be used as the target for a leaf node split.

The most interesting thing to note is that Sai has demonstrated a case where issuing a commit between the bulk delete and the subsequent insert results in a performance problem. On the other hand if you read my description in this blog posting, you’ll see that I have described a scenario where the commit between the delete and the insert is a very good idea – especially if you can fit a coalesce in between to clear up the mess.

If you’re going to repeat the test shown in Sai’s blog, it’s best if you create a new tablespace with a single datafile. When I first ran it, I decided that one of my existing tablespaces was a “close enough” match to his example – and found that I couldn’t reproduce his results. His demonstration, being based on a non-unique index,  is very sensitive to the order in which extents are allocated to the table – s0 you need a clean tablespace with just one datafile, or a tablespace which guarantees the table will fit into a single extent.

The bug number that Sai raised (8446989) is still waiting for a base bug fix; and the base bug it refers to (8570223) is hidden. There is some further information available, though, because his bug number has become the base bug for bug 7712825 which gives you a few clues about what is going on.

In fact, when I ran Sai’s test case with some diagnostic code interspersed, I noticed that the insert he did to “refill” the empty blocks didn’t take them off the freelist when (to my mind) they should have come off the freelist – which is why the single row insert that caused a block split had to walk through and unlink a lot of blocks in its search for a free block. For various reasons, and I think one of them may be a fairly expensive structural issue, I don’t think this is going to be easy to fix.

Free space management is not a simple part of the database – especially given Oracle’s dedication to making the commit as fast as possible by minimising the “excess” work that has to take place at the same time; so we may continute to see strange  behaviour appearing in unusual conditions for both freelist management and bitmap management of free space for quite some time to come.

November 6, 2009

Did you know …

Filed under: Indexing, Infrastructure — Jonathan Lewis @ 7:02 pm UTC Nov 6,2009

… a few things about the use of index space to suprise your friends and amaze your colleagues:

  • If you use “bigfile” tablespaces for your tables this can result in some indexes becoming more space-efficient than they would be otherwise.
  • Creating tables in tablespaces built from multiple datafiles may cause some of their indexes to be less space-efficient than they would otherwise be.
  • Unless you store tables in different tablespaces from indexes, rebuilding indexes can allow the indexes that you haven’t rebuilt to become less space-efficient.
  • Dropping (or truncating) a table may allow unrelated indexes to become less space-efficient
  • Using the partition exchange mechanism for data loading may make some unrelated indexes become less space-efficient.
  • Some classes of index will be less space-efficient in a RAC system than they would be in a single-instance system.
  • Using the “shrink” option on a table may cause some of its indexes to get bigger.

On the plus side: although there are many activities that contribute to indexes  being bigger than they would be if they were newly created, there aren’t many cases where the benefit of recreating them (or restructuring your database) is worth the effort. Note that many “space anomalies” in indexes are transient anyway and are self-correcting over time – given enough time and sufficient use.

Footnote: I am not going to supply an explanation for any of the above phenomena because I don’t think they’re sufficiently important to warrant any action – in general. This note has been published only for the purpose of answering trivia questions and similar entertainments.

 

November 4, 2009

Anonymous

Filed under: Uncategorized — Jonathan Lewis @ 9:00 pm UTC Nov 4,2009

From time to time I browse the comments that Akismet (the spam detector used by WordPress) has intercepted just to check that a useful comment hasn’t been unjustly labelled as spam; and a couple of days ago I was surprised to find that four comments (from the same person) had been marked as spam even though there was no obvious reason why they had been so marked.

They were clearly rubbish, of course – one expressed suprise that I hadn’t published my resume [sic] (it could do with an acute accent over the final e, but apart from that it’s American for C.V. ), another expressed the opinion that my blog showed I had no experience with ACTUAL [sic] working databases … and so on.

But how had Akismet worked out that these comments were garbage ?

The author had supplied a gmail account name – and I think that comments from the gmail domain have triggered spam exclusion in the past but only when the comment included a hyperlink. Then I thought of tracing the IP address that WordPress has captured – and it turned out to lead back to an “anonymizing proxy server”.

Clever trick that – if it’s coming from someone who’s trying to hide, a comment probably deserves to be binned.

Back to the USA – etc.

Filed under: Uncategorized — Jonathan Lewis @ 7:30 pm UTC Nov 4,2009

California, Bulgaria (yes, I know it’s not in the USA), Virginia and Texas

It’s nice after a long trip to get a few email messages from people who have heard you speak and enjoyed the experience – so thanks to the people who let me know what they thought of the seminars and presentations I’ve done in the last few weeks.

For an idea of how things went – here’s a review from Bob Watkins of one of the full day seminars.

UKOUG Conference Series

The agenda for the “Technical and E-Business Suite”  conference (30th Nov – 2nd Dec) is now available.

I’m doing a 45 minute presentation on  “Writing Optimal SQL” at 14:50 on Monday 30th, and another 45 minutes on “Introducing Partitions” at 13:15 on Tuesday 1st.

There’s also going to be a “round table” – which means everyone gets their chance to talk – on Cost Based Optimisation at 12:35 on Wednesday 3rd that I’ll be chairing.

Future options

And while we’re looking ahead – Oracle Education has started talking about trips to Switzerland in Feb, Germany in March, and Greece in Apr. I’m also talking to Oracle in the APAC region about  a trip to China (Beijing and Shanghai) in March.  Watch this space.

November 3, 2009

Foreign Keys

Filed under: Indexing, Infrastructure — Jonathan Lewis @ 11:02 pm UTC Nov 3,2009

People often create far more “foreign key” indexes than are needed – and any redundant index is a waste of several types of resources. I’ve just made some comments on an OTN thread about this topic, so rather than repeat the comments here I’ll just give you the URL.

Update 5th Nov 2009: and here’s another common issue about foreign keys that I also addressed by a link to OTN some time ago.

 

October 31, 2009

Top Ten

Filed under: Uncategorized — Jonathan Lewis @ 10:41 pm UTC Oct 31,2009

Sorry, this isn’t a posting about efficient ways of getting the first 10 rows from a result set – it’s a little note about Oracle Open World. I don’t brag very often, but sometimes it’s hard to resist.

I’ve just received an email about Openworld with the following content:

Go to Oracle OpenWorld On Demand
Get in on sessions you missed, or hear keynotes and executive solution sessions again.
FREE for all full conference, Oracle Develop, and Primavera program attendees.
Only US$245 for all others.

    • Find all keynotes, general sessions, executive solution sessions, and breakout sessions—more than 1,900 presentations—in an online streaming rich media portal
    • Get keynotes with synchronized video, slides, scrolling transcripts, and downloadable MP3s and MP4s
    • Comment and rate presentations online and post your own content
  • For 1,900 presentations, $245 doesn’t seem to be an extreme price – although it might be nicer to be able to get a set of DVDs rather than downloading all those pdfs, mp3s and adobe flash files. But I was registered anyway, so I just clicked on the link, signed in, and got to the opening portal page to find a list of the “Top Ten Presentations”, in this order:

    • Keynote: Extreme Innovation - Scott McNealy, Sun Microsystems, Inc.
    • Keynote: Larry Ellison, Oracle
    • Keynote: Innovation Across the Stack -  Thomas Kurian, Oracle
    • Keynote: Oracle Develop – What Are We Still Doing Wrong? – Thomas Kyte
    • General Session: Oracle Fusion Middleware 11g – Foundation for Innovation -  Hasan Rizvi
    • General Session: Lower Costs with Oracle Database 11g Release 2 - Robbins Jeffrey, Masayuki Karahashi, Andy Mendelsohn
    • Keynote: The Art of the Possible - Charles Phillips and Safra Catz, Oracle
    • How to Hint -  Jonathan Lewis
    • Accelerating Java for Better Enterprise Performance - Charlie Hunt, John Pampuch
    • General Session: Complete, Open, and Integrated Applications for Your Business - Andy Platt, Paco Aubrejuan, Janet Foutty, Anthony Lye,  Steve Miranda, Syed Nadeem, Jewell Richard
    Looking at the list of keynotes and well-known Oracle names above mine, I think I’m currently the top “civilian” (at the moment) – so I’m going to feel just a little smug for the rest of the weekend.

    October 30, 2009

    logging

    Filed under: Performance — Jonathan Lewis @ 1:02 pm UTC Oct 30,2009

    I’ve just jotted down a few notes about “log file sync” waits, “log file parallel write” waits, and the nologging option in response to a question on OTN about redo activity when creating a large index. The thread is probably worth reading.

    October 27, 2009

    XStreams

    Filed under: Infrastructure — Jonathan Lewis @ 7:24 pm UTC Oct 27,2009

    If you’re into using Streams, you may want to take a look at the latest enhancement: XStreams.

    There’s a little overview, plus a couple of Java examples at the following links:

    October 26, 2009

    Philosophy – 8

    Filed under: Uncategorized — Jonathan Lewis @ 8:33 pm UTC Oct 26,2009

    Btree indexes vs. Bitmap indexes – the critical difference

    • A single B-tree index allows you to access a small amount of data very precisely.
    • It is the combination of a subset of the available bitmap indexes that offers the same degree of precision.

    You should not be comparing the effectiveness of a bitmap index with the effectiveness of a b-tree index.

    (Inevitably it’s a little more subtle than this – you may create some low-precision b-tree indexes to avoid foreign key locking issues,the optimizer can combine b-tree indexes, and so on - but if you start from this basis you will have a rational view about how to use bitmap indexes).

    Footnote: remember, also, that bitmap indexes introduce massive concurrency issues and other maintenance overheads.

    October 23, 2009

    Quiz Night

    Filed under: Uncategorized — Jonathan Lewis @ 6:55 pm UTC Oct 23,2009

    Okay, so it’s not night-time, or even early evening, in some parts of the world – but somewhere it’s Friday night, and Friday night is quiz night.

    Now, most people are aware that the clustering_factor of an index measures the ordering (or, if you want to be slightly more accurate, the clustering) of the rows in a table relative to the order of entries in the index. Consequently  if you check the clustering_factor before and after rebuilding an index it cannot change.

    So how do you explan this output – which comes from a clean cut-n-paste, with no selective editing, from an SQL*Plus session running Oracle 10.2.03 with echo on - with no other sessions active on the system:

    
     SQL> execute dbms_stats.gather_index_stats(user,'i1', estimate_percent=>100)
    
    PL/SQL procedure successfully completed.
    
    SQL>
    SQL> select
      2          clustering_factor
      3  from       user_indexes
      4  where      index_name = 'I1'
      5  ;
    
    CLUSTERING_FACTOR
    -----------------
                  396
    
    1 row selected.
    
    SQL>
    SQL> alter index i1 rebuild pctfree 10;
    
    Index altered.
    
    SQL>
    SQL> execute dbms_stats.gather_index_stats(user,'i1', estimate_percent=>100)
    
    PL/SQL procedure successfully completed.
    
    SQL>
    SQL> select
      2          clustering_factor
      3  from       user_indexes
      4  where      index_name = 'I1'
      5  ;
    
    CLUSTERING_FACTOR
    -----------------
                  198
    
    1 row selected.
    

    You have two minutes, starting from NOW.

    October 21, 2009

    Bitmap Updates

    Filed under: Indexing, Infrastructure, Performance — Jonathan Lewis @ 5:33 pm UTC Oct 21,2009

    It is fairly well-known that bitmap indexes are very dense structures that can behave badly if their underlying tables are subject to even fairly low levels of insert, update or delete activity. Problems include contention, space management and performance, and these problens have spawned a couple of well-known guidelines relating to bitmap indexes:

    • Avoid concurrent modification of data by multiple processes – otherwise you end up with processes dead-locking
    • Drop/disable bitmap indexes before data loads and rebuild them afterwards.

    Of course, with a little care and experimentation, you may find that you don’t need to apply the second guideline in all cases – especially for bulk inserts.

    But what if you absolutely do have to cope with a continuous (serialised) stream of small modifications to the data and can’t drop the indexes while you’re doing it. Are there any damage-limitation steps you can take. The answer, of course, is yes – there’s always a “least worst” strategy, no matter how badly you attempt to misue an Oracle feature. Here’s a demonstration of one interesting aspect of bitmap index behaviour that helps to reduce the problems of “lots of small updates”.

    All I’m going to do is create a table with a bitmap index in a tablespace using 8KB blocks and freelist management (rather than ASSMautomatic segment space management – but that’s not a specific requirement of the demonstration) then run a pl/sql loop that updates 5,000 randomly selected rows one at a time.  When the update loop is complete I’m going to check the session stats for any interesting figures.

    execute dbms_random.seed(0)
    
    create table t1
    as
    with generator as (
    	select	--+ materialize
    		rownum 	id
    	from	all_objects
    	where	rownum <= 3000
    )
    select
    	rownum				id,
    	trunc(dbms_random.value(1,100))	bit1,
    	trunc(dbms_random.value(1,100))	bit2,
    	lpad(rownum,10,'0')		small_vc
    from
    	generator	v1,
    	generator	v2
    where
    	rownum <= 400000
    ;
    
    create index t1_i1 on t1(id);
    create bitmap index t1_b2 on t1(bit2);
    
    execute snap_my_stats.start_snap
    
    declare
    	m_key	number(6);
    begin
    	for i in 1..5000 loop
    		m_key := trunc(dbms_random.value(1,250000));
    
    		update t1 set
    			bit2 = trunc(dbms_random.value(1,100))
    		where id = m_key;
    
    	end loop;
    end;
    /
    
    execute snap_my_stats.end_snap
    

    The snap_my_stats procedure calls come from one of my simple snapshot packages – querying v$mystat joined to v$statname in this case. So what do you think will be the interesting numbers generated by this little piece of code when running on 10g ?

    Take a look at the figures for the session’s undo and redo:

    
    redo entries                      31,786
    redo size                     41,933,324
    undo change vector size       38,038,180
    

    The number of redo entries is a little high – in a perfect case we might expect about 15,000 redo entries because (in principle) we update one row, delete one index entry and insert another 5,000 times; clearly we’ve had quite a lot of extra block cleanout activity happening at the same time. In fact, the number of redo entries isn’t particularly interesting, but I’ve included it so that you can see that the average size of a redo entry is, at first approximation, nearly 1.3KB – which is quite big.

    Since half the redo entries are fairly small entries relating to block cleanout we could ignore their contribution to the count, and since 5,000 of the entries are fairly small entries describing changes to the table blocks we could ignore their contribution as well – the most significant entries are the 10,000 entries for modifying index blocks and (after allowing for small redo sizes for the other entries) our second approximation is that the average size of the redo entries for the index changes is about 4KB (which you could confirm by doing some log file dumps). Think about that for a moment – every time we update a row in the table, the two index changes between them seem to generate the equivalent of a data block of redo.

    Given that undo also has to be described by redo ( see Nutshell – 1 ) you can see that most of the redo is actually from the undo – 38MB out of 42MB. So what information is being created where, and why is there so much in total, and what can we do to reduce it.

    Each of my update statements updated one row of data, usually changing the indexed column from one value to another. To effect this change Oracle has  to change a bit in one bitmap index entry (the one for the current value of the column) from a one to a zero and change a bit in another bitmap index entry (the one for the new value of the column) from a zero to a one. But you don’t update index entries – you “delete” (mark as deleted) the old index entry and insert a new index entry … at least, that’s what you do with B-tree indexes in Oracle, and it’s what you used to do with bitmap indexes until 10g of Oracle).

    In 10g when you change a bit in a bitmap index entry you copy the old version of the index entry into an undo record and modify the index entry “in situ” to its new value. (This may not be possible in every case, there are a few variations on the theme.) The redo change vectors you generate are:

    • a very large vector describing what you’ve done to the undo block,
    • a small change vector describing how you’ve modified the index entry. 

    Now remember that you do this for two index entries. So how do we manage to generate so much undo and redo ? Because a single bitmap index entry can be as large as half a data block – which is why the total redo generated for a single row change can be roughly the same size as a whole data block.

    So the underlying problem is a consequence of the size of the block or rather, to be precise, the size of the freespace available in the block based on the index definition. So how can you reduce the redo generation if you really need to cope with this type of drip-feed into a table with bitmap indexes ? There are two options really – create the index in a tablespace with the smallest block size possible, or define the index with a large value for pctfree (e.g. 75%). Both strategies have their costs but they can make a big difference to the undo and redo generated.

    Here, for example, are the figures from re-running my test first using a 2KB blocksize for the index, and then using an 8KB blocksize with the index pctfree set to 75. (You can expect the figures to be similar but not identical because the settings don’t result in exactly the same limiting size on an index entry .) In both cases the typical index entry has been reduced by a factor of approximately four, so the undo change vector size is roughly a quarter of the size – and that fact is reflected in the reduction in the redo size:

    Blocksize 2KB – pctfree 10 (default)

    redo entries                       29,718
    redo size                      14,784,852
    undo change vector size        10,293,344
    

    Blocksize 8KB – pctfree 75

    redo entries                       29,944
    redo size                      15,913,584
    undo change vector size        11,413,304
    

    If diskspace and buffer memory are not a problem you may prefer to “waste” space in the 8KB blocks so that Oracle can make best use of the buffer cache through its LRU as the pattern of activity changes throughout the  day. If space (and memory particularly) are at a premium you may choose to go for the smaller block size and deal with the hassle of manual cache management rather than memory to “empty” space.

    If you think this is bad, by the way, take a look at the equivalent figures for the test when run against 9i using an 8KB block size and the default pctfree 10 (there is no “undo change vector size” statistic on 9i).

    redo entries                      136,209
    redo size                     289,892,104
    

    Apart from the excess redo, the index also exploded from 99 blocks to about 7,300 blocks and the test run took 81 seconds instead of just nine seconds.

    The problem is that in 9i (and earlier), updates to bitmap index entries are treated the same way as updates to B-tree entries – which means a long “history” of very similar bitmap index entries can build up in the index, and these entries result in leaf block splits and large entries propagating upwards into branch blocks; and both the old and new entries are copied (one into the undo vector, the other into the forward vector) into the redo log.

    Bonus Performance Threat:

    If you enable flashback database, then Oracle will read old undo blocks from the undo tablespace and copy them into the flashback log before “new”ing them and using them (watch out for statistic “physical reads for flashback new”). So when you’re hammering the redo logs because of your bitmap updates, you’re also hammering your undo tablespace and flashback log – so it really is worth thinking carefully about the best way to do damage limitation.

    October 20, 2009

    Understanding

    Filed under: Uncategorized — Jonathan Lewis @ 7:27 pm UTC Oct 20,2009

    Karen Morton has a few wise words to say about understanding vs. memorization. Definitely worth reading and understanding – and maybe even memorizing.

    Next Page »

    Blog at WordPress.com.