Oracle Scratchpad

May 31, 2009

Ancient History 2

Filed under: Infrastructure,Performance,Tuning — Jonathan Lewis @ 7:09 pm BST May 31,2009

Some time ago I posted an extract from a short presentation I had given at the UKOUG annual conference about 12 years previously.

When I found that set of slides, I also found the paper copies of another set of slides I had used at another UKOUG SIG event in 1995 where I had been explaining the mechanisms used by the cost based optimizer to decide whether or not to use an index. I thought it might be quite revealing to reproduce those slides to show how much (or how little) things have changed since then.

The original presentation had been done on an overhead projector – the type that uses transparent plastic sheets with ink printed onto them, and a hot lamp – and I didn’t have the original (Word 2.0) file that I had used to create the slides, so I’ve scanned the paper copies and have uploaded them as small jpg files.

Slide 1: sets the scene: If you have a table with 1,000,000 rows and you want to query 5,000 of them should you use a tablescan or an index.

Slide 2: gives an estimate of the number of read requests needed to scan the table

Slide 3: descibes a scenario where all 5,000 rows will be in a compact section of the table because the nature of the query matches the pattern of the data arrival, and points out that the number of disc read requests will be small (compared to the tablescan) if you use the index


Slide 4: describes another viable scenario relevant to the same data set where the nature of the query is completely opposed to the pattern of the data arrive, and points out that the number of disc read requests will be large (compared to the tablescan) if you use the index.


The conclusion of the presentation – five slides in total (at least, that was all I’d kept, and the fifth was the conclusion from the above) – was that you couldn’t identify the better execution by considering just the volume of data … the data distribution mattered as well.

I suspect that this set of slides was based on Oracle 7.3 since 4KB isn’t all that common as a database block size nowadays, but things haven’t really changed much. We’ve got CPU costing (also known as System Statistics to add timing information based on the size of read requests, and to allow some time for CPU activity, but apart from that the optimizer is still interested in the same two things: how much data, and how randomly is it scattered.


  1. If it was Oracle 7, did the data dictionary hold CLUSTERING_FACTOR then? That would probably have been the next step forward that Oracle introduced to make that decision. I’m pretty sure it was there in Oracle 8, but not 7.3. I know I was often disabling indexes from being used back with Oracle 7 as the index was not “good enough”.
    I’ve got an Oracle internal paper from 1993 that talks about VLDBs. Back then anything over 30GB under unix was counted as Very Large. So the scale has change. But most of the main problems it talks about solving (backups, maintenance) have not changed at all.

    Comment by Martin Widlake — June 2, 2009 @ 2:08 pm BST Jun 2,2009 | Reply

    • Martin,

      Fortunately Hemant has confirmed my belief about the clustering_factor – I was pretty sure that it was there, but couldn’t be absolutely certain without a manual or copy of the database to hand.

      I’m surprised you had to disable indexes – it was usually the other way round in my experience. An index that the rule-based optimizer would use “because it was there” would be ignored by the cost-based optimizer because a tablescan was cheaper.

      To me the big issue of using Cost-based in v7 was that the tablescan cost was (blocks / db_file_multiblock_read_count) and this was a problem for two reasons:

      (a) DBAs had been conditioned to setting the parameter to the largest possible value to get faster reads after the RBO had worked out a plan the required a tablescan and
      (b) The cost really was (blocks / db_file_multiblock_read_count), and didn’t use an “adjusted” value to cater for a more realistic view of what would happen at run-time. (From 8.0 onwards, for example, Oracle would use 40 in the arithmetic if you set the parameter to 128 — and in 10g2 if you don’t set the parameter at all Oracle will use 8 for the arithmetic and the largest reasonable value for run-time).

      Comment by Jonathan Lewis — June 3, 2009 @ 6:33 pm BST Jun 3,2009 | Reply

  2. The 7.3 Reference Manual does show CLUSTERING_FACTOR in DBA_INDEXES. The description is “A measurement of the amount of (dis)order of the table this index is for” (sic !)

    Comment by Hemant K Chitale — June 3, 2009 @ 6:32 am BST Jun 3,2009 | Reply

    • Hemant,

      Thanks for that. I was fairly sure that the clustering_factor was there in v7 – but don’t have the manuals (or the software) around any more.

      Comment by Jonathan Lewis — June 3, 2009 @ 6:26 pm BST Jun 3,2009 | Reply

  3. Thanks Hemant, I could not find my old set of V7 manuals (I suspect in a box since the last house move.)
    And jonathan, just to compound my stupidity (my only excuse for which was being tired when I posted), I was using the RBO on 7.3! So clustering factor would not have been a factor for me. I tried CBO and it gave me so much grief I stopped pushing for it and went back to RBO! So, when I was turning off indexes (adding zero and “” to columns – boy I spent so much time doing that) it was to control RBO’s choice of using indexes when it was not the best choice.

    Comment by Martin Widlake — June 3, 2009 @ 9:34 pm BST Jun 3,2009 | Reply

RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by