Oracle Scratchpad

December 16, 2011

All Things Oracle

Filed under: CBO,Oracle,Tuning — Jonathan Lewis @ 5:53 pm GMT Dec 16,2011

Last year I wrote a few articles for Simpletalk, a web service created by Red Gate for users of SQL Server. This year, Red Gate is setting up a similar service called “All things Oracle” (I’ve added a link in my blogroll) for Oracle users, and I’ve volunteered to write articles for them occasionally.

Some of the stuff they publish will be complete articles on their website, some will be short introductions with links to the authors’ own websites. My first article for them has just been posted – it’s an article that captures a couple of key points from the optimizer presentation I did at the UKOUG conference a couple of weeks ago.

9 Comments »

  1. Jonathan, can you elaborate a bit on the “double loop” mentioned in the article? I was unable to find anything about that.

    Comment by Mladen Gogala — December 16, 2011 @ 6:52 pm GMT Dec 16,2011 | Reply

    • Mladen,

      Although the query is a simple join between two tables it shows two nested loop operators – lines 1 and 2.

      -----------------------------------------------------
      | Id  | Operation                     | Name        |
      -----------------------------------------------------
      |   0 | SELECT STATEMENT              |             |
      |   1 |  NESTED LOOPS                 |             |
      |   2 |   NESTED LOOPS                |             |
      |   3 |    TABLE ACCESS BY INDEX ROWID| ORDERS      |
      |*  4 |     INDEX RANGE SCAN          | ORD_DAT_PRD |
      |*  5 |    INDEX UNIQUE SCAN          | PRD_PK      |
      |*  6 |   TABLE ACCESS BY INDEX ROWID | PRODUCTS    |
      -----------------------------------------------------
      
      Predicate Information (identified by operation id):
      ---------------------------------------------------
         4 - access("ORD"."DATE_PLACED">SYSDATE@!-1)
         5 - access("ORD"."ID_PRODUCT"="PRD"."ID")
         6 - filter("PRD"."PRODUCT_GROUP"='CLASSICAL CD')
      

      In principle:
      For each row from the ORDERS table we join to the PRD_PK index.
      For each entry from the PRD_PK index we join to the PRODUCTS table.

      Comment by Jonathan Lewis — December 16, 2011 @ 8:42 pm GMT Dec 16,2011 | Reply

    • It’s described in the docs.

      Comment by Timur Akhmadeev — December 19, 2011 @ 7:31 am GMT Dec 19,2011 | Reply

    • I should have mentioned that Randolf Geist discusses it, and examines variations and costs, in a series of blog articles that I added to my “bits and pieces” list a little while ago:

      Logical I/O Part 1 – Randolf Geist
      Logical I/O Part 2 – Randolf Geist
      Logical I/O Part 3 – Randolf Geist

      Comment by Jonathan Lewis — December 19, 2011 @ 12:50 pm GMT Dec 19,2011 | Reply

  2. This is very useful for me as a continuous learner :) Thanks a lot!!

    Comment by Selvakumar Nagulan — December 16, 2011 @ 7:14 pm GMT Dec 16,2011 | Reply

  3. Jonathan,

    do you have a chance to reverse the two columns in the index ORD_DAT_PRD of the date_placed and id_product? Thus the CBO can use nested loop and take products as the driving table, to take advantage of the known product_id from products and date_placed to range scan the data on orders we’re just after. I think it’s mainly a index design problem, i’ve seen this scenario many time. As a general guideline, put the equal predicate column as the leading column and non-equal predicate columns as the end.

    Understand you are showing how to instruct the CBO to work as we want, and there is not always a change to correct the index. Actually the two solution is interesting, show how to avoid unnecessary access on the table orders. the “scalar subquery caching” remind me the chapter in the “Cost Based Fundamental” :-)

    Comment by Sid — December 18, 2011 @ 6:02 am GMT Dec 18,2011 | Reply

    • Sid,

      In the presentation I gave at the UKOUG conference, I described seven different ways to optimise this query using minimal physical changes to the database. Reversing the columns on the ord_dat_prd index (or, rather, extending the foreign key index on orders(id_prduct)) was not a good option because the initial effect of the join order you suggest would have resulted about 1,000 in different locations in a very large index, and the increase in the size and maintenance costs of the index was undesirable.

      Comment by Jonathan Lewis — December 19, 2011 @ 12:43 pm GMT Dec 19,2011 | Reply

      • Jonathan,

        I thought that there was just one product in one product group. I just saw you share the ddl to build the test case, there are 1,000 products per product group, so reverse the columns on the index ord_dat_prd doesn’t make sense.

        It sounds awesome there are other 5 possible ways to tune the sql, would you mind to publish them all:-)
        i build the test case in my 10.2.0.4 db, the plan chosen for the original SQL is different. Hash join is chosen for the first two sql, the full table scan on products contribute most of buffer gets. I guess the enhance 1 would be the best choice.

        The enhancement 2, the “scalar subquery cache” is not suitable is this case because there are lots of different id_product, the buffer gets is up to 1M.

        Here comes the performance for the original sql and enhance 1/2
        … [Removed by JPL] …

        Comment by Sid — December 20, 2011 @ 1:50 pm GMT Dec 20,2011 | Reply

  4. Sid,

    The seven strategies are part of a series of presentations that take between one and three hours to run. I’ll be showing them at both RMOUG and ODTUG next year.

    I’ve deleted the results you got from running the code. It wasn’t a test case, it was “a model of the two tables and their indexes” – but if you look at the data, it’s generating 25 days of orders at three orders per second (and it’s generating them in reverse order!) with a tiny amount of peripheral data. The model allows us to investigate and think about possible execution paths – but if we see an execution path that might be helpful we still need to think about the pros and cons.

    You’ve seen a lot of extra buffer gets and concluded that the second strategy would not be good – but you haven’t allowed for the possibility of differential caching and differences in physical reads. In the article I pointed out that the second strategy probably wouldn’t be a good choice because the number of different products would probably minimise the potential benefits of scalar subquery caching – but I didn’t allow for the possiblity that 90% of the orders for classical CDs were for the top 10 popular CDs for the day.

    Comment by Jonathan Lewis — December 23, 2011 @ 9:46 am GMT Dec 23,2011 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,523 other followers