Oracle Scratchpad

November 17, 2006

How long …

Filed under: Performance,Tuning — Jonathan Lewis @ 4:28 pm BST Nov 17,2006

… should it take Oracle to run my query ?

It’s a perfectly reasonable question, and one that you should be able to answer before you start to worry about tuning the query – but I wonder how many people actually formulate the question, and then do a scratch calculation to get an answer. And if you don’t do it, how do you know what your tuning target is ?

10 Comments »

  1. but I wonder how many people actually formulate the question, and then do a scratch calculation to get an answer.

    I don’t, because I don’t know how to, but I am willing to learn. Any pointers / reference documents / examples?

    And if you don’t do it, how do you know what your tuning target is ?

    By a reasonable expectation, based on the complexity of the query, volume of data that will be analyzed, etc. I work to tune the query for the lowest LIOs, but, as I already stated, do not really know how to calculate a target.

    Comment by Kyle — November 17, 2006 @ 5:09 pm BST Nov 17,2006 | Reply

  2. A suggestion I once heard from Mr. Millsap applicable for non-aggregating query was

    10 LIO per returned row per tables in the From clause
    e.g. 2 rows returned from a 3 table join LIO should be no greater than 60

    I’ve found this to be generally applicable.

    Otherwise it depends on end user satisfaction and impact on the system

    Ken

    Comment by Ken — November 17, 2006 @ 8:07 pm BST Nov 17,2006 | Reply

  3. I normally assume that a single-block PIO takes about 10msec, and a LIO 1/tenth of that (1msec), then I mentally execute the query in the steady system state (assuming eg that the root block of an index is always cached, so it gets counted as LIO, etc).

    I also assume that a multiblock I/O takes the same time as two-three single-block accesses.

    The best access path/join order I can imagine becomes my tuning target.

    I resort to measuring, rather than assuming, the cpu used for hash joins and sorting operations, since I’ve yet to found a reasonable heuristic for them; the same, a fortiori, for latches acquisitions (whose impact, sadly, can normally be measured only in production, under the real load).

    But my tuning process is, mostly, by intuition, guided by some guidelines such as the ones presented above, and some simplified experiments on (possible synthetic) data for the most critical statements.

    Comment by Alberto Dell'Era — November 17, 2006 @ 10:37 pm BST Nov 17,2006 | Reply

  4. Most of the times, I just do a quick explain plan of the query in question on a test/dev/production machine and that gives a good estimate to start with. Of course the query will go through the actual run and times calculated, but the explain plan should be a decent start..no?

    Comment by Sudhi — November 19, 2006 @ 1:46 pm BST Nov 19,2006 | Reply

  5. Sudhi,

    >>I just do a quick explain plan of the query in question on a test/dev/production machine and that gives a good estimate to start with.

    There won’t be any guarantee that the explain for the query would be the sam when you run on test/dev/production. Also, depends which way you get the explain plan, using different methods to get explain plan for the query.

    Comment by Jaffar — November 19, 2006 @ 2:48 pm BST Nov 19,2006 | Reply

  6. Ken, I think that you’re quoting the guideline Cary gives when reviewing the summary of work done by SQL after the event. The figure 10 is probably allowing for a “reasonably precise” application. In similar circumstatnces I tend to take 4 LIOs as a guideline on “perfect precision” – based on a very simple-minded approximation that a single row goes through “root, branch, leaf, table”.
    Kyle, it’s interesting to note that your first paragraph says you don’t know how to set a target, and then your second paragraph gives a perfect description of how to set a target. The “reasonable expectation” based on complexity of query, data to be analysed is IT. Your comments, combined with Alberto’s, plus a little formalism is all it needs to figure out rough, but adequate, targets.
    I’ll be writing something up about it eventually – but a pretty good starting point is:

    Work out a “business-aware” execution path, and walk it, counting the accumulation of data, and block visits for filtering subqueries. The data volume (row counts) depend on knowing the business, and the block visits (derived from row counts) then needs an understanding of the likely data scatter.

    Basically I take the same approach as Alberto, with the assumeption that index blocks will be buffered and table blocks will be phsyical I/Os and that time will be mostly about physical I/O.
    It’s not a great approximation, but it does two things – it highlights the cases where the query is likely to thrash the I/O subsystem (and will probably start using merge or hash joins), and it gives me an idea of whether I need to review the query, the business requirement, or the database structure.
    Sudhi, as Jaffar points out you need to be a bit careful about trusting the execution plans from explain plan. (See Volume 2 of my book when I manage to finish it). But if you’ve allowed for all the traps, the execution can be helpful on two counts: you can use it to decide if Oracle’s chosen plan makes sense from a business perspective, and you can follow Alberto’s strategy by mentally executing the chosen plan to see if the work predicted by the optimizer matches your mental image of the data.

    Comment by Jonathan Lewis — November 19, 2006 @ 6:08 pm BST Nov 19,2006 | Reply

  7. Jaffar & Jonathan, Yes I’m aware of the traps of explain plan and the various things that come with it. I just use it as a starting *estimate* combined with application requirements of the query. Of course the entire thing is NOT trusted at its face value, but gives a very good starting point to think about the plan & elapsed time. This has also helped the developers to change their approach from “tune this query” to “does this makes sense” or “Can I come out with a better query” :)

    Comment by Sudhi — November 20, 2006 @ 4:36 am BST Nov 20,2006 | Reply

  8. In case of partitioned table, it’s great risk if partition statistics is missing for some of the partitions, I observed plan gets changed very drastically and so the execution time. Use of DOP (Degree of Parallelism) also key factor.

    Comment by Dilipkumar Patel — November 20, 2006 @ 4:58 pm BST Nov 20,2006 | Reply

  9. funny,
    I came across the very same thought, working on a problem just the other day re: how much time and how much space is required for a large DW job…. doing some FTS on some pretty large tables, it required some 60+ gb of temp space and a few good hours of processing time. Have a bit of a (different) appreciation now for these (large) jobs… ;-)

    Comment by cosmin — November 21, 2006 @ 6:29 pm BST Nov 21,2006 | Reply

  10. Playing with connect by, once I ran this (on a test system)

    select * from (select level l from dual connect by level > 1) order by l desc

    forever !

    Comment by Rob — November 27, 2006 @ 4:23 pm BST Nov 27,2006 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Powered by WordPress.com.