Oracle Scratchpad

November 11, 2008

first_rows_n

Filed under: CBO,Execution plans,Hints,Performance,Tuning — Jonathan Lewis @ 1:05 pm BST Nov 11,2008

When it comes to setting the optimizer_mode parameter you often hear people say that first_rows_N (for one of the legal values of N) should be used for OLTP systems and all_rows should be used for decision support and data warehouse systems.

There is an element of truth in the statement – but it’s really a hangover from the early days of CBO, and remembrance of  the old first_rows optimizer mode (** See footnote).

Prior to the introduction of parameters like optimizer_index_cost_adj (Oracle 8), and system statistics (Oracle 9 – but optional until 10g), the cost based optimizer often seemed too keen to use execution plans that involved tablescans and hash joins – which could be most unsuitable in an OLTP system.

This problem appeared because the optimizer did not attempt to distinguish between the potential speed of a “db file sequential read” (single block read) compared to a “db file scattered read” (multiblock read).  Setting the optimizer_mode to first_rows introduced some simple rules that (to a large extent) stopped the optimizer from using tablescans unless there were no alternatives.

In 9i, however, first_rows_N optimisation is geared towards using arithmetic to make decisions about avoiding tablescans. You can see the results of this in some of the odd numbers that appear in execution plans as the optimizer works out whether, for example, it is better to use an index to pick up 10 rows from a table, or whether running (say) 2.3% of a tablescan (which would probably get the first 10 rows) is the cheaper option.

Think about what this means – first_rows_N is trying to work out the cheapest way of collecting the first N rows of the result set, whereas all_rows is trying to work out the cheapest way of collecting all the rows of the result set.

But when you are running an OLTP system a huge fraction of typical end-user queries will be returning just a handful of rows anyway – and if you have a query that’s only supposed to return 5 or 6 rows (say), then first_rows_10 actually means “more rows than I really expect” i.e. all_rows. If the optimizer works correctly then both options should produce the same execution plan.

Of course some end-user queries are reports (even in an OLTP system), and for reports you don’t want to generate a plan that’s optimised to return the first few lines so, again, you’re really interested in all_rows optimisation.

Your conclusion should therefore be: for OLTP systems, first_rows_N ought to be pretty irrelevant. You should be using all_rows – unless you can demonstrate that this introduces a large number of execution plans that have to be fixed individually because of some unexpected side effects (Again, though, see footnote). 

So why would we ever need first_rows_N ?

Think Google, or Amazon, or any of the “forum” software suppliers.  There are systems which are effectively ad hoc query systems where users can supply query predicates that could return a lot of data – but the code limits the amount of data sent back to the user to “page at a time”. Think “web-based reporting system” and you’ve identified the basic rationale for first_rows_N.

Even for the web-base reporting system, though, you may still be able to run under all_rows from from 10g onwards. In 10g when the optimizer sees predicates like “rownum <= 35″ it will behave as if you had included the hint /*+ first_rows(35) */ in the query.

As a closing thought: even if you think that first_rows_N may be the correct logical and strategic choice for your system, remember that it hasn’t been “field-tested” to the extent that all_rows has been tested.

There are bugs (or side effects) in the optimizer that make their first appearance only in the strange circumstances of “real-world” applications – and it’s a good bet that such oddities are found (and probably fixed) faster in the all_rows environment than in the first_rows_N environment simply because far more systems use all_rows.

Even when first_rows_N seems to be the right choice, you could decide to stick with all_rows and adjust critical statements with a /*+ first_rows(N) */ hint – where N can be any positive integer.

Update (Sept 2009):

In one of the comments below, you will see that Timur Akhmadeev has supplied a link to an item on the OTN database forum highlighting the types of error that can appear.

A couple of months after reporting the problem in 10.2 the original author re-ran his test case in 11.2 to discover that the problem had disappeared. The analysis by Charles Hooper suggests that Oracle decided that a “rownum <= 1″ predicate made the optimizer work out that it needed 1/78th of the final data set – but then erroneously pushed that factor of 1/78 into the cost calculations for an aggregation step that had to run to completion before any data could be reported.

Footnote:

If you still have any systems running 9i or later and using first_rows, you should be planning to change the optimizer_mode when you next go through a full test cycle. As the manuals point out (9.2 Performance Tuning Guide and Reference p1-12): first_rows is available for backward compatibility and plan stability”.

Backward compatibility is important, and until you enable system statistics, clean up various other optimizer related parameters, sort out your statistics collection strategy, and clean out a load of redundant hints and SQL hacks, you may find that changing the optimizer_mode introduces as many problems as it solves – and that’s why this is a change that needs a lot of testing

19 Comments »

  1. Very interesting post Jonathan – really made me stop and think. The fact that ALL_ROWS is the default (since at least 10.2) lends credence to your argument. I think the comment on the amount “testing” that ALL_ROWS gets vs. any other option is also important. The middle of the pack is a much safer place to be than on the fringe.

    Comment by Kerry Osborne — November 12, 2008 @ 5:32 pm BST Nov 12,2008 | Reply

  2. Jonathan, allow me to disagree. Most of the web-based reporting systems I run across don’t use a ‘where rownum <= n’ predicate at all. They perform a select, display the first n rows, and start ‘walking’ (i.e continue fetching) the result set when the user hits the next page button.

    Using a ‘rownum <= n’ predicate would force the application to re-execute the query, with rownum <= 2*n, when the next-page button is hit. It would fetch-and-throwaway the first n rows to get to the next page in the result set. The next ‘next-page’ hit would cause a rownum <= 3n predicate, throw away 2n rows, and so on.

    I would say, FIRST_ROWS_N is especially useful for web-based reporting systems. Not using a ‘rownum < n’ predicate avoids many unneeded re-executions of queries, combined with increasing amounts of throw-away fetching. I can be wrong of course, e.g. because I’m unaware of an efficient skip-fetch feature?

    Comment by Carel-Jan — November 14, 2008 @ 1:45 pm BST Nov 14,2008 | Reply

  3. Carel-Jan,

    you might want to take a look at the pagination technique described by Tom Kyte: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:127412348064

    Comment by Martin Decker — November 16, 2008 @ 5:58 pm BST Nov 16,2008 | Reply

  4. Carel-Jan,
    You’re allowed to disagree. There are at least three main versions for the web-based report:

    a) Fetch all the data into the client in a single call, but let the client display it a page at a time.

    b) Fetch all the data into the mid-tier in a single call, and let the client fetch the data from the mid-tier in pages

    c) Allow the client to fetch the data from the database in pages.

    The option I’ve seen most frequently is (c), but that doesn’t stop you seeing (a) or (b) more frequently.

    There is no generic “skip fetch” option – but Martin Decker has given you a link to Tom Kyte and I’ve published a couple of notes on mechanisms that can limit the damage.

    The strategy assumes that the user won’t want to page through all the data – so the trade-off is between: several small queries (with increasing throwaway) and stopping early, compared to acquiring and moving all the data to the client.

    Of course if you’re expecting the client to fetch the entire data set in a single call then it makes more sense to fetch the entire report in one go. But in that case I’d expect to run all_rows – because you want the database to get all the data with the least resource usage.

    Comment by Jonathan Lewis — November 17, 2008 @ 10:14 pm BST Nov 17,2008 | Reply

  5. There is also a good description of possible pagination techniques on this page: http://www.inf.unideb.hu/~gabora/pagination/

    Comment by Herald — November 20, 2008 @ 8:48 pm BST Nov 20,2008 | Reply

  6. Jonathan, you mentioned that

    “In 10g when the optimizer sees predicates like “rownum <= 35″ it will behave as if you had included the hint /*+ first_rows(35) */ in the query."

    Wouldn’t the number 35 in the predicate usually be a bind variable? In that case, how would Oracle know to use the hint /*+ first_rows(35) */?

    Comment by peter — January 29, 2009 @ 5:58 pm BST Jan 29,2009 | Reply

    • Peter,

      Bind variable – highly likely, I should think – but with bind variable peeking this means the assumed hint would always be for the notional “page 1″ – biasing Oracle to the most efficient strategy for a relatively small number of rows.

      (There is a hidden parameter that seems to be relevant: _optimizer_rownum_bind_default which defaults to 10, but a quick test shows that when the optimizer can peek it uses the peeked value).

      Comment by Jonathan Lewis — January 30, 2009 @ 6:26 pm BST Jan 30,2009 | Reply

  7. [...] Blog from Jonathan Lewis about First Rows vs. All Rows for the Oracle Optimizer [...]

    Pingback by Three Unrelated But Interesting Blogs « Seven Seconds — July 7, 2009 @ 3:55 am BST Jul 7,2009 | Reply

  8. Jonathan,

    One application that makes extensive use of this is Siebel. You made some comments here (http://www.freelists.org/post/oracle-l/ORDER-BY-and-first-rows-10-madness,4) on how the optimizer behaves in this mode.

    Indeed, the problem that the original poster in that thread had was quite common when running Siebel on Oracle 9i, but I haven’t seen it recently (corresponding to 10g installs). That is, execpt of course in one isolated instance that came to my attention yesterday – hence why I thought to post a note here… The basic problem is that the optimizer chooses an index that covers the sort rather than a selective filter.

    Certainly, this could be influenced by statistics. But what we seem to have is a case of the same query and generally the same data pattern (its a package…) driving to a different (better) plan in 10g than in 9i. I’m wondering if there was an underlying improvement in the algorithm for this feature from 9i to 10g. Are you aware of any such change?

    Comment by Joe Coffey — August 4, 2009 @ 3:12 pm BST Aug 4,2009 | Reply

    • Joe,

      I don’t have any definite information about this but there is a new parameter in 10g called “_first_k_rows_dynamic_proration” which has the description: “enable the use of dynamic proration of join cardinalities”. That sounds as if it might be relevant.

      I know that Randolf Geist has been doing some work on first_rows_k in the last couple of weeks – I don’t think he’s published anything about it, but it might be worth keeping an eye on his blog.

      Comment by Jonathan Lewis — August 6, 2009 @ 5:15 pm BST Aug 6,2009 | Reply

      • Thanks for the note and I will keep an eye on Randolf’s blog as well. Since this problem is
        1. Usually limited to a few queries per installation and easily remedied by stored outline
        and
        2. Seems to not re-occur often (only once in my sphere) since Siebel customers started using 10g.

        I suspect that this will remain one of those where resolution < root cause solution.

        If I do have opportunity to expose more on it, I'll post back here, but it does look like something related in the optimizer changed for the better between 9i and 10g.

        Comment by Joe Coffey — August 10, 2009 @ 1:43 pm BST Aug 10,2009 | Reply

    • Hi Joe,
      here is recent OTN thread which looks very similar to your problem.

      Comment by Timur Akhmadeev — August 7, 2009 @ 7:34 am BST Aug 7,2009 | Reply

  9. [...] Execution plans, Hints — Jonathan Lewis @ 7:21 pm UTC Aug 7,2009 A few months ago, I wrote a note about setting the optimizer_mode to one of the first_rows_N values (first_rows_1, first_rows_10, first_rows_100, or [...]

    Pingback by Rownum effects « Oracle Scratchpad — September 21, 2009 @ 11:11 am BST Sep 21,2009 | Reply

  10. There are indeed some strnage effects when cost gets adjusted to first_k_rows.
    The optimzer went for highly inselective Indexes first.
    Look for the token “First K Rows: ” in 10053.
    It seems to be related to Histograms on these uselective columns, in which way is not entirely clear.
    The issue seems to go away after I got rid of Histograms, but not immediatly, it could take some hours although I fluhsed the shared pool.
    Note that you can set a ALL_ROWS hint in Siebel at Business component level.

    Comment by Lothar — November 26, 2009 @ 11:17 am BST Nov 26,2009 | Reply

  11. [...] effects and first_rows_n, and an example of the problem in this item on the OTN database forum. Leave a [...]

    Pingback by first_rows_N again « Oracle Scratchpad — January 10, 2010 @ 7:03 pm BST Jan 10,2010 | Reply

  12. [...] the internet from time to time along the lines of “first_rows_N is for OLTP systems” (that’s not really correct, by the way), it’s probably best to stick with all_rows unless you can come up with a very [...]

    Pingback by First_rows hash « Oracle Scratchpad — March 12, 2012 @ 1:51 am BST Mar 12,2012 | Reply

  13. Is there any default value of n in first_rows(N) – say 100 or 1000?
    When I am executing my query with first_rows(1000) and first_rows(1) and even with first_rows(200), it does not makes any difference.

    Comment by Peeush — August 14, 2013 @ 1:05 pm BST Aug 14,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,507 other followers