Oracle Scratchpad

November 25, 2015

Tech 15

Filed under: Uncategorized — Jonathan Lewis @ 1:00 pm GMT Nov 25,2015

Updated 27th Nov 2015:

Thanks for all the questions so far – there are plenty of useful topics coming out. At this rate I may have to pass some of these on to the round-table that Tony Hasler.

I’ve made a couple of comments in response, but generally I plan to avoid making any comments until after the panel.

Monday 7th Dec: 11:20 – 12:05

I’ve arranged a panel session on the Cost Based Optimizer for UKOUG Tech 2015, with Maria Colgan, Nigel Bayliss, and Chris Antognini joining me to answer (and maybe argue about) questions from the audience.

To keep things moving along, we aim to have a few questions available before the day and collect a few questions at the door as well as accepting questions from the floor. Martin Widlake and Neil Chandler will be acting as MCs, wielding the microphones, and collecting questions at the door.

So far we have a handful of questions – if you have a question you’d like to raise, add it to the comments below. Even if you can’t attend we may get to your question, and I may be able to write up a summary of questions and answers after the event.

Questions so far:

  • Why do queries that have been working fine for months suddenly change to a very slow plan ?
  • Are smaller, simpler queries less likely to have a bad plan or change from a good plan to a bad one ?
  • What is the down-side of using SQL Profiles to lock in plans ?
  • What’s the best way to collect stats on a partitioned table where you add a daily / weekly / monthly partition ?
  • Is there a good way to make sure you’ve collected stats for interval partitions when you don’t know they’ve been added ?
  • Why do the manuals talk about “prefixed” local indexes (on partitioned tables) as if they have some special effect ?
  • Why does a query with literals run faster than one with bind variables? Stats are up to date on all tables concerned.
  • For a single table an SQL Plan directive will create extended statistics, but how does the Optimizer resolve SPD for the group by or join queries ?
  • I’ve heard that for the group by and join queries, SPD always forces dynamic sampling, is this true ?
  • Will SPD with dynamic sampling take precedence over e.g. columns with histograms?
  • What is the order of precedence the optimizer uses when resolving table cardinality?
  • Will 12.2 allow us to combine column groups and expression as a single extended statistic?


  1. Jonathan

    Well to start off…big fan of yours…an avid reader of your blogs and books!!!
    1) collecting stats on partioned indexes or normal indexes on regular intervals improve performance or its just a waste of rescources as compared to performance gain.index here refers b tree indexes

    2) a strategy to be incorporated to fix the memory like sga and pga …tuning of buffer cache etc while having a new server build ..

    Comment by Aastha — November 25, 2015 @ 1:16 pm GMT Nov 25,2015 | Reply

  2. Hi Jonathan,

    Many thanks for all your hard work!
    My question is related to your detailed investigation of the star query transformation and the next logical step we could do.

    Having the Optimizer calculating the selectivity/cardinality of the Star Transformation, based on the dimensions’ columns selectivity, we are facing an issue with fact tables skewed data.
    I.e. we are in a trouble by the default Optimizer behavior in a cases when the fact table data distribution, doesn’t correspond to the dimension attribute(s) ones.

    For example:
    The customer dimension has equal 50/50 distribution between Men and Women, but the particular fact table stores sales records, 99% of which belongs to women customers.
    Based on the current algorithm, the following query, most probably won’t use star transformation (as it will decide that 50% selectivity is not good enough to initiate index kind of access);

    select sum(amount)
    from fact_sales f, dim_customer d
    where = f.cust_dk
    and = ‘MALE’

    The solution: I have made a connection to another investigation of yours, namely, the faked statistics. I just decided to collect the actual fact table distribution of the males and females and update the dimension column statistics with it. Like, create a fake statistics on the sex dimension column and put the 99/1 distribution there.

    The question: Do you think this is a valid workaround to make the Optimizer “aware” of the skewed data, thus making the Star Transformation useful in a cases when it will provide the actual benefits ? If so, do you see some drawbacks of this method ?

    P.S: It would be nice to see you again at the BGOUG event (yes, a few days since you left and we already miss you :))

    Comment by Edward Hayrabedian — November 25, 2015 @ 1:55 pm GMT Nov 25,2015 | Reply

    • Edward,

      I wrote about this problem a couple of years ago – with a possible workaround:

      (That doesn’t mean it won’t go on the list of questions, though)

      Thanks for the BGOUG comment; I had a great time there.

      (EDIT: I’ve just noticed that you commented on that posting a couple of years ago.)

      Comment by Jonathan Lewis — November 25, 2015 @ 2:48 pm GMT Nov 25,2015 | Reply

      • I “discovered” my comment either, when re-reading your article (sorry, problems with my memory I guess). Maybe, I had to continue asking then and completely resolve the issue (for myself). So, I will continue now in order to avoid asking you again in the near feature :)

        My concerns about the “materialized” in-list workaround are:
        1) Usually, we don’t have any control of the generic code produced by the BI tools (what a shame for their designers). As a result we won’t be able to re-write the SQL text to get the right “IN” format.
        2) The /*+ precompute_subquery */ hint together with monster dimensions (>10M records) could make the final SQL text huge and the parsing time will depend on the number of affected fact records, which I don’t think will scale well.

        Just to demonstrate the issue, lets use the same example, but use the opposite scenario, asking for female sales (to make the IN-LIST huge) :

        The “initial” star query select:
        select sum(amount)
        from fact_sales f, dim_customer d
        where = f.cust_dk
        and = ‘FEMALE’

        The “IN” version:
        select sum(amount)
        from fact_sales f
        where f.cust_dk in (select /*+ PRECOMPUTE_SUBQUERY */ from dim_customer d where = ‘FEMALE’)

        The problem:
        The recordset to be materialized in the form of “OR … ” conditions will be huge, thus neither the parsing time nor the final sql text to be parsed will scale well.

        Am I on the right track at all :)) ?!

        Comment by Edward Hayrabedian — November 25, 2015 @ 4:29 pm GMT Nov 25,2015 | Reply

  3. “Is the default setup of Oracle and the Optimizer, in terms of adaptive/dynamic/feedback features, at odds with the requirements of stability and predicability expected by most application owners?”

    and very much related:

    “What is the least you should do / turn on / turn off in order to prioritise plan stability? And is it likely to cause fewer issues than the default behaviour?”

    Comment by Dom Brooks — November 25, 2015 @ 1:59 pm GMT Nov 25,2015 | Reply

  4. Although, I dont live in UK, yet here’s my question…hopefully ll see a post from you on it :)

    How is new cpu costing mechanism in 12c beneficial? I see huge cost difference in same 11g and 12c plan.


    Comment by Yogi — November 26, 2015 @ 2:19 am GMT Nov 26,2015 | Reply

  5. Hi Jonathan, here’s my question and I know I would need to better elaborate with a test case but a general advice would be greatly appreciated:

    What would be the best practice/approach for minimizing/reducing row chaining in tables with >=255 cols on Exadata? In the past we’ve fixed it reorganizing tables and moving null columns after column 255, a very time/resource consuming task. But recently we were challenged with the question: “can we add more columns without generating row chaining?” Thank you very much for your time.

    Comment by Irving Perez — November 26, 2015 @ 3:33 am GMT Nov 26,2015 | Reply

    • Forgot to mention that this is for avoiding execution plan changes that the optimizer does whenever it detects row chaining presence. Thanks.

      Comment by Irving Perez — November 26, 2015 @ 3:41 am GMT Nov 26,2015 | Reply

      • Irving,

        I’m curious about this. Which version of Oracle are you using, and how is the optimizer getting any stats about chained rows. In my experience the optimizer doesn’t allow an extra cost for row chaining unless you’ve managed to set the chain_cnt statistic, and you can only set that by calling the Analyze command rather than the dbms_stats package.

        Even when you use the analyze command it looks as if chaining a row inside a single block doesn’t result in the row being counted in chain_cnt, so there may be some row-migration involved as well in your case.

        Comment by Jonathan Lewis — November 27, 2015 @ 4:34 pm GMT Nov 27,2015 | Reply

  6. I won’t be able to attend, but I do have a question that has been bugging me:

    Scenario: A standard Oracle EBS program runs once per organization every day. The EBS install has about 100 odd orgs. The performance of the 100 odd requests depend of whether the first program to run has is for an org that returns a large data set or a small one. I’m pretty sure it’s because of bind variable peeking, and my question is what can we do to assist the program that got latched onto the bad plan and is now running long. Certain EBS programs can’t be cancelled and are also incompatible with themselves (no two instance of the program can run simultaneously), so this puts a lot of pressure on the month-end and year-end closures.

    PS: I wise you did a conference in India

    Comment by Jithin Sarath — November 26, 2015 @ 2:38 pm GMT Nov 26,2015 | Reply

  7. Hello Jonathan
    It would be nice if you and your peers on the panel could explain to us, mere mortals, what, in the cost-based optimiser, is the cost. I’ve read differing definitions, and have heard different opinions on its use. One such opinion is that the cost can only be used to compare different plans for the same sql_id, and cannot be trusted to compare several sql_ids. According to that tenet, for example, one cannot say that sql_id b with a cost of 200 is twice as costly as sql_id a with a cost of 100. What is your opinion? See you in Birmingham.

    Comment by Dear DBA Frank (@fdernoncourt) — November 26, 2015 @ 4:28 pm GMT Nov 26,2015 | Reply

  8. Hello Jonathan,

    My question is about hints behaviour.
    Sometimes ago I’ve hit an issue with incorrect hint blocked/destroyed effect of the following hints.

    Let’s use your example from
    and execute it with the following hints:
    1) initial hint
    use_hash(t2) full(t2) swap_join_inputs(t2)
    leading(t1 t2 t3 t4 )
    use_hash(t1) full(t1) swap_join_inputs(t1)
    use_hash(t3) full(t3) swap_join_inputs(t3)

    2) “bad” “hint” before
    use_hash(t2) full(t2) swap_join_inputs(t2)
    leading(t1 t2 t3 t4 )
    use_hash(t1) full(t1) swap_join_inputs(t1)
    use_hash(t3) full(t3) swap_join_inputs(t3)
    3) “bad” “hint” after
    use_hash(t2) full(t2) swap_join_inputs(t2)
    leading(t1 t2 t3 t4 )
    use_hash(t1) full(t1) swap_join_inputs(t1)
    use_hash(t3) full(t3) swap_join_inputs(t3)

    4) neutral “hint” before
    use_hash(t2) full(t2) swap_join_inputs(t2)
    leading(t1 t2 t3 t4 )
    use_hash(t1) full(t1) swap_join_inputs(t1)
    use_hash(t3) full(t3) swap_join_inputs(t3)

    It seems that in the case 2 “wrong hint” causes all following hints to fail.
    Real life scenario looked more realistic(just a typo in one of the hints) but after some investigations I came up with the test case.

    Can you please comment on such hints behaviour , is it documented somehow, etc.

    Thank you very much in advance.

    Comment by dmitryremizov — November 26, 2015 @ 9:33 pm GMT Nov 26,2015 | Reply

  9. Hi Jonathan,

    My question is related to ADS (Adaptive Dynamic Sampling). You and Christian pointed out that this will be automatically used in 12c.

    My question is how can we monitor the workings of ADS?
    How can we identify query behaviour changes attributed to ADS (good or bad)?
    Is there any way we can define a strategy around this so that after upgrade to 12c we are able to minimise the impact it can cause to existing queries?

    Sanjeev Labh

    Comment by Sanjeev Labh — November 27, 2015 @ 9:24 am GMT Nov 27,2015 | Reply

  10. One more question realted to this.

    What are the various ways or methods we could use to observe and understand this new feature and it’s behaviour.


    Comment by Sanjeev Labh — November 27, 2015 @ 11:47 am GMT Nov 27,2015 | Reply

  11. […] Why did I mention Jonathan Lewis? Well, on Monday at 11:20 he is doing another panel session taking questions, with Nigel Bayliss, Christian Antognini and Maria Colgan (again – we work them hard). This session is focused on the Cost Based Optimizer. We already have enough initial questions but if you are curious about the optimizer and performance, maybe ask your own question from the floor, it’s a must-see session. Jonathan talks about this session in this blog post. […]

    Pingback by A Different Type of Keynote & Jonathan Lewis Panel Session at UKOUG Tech15 | Martin Widlake's Yet Another Oracle Blog — November 27, 2015 @ 1:24 pm GMT Nov 27,2015 | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Powered by