Oracle Scratchpad

December 15, 2006

ORA-01722: upgrade error

Filed under: CBO,Hints,Oracle,Statistics,System Stats,Troubleshooting — Jonathan Lewis @ 10:33 pm GMT Dec 15,2006

I received an email recently from someone who had just upgraded the Oracle 11i Business Suite (11.5.9) from 9.2.0.6 to 10.2.0.2.  After the upgrade, the following SQL statement (shown here with its original format – not according to my standards) started failing with Oracle error: ORA-01722: invalid number.

select fr.APPLICATION_ID, fr.responsibility_id
from fnd_request_group_units frgu
,fnd_responsibility fr
,fnd_user_resp_groups furg
,fnd_concurrent_programs fcp
where frgu.APPLICATION_ID = fcp.APPLICATION_ID
and   frgu.REQUEST_UNIT_ID = fcp.CONCURRENT_PROGRAM_ID
and   fcp.CONCURRENT_PROGRAM_NAME = 'OT044560'
and   furg.USER_ID = 3821
and   fr.APPLICATION_ID = frgu.APPLICATION_ID
and   fr.REQUEST_GROUP_ID = frgu.REQUEST_GROUP_ID
and   furg.RESPONSIBILITY_APPLICATION_ID = fr.APPLICATION_ID
and   furg.RESPONSIBILITY_ID = fr.RESPONSIBILITY_ID
and   exists (select null
from fnd_profile_options fpo
,fnd_profile_option_values fpov
where 1=1
and   fpo.PROFILE_OPTION_ID = fpov.PROFILE_OPTION_ID
and   fpov.LEVEL_ID = 10003 -- responsibility level
and   fpov.LEVEL_VALUE = fr.responsibility_id
and   fpov.LEVEL_VALUE_APPLICATION_ID = fr.application_id
and   fpov.PROFILE_OPTION_VALUE = 2
and   fpo.PROFILE_OPTION_NAME = 'ORG_ID')
and rownum = 1
;

The DBA found that if he changed the predicate fpov.PROFILE_OPTION_VALUE = 2 to read fpov.PROFILE_OPTION_VALUE = ‘2’ (note the quote marks) the query stopped crashing. So what was going wrong ?

The answer is simple – 10g enables CPU costing (also known as system statistics) automatically. In 9i you have to enable the feature explicitly.

One of the benefits of system statistics is that it allows the optimizer to alter the predicate order to minimise CPU usage. Unfortunately this means that some SQL that used to work by accident in earlier versions of Oracle will stop working when you upgrade to 10g.

In this example the column  fpov.PROFILE_OPTION_VALUE  happens to be a character column which holds some values that look like numbers; but the predicate fpov.LEVEL_ID = 10003 eliminates all rows where the value doesn’t look like a number, so when Oracle does the implicit to_number() conversion on the remaining rows to compare them with the value 2, nothing breaks.

Because the cost based optimizer used to examine filter predicates in the order they appear in the where clause everything used to work nicely – until you enabled CPU costing and the optimizer decided to re-arrange the predicate order – in this case reversing the order of those two predicates. Suddenly the to_number() conversion takes place on a character value that hasn’t yet been filtered out … bad luck. In this case, you would see the to_number() conversion made visible, and the order in which the predicates were actually applied if you generated and reported the full execution plan using dbms_xplan.display().

A point that I alway stress in my tutorial on explain plan, and which I also made in a recent blog entry: ” make sure you check the predicates section of your execution plan”. It’s amazing the number of problems it explains.

There are two possible workarounds to this issue – neither desirable.

The first is to add the /*+ ordered_predicates */ hint to the query. Contrary to a note that used to appear in the manuals – and in some of the publications that simply copied the manuals – the hint goes in the normal place, not in the where clause. Unfortunately this hint has been deprecated in 10g, so you shouldn’t use it.

The second option is to disable CPU costing, which you can do by setting the hidden parameter _optimizer_cost_model to the value io. Of course, since it’s a hidden parameter, you shouldn’t do this either.

A compromise (until you fix the data model (!)) might be to use the new /*+ opt_param */ hint to disable CPU costing for the duration of query. The syntax is:

        /*+ opt_param(parameter, value) */
        /*+ opt_param('_optimizer_cost_model','io') */

Even then, you should exercise caution, as this hint is not yet documented. [Update: The hint was finally documented in the 11g manuals, with a limited list of “legal” parameters that could be set]

[Update August 2009: another option – documented in Metalink Bug number 8554306, dated May 2009,  is to set event 10158 at level 1. The example in the note uses the alter session command, but you might want to try setting the value in your parameter file. The oraus.msg file lists this this event  with the comment: 10158, 00000, “CBO preserve predicate order in post-filters”. I would only use this as a temporary workaround while correcting the guilty code.]

 

19 Comments »

  1. There are two possible workarounds to this issue – neither desirable.

    … or, if you’re going to touch the code anyway, just take out the implicit conversion?

    I’ve lost count of the number of times that relying on implicit conversions has caused tricky problems.

    Comment by Doug Burns — December 17, 2006 @ 1:06 am GMT Dec 17,2006 | Reply

  2. Another common assumption valid in 9i – GROUP BY returns rows in the same order. This was a side effect that nobody should have relied on but surely some people did. Now that 10g introduced hash group by operation in addition to sort group by, this doesn’t stand valid anymore.

    Comment by Alex Gorbachev — December 17, 2006 @ 2:51 am GMT Dec 17,2006 | Reply

  3. Doug, true – and that’s what the original poster did with his col = ‘2’ change. In a more general case, it would be a good idea to put in the explicit coercion to_number(col), although this doesn’t solve the problem it merely makes the threat more obvious.

    Your comment did prompt me to realise there was a third solution if you can change the code. Create a deterministic pl/sql function that takes a character input and returns a null if to_number() can’t convert it to a number, otherwise returning the number. This would change the selectivity – in this case the optimizer would switch to using the fixed 1% for a pl/sql function instead of the selectivity of the underlying column – and it might change the CPU usage for the query, it could even change the logic of the query in some cases, I suppose.

    Alex, I was planning to write a short note about that one too – but you’ve saved me the trouble. One important detail, though: the assumption was never valid – I can show you an example that will give you ordering that isn’t even deterministic – even on version 7 – if you rely on the group by

    Comment by Jonathan Lewis — December 17, 2006 @ 11:27 am GMT Dec 17,2006 | Reply

  4. Alex, Jonathan, there is a quite long thread on the subject on AskTom (“Does a Group By guarantee that output data is sorted?”).

    Comment by Michel Cadot — December 17, 2006 @ 8:00 pm GMT Dec 17,2006 | Reply

  5. I’m just curious as to how you discovered optimizer parameter hint. eg:

    /*+ opt_param(parameter, value) */
    /*+ opt_param(‘_optimizer_cost_model’,’io’) */

    I’m presuming that you discovered this whilst tracing some other Oracle component, and then fiddled with it to confirm it’s function.

    Best Regards,

    Comment by Mathew Butler — December 18, 2006 @ 12:04 pm GMT Dec 18,2006 | Reply

  6. Matthew, have a look at Metalink note 377333.1

    Regards

    Comment by Michel Cadot — December 18, 2006 @ 3:30 pm GMT Dec 18,2006 | Reply

  7. Thanks Michel.

    Skimming metalink this hint will be documented in 11g under doc bug 5160959.

    I’m still interested to find out how Jonathan uncovered this one.

    Regards,

    Comment by Mathew Butler — December 18, 2006 @ 6:44 pm GMT Dec 18,2006 | Reply

  8. Mathew, I can’t really remember, it was a long time ago; but if you generate a 10053 trace in 10.2 you get ‘outlines’ in the tail-end of the trace, looking like this:

      /*+
        BEGIN_OUTLINE_DATA
          IGNORE_OPTIM_EMBEDDED_HINTS
          OPTIMIZER_FEATURES_ENABLE(‘10.2.0.1’)
          ALL_ROWS
          OUTLINE_LEAF(@”SEL$1″)
          INDEX_FFS(@”SEL$1″ “GRANDPARENT”@”SEL$1” (“GRANDPARENT”.”ID”))
        END_OUTLINE_DATA
      */

    I think I did a 10053 one day (possibly during a call to an Oracle package) which happened to dump the opt_param() hint in this bit of the trace.

    Comment by Jonathan Lewis — December 18, 2006 @ 7:28 pm GMT Dec 18,2006 | Reply

  9. Michel, thanks for the reference; it’s nice to know that it’s now official.

    Comment by Jonathan Lewis — December 18, 2006 @ 7:35 pm GMT Dec 18,2006 | Reply

  10. can you give me the example to produce test case that show group by is not doing ordering in the resultset it returned in Oracle 8.1.7

    Anyway, Thanks.

    Comment by Ronald — January 18, 2007 @ 11:23 am GMT Jan 18,2007 | Reply

  11. Ronald,

    It’ s easy to build such an exemple (see below) but the real question is: Why do you need it?
    A group by (and any other clauses but “order by”) did never guaranteed any order.

    create table t (grp integer, val number)
    partition by range (grp)
    (partition values less than (1),
    partition values less than (2),
    partition values less than (3),
    partition values less than (4),
    partition values less than (5),
    partition values less than (maxvalue)
    )
    parallel 3
    /
    insert into t
    select mod(rownum,5), object_id
    from dba_objects
    /
    commit;
    begin
    dbms_stats.gather_table_stats
    (user,’T’,method_opt=>’FOR COLUMNS SIZE 5 GRP’);
    end;
    /
    SQL> select grp, count(*) from t group by grp;

    GRP COUNT(*)
    ———- ———-
    0 1666
    1 1667
    3 1667
    4 1667
    2 1667

    SQL> select * from v$version;

    BANNER
    —————————————————————-
    Oracle8i Enterprise Edition Release 8.1.7.4.0 – Production

    Regards

    Comment by Michel Cadot — January 18, 2007 @ 12:40 pm GMT Jan 18,2007 | Reply

  12. Michel, just curious, and thanks

    Comment by Ronald — January 20, 2007 @ 1:43 pm GMT Jan 20,2007 | Reply

  13. The explanation is really useful.

    Comment by Abhishek PK — March 9, 2007 @ 1:12 pm GMT Mar 9,2007 | Reply

  14. Jonathan,

    Your articles are lucid and to the point. I got your book cost based fundamentals from USA through one of my friends as it is not available in India.

    Regards
    Satish

    Comment by Satish P — September 3, 2007 @ 4:20 am BST Sep 3,2007 | Reply

  15. Satish,

    Jonathan’s book is available in India.

    Rgds,
    Gokul

    Comment by Gokul — September 3, 2007 @ 5:29 pm BST Sep 3,2007 | Reply

  16. […] wrong”. There are three main reasons for this. The first is that 10g automatically enables CPU costing, and if you weren’t using it in 9i this is likely to make a difference – often an improvement […]

    Pingback by 10g Upgrade « Oracle Scratchpad — January 16, 2008 @ 8:14 am GMT Jan 16,2008 | Reply

  17. […] vom 16.12.2008: In einem Artikel von J. Lewis wird erwähnt, daß speziell ab Oracle 10g dieses Szenario verstärkt auftaucht, da das “CPU […]

    Pingback by Immer wieder ORA-01722 (invalid number) « Oraculix — December 16, 2008 @ 5:29 pm GMT Dec 16,2008 | Reply

  18. I’ve just added a note to the end of this piece about another way of dealing (temporarily) with Oracle error 01722.

    Comment by Jonathan Lewis — August 15, 2009 @ 10:01 am BST Aug 15,2009 | Reply

  19. […] “CPU costing” of the optimizer can lead to a change in predicate order. In his blog, Jonathan Lewis discusses this behaviour and recommends to change your data model if this happens (see approach 3 […]

    Pingback by ORA-01722 (invalid number) over and over again | Oraculix (en) — July 1, 2013 @ 1:52 pm BST Jul 1,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.