Oracle Scratchpad

May 13, 2013

Parse Time

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:59 pm BST May 13,2013

Update Jan 2022 about memory usage

Dominic Brooks published a note recently about some very nasty SQL – originally thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a run-time problem but when I first read Dominic’s note I was sufficiently surprised that I decided to try modelling the query.

Unfortunately the query had more than 1,000 predicates (OR’ed together) and some of them included in-lists.  Clearly, writing this up by hand wasn’t going to be a good idea so I wrote a script to generate both the data, and the query, as follows – first a table to query:

rem
rem     Script:         killer_parse.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2013
rem

create table t1
as
with generator as (
        select  --+ materialize
                rownum id 
        from dual 
        connect by 
                level <= 1e4
)
select
        rownum          id1,
        rownum          id2,
        rownum          id,
        lpad(rownum,10) v1,
        rpad('x',100)   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5   -- > comment to avoid wordpress format issue
;

create index t1_i1 on t1(id1, id2);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

Then a piece of code to write a nasty query:

set pagesize 0
set feedback off
set termout off

spool temp1.sql

prompt select * from t1 where 1 = 2

select
        'or (id1 = ' || rownum || ' and id2 = ' || (rownum + 1) || ')'
from
        t1
where
        rownum <= 750 -- > comment to avoid wordpress format issue
union all
select
        'or ( id1 =  ' || (rownum + 1000) || ' and id2 in (' || rownum || ',' || (rownum+1) || '))'
from
        t1
where
        rownum <= 250 -- > comment to avoid WordPress format issue
;

prompt /

spool off

Here’s an example of the text generated by the code – with the limit values in the union all set to 5 and 3 respectively (and notice how I’ve rigged the query so that it doesn’t return any data, whatever the optimizer thinks):

select * from t1 where 1 = 2
or (id1 = 1 and id2 = 2)
or (id1 = 2 and id2 = 3)
or (id1 = 3 and id2 = 4)
or (id1 = 4 and id2 = 5)
or (id1 = 5 and id2 = 6)
or ( id1 =  1001 and id2 in (1,2))
or ( id1 =  1002 and id2 in (2,3))
or ( id1 =  1003 and id2 in (3,4))
/

So here’s the plan from the above query:


---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     8 |  1008 |    16   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   8 |   INLIST ITERATOR             |       |       |       |            |          |
|   9 |    TABLE ACCESS BY INDEX ROWID| T1    |     5 |   630 |     7   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | T1_I1 |     5 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID1"=1003)
       filter("ID2"=3 OR "ID2"=4)
   5 - access("ID1"=1002)
       filter((LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND
              ("ID2"=2 OR "ID2"=3))
   7 - access("ID1"=1001)
       filter((LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND ("ID2"=1 OR
              "ID2"=2))
  10 - access(("ID1"=1 AND "ID2"=2 OR "ID1"=2 AND "ID2"=3 OR "ID1"=3 AND
              "ID2"=4 OR "ID1"=4 AND "ID2"=5 OR "ID1"=5 AND "ID2"=6))
       filter((LNNVL("ID1"=1001) OR LNNVL("ID2"=1) AND LNNVL("ID2"=2)) AND
              (LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)))

As you can see, the first five predicates end up in line 10 of the plan with 10 repetitions (5 * 2) of the lnnvl() function. The last three predicates show up in lines 3, 5, and 7 – and the on each line we see two more lnnvl() calls than on the previous – just imagine, then, how many lnnvl() calls the optimizer will have added to the query plan by the time we have 750 occurrences in the inlist iterator (line 8) and 250 occurrences of the slightly more complex predicate. Here are the relevant CPU stats (from v$sesstat joined to v$statname) from running the generated script on 11.1.0.7, on Windows 32-bit, 2.8GHz CPU:

Name                                           Value
----                                           -----
recursive cpu usage                            1,848
CPU used when call started                     1,854
CPU used by this session                       1,854
DB time                                        1,870
parse time cpu                                 1,847
parse time elapsed                             1,862

Clearly the parse time is extreme – though not as dramatic as in Dominic’s example. Having set up the first draft of the sample code it’s easy enough to change the number of occurrences of each type of predicate, and it’s pretty easy to make longer in-lists in the more complex of the two types of predicate. It’s not really all that difficult to get an execution plan that mimics Dominic’s in length and time to parse.

It’s not just the parse times that are interesting when you start doing this, by the way – it’s worth playing around to see what happens. It’s probably best to run the query to pull the plans from memory if you want to see the plans, though – if you try using “explain plan” then your session starts using memory in the SGA for parts of the work: in one of my examples I had to abort the instance after a few minutes.

 

 

Update (Jan 2022)

I’ve revisited this blog note (and, more importantly the script to run the tests) because the “long parse time” question came up again on the Oracle-L mailing list. Since the time of the original post the optimizer has introduced cost-based OR-expansion, so I wanted to check if this made any significant difference to the tests.  Apart from changing the initial concatenation operation the upgrade made no significant difference (19.11.0.0, test with 150/250 split):

----------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                 |   550 | 62150 |   622   (1)| 00:00:01 |
|   1 |  VIEW                                  | VW_ORE_BA8ECEFB |   550 | 62150 |   622   (1)| 00:00:01 |
|   2 |   UNION-ALL                            |                 |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED | T1              |     2 |   254 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                   | T1_I1           |     1 |       |     2   (0)| 00:00:01 |

Since I was following the discussion on Oracle-L while repeating the tests I also included some checks on memory usage (being careful to start new sessions and flush the shared pool for each individual test, of course) to back up my earlier comment about the difference in memory usage between executing and using explain plan.

When testing with the 150 / 250 split in the generated SQL and picking out figures from v$sesstat and v$sgastat the following major differences between parsing through execution and parsing through explain plan appeared:


v$sesstat                          Execution test     Explain Plan test
session pga memory max                309,854,208            47,579,136


v$sgastat                          Execution test     Explain Plan test
shared pool        SQLA                40,554,888           295,652,112
shared pool        free memory        470,539,784           217,137,088

So, roughly speaking and ignoring minor variations, when using explain plan the demand for session PGA memory dropped by about 260MB, and the demand for shared pool memory to use for the SQLA content increased by about 255MB, a figure echoed in the drop in shared pool free memory of 253MB.

The moral here: trying to run the query may be a lot kinder to everyone else than trying to use explain plan to get an initial idea of the execution plan; so there’s another reasons (for long complex parses) beyond the usual “explain plan may deceive you” for running the query to get the execution plan.

 

8 Comments »

  1. Jonathan,

    Thanks for putting me on the right track and for taking the time to model the problem.
    It is a little reminder in not jumping to conclusions – no need when it’s so easy to find out where the time is actually spent …

    Dominic.

    Comment by Dom Brooks — May 13, 2013 @ 9:13 pm BST May 13,2013 | Reply

    • We had a re-occurrence of this problem today. Query was in hard parse for hours. ASH said it was IN_HARD_PARSE = Y. Also clue from ASH and V$SESSION because sql_exec_id and sql_exec_start… and you only get to execute once you’ve parsed I suppose.
      Added problem was that session would not respond to kill, probably wouldn’t until parse was done? Had to be killed from OS.

      Comment by Dom Brooks — August 7, 2017 @ 10:43 pm BST Aug 7,2017 | Reply

  2. Sorry, an unrelated question to the topic:

    You generated the test data by using a cross join

    from
        generator   v1,
        generator   v2
    where
        rownum <= 1e5
    

    What is the reason/benefit for generating data this way (you did the same approach in other blog entries, too) and not just

    select ...
            from dual
    connect by
            level <= 1e5
    

    Comment by Tim — May 14, 2013 @ 5:49 am BST May 14,2013 | Reply

    • Tim,

      This is a side effect of using a standard template.
      Some of my test cases include tens of millions of rows. If you use the connect by / rownum trick for very large numbers the amount of (PGA) memory and CPU consumed can be very large – doing the cross product is quicker, but, more importantly uses far less memory.

      There’s also a little bit of history involved: before the trick with dual worked I used to select (with no_merge hint) rownum from all_objects, and needed to do a cross product for large volumes.

      Comment by Jonathan Lewis — May 14, 2013 @ 4:20 pm BST May 14,2013 | Reply

  3. Hi Jonathan,
    Thanks for this very interesting post. While i’m quite confortable with troubleshooting excessive execution time (execution plans as well as many other usefull oracle or third party tools), I don’t know exactly what to do when i’m facing excessive parsing time. With execution plans (and without oversimplifying) with basically look for the most consuming steps. But what can we do with excessive parsing time ? many times i’ve been asked by support to disable optimizer features, but that was much more in some sort of tuning compulsive disorder : “Let’s see if this feature if causing the excessive parsing time”. While i can understand this might makes sense, I’m quite a bit disapointed to see that ora-10053 does not provide any timing statistics. This would be very helpfull to know where parsing time is spent. Hopefully this is something planned for next oracle release.

    By the way thanks a lot (again) for this nice example about how to build a test case. Here it’s not only about generating data, but also generating sql statements matching a similar issue. Very instructive, as usual ….

    Comment by obtechora — May 14, 2013 @ 9:52 am BST May 14,2013 | Reply

    • obtechora,

      There’s often not a lot you can do about long parse times – essentially you need to cut down the options the optimizer has in a way that’s “safe”.

      If you can rewrite the SQL that may help – a common trick is to break a complex statement into a couple of pieces with in-line views (perhaps placed as “with subqueries”) and /*+ no_merge */ hints; this may allow you to give the optimizer a good idea of how to start handling the query without trying to tell it an exact execution plan.

      Another option is to simplify the statistics – sometimes the problem is having too many histograms that require Oracle to do lots of extra work reviewing different strategies because different paths (joins particularly) may mean that Oracle has to include or exclude histograms.

      You may find that using “alter session” (or the opt_param() hint) to disable some features may help.

      Comment by Jonathan Lewis — June 9, 2013 @ 12:43 pm BST Jun 9,2013 | Reply

  4. […] (logical OR) is known to cause various performance problems, from extreme parse times (e.g. here) to sub-optimal plans. A common solution to such problems is getting rid of OR’s by “OR […]

    Pingback by Disjunction | Oracle Diagnostician — January 22, 2015 @ 9:11 pm GMT Jan 22,2015 | Reply

  5. […] an example from my killer_parse.sql query after setting “_fix_control”=’16923858:4′ (1e4 microseconds = 1/100th […]

    Pingback by Parse Time | Oracle Scratchpad — January 13, 2022 @ 10:49 am GMT Jan 13,2022 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.