Oracle Scratchpad

May 13, 2013

Parse Time

Filed under: Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:59 pm GMT May 13,2013

Dominic Brooks published a note recently about some very nasty SQL – originally thinking that it was displaying a run-time problem due to the extreme number of copies of the lnnvl() function the optimizer had produced. In fact it turned out to be a parse-time problem rather than a run-time problem, but when I first read Dominic’s note I was sufficiently surprised that I decided to try modelling the query.

Unfortunately the query had more than 1,000 predicates, (OR’ed together) and some of them included in-lists.  Clearly, writing this up by hand wasn’t going to be a good idea, so I wrote a script to generate both the data, and the query, as follows – first a table to query:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum		id1,
	rownum		id2,
	rownum		id,
	lpad(rownum,10)	v1,
	rpad('x',100)	padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5
;

create index t1_i1 on t1(id1, id2);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

Then a piece of code to write a nasty query:

set pagesize 0
set feedback off
set termout off

spool temp1.sql

prompt select * from t1 where 1 = 2

select
	'or (id1 = ' || rownum || ' and id2 = ' || (rownum + 1) || ')'
from
	t1
where
	rownum <= 750
union all
select
	'or ( id1 =  ' || (rownum + 1000) || ' and id2 in (' || rownum || ',' || (rownum+1) || '))'
from
	t1
where
	rownum <= 250
;

prompt /

spool off

Here’s an example of the text generated by the code – with the parameters set to 5 and 3 respectively (and notice how I’ve rigged the query so that it doesn’t return any data, whatever the optimizer thinks):

select * from t1 where 1 = 2
or (id1 = 1 and id2 = 2)
or (id1 = 2 and id2 = 3)
or (id1 = 3 and id2 = 4)
or (id1 = 4 and id2 = 5)
or (id1 = 5 and id2 = 6)
or ( id1 =  1001 and id2 in (1,2))
or ( id1 =  1002 and id2 in (2,3))
or ( id1 =  1003 and id2 in (3,4))
/

So here’s the plan from the above query:


---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     8 |  1008 |    16   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T1    |     1 |   126 |     3   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN           | T1_I1 |     1 |       |     2   (0)| 00:00:01 |
|   8 |   INLIST ITERATOR             |       |       |       |            |          |
|   9 |    TABLE ACCESS BY INDEX ROWID| T1    |     5 |   630 |     7   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN          | T1_I1 |     5 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID1"=1003)
       filter("ID2"=3 OR "ID2"=4)
   5 - access("ID1"=1002)
       filter((LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND
              ("ID2"=2 OR "ID2"=3))
   7 - access("ID1"=1001)
       filter((LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)) AND ("ID2"=1 OR
              "ID2"=2))
  10 - access(("ID1"=1 AND "ID2"=2 OR "ID1"=2 AND "ID2"=3 OR "ID1"=3 AND
              "ID2"=4 OR "ID1"=4 AND "ID2"=5 OR "ID1"=5 AND "ID2"=6))
       filter((LNNVL("ID1"=1001) OR LNNVL("ID2"=1) AND LNNVL("ID2"=2)) AND
              (LNNVL("ID1"=1002) OR LNNVL("ID2"=2) AND LNNVL("ID2"=3)) AND
              (LNNVL("ID1"=1003) OR LNNVL("ID2"=3) AND LNNVL("ID2"=4)))

As you can see, the first five predicates end up in line 10 of the plan with 10 repetitions (5 * 2) of the lnnvl() function. The last three predicates show up in lines 3, 5, and 7 – and the on each line we see two more lnnvl() calls than on the previous – just imagine, then, how many lnnvl() calls the optimizer will have added to the query plan by the time we have 750 occurrences in the inlist iterator (line 8) and 250 occurrences of the slightly complex predicate. Here are the relevant CPU stats (from v$session_stats) from running the generated script on 11.1.0.7, on Windows 32-bit, 2.8GHz CPU:

Name                                           Value
----                                           -----
recursive cpu usage                            1,848
CPU used when call started                     1,854
CPU used by this session                       1,854
DB time                                        1,870
parse time cpu                                 1,847
parse time elapsed                             1,862

Clearly the parse time is extreme – though not as dramatic as in Dominic’s example; but having set up the first draft of the sample code it’s easy enough to change the number of occurrences of each type of predicate, and it’s pretty easy to make longer in-lists in the more complex of the two types of predicate. It’s not too difficult to get an execution plan that mimics Dominic’s in length and time to parse.

It’s not just the parse times that are interesting when you start doing this, by the way – it’s worth playing around to see what happens. It’s probably best to run the query to pull the plans from memory if you want to see the plans, though – if you try using “explain plan” then you start using memory in the SGA for some of the work: in one of my examples I had to abort the instance after a few minutes.

5 Comments »

  1. Jonathan,

    Thanks for putting me on the right track and for taking the time to model the problem.
    It is a little reminder in not jumping to conclusions – no need when it’s so easy to find out where the time is actually spent …

    Dominic.

    Comment by Dom Brooks — May 13, 2013 @ 9:13 pm GMT May 13,2013 | Reply

  2. Sorry, an unrelated question to the topic:

    You generated the test data by using a cross join

    from
        generator   v1,
        generator   v2
    where
        rownum <= 1e5
    

    What is the reason/benefit for generating data this way (you did the same approach in other blog entries, too) and not just

    select ...
            from dual
    connect by
            level <= 1e5
    

    Comment by Tim — May 14, 2013 @ 5:49 am GMT May 14,2013 | Reply

    • Tim,

      This is a side effect of using a standard template.
      Some of my test cases include tens of millions of rows. If you use the connect by / rownum trick for very large numbers the amount of (PGA) memory and CPU consumed can be very large – doing the cross product is quicker, but, more importantly uses far less memory.

      There’s also a little bit of history involved: before the trick with dual worked I used to select (with no_merge hint) rownum from all_objects, and needed to do a cross product for large volumes.

      Comment by Jonathan Lewis — May 14, 2013 @ 4:20 pm GMT May 14,2013 | Reply

  3. Hi Jonathan,
    Thanks for this very interesting post. While i’m quite confortable with troubleshooting excessive execution time (execution plans as well as many other usefull oracle or third party tools), I don’t know exactly what to do when i’m facing excessive parsing time. With execution plans (and without oversimplifying) with basically look for the most consuming steps. But what can we do with excessive parsing time ? many times i’ve been asked by support to disable optimizer features, but that was much more in some sort of tuning compulsive disorder : “Let’s see if this feature if causing the excessive parsing time”. While i can understand this might makes sense, I’m quite a bit disapointed to see that ora-10053 does not provide any timing statistics. This would be very helpfull to know where parsing time is spent. Hopefully this is something planned for next oracle release.

    By the way thanks a lot (again) for this nice example about how to build a test case. Here it’s not only about generating data, but also generating sql statements matching a similar issue. Very instructive, as usual ….

    Comment by obtechora — May 14, 2013 @ 9:52 am GMT May 14,2013 | Reply

    • obtechora,

      There’s often not a lot you can do about long parse times – essentially you need to cut down the options that optimizer has in a way that’s “safe”.

      If you can rewrite the SQL that may help – a common trick is to break a complex statement into a couple of pieces with in-line views (perhaps placed as “with subqueries”) and /*+ no_merge() */ hints; this may allow you to give the optimizer a good idea of how to start handling the hint without trying to tell it an exact execution plan.

      Another option is to simplify the statistics – sometimes the problem is having too many histograms that require Oracle to do lots of extra work reviewing different strategies because different paths (joins particularly) may mean that Oracle has to include or exclude histograms.

      You may find that using “alter session” (or the opt_param() hint) to disable some features may help.

      Comment by Jonathan Lewis — June 9, 2013 @ 12:43 pm GMT Jun 9,2013 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,556 other followers