Oracle Scratchpad

November 19, 2010

Quiz Night

Filed under: Bugs,Execution plans,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 6:00 pm BST Nov 19,2010

Apart from the fact that the “Rows” figure for the FILTER operation at line 6 is blank, what’s the obvious error in this extract from an execution plan:

-------------------------------------------------------------------------
| Id  | Operation                            | Name             | Rows  |
-------------------------------------------------------------------------
        ...
|   5 |      NESTED LOOPS                    |                  |  3864 |
|   6 |       FILTER                         |                  |       |
|   7 |        HASH JOIN OUTER               |                  |  3864 |
|   8 |         HASH JOIN OUTER              |                  |   282K|
|   9 |          TABLE ACCESS BY INDEX ROWID | PRODUCT          |   282K|
|  10 |           INDEX RANGE SCAN           | PRD_SUPP_I1      |   282K|
|  11 |          VIEW                        |                  |  2293K|
|  12 |           HASH GROUP BY              |                  |  2293K|
|  13 |            PARTITION LIST SINGLE     |                  |  5790K|
|  14 |             TABLE ACCESS FULL        | PRODUCT_PRICING  |  5790K|
|  15 |         VIEW                         |                  |  2307K|
|  16 |          HASH GROUP BY               |                  |  2307K|
|  17 |           PARTITION LIST SINGLE      |                  |  5703K|
|  18 |            TABLE ACCESS FULL         | PRODUCT_PRICING  |  5703K|
        ...
-------------------------------------------------------------------------

Update 21/Nov/2010:
Once again I am reminded of two things – it’s important to be precise in your use of language if you want people to understand the question; and you can see a lot if you look carefully.

If you start to think about the activity that the plan represents, and the SQL that might have produced it, there are some ideas you might get about re-writing the query to be more efficient – but the point I was trying to make is that there is clearly an error in the content that the optimizer is displaying. The error suggests either that the optimizer has done the wrong arithmetic, or that the output is not a correct copy of the results produced by the optimizer.

The answer I was expecting comes from line 7. Stripping the error back to the bare minimum we see this:

--------------------------------------------------------------------------
| Id  | Operation                             | Name             | Rows  |
--------------------------------------------------------------------------
        ...
|   7 |        HASH JOIN OUTER                |                  |  3864 |
|   8 |         rowsource 1 (HASH JOIN OUTER) |                  |   282K|
|  15 |         rowsource 2 (VIEW)            |                  |  2307K|
        ...
--------------------------------------------------------------------------

As Milo points out in comment 3, In an outer join the result set cannot have fewer rows than the “preserved” rowsource (which, in this case, is the result set from line 8). I mentioned the fact that the “Rows” figure for the FILTER operation at line 6 was blank – it’s just possible that the optimizer has overwritten the figure in line 7 with the figure that should have been in line 6; there are cases where a FILTER operation and the operation you would normally think of as its first child are combined, so it’s possible that a little storage glitch has appeared in some cases where the combination rule doesn’t apply.

Someone did mention the FILTER operation and pointed out that it wasn’t filtering any data. The commonest forms of FILTER operation essentially check that some predicate it true for each row in their first child rowsource – and it is possible for someone to write code that has a filter that doesn’t eliminate any rows. In fact, though, this plan is probably saying: “line 7 will produce 282K rows, and the filter at line 6 will reduce that to 3,684.” (There’s also a comment about a “group by” not reducing the size of the rowsource – the comment was caused by a parallax error, but it is possible, of course, for Oracle to decide that a “group by” is going to produce an output with just as many rows as the input.)

Sean Molloy’s opening comment asks how you can get two different estimates from the same tablescan — and follows up with one answer which is that since we are looking at PARTITION LIST SINGLE the two tablescans could be of different partitions. But it’s only a puzzle if there were no predicates on the tablescans and, as Pavol points out in comment 7, there are no “star flags” in the ID column to suggest the presence of any predicates – but there are no stars anywhere – and there have to be some predicates in the plan, since you can’t do a hash join, index range scan, or filter without a predicate. As Timur points out – you don’t get the predicate section in the report from dbms_xplan.display_awr(), so you don’t get the stars.

Speaking of missing information, Dave Costa in comment 4 suggests that the user made an error in choosing which bits of the plan to copy. I did the choosing – how could it possible by wrong ! Given the number of times I’ve said “you must include the predicate section”, why isn’t it there ? (answer: it’s display_awr). In fact lines 5 and 6 were redundant as far as the “obvious problem” was concerned – but I thought that the blank and reappearance of the same cardinality might be a helpful visual clue.

The SQL:

Several people have commented on the rationale for code that does two outer join aggregations on the same table. It does look a little unusual, but it’s not possible to come to any conclusion about whether it’s a good thing or a bad thing without knowing the data and the intent of the SQL. For example the intent could be something like:


select
        product_name, min(offer_price), max(offer_price) ...
from

(You can assume that in the actual code, the min() and max() would be hidden inside a couple of inline views)
In this case code which visits the product_pricing table twice might be a very good idea – because there is a special “index (min/max) range scan” optimisation that works very well if (a) have the right indexes in place (b) use a nested loop and (c) only ask for min, or max, but not both. Perhaps our problem is that the optimizer is doing a hash join when it should be doing a nested loop.

Another possibility is that we have something like:


select
        product_name, min(gbp_offer_price), min(usd_offer_price) ...
from

(Again you can assume the min() and max() would be hidden inside inline view, and the different column names would be derived names rather than being from two separate columns in a table).
Notice that the plan shows list partitioning – maybe we have one currency per partition, and we’ve written the query to maximise the benefits of partition elimination (as well as leaving the data nicely normalised, thus maximising efficiency of maintenance).

Bottom line on the SQL – in this case I wasn’t asking people to guess what was wrong with the code; but it’s very interesting to see how many different topics of thought can come out from a starting point of a few lines extracted from a plan.

22 Comments »

  1. How about two full table scans on product_pricing, one yielding 5790k rows, the other 5703k rows. (should be same number of rows)

    Comment by Sean Molloy — November 19, 2010 @ 6:08 pm BST Nov 19,2010 | Reply

  2. unless they are two separate partitions. Sometimes, I wish I could delete my comments!!

    Comment by Sean Molloy — November 19, 2010 @ 6:11 pm BST Nov 19,2010 | Reply

  3. An outer join (line 7) that results in less rows than the two things being joined seems suspicious.

    Comment by Milo — November 19, 2010 @ 6:47 pm BST Nov 19,2010 | Reply

    • 1 for the outer join anomaly and the missing stars. Also, there’s a filter on line 6 that isn’t expected to filter – cardinality estimates are the same at line 7 and line 5.
      Not to mention the group bys that don’t reduce cardinality either – are they grouping by a unique key?

      Comment by Flado — November 20, 2010 @ 12:09 am BST Nov 20,2010 | Reply

      • group by does reduce cardinality.

        line 14 FTS 5790K rows -> line 12 HASH GROUP BY 2293K rows

        Comment by Pavol Babel — November 20, 2010 @ 7:13 pm BST Nov 20,2010 | Reply

        • Ooops… damned small mobile screen!
          You are right, of course.

          Comment by Flado — November 20, 2010 @ 7:38 pm BST Nov 20,2010

  4. Taking your question to mean, what error did the user make in choosing which portion of the plan to copy: Rows 5 and 6 are essentially meaningless, since their child tree is incomplete in this extract.

    Comment by Dave Costa — November 19, 2010 @ 7:01 pm BST Nov 19,2010 | Reply

  5. why doing two “partition list single” accesses to the same table? why not iterate?

    Comment by Daniel Stolf — November 19, 2010 @ 7:25 pm BST Nov 19,2010 | Reply

  6. Jonathan, why don’t we see the star flag (*) in “Id” column (line 10, 7, 8)? DBMS_XPLAN would show * in rows where INDEX RANGE SCAN or HASH JOIN is used.

    Comment by Pavol Babel — November 19, 2010 @ 10:10 pm BST Nov 19,2010 | Reply

    • +1 for that, although DISPLAY_AWR wouldn’t show stars (due to a bug IIRC) even in 11.2.0.1

      Comment by Timur Akhmadeev — November 19, 2010 @ 11:17 pm BST Nov 19,2010 | Reply

      • You’re right Timur. I do not use DBMS_XPLAN.DISPLAY_AWR, because we do not have licenses for DIAGNOSTICS PACK in most projects.
        The aceess_predicates and filter_predicates columns are always NULL in dba_hist_sql_plan in 10gR2

        select count(*) from dba_hist_sql_plan where filter_predicates IS NOT NULL
        SQL> /

        COUNT(*)
        ———-
        0

        select count(*) from dba_hist_sql_plan
        SQL> /

        COUNT(*)
        ———-
        984564

        11gR2 stores data, but still not displayed due to bug? :)

        Stars are missing at lines 7, 8, 10 for sure, but could be also at 14, 18, etc (another possible explanation why we have different rows count for both full scans). Still no explanation for Milo’s note it realy seems to be obious error of CBO.

        Comment by Pavol Babel — November 20, 2010 @ 2:46 pm BST Nov 20,2010 | Reply

        • I believe that the predicates were originally omitted due to a performance bug querying v$sql_plan in historical versions.
          If you look at the current AWR statement which should populate the requisite columns in wrh$_sql_plan then you’ll find that they are deliberately set to NULL.

          I did open an SR some time ago to enquire about plans to include this information at some point and was told that it wasn’t going to happen – seems like a big hole of missing potentially valuable information to me.

          Comment by Dominic Brooks — November 21, 2010 @ 9:03 pm BST Nov 21,2010

  7. PRODUCT_PRICING appears to be Outer-Joined twice.

    Comment by Hemant K Chitale — November 21, 2010 @ 7:27 am BST Nov 21,2010 | Reply

  8. Thanks for the suggestions and comments from everyone – I’ve added a few comments to the post.

    Comment by Jonathan Lewis — November 21, 2010 @ 11:11 am BST Nov 21,2010 | Reply

    • Very interesting QUIZ Jonathan, thank you for that. I’ve made some tests on 10gR2 database and FILTER always behaves as described. Is the behaviour same in 11gR2, too? I don’t have any 11g database available to play with.

      Comment by Pavol Babel — November 21, 2010 @ 12:56 pm BST Nov 21,2010 | Reply

  9. May I ask a question about HASH GROUP BY here, since the explain plan shows this choice – why HASH GROUP BY is not running faster than SORT GROUP BY in either 11g or 10g? Why HASH GROUP BY still requires a heck of temp space even with a pretty big PGA_TARGET (let’s say 16GB) setting.

    Thanks.

    Comment by goldenorbit — November 25, 2010 @ 6:29 am BST Nov 25,2010 | Reply

  10. goldenorbit,

    I’ve just created two data sets and run a ‘select distinct’ against them. The first data set was designed to aggregate down from a large volume to a small volume – the hash aggregation was 30% faster than the sort aggregation. The second data set aggregated a large data set without reducing the size very much – the sort aggregation was about 30% faster than the hash aggregation.

    The memory requirement for the hash aggregation was much larger than for the sort aggregates, and the temp space used for the hash aggregation was larger then for the sort aggregation when I made it spill to disc.

    I haven’t looked at the hash aggregation closely, but I think it is trying to reduce CPU usage by using more memory than the sort. However, when there is little aggregation taking place the CPU cost of larger memory allocations exceeds the CPU saving of not sorting. Moreover, on the dump to disc the sort aggregate simply has to store streams of sorted data which the hash aggregation probably has to store the data and a significant amount of hashing structure (possibly hash keys) – which means that hash aggregations that spill to disk would suffer an I/O disadvantage.

    I think someone made a comment on one of my other postings that the variable (_smm_max_size) setting the maximum memory for a single workarea operation was a 32 bit integer: which means the maximum memory you could use for a (serial) hash aggregation would be 4GB irrepespective of the available PGA. (It might even be 2GB if it’s a signed 32-bit).

    Comment by Jonathan Lewis — November 27, 2010 @ 6:42 pm BST Nov 27,2010 | Reply

    • Jonathan,
      Re your testing with two datasets where Hash Aggregation was faster in one and Sort Aggregation in the other …. can you publish how you created the two datasets ?

      A few months ago, I had a similar issue. A job that was processing data by Monthly Partitions had been tested against year 2010 partitions and performed well in Production as it began processing 2010 partitions. However, as it went “back in time” to the year 2009, it suddenly took excruciatingly long from November 2009. The Group By was overflowing to disk — with tempspace usage more than 3x the actual data volume. I found that data had been updated back in December 2009 and the average row length had changed. So, I suspected that the Hash Group By was performing poorly against the older data because of the “nature of the data”. However, I never did get a chance to prove my suspicion — viz by disabling groupbyhashaggregation and testing each month individually.
      (A different process was implemented so that code was scrapped).

      If you can publish the way you generated the two datasets, I can run more tests.

      Hemant K Chitale

      Comment by Hemant K Chitale — December 1, 2010 @ 4:35 pm BST Dec 1,2010 | Reply

      • Hemant,

        It was only a trivial bit of code I whipped up in about fiften minutes to make the point that it was possible for either aggregation to be the faster depending on things like level of aggregation – it wasn’t anything designed to investigate the phenomenon in any depth:

        
        create table t1 nologging
        as
        with generator as (
        	select	--+ materialize
        		rownum 	id
        	from	all_objects 
        	where	rownum <= 3000
        )
        select
        	lpad(mod(rownum,1000),6)	small_vc_K,
        	lpad(rownum,6)			small_vc_M
        	from
        	generator	v1,
        	generator	v2
        where
        	rownum <= 1000000
        ;
        
        
        begin
        	dbms_stats.gather_table_stats(
        		ownname		 => user,
        		tabname		 =>'T1',
        		estimate_percent => null,
        		block_sample 	 => true,
        		method_opt 	 => 'for all columns size 1'
        	);
        end;
        /
        
        
        set serveroutput off
        set timing on
        
        spool hash_agg
        
        prompt	===========================
        prompt	1000 distinct values (hash)
        prompt	===========================
        
        select
        	/*+ gather_plan_statistics 1000 */
        	count(*)
        from
        	(
        	select	/*+ no_merge */
        		distinct small_vc_K
        	from
        		t1
        	)
        ;
        
        select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
        
        prompt	===========================
        prompt	1000 distinct values (sort)
        prompt	===========================
        
        select
        	/*+ gather_plan_statistics 1000 */
        	count(*)
        from
        	(
        	select	/*+ no_merge no_use_hash_aggregation */
        		distinct small_vc_K
        	from
        		t1
        	)
        ;
        
        select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
        
        
        prompt	==============================
        prompt	1000000 distinct values (hash)
        prompt	==============================
        
        select
        	/*+ gather_plan_statistics 1000000 */
        	count(*)
        from
        	(
        	select	/*+ no_merge */
        		distinct small_vc_M
        	from
        		t1
        	)
        ;
        
        select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
        
        prompt	==============================
        prompt	1000000 distinct values (sort)
        prompt	==============================
        
        select
        	/*+ gather_plan_statistics 1000000 */
        	count(*)
        from
        	(
        	select	/*+ no_merge no_use_hash_aggregation */
        		distinct small_vc_M
        	from
        		t1
        	)
        ;
        
        select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
        
        

        Comment by Jonathan Lewis — December 2, 2010 @ 8:29 am BST Dec 2,2010 | Reply

  11. Different tools for different purposes… Conceptually, hash aggregation is very sensitive to the number of groups and relatively insensitive to the total volume. Sort aggregation needs to sort the entire set and doesn’t really care about the number of distinct values (groups) contained therein.
    If one tool was better in all cases, we wouldn’t have the other.
    IMHO.
    Flado

    Comment by Flado — November 27, 2010 @ 7:11 pm BST Nov 27,2010 | Reply

  12. Thank you, Jonathan. A simple demonstration of the difference. I’ll scale this to larger datasets with different distinct counts and datatypes as well.

    Comment by Hemant K Chitale — December 6, 2010 @ 3:13 am BST Dec 6,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,528 other followers