In an earlier article on Cost Based Query Transformation, I showed an example of “complex view merging” where the optimizer turned an in-line aggregate view into a simple join with late aggregation.
In 11g, the optimizer can do the opposite transformation using a mechanism known as “Group by Placement”.
Here’s a simple SQL statement, with the associated execution plan from 10g:
rem rem Script: group_by_11g.sql rem Author: Jonathan Lewis rem Dated: Feb 2008 rem select t1.v1, sum(t2.n2) from t1, t2 where t2.n1 between 10 and 90 and t1.n1 = t2.n1 group by t1.v1 ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1236 | 232K| | 176 | | 1 | HASH GROUP BY | | 1236 | 232K| 1784K| 176 | |* 2 | HASH JOIN | | 7639 | 1439K| | 30 | |* 3 | TABLE ACCESS FULL| T2 | 1236 | 9888 | | 14 | |* 4 | TABLE ACCESS FULL| T1 | 1236 | 223K| | 14 | -------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."N1"="T2"."N1") 3 - filter("T2"."N1">=10 AND "T2"."N1"<=90) -- > comment added to avoid WordPress format issue 4 - filter("T1"."N1">=10 AND "T1"."N1"<=90) -- > comment added to avoid WordPress format issue
As you might expect we have joined the two tables then done a large scale aggregation (using the 10g “hash group by”mechanism). But here’s what happens in 11g (11.1.0.6) for this sample data set and query:
-------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3000 | 591K| | 142 | | 1 | HASH GROUP BY | | 3000 | 591K| 1272K| 142 | |* 2 | HASH JOIN | | 3000 | 591K| | 44 | | 3 | VIEW | VW_GBC_5 | 200 | 3400 | | 29 | | 4 | HASH GROUP BY | | 200 | 2400 | | 29 | |* 5 | TABLE ACCESS FULL| T2 | 1236 | 14832 | | 14 | | 6 | TABLE ACCESS FULL | T1 | 3000 | 541K| | 14 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."N1"="ITEM_1") 5 - filter("T2"."N1"<=90 AND "T2"."N1">=10)
In this case we have aggregated data from t2 before joining to t1 because the optimizer has decided the volume selected from t2 is so large that aggregating before joining would be cheaper than joining before aggregating.
This is a fantastic trick – I’ve often seen examples where it would obviously be nice to rewrite a query to do this sort of thing but the performance is so small that it’s not worth the effort involved in rewriting and testing the code. Now it can happen automatically.
Of course, as always, the optimizer arithmetic may be inaccurate – so Oracle has supplied hints to allow you force the transformation when it doesn’t happen and stop the transformation if it does happen. For example:
prompt =================================== prompt Hinting to force group by placement prompt =================================== select /*+ qb_name(main) place_group_by(@main(t1@main)) */ t1.v1, sum(t2.n2) from t1, t2 where t2.n1 between 10 and 50 and t1.n1 = t2.n1 group by t1.v1 ; prompt =================================== prompt Hinting to block group by placement prompt =================================== select /*+ qb_name(main) no_place_group_by(@main(t2@main)) */ t1.v1, sum(t2.n2) from t1, t2 where t2.n1 between 10 and 90 and t1.n1 = t2.n1 group by t1.v1 ; prompt ======================================= prompt Hinting to double up group by placement prompt ======================================= select /*+ qb_name(main) place_group_by(@main (t1@main) (t2@main)) */ t1.v1, sum(t2.n2) from t1, t2 where t2.n1 between 10 and 50 and t1.n1 = t2.n1 group by t1.v1 ;
You’ll notice that I’ve used the qb_name() hint in these examples – this makes it easier to control the details in more complex cases.
The place_group_by and no_place_group_by hints are a little more subtle than my examples show – and I look forward to seeing them documented – so I would be a little cautious about using them. You’ll see in my example execution plan that there is a reference to an internal view called vw_gbc_5. The name relates in some way to the order of the optimizer’s processing, and I found that the proper use of the place_group_by hint included the number as a second parameter when I checked the outline information from the execution plan.
Another little detail that I haven’t yet investigated is that whenever t2 was subject to group by placement the internal view had a name of the form vw_gbc_N while the internal view relating to group by placement of t1 had a name of the form vw_gbf_N.
This is great. There’s a stench of a business intelligence enhancement all over it, and I’ve seen quite a few cases where this would be a wonderful enhancement in the joining of a fact to multiple dimension tables. For example, in joining a Sales fact table to a Geography dimension table, how much nicer to aggregate the sales metrics and defer the join to the lengthy text items, (“Birmingham”, “San Francisco”, “Upper Volta”, “New Zealand” etc) than to include said lengthy text in the group-by.
A common situation, actually the more common situation, in a BI environemt is to have a query in which the projected column is at a higher level in the hierarchy that the join column, however. something like:
In manually optimising queries like that it is sometimes the case that it is beneficial to aggregate to the level of the join column, perform the join, then aggregate a second time to the level of the report columns. In the above example, we’d optimize to something like:
(with the required hints to prevent merging)
I wonder if the 11g optimizer can handle that? Maybe after Christmas I’ll have the time to test. In theory, given a join from a fact table to multiple dimension tables, there would be quite an interesting choice of whether to perform an intermediate aggregation, and which dimensions to defer aggregation on until after the join. Certain aggregation types (well, count(distinct …) anyway) would prevent an intermediate aggregation step i suppose.
Comment by David Aldridge — December 21, 2008 @ 11:15 pm GMT Dec 21,2008 |
This query transformation (group-by placement) is available in 11g.
Comment by Greg Rahn — December 2, 2010 @ 12:10 am GMT Dec 2,2010 |
I’m confused – easily done I know…
You say “…optimizer has decided the volume selected from t2 is so large that aggregating before joining would be cheaper than joining before aggregating.” – but the volume selected is 1236 rows, which is tiny…am I missing something here?
Comment by Jeff Moss — December 22, 2008 @ 9:58 am GMT Dec 22,2008 |
Jeff,
No, you’re not missing anything – the estimate is only 1,236 rows at that point, but it’s still large enough (in this case, with my parameter settings) to make the optimizer choose this transformation.
For “n1 between 10 and 50” I got a smaller amount of driving data, which gave me the “hash then group” we saw from 10g – my intention with this comment was simply to indicate that the quantity of data had an effect, not that it had to be a very large quantity.
Comment by Jonathan Lewis — December 22, 2008 @ 11:29 am GMT Dec 22,2008 |
Jeff is not missing something, but I was missing that what I suggested was what the feature already does.
I need to go lie down.
Comment by David Aldridge — December 22, 2008 @ 1:56 pm GMT Dec 22,2008 |
Hi Jonathan ,
we are have similar query running in a minute and FS1_REV_TRK_TMP has 10 milion rows .Could you please help me to improve the query.
Thanks in Advance
Raju
Comment by Raju Gorantla — April 23, 2010 @ 10:51 pm BST Apr 23,2010 |
Raju,
I’m going to guess that you found this blog post by searching Google for VW_GBC_5, because your question really has very little to do with the topic of the posting.
What have you done so far in your attempts to improve the query ?
As a generic starting point, you need to examine the query and understand how much work it has to do – which means being aware of how much data is there, how much you need, and how you can get at it. I suggest you start by reading this article on designing efficient SQL.
Looking at your query – you have no filters on either table, and only two tables to start from.
If you start from the D_DY table are you expecting to collect a small amount of data with a precise access path from FS1_REV_TRK_TMP – if not then it may be best to start from FS1_REV_TRK_TMP; but maybe you only get a few rows from FS1_REV_TRK_TMP for every row in D_DY, so perhaps you need a suitable index on FS1_REV_TRK_TMP – perhaps hinted if the optimizer doesn’t think much of it.
If you start from FS1_REV_TRK_TMP you have to do a large tablescan (and how much of your time is spent just on the tablescan) – will a hash join discard most of the data, or keep most of the data ? If it discards most of the data then it may be better to join before aggregating; if it keeps a large fraction of the data then the path shown may be the best strategy.
Comment by Jonathan Lewis — May 2, 2010 @ 10:59 am BST May 2,2010 |
Hi Jonathan,
HASH GROUP BY is advertised as one of the most important new features for data warehouse especially in 11g (and in 10g as well). However, I’ve seen this feature turned off by DBA in many production systems, because
(1) it uses excessive temp space
(2) it is not faster than SORT GROUP BY
I believed that you must have encountered this issue in earlier versions of 10g. According to your extensive practice and implementation, does the patch for 10g actually fixes the bug and makes HASH GROUP BY run fast? I understand that SORT algorithm is improved in 11g, but why HASH GROUP BY is still running no faster in 11g? Any specially parameter or patch that I could have missed in 11g?
Much appreciated!
Comment by goldenorbit — November 25, 2010 @ 6:24 am GMT Nov 25,2010 |
See my reply to your other comment: https://jonathanlewis.wordpress.com/2010/11/19/quiz-night-9/#comment-37904
Although hash aggregation is supposed to be have benefits over sort aggregation I hadn’t actually seen anything that suggested that was quite so important as you suggest. I’d be interested to see a URL (from Oracle) that made this claim.
I wasn’t aware of any significant improvements in the sort algorithm in 11g – but there was a change in 10.2 which shows up in some cases with a 10032 trace reported “V2 sort” rather than “V1 sort”. The V2 sort uses less memory and less CPU than the V1 sort, so perhaps it’s been incorporated in more places in 11g.
See: https://jonathanlewis.wordpress.com/2009/10/01/_smm_isort_cap/ for a couple of details
Comment by Jonathan Lewis — November 27, 2010 @ 6:50 pm GMT Nov 27,2010 |
This is an old white pager http://www.oracle.com/technetwork/database/features/bi-datawarehousing/twp-dss-performance-10gr2-0705-128049.pdf on page 9 – section for “HASH-BASED AGGREGATION” illustrated hash group by in 10gR2.
In a later white paper http://www.mediafire.com/?28ny3564f15plud for 11gR1, on page 22, it talks about how hash-based NPV stats collection is faster in 11g than 10g.
Comment by goldenorbit — November 29, 2010 @ 7:36 am GMT Nov 29,2010 |
goldenorbit,
Your original comment was:
“HASH GROUP BY is advertised as one of the most important new features for data warehouse especially in 11g (and in 10g as well)”
What the first whitepaper from Oracle says is:
“the major performance improvement brought by hash-based aggregation in Oracle Database 10g Release 2 has great impact.”
The statement is about 10g, not 11g, and it’s much less dramatic than you claim.
The second document is stored on a non-Oracle website that I don’t plan to download from; but from the comments you make I think you are talking about the hashing used in the “approximate NDV” mechanism used by Oracle to estimate the number of distinct values in a column without having to aggregate the column. This mechanism doesn’t use aggregation at all, and (for large data volumes) is much faster than any aggregation method – technically I think it would be described as O(n) rather than O(n*log(n)).
If you visit https://jonathanlewis.wordpress.com/2009/03/20/hotsos-2009/ there are several comments discussing the method, and most importantly a link to Amit Poddar’s blog where he has published a paper and powerpoint presentation on the mechanism.
Comment by Jonathan Lewis — November 29, 2010 @ 11:26 pm GMT Nov 29,2010 |
^_^ I did exaggerated it a little bit to catch your attention. However the question remains, if hash-based aggregation has introduced MAJOR performance improvement since 10gR2, why we can’t see the real gain, furthermore why we have to turn this feature off? Any hash-based and sort-based aggregation will use GROUP BY operation, right?
http://www.oracle.com/technetwork/testcontent/twp-bidw-dss-perf-11gr1-129802.pdf is the official link to the second white paper that I couldn’t find it earlier with the original name.
Take a line from Amit Poddar’s PPT:
The NDV algorithm is not a use case of GROUP BY, but it also gives us some idea: when compared with Sort Group By operation, Hash Group By should have the similar advantage – “without sorting the entire data set”.
Comment by goldenorbit — November 30, 2010 @ 9:22 pm GMT Nov 30,2010 |
goldenorbit,
I have demonstrated why sort aggregation and hash aggregation could perform differently in the other comment I linked to. I have no idea why you have felt the need to turn off the hash aggregation feature, or what analysis of results you have done to come to the conclusion that you never want to see it happen.
Thank you for the link to Oracle’s website for the 11g paper. Since we don’t know exactly how Oracle does hash aggregation and we don’t know exactly how Oracle does the hash NDV, it’s a little optimistic to insist that the improvement we get from the hash NDV over the sort NDV should be a valid indicator of the improvement we might get from hash aggregation over sort aggregation.
Since you’ve seen DBAs turn this feature off in many production systems, and I haven’t, I’ve added a poll to this comment asking DBAs for an indication of their experiences of hash aggregation.
Comment by Jonathan Lewis — November 30, 2010 @ 11:21 pm GMT Nov 30,2010 |
There is no correlation that can be drawn from one-pass distinct sampling (dbms_stats.auto_sample_size in 11g) and a potential performance delta between SORT GROUP BY and HASH GROUP BY.
The only reason that comes to mind to disable hash group-by aggregation is a wrong results bug.
Comment by Greg Rahn — December 1, 2010 @ 11:43 pm GMT Dec 1,2010 |
Thanks for the reply, Greg. The reason that I posted the question here to leverage Jonathan’s popularity to get more feedback.
Since HASH GROUP BY was introduced, it has at least 2 bugs reported: wrong result 4604970 and excessive temp space usage 7716219 (don’t remember the links).
As you can see from the old twp-dss-performance-10gr2-0705-128049.pdf, ORACLE did claim that hash aggregation should bring major performance improvement in 10gR2. But even the above bugs should have been patched by now, my DBA still insist disabling hash group by, and I still don’t have a benchmark proof to convince them that hash group by is way faster than sort group by, so this feature can be enabled.
See my quick test result here https://docs.google.com/document/pub?id=133amtsexvyaB06m1VCu8add8CPQ139Iqv7jGlFknjE8
There is another recent post http://orainternals.wordpress.com/2010/09/30/group-by-hash-aggregation/
Comment by goldenorbit — December 2, 2010 @ 2:09 am GMT Dec 2,2010 |
goldenorbit,
Why don’t you try reading things carefully and thinking about them, rather than looking at one line that seems to confirm you prejudices ?
The white paper gives you one example of a query where the 10.2 hash aggregations operates much more efficiently than the 10.1 sort aggregation. You are insisting that this means hash aggregation must always be faster than sort aggregation in all cases and versions. Don’t forget that the paper also points out that the 10.2 simple sort algorithm is faster than the 10.1 simple sorting algorithm – might this also be relevant to sort aggregation in later releases ?
Your results in show figures for sort and hash aggregration with degree 4, 8, and 16. At degree 8 the performance is significantly better than at degree 4 OR 16 – and the hash aggregation is 25% faster than the sort aggregation at that degree. (It’s a few percent worse at degree 4 and a few percent better at degree 16). Shouldn’t the drop from 8 minutes to 5:30 at degree eight give you reason to wonder where the time is going ? I note also that the plan suggests you are aggregating 288 Million rows down to 71M – as I pointed out in my comment on the other thread where you raised your point ( https://jonathanlewis.wordpress.com/2010/11/19/quiz-night-9/#comment-37904 ), the relative performance of sort and hash aggregation is dependent on the data pattern and level of aggregation – and the relatively small reduction of volume in your example offers the hash aggregation little room for improvement over the sort aggregate.
You might point out to your DBAs that the link on orainternals (which is about 10g, not 11g) has the conclusion that the feature is too good to miss in general, so they only disable it for specific statements.
Comment by Jonathan Lewis — December 2, 2010 @ 8:22 am GMT Dec 2,2010
Hi Jonathan,
Don’t get me wrong. There are several blogs explaining how “hash group by” behave differently than “sort group by”, but I have been trying to search & read a lot to get the positive benchmark proof for hash group by (during the past 2 years), that single line in ORACLE 10gR2 white paper is the only official statement I can use so far. If I just post my question by referring a statement from some personal blog, you might not pay attention to the topic, right?
There are blogs talking about the problems for “hash group by”, even with a general statement saying that “hash group by” is a good feature, but I have not yet found any dedicated article to explain if/why “group hash by” is faster & useful.
I kept wondering if I miss-used “group hash by” or there’s a bug fix for it that I was not ware of. I have encountered 3 DW RAC 20TB+ (2 x 10g and 1 x 11g), “group hash by” are disabled in all 4 implementations by DBA, and I can’t convince them to enable it.
Some people add “order by” to their query to avoid “hash group by”, some people disable the feature at session level, some people continue running the SQL without noticing any difference… I hope to leverage the popularity of your blog to ask: is there anyone out there who has strong proof that “hash group by” is really good? what patch is required?
Thank you.
Comment by goldenorbit — December 3, 2010 @ 11:50 pm GMT Dec 3,2010
goldenorbit,
If you read the train of comments you now have a sample of code you can experiment with to learn more about the pros and cons of hash aggregation.
I’m not surprised that a DBA that’s set a (hidden) parameter is scared to remove it on a production system. But you could try persuading them to test a few queries with the ‘use_hash_aggregation’ hint.
Are you sure that people who “add an order by” are doing so to disable the feature – if so then there’s no guarantee that it will work. They’d be better off (temporarily) using the ‘no_use_hash_aggregation’ hint. Most people who add an “order by” do so because they forgot that a “group by” doesn’t imply an “order by”.
If you think there is a patch for hash aggregation, I suggest you get one of your doubting DBAs to raise an SR for it.
Comment by Jonathan Lewis — December 4, 2010 @ 2:57 pm GMT Dec 4,2010
[…] Following a few comments on an old posting about the hash aggregation mechanism introduced by Oracle 10g (possibly 10.2) I’ve added a poll to my last comment to see how many people have had sufficient problems with it that they’ve decided to disable the feature. If you want to add your vote (and make a comment) go to: https://jonathanlewis.wordpress.com/2008/12/21/group-by/#comment-37975 […]
Pingback by Hash Aggregation « Oracle Scratchpad — December 1, 2010 @ 3:18 pm GMT Dec 1,2010 |
Hello Jonathan
Greetings!
Orainternal is my blog :-)
Yes, you are correct, my blog entry is more about scientific approach than disabling “Group by hash aggregation”. It seems to me that, in some corner cases, HASH GROUP BY feature has a bug (10.2.0.3 and 10.2.0.4), due to which memory is allocated and de-allocated excessively leading to longer run time. Of course, this is not going to show up in sqltrace or any other conventional tools, only a pstack loop pointed to the issue. My intention was to introduce those tools to approach the problem scientifically. I remember seeing a similar bug in MOS, but that bug was specific to queries involving FILTER step with GROUP BY operation in a subquery branch (and I can’t find the bug anymore). My client didn’t prefer to follow up the issue with support all the way, we resolved the issue with a profile and closed the problem.
In my blog entry, I also said “Note that, I am not saying hash group by feature is bad. Rather, we seem to have encountered an unfortunate bug which caused performance issues at this client. I think, Hash Grouping is a good feature as the efficiency of grouping operations can be improved if you have ample amount of memory. That’s the reason why we disabled this feature at the statement level, NOT at the instance level.” in line with you.
Cheers
Riyaj
Comment by orainternals — December 2, 2010 @ 2:54 pm GMT Dec 2,2010 |
[…] For reference notes for a couple of these cases: Join Factorization Group by Placement […]
Pingback by Internal Views « Oracle Scratchpad — February 27, 2011 @ 6:12 pm GMT Feb 27,2011 |
I was doing a presentation in Ireland yesterday about things you might see, or have to plan for, when upgrading to 11gR2. This was a follow-on from the previous speaker who had mentioned a problem with “group by” queries returning the wrong results after the upgrade.
He’s sent me a follow-up email with some details – and this looks like the best place to publish them.
Recently the database was upgraded to 11.2.0.1.
Since then, a query returns wrong results (zero rows).
When disabling the hidden parameter “_optimizer_group_by_placement”, the query retrieves correct results.
There are cases in 11.2.0.1 where “group by placement” goes wrong when aggregation is applied to complex views (it isn’t a general problem).
Patch 10100766 is relevant (currently only available for AIX) and this seems to address bugs 9022839 (reported as fixed in 11.2.0.2) and 10028651.
Comment by Jonathan Lewis — March 31, 2011 @ 2:31 pm BST Mar 31,2011 |
[…] subquery (which should, therefore, produce the same plan) but then unnests the subquery and uses “group by placement” to resolve the plan as a hash join followed by hash aggregation. This gives us two completely […]
Pingback by Subquery Factoring « Oracle Scratchpad — February 1, 2012 @ 5:53 pm GMT Feb 1,2012 |
[…] of hints to make sure that the optimizer doesn’t try complex view merging (with subsequent group by placement), and I’ve enabled stats collections. Here’s the execution plan, with predicates and […]
Pingback by Bloom Filter | Oracle Scratchpad — August 5, 2013 @ 9:22 pm BST Aug 5,2013 |
[…] combining aspects of the star-transformation (but without the bitmap indexes), Bloom filters, and “group-by” placement to minimise the cost of aggregation over high-volume […]
Pingback by In-memory Aggregation | Oracle Scratchpad — August 24, 2014 @ 8:05 pm BST Aug 24,2014 |
[…] I’ve discussed before: (a) be selective about using a table twice to reduce the workload, (b) aggregate early if you can reduce the scale of the […]
Pingback by Cartesian join | Oracle Scratchpad — April 15, 2015 @ 6:41 pm BST Apr 15,2015 |
[…] performance by finding a way of rewriting the query to aggregate before doing the join. (See this note from 2008). The article then went one step further to optimise a “count distinct” by wrapping a […]
Pingback by CBO++ | Oracle Scratchpad — May 25, 2016 @ 1:23 pm BST May 25,2016 |
[…] your SQL in ways that might give it more options for finding an efficient access path – for example you might have a query that joins two tables and then aggregates the resulting data but the […]
Pingback by CBO Oddities – 2 | Oracle Scratchpad — December 10, 2019 @ 2:00 pm GMT Dec 10,2019 |
[…] cost-based optimizer is already able to do a similar query transformation with group by clauses since 11g . If the optimizer is able to do the transformation for you, then tuning this in live applications […]
Pingback by Improve performance of complex Top-N queries by adding an additional order by clause | CTAndrewSayer — May 14, 2022 @ 8:18 pm BST May 14,2022 |