“Why did my query go parallel?”
It’s a question that crops up from time to time, usually followed by a list of reasons why it shouldn’t have gone parallel – no hints in the query, table is not declared parallel, parallel_degree_policy is set to manual etc.
When the question appeared recently on the Oracle developer forum it turned out that the table in question was declared as “parallel (degree default)”, which prompted the OP to ask the question: “is parallel = default not equivalent to parallel = 1”.
The answer to the question is that the two options are not equivalent – but that’s not the point of this note. Here’s a little script to test the claim:
drop table t1 purge;
create table t1 pctfree 90 as select * from all_objects where rownum <= 50000;
select degree, instances from user_tables where table_name = 'T1';
explain plan for select sum(object_id) from t1;
select * from table(dbms_xplan.display);
alter table t1 parallel (degree default);
select degree, instances from user_tables where table_name = 'T1';
explain plan for select sum(object_id) from t1;
select * from table(dbms_xplan.display);
I’ve created a table in the simplest possible way, but picked a fixed number of rows (to help reproducibility) and – because parallel is usually about “big” objects – I’ve left a lot of empty space (90%) in each block.
Then I’ve checked the execution plan for a very simple query that can only do a full tablescan, with the two declarations of parallelism set.
Here are the outputs of the 4 queries I’ve run:
DEGREE INSTANCES
---------------------------------------- ----------------------------------------
1 1
1 row selected.
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 1275 (2)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
| 2 | TABLE ACCESS FULL| T1 | 50000 | 244K| 1275 (2)| 00:00:01 |
---------------------------------------------------------------------------
9 rows selected.
DEGREE INSTANCES
---------------------------------------- ----------------------------------------
DEFAULT 1
1 row selected.
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3110199320
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 350 (0)| 00:00:01 | | | |
| 1 | SORT AGGREGATE | | 1 | 5 | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 5 | | | Q1,00 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 5 | | | Q1,00 | PCWP | |
| 5 | PX BLOCK ITERATOR | | 50000 | 244K| 350 (0)| 00:00:01 | Q1,00 | PCWC | |
| 6 | TABLE ACCESS FULL| T1 | 50000 | 244K| 350 (0)| 00:00:01 | Q1,00 | PCWP | |
----------------------------------------------------------------------------------------------------------------
Note
-----
- automatic DOP: Computed Degree of Parallelism is 4 because of degree limit
17 rows selected.
Clearly “parallel default” does not have the same effect as “parallel 1”. Any time you’ve got a query unexpectedly running parallel it’s possible that some table (or index on the table) has been created with a parallel degree of default. (More commonly, someone may have rebuilt an index “parallel N” to get the job done more quickly then forgotten to alter the index back to parallel 1 – or noparallel – afterwards.)
The point of this note, though, is that there are some questions you should not ask until you’ve spent a few minutes thinking about how you might create a model that gives you the answer. There are several reasons for this
- The more you do it, the better and faster you get at modelling and understanding – and sometimes you really need to model a complex problem because you’re not allowed to show anything that looks like production in public.
- If the simple model seems to disagree with the behaviour you see in production it may give you some clues about where to look in the production system for the source of the difference.
- If the answer isn’t what you thought it would be you can change the question you put publicly to: “I thought Oracle would do X but it did Y; here’s how I tested, is there a flaw in the test?”
It took about 5 minutes for me to run up this demo – that might seem a bit quick but I’ve had a lot of practice (and it took a lot longer to write the note) – and it was, in this case, a waste of my time because I knew the answer; but I often run up little models before responding to questions on the forums or listservers because while I often think I know what the answer “ought” to be I do like to check before I say something that might be incorrect.
The next 2 questions would be:
1 What determines the default degree of parallelism?
2 Can it be configured?
It turns out that a quick google search (the other obvious but under used way to find answers to questions before asking them) results in https://docs.oracle.com/database/121/VLDBG/GUID-7D86C1A1-58B2-4D66-B7EF-A5B82BADF40F.htm
Comment by Johan Snyman — September 9, 2022 @ 11:47 am BST Sep 9,2022 |
Johan,
Thanks for the comment, it raises another can of worms about asking and answering questions.
In the context of the question I started with I think the follow-ups are likely to be more concrete, e.g. Why did my query use a DOP of 64; how do I make my queries run no higher than degree 4. But there are some queries which are so open-ended that they can’t be answer easily and won’t be good targets for simple modelling (at least, not if you want to be confident that you’ve inferred the correction answer) and a reference to the manual is the only good starting point. You may get lucky with a Google search (I’m guessing that you searched: oracle parallel default) but (a) you can get lots of early hits that are bad, (b) you can easily find out of date information even if you do find an Oracle reference manual (12.1 seems to come up quite a lot if you don’t include an indication of the version you want), or (c) it actually takes 3 or 4 different sections of manuals to piece together the answer you need.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — September 10, 2022 @ 10:48 am BST Sep 10,2022 |
I was running the above test on my local database (21.3) and got this note from plan.
Note
-----
- automatic DOP: Computed Degree of Parallelism is 8 because of degree limit
17 rows selected.
is that computed dop = 8 is not due to these parameter settings? ( cpu_count * parallel_threads_per_cpu )
demo@PDB1> show parameter parallel_threads_per_cpu
NAME TYPE VALUE
------------------------------------ ----------- --------
parallel_threads_per_cpu integer 1
demo@PDB1> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- --------
cpu_count integer 8
demo@PDB1>
Comment by Rajeshwaran Jeyabal (@rajeshwaranj) — September 22, 2022 @ 7:40 am BST Sep 22,2022 |
Rajeshwaran,
Thanks for the comment. I had avoided saying anything about why my note reported degree 4 and “because of degree limit” because there are several ways in which the degree could be derived. In my case the value 4 appeared because of all the parameter settings related to parallelism and their relevance to my cpu_count (which was 4). However, if I had issued:
I would have got the same note text, but reporting the value 3 – and various other parameters could also have had an effect, some of them changing the basic text of the note.
In many systems the parallel-related parameters are left at their defaults, so the limit that gets reported usually matches the parameter cpu_count (because the parallel_threads_per_cpu is often 1, and the system is not often RAC in which the limit is also scaled up by the number of instances available)
Regards
Jonathan Lewis
Comment by Jonathan Lewis — September 23, 2022 @ 10:16 am BST Sep 23,2022
Hello Jonathan,
Interestingly, the syntax
alter table t1 parallel (degree default);
seems to not appear in the SQL Language documentation, that only shows using either NOPARALLEL or
PARALLEL [N].
I think that a little misunderstanding here might arise from the “double sense” in which the word DEFAULT is used.
One is the “common” sense used in most places, where a DEFAULT value is one that is automatically used if not specifying explicitly another value. For example, a CREATE TABLE not specifying PARALLEL uses a default parallelism degree of 1 (aka NOPARALLEL).
The second sense is used when specifying PARALLEL, but without an explicit degree, which means to use
parallelism, but with the “default degree” that Oracle calculates for the database.
I remember a situation at my work when, after migrating an 11g R2 database from a Windows server
to a Linux server, several very simple queries, that were always and very obviously run as serial in the Windows database, suddenly started to be run using PARALLEL in the Linux database, while all of the involved objects did have a degree of 1 (no parallelism ever set explicitly).
The problem could only be solved by setting (or “correcting”) one of the PARALLEL-related initialization parameters in the new environment at the database level.
Thanks a lot and, during these sad times, all my sincere condolences go out to all the British people.
Iudith Mentzel
Comment by Iudith Mentzel — September 9, 2022 @ 6:19 pm BST Sep 9,2022 |
Iudith,
That (degree default) is just me letting my fingers do the thinking – harking back to the days when you specified the degree and the instances. (degree N instances M). I can’t remember how long ago Oracle Corp. deprecated the syntax and changed the algorithm to just multiply the two numbers together and ignore the implied directive about the number of instances to use.
It’s interesting to note the difference in code paths, though. My syntax gets reported as degree = default instances = 1 while simply stating “alter table xxx parallel” gets reported as degree = default instances = default.
I think you’re right about the ambiguity in the use of the word “default”. Does it represent a hard-coded but invisible constant, or does it mean “do whatever you think is the right thing given the settings of a number of parameters and the state of the machine at the moment you work out the strategy”. The latter gets particularly complicated when parallel execution is involved.
I was trying to remember which parameter was the one that would hvae been coauses your trouble – I’m guessing it was the parallel_degree_policy or may maybe the parallel_automatic_tuning – something that says “I don’t care if it’s not declared parallel, it’s pretty big, and I can, so I will”.
I know that I looked at a handful of parallel parameter changes on one of the major release and thought the change was going to cause problems for years to come.
Regards
Jonathan Lewis
Comment by Jonathan Lewis — September 10, 2022 @ 11:20 am BST Sep 10,2022 |
[…] Parallel default (Sept 2022): a little test and explanation showing “parallel default” and “parallel 1 / noparallel” are not the same thing. […]
Pingback by Parallel Execution Catalogue | Oracle Scratchpad — September 12, 2022 @ 4:41 pm BST Sep 12,2022 |