Here’s a little quiz that came to light on a database running 11g. It’s one of those anomalies that is likely to get noticed only in fairly extreme cases and it’s about DDL rather than DML so it shouldn’t be happening very often on anyone’s system.
Read through the following script which simply creates a table, populates it with 4 million rows and creates and index; then answer the question that follows:
rem
rem Script: pt_iot_oddity.sql
rem Author: Jonathan Lewis
rem Dated: March 2023
rem Purpose:
rem
rem Last tested
rem 19.11.0.0
rem 11.2.0.4
rem
create table t1 (
key1 varchar2(20),
secondary varchar2(10),
other varchar2(10),
constraint t1_pk primary key (key1)
)
/
insert into t1( key1, secondary, other)
with generator as (
select
rownum id
from dual
connect by
level <= 1e4 -- > comment to avoid WordPress format issue
)
select
lpad(rownum,20),
null,
lpad(rownum,10)
from
generator v1,
generator v2
where
rownum <= 4e6
/
begin
dbms_stats.gather_table_stats(
ownname => null,
tabname => 'T1',
method_opt => 'for all columns size 1'
);
end;
/
create index t1_i1 on t1(secondary) parallel 4;
Take note that the index is a single column B-tree index on a column which is NULL for every single row; and remind yourselves that completely null entries not recorded in B-tree indexes – if you run the script you will find that the index has no entries: user_indexes.num_rows = 0. So the question is: how many rows does Oracle sort while creating the index?
To help you out, perhaps, here’s the execution plan:
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
| 0 | CREATE INDEX STATEMENT | | 4000K| 26M| 1183 (7)| 00:00:06 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (ORDER) | :TQ10001 | 4000K| 26M| | | Q1,01 | P->S | QC (ORDER) |
| 3 | INDEX BUILD NON UNIQUE| T1_I1 | | | | | Q1,01 | PCWP | |
| 4 | SORT CREATE INDEX | | 4000K| 26M| | | Q1,01 | PCWP | |
| 5 | PX RECEIVE | | 4000K| 26M| 816 (10)| 00:00:05 | Q1,01 | PCWP | |
| 6 | PX SEND RANGE | :TQ10000 | 4000K| 26M| 816 (10)| 00:00:05 | Q1,00 | P->P | RANGE |
| 7 | PX BLOCK ITERATOR | | 4000K| 26M| 816 (10)| 00:00:05 | Q1,00 | PCWC | |
| 8 | TABLE ACCESS FULL| T1 | 4000K| 26M| 816 (10)| 00:00:05 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------------------------------
Note
-----
- estimated index size: 100M bytes
And here’s another little bit of information (cut-n-paste after creating the index) just in case you didn’t believe me when I said completely null keys don’t appear in Oracle’s B-tree indexes:
SQL> select index_name, num_rows from user_indexes where table_name = 'T1';
INDEX_NAME NUM_ROWS
-------------------- ----------
T1_PK 4059833
T1_I1 0
Special note for Richard Foote: you’re allowed to think about the question, but you’re not allowed to answer it in the comments.
Answer
The starting thought to this one is this: we know that completely null entries don’t get into B-tree indexes so when, in the plan above, does Oracle discard the nulls. There are three possible places where it might do so:
- At the table access full
- At the sort
- At the index build
I’m not counting the “PX send” as that’s more or less tied into the table “access full”, similarly the “PX receive” is tied into the sort operation.
Ideally, of course, we’d hope that Oracle would discard the nulls as soon as possible – i.e. during the table access full. There are potentiall 4 ways (at least) that we can check to see what actually happens:
- Looking at the SQL Monitor output
- Looking at the dbms_xplan.display_cursor() output with rowsource execution stats enabled
- Looking at the v$pq_tqstat information immediately after creating the index.
- Looking at the output from the 10046 trace
- Looking at the results of a 10032 (sort) trace
Here’s the execution plan body from the SQL Monitor report in 11.2.0.4:
SQL Plan Monitoring Details (Plan Hash Value=2666861883)
===============================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Mem | Temp | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | (Max) | (Max) | (%) | (# samples) |
===============================================================================================================================================================================================
| 0 | CREATE INDEX STATEMENT | | | | 1 | +4 | 9 | 4 | | | | | | | | |
| 1 | PX COORDINATOR | | | | 4 | +1 | 9 | 4 | | | | | | | 18.18 | Cpu (2) |
| 2 | PX SEND QC (ORDER) | :TQ10001 | 4M | | 1 | +4 | 4 | 4 | | | | | | | | |
| 3 | INDEX BUILD NON UNIQUE | T1_I1 | | | 1 | +4 | 4 | 4 | | | | | | | | |
| 4 | SORT CREATE INDEX | | 4M | | 2 | +3 | 4 | 4M | 922 | 48MB | 243 | 48MB | 95M | 52M | 27.27 | Cpu (3) |
| 5 | PX RECEIVE | | 4M | 816 | 3 | +2 | 4 | 4M | | | | | | | 9.09 | Cpu (1) |
| 6 | PX SEND RANGE | :TQ10000 | 4M | 816 | 2 | +2 | 4 | 4M | | | | | | | 27.27 | Cpu (3) |
| 7 | PX BLOCK ITERATOR | | 4M | 816 | 1 | +3 | 8 | 4M | | | | | | | | |
| 8 | TABLE ACCESS FULL | T1 | 4M | 816 | 3 | +1 | 56 | 4M | 247 | 165MB | | | | | 18.18 | Cpu (1) |
| | | | | | | | | | | | | | | | | db file sequential read (1) |
===============================================================================================================================================================================================
Under “Rows (actual)” we can see operation 4 (Sort create index) supplying 4M rows to operation 3 (Index Build). If that’s not entirely convincing, here’s the output from v$pq_tqstat:
DFO_NUMBER TQ_ID SERVER_TYPE INSTANCE PROCESS NUM_ROWS BYTES ROW_SHARE DATA_SHARE WAITS TIMEOUTS AVG_LATENCY
---------- ---------- --------------- -------- --------------- ---------- ---------- --------- ---------- ---------- ---------- -----------
1 0 Ranger 1 QC 12 528 100.00 100.00 5 1 0
Producer 1 P004 1007528 14126554 25.19 25.19 10 1 0
1 P005 930555 13047324 23.26 23.26 8 1 0
1 P006 1008101 14134600 25.20 25.20 7 0 0
1 P007 1053828 14775738 26.35 26.35 5 0 0
Consumer 1 P000 1362213 19099470 34.06 34.06 3 0 0
1 P001 1155741 16204542 28.89 28.89 3 0 0
1 P002 695813 9755950 17.40 17.40 3 0 0
1 P003 786233 11023726 19.66 19.66 3 1 0
1 Producer 1 P000 1 298 25.00 25.00 0 0 0
1 P001 1 298 25.00 25.00 0 0 0
1 P002 1 298 25.00 25.00 0 0 0
1 P003 1 298 25.00 25.00 0 0 0
Consumer 1 QC 4 1192 100.00 100.00 2 0 0
Here we see the first producer (tablescan) PX processes passing 4 million rows to the first consumer processes. which then turn around to be producers and pass a single row each to the query co-ordinator. That doesn’t really tell us anything about when the consumers discarded the 4M rows, of course, but it does confirm that 4M rows passed at least some way up the tree; so here’s an extrace from each of the trace files for the consumers turned producer showing how many rows each sorted:
grep -A+4 "\- Sort Statistics " test_p00[0-9]*165*.trc
test_p000_16583.trc:---- Sort Statistics ------------------------------
test_p000_16583.trc-Initial runs 2
test_p000_16583.trc-Number of merges 1
test_p000_16583.trc-Input records 1362213
test_p000_16583.trc-Output records 1362213
--
test_p001_16587.trc:---- Sort Statistics ------------------------------
test_p001_16587.trc-Initial runs 2
test_p001_16587.trc-Number of merges 1
test_p001_16587.trc-Input records 1155741
test_p001_16587.trc-Output records 1155741
--
test_p002_16591.trc:---- Sort Statistics ------------------------------
test_p002_16591.trc-Input records 695813
test_p002_16591.trc-Output records 695813
test_p002_16591.trc-Total number of comparisons performed 4230527
test_p002_16591.trc- Comparisons performed by in-memory sort 4230527
--
test_p003_16595.trc:---- Sort Statistics ------------------------------
test_p003_16595.trc-Input records 786233
test_p003_16595.trc-Output records 786233
test_p003_16595.trc-Total number of comparisons performed 4778408
test_p003_16595.trc- Comparisons performed by in-memory sort 4778408
Compare the Pnnn identifiers with the trace file names, and compare the count of consumed rows with the sort input/output record counts: the PX processes did sort the rows that received before going into the “index build” routine and discarding the null entries.
This is just one scenario of several – it’s a simple heap table, with a parallel create index of a non-unique index in an old version of Oracle. Would it be the same for a unique index, for an index on a partitioned heap table (especially if the number of partitions matched the degree of parallelism), would the type of partitioning matter, and how would things change (if at all) for an index organized table – and what about doing all the combinations of tests on a version of Oracle that is still in support.
You may have wondered, in passing, why the script for this example was called pt_iot_oddity.sql. It’s because someone asked me if I could think of a reason why it took 15 minutes to create an empty index on a partitioned index-organized table when running parallel 8. Admittedly it was a table with 650 million rows, but the time was almost all CPU, not I/O.
Here’s one variation on the test: upgrading to 19c (19.11.0.0) and creating a unique index – I’m just going to show the v$pq_tqstat information and one of the 10032 trace file extracts:
DFO_NUMBER TQ_ID SERVER_TYPE INSTANCE PROCESS NUM_ROWS BYTES ROW_SHARE DATA_SHARE WAITS TIMEOUTS AVG_LATENCY
---------- ---------- --------------- -------- --------------- ---------- ---------- --------- ---------- ---------- ---------- -----------
1 0 Ranger 1 QC 12 9820 100.00 100.00 2 0 0
Producer 1 P004 1027485 14399730 25.67 25.67 19 4 0
1 P005 954287 13374652 23.84 23.84 19 4 0
1 P006 993201 13919984 24.82 24.82 19 4 0
1 P007 1027428 14399502 25.67 25.67 15 0 0
Consumer 1 P000 0 96 0.00 0.00 807 804 0
1 P001 4000000 56083760 100.00 100.00 808 803 0
1 P002 0 96 0.00 0.00 19740 19737 0
1 P003 0 96 0.00 0.00 19740 19737 0
1 Producer 1 P000 1 370 25.00 25.00 26 21 0
1 P001 1 370 25.00 25.00 210 94 0
1 P002 1 370 25.00 25.00 26 21 0
1 P003 1 370 25.00 25.00 28 23 0
Consumer 1 QC 4 1480 100.00 100.00 5 842 0
---- Sort Statistics ------------------------------
Initial runs 1
Input records 4000000
Output records 4000000
One thing that didn’t change – the consumer PX processes sorted the data they received; one thing that did change – only one of the consumers received data. The difference is in the change from non-unique to unique, not from 11g to 19c.
A non-unique index is “made unique” by the addition of the table’s rowid to the key value so that – amongst other things – if you have multiple rows with the same key value in a table block they are adjacent in the index. The rowid for a row pointed to by a unique index is “carried” as extra data and is not treated as if it were part of the key.
For parallel index creation this means for a unique index where every key is actually NULL Oracle has to “distribute” all the key values to the same PX process. For a non-unique index Oracle is trying to distribute the combination (key, rowid) evenly between the PX processes – so each PX process should get a different set of null keys corresponding to non-overlapping rowid ranges from the table.