Here’s a query that appeared on the MOS “SQL Performance” forum (needs a support account) with the request: “Can someone please assist me to optimize this query?”
It looks like a nice simple query (though looks can be deceiving) so I thought I’d use it as another example on how to think about tuning SQL.
SELECT
MSI.SEGMENT1, OL.SHIP_FROM_ORG_ID,
OL.ORDERED_QUANTITY SOLD_QTY,
OL.UNIT_SELLING_PRICE SELLING_PRICE
FROM
OE_ORDER_HEADERS_ALL OH,
OE_ORDER_LINES_ALL OL,
MTL_SYSTEM_ITEMS_B MSI
WHERE
OH.HEADER_ID = OL.HEADER_ID
AND OL.SHIP_FROM_ORG_ID = MSI.ORGANIZATION_ID
AND OL.INVENTORY_ITEM_ID = MSI.INVENTORY_ITEM_ID
AND NVL(oh.source_document_type_id,0) <> 10
AND SUBSTR(SEGMENT1,4,3) = 'FIF'
AND UPPER(OL.FLOW_STATUS_CODE) NOT IN ('ENTERED','CANCELLED')
I really dislike (and actually have trouble) reading blocks of text in upper case, so before I do anything else here’s the same code converted to (mostly) lower case, with one minor adjustment:
select
msi.segment1,
ol.ship_from_org_id,
ol.ordered_quantity sold_qty,
ol.unit_selling_price selling_price
from
oe_order_headers_all oh,
oe_order_lines_all ol,
mtl_system_items_b msi
where
oh.header_id = ol.header_id
and ol.ship_from_org_id = msi.organization_id
and ol.inventory_item_id = msi.inventory_item_id
and nvl(oh.source_document_type_id,0) <> 10
and substr(msi.segment1,4,3) = 'FIF'
and upper(ol.flow_status_code) not in ('ENTERED','CANCELLED')
I’ve highlighted line 15 to pick out the minor adjustment: I’ve added a table alias to the expression substr(segment1,4,3). Every column reference should include its table alias. The query was pretty good in following this guideline and it was fairly easy to work out the appropriate alias here because we can also see msi.segment1 in the select list and if there had been a column called segment1 in one of the other tables Oracle would have reported error “ORA-00918: column ambiguously defined“.
We’ve been told that the query is a “sales query” against Oracle EBS R12, and we’ve been given an execution plan – though the plan, unfortunmetly, is an image from one of the non-Oracle GUIs:

Conveniently the plan tells us that the three tables in the query really are tables and not views that hide complex subqueries; it also gives us some idea of the size of two of the tables (very big). What it doesn’t give us is any information about how and where the original predicates have been used.
Things we don’t know
- What the query means (in business terms)
- How long it takes the query to run
- How long it probably should take the query to run
- How fast the OP would like the query to run
- How often the query will be run
- How many rows from each table have to participate in deriving the result
- How many rows and blocks there are in each table
- How many distinct values there are for each of the columns in the where clause.
- Whether there are any virtual columns (or extended stats) on the table
- Whether there are any (numerically) skewed data distribution patterns
- What physical patterns there might be in the rows identified by the predicates.
- What the definitions of any available indexes are (including “function-based”)
- Whether or not we’re allowed to change the code, or whether we have to “hint” it somehow.
- Whether this is the only set of literal values that would appear or whether it’s generated code that allows many variations to appear.
- Whether this is a case where a production query using bind variables is being tested with one known set of values.
- Which version of Oracle and optimizer_features_enable / hacked optimizer parameters
Basic threats
Every single predicate that compares a column with a literal hides the column inside a function call – which means the optimizer may be “losing” important statistical information
Two of the three literal-based predicates are “negative”, i.e. “not equal” and “not in”, which is another detail that can mess up the optimizer’s arithmetic (though the specific effects may vary with version and the column statistics).
Observations and Guesswork
The optimizer cardinality estimate for upper(ol.flow_status_code) not in (‘ENTERED’,’CANCELLED’) is 76,240 with a tablescan cost of 989,658. That looks like a very small fraction of a very large table. But it’s possible that this is the standard estimate for “unknown value not in (list of 2 items)” which would be 5% of 5% in 19c. Certainly 76,240 * 20 * 20 = 30M sounds like a suitable number of rows for a table with a tablescan cost close to a million. Possibly a better cardinality estimate would change the plan.
The name flow_status_code suggests a column that would have a small number of distinct values with an enormous data skew. If we had extended stats, or a virtual column, on upper(flow_status_code) with a histogram in place the optimizer might pick a completely different path. If the estimate became very small it might be able to choose nested loops and an indexed access path all the way through the query; if the estimate became very large it might decide to use a different join order.
The optimizer cardinality estimate for nvl(oh.source_document_type_id,0) <> 10 is 6.65M with a tablescan cost of 95,823. The selectivity for this predicate is derived as 1 – selectivity(nvl(oh.source_document_type_id,0)= 10), which is derived as 1 – selectivity( oh.source_document_type_id = 10).
At this point I could launch into all sorts of speculation about the column: the number of distinct values, the number of nulls, the existence (or not) of a histogram, a comparison between the blocks and cardinality of this “order headers” table and the estimates made above for the “order lines” table – but trying to cover all the options would be a long and tangled document, so I’ll just show you one example that might be a valid model this predicate:
- t1 is a table with 100,000 rows
- data_type_id is a numeric column with 80 distinct values
- there are 1,000 rows in t1 where data_type_id is null
- I’ve gathered stats just before executing a couple of queries
Here are the two queries, each followed by the critical line from its execution plan:
SQL> select count(*) from t1 where nvl(data_type_id,0) <> 50;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
|* 2 | TABLE ACCESS FULL| T1 | 98763 | 289K| 553 (2)| 00:00:01 |
---------------------------------------------------------------------------
SQL> select count(*) from t1 where data_type_id = 50;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
|* 2 | TABLE ACCESS FULL| T1 | 1238 | 3714 | 551 (1)| 00:00:01 |
---------------------------------------------------------------------------
A quick arithmetic check: add the two row estimates: 98,763 + 1,238 = 99,001. Allowing for rounding errors in the arithmetic that’s the 99,000 rows where data_type_id is not null.
Unfortunately the actual count from the first query is 2,793 – the optimizer’s estimate is out by a factor of more than 35; and the actual count from the second query is 97,207 – the optimizer’s estimate is out (in the opposite direction) by a factor of more than 78.
The statistics are “up to date” – except 50 is an extremely common value for data_type_id and I haven’t created a histogram on the column.
So perhaps the original query is asking for the “rare” order headers but there’s no histogram to give the optimizer any clue that 10 (the unwanted value) is an extremely common value for the document_type_id. Would a suitable histogram allow the optimizer to produce a much lower cardinality estimate and a significant change in the execution plan?
It may be significant that the table order in the from clause, and the predicate order in the where clause suggests that the original programmer may have been aiming for an execution plan that started at the oe_order_headers_all table.
Finally we come to the predicate substr(msi.segment1,4,3) = ‘FIF’. I don’t know much about EBS but I can guess what the table names oe_order_headers_all and oe_order_lines_all represent; on the other hand I can’t think what is really behind the name mtl_system_items_b, especially when the column names that join it to the order lines table make it look as if it should be called something more like oe_products_all; and there’s no way to guess what the significance of a little chunk of a column called segment1 might be.
The only thing we can infer from the execution plan is that it must be a fairly huge table since the optimizer has chosen to probe it through a nested looped 76,240 times with a total incremental cost of 152,551 (= 1,244,546 – 1,091,995) rather than doing a tablescan and hash join.
The cardinality and cost estimates show a couple of interesting details. First, there’s a clear error in the cardinality estimate of the nested loop join since but the number of rows produced by the join is (very slightly) larger than the number of rows supplied to it by the hash join, even though it’s accessing the table by a unique scan of a unique index.
Secondly it’s worth commenting on the cost of the indexed access, which might seem low at only 2 when I’ve claimed it’s a “fairly huge” table which would presumably require traversing an index with a blevel of 2 (root, branch, leaf) before visiting the required table block – so a cost of 4 for the table visit would seem reasonable. In fact 4 is what we’d see for a non-unique index (on this data) and a query for “id = {constant}”; the optimizer has substracted one for the uniqueness, and one because it’s a nested loop join.
If we know that there are only a very few rows where substr(segment_1,4,3) = ‘FIF’, and if we think that this is a good starting table for the optimizer then we need to create a virtual column (or extended stats) and might need to support that with a histogram. Even then, of course, it might not be a good table to have first in the join order.
Joins
So far we’ve only been considering ways to improve the optimizer’s cardinality estimates in the hope that better information would give it a better execution plan. We believe that all three tables are very large and suspect that if the actual volume of relevant data is small we can get a better path that uses nested loops from beginning to end. But if we want that path to be efficient we’re going to need suitable indexes, including a precision index into the first table in the join.
What we need to do now is consider the resources that might be needed to ensure we have the “perfect” indexes for the optimum path, compared with the excess resources that would be used if we could find a “good enough” path.
Looking at the query, my feeling is that there are two possible paths that might work reasonably well (assuming the required volume of data is sufficiently small):
- oe_order_headers_all -> or_order_lines_all -> mtl_system_items_b
- oe_order_lines_all -> or_order_headers_all -> mtl_system_items_b
I suspect (in the absence of any supplied information) that a path starting with mtl_system_items_b will do too much random I/O into the order lines tables. (It’s a standard example of the problem displayed by pairings like: products -> order_lines and customers -> orders the order lines for any given product are likely to be be scattered widely across the table, as are the orders for any particular customer. Starting with mtl_system_items_b might do a lot of random I/O before discarding a lot of unwanted order lines.
Looking at the oe_order_lines_all table it strikes me that only a small number of rows will be newly entered or cancelled, and most of them will be in states like “completed”, “picked” invoiced”, etc, so the predicate on flow_status_code is probably not one that will eliminate a lot of data, so I’m just going to talk about the options for getting into the oe_order_headers_all table. Assuming the predicate “nvl(oh.source_document_type_id,0) <> 10” does identify a “small enough” number of rows then we probably have the a good enough index (the foreign key index – which probably should be created with a degree of compression and will probably also be the primary key) into oe_order_lines_all, and we know we have a unique index from there into mtl_system_items_b.
So how do we access that small number of rows as efficiently as possible with a minimum of overhead. We’ve set up a histogram on source_document_type_id so that the optimizer gets a better idea of the number of rows – but that still leaves us with a tablescan of a huge table unless we create a tailored index. Here’s an example of the type of approach we can take – based on a table t2 that has 50,000 rows of which almost all have a document_type of ‘XX’, which we want to ignore, and a few rows where the document_type is null which have to be included in the ones we keep:
SQL> execute dbms_stats.gather_table_stats(user,'t2',method_opt =>'for all columns size 1 for columns size 254 document_type')
SQL> select count(*) from t2 where nvl(document_type,'YY') <> 'XX';
COUNT(*)
----------
300
Execution Plan
----------------------------------------------------------
Plan hash value: 3321871023
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 278 (2)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
|* 2 | TABLE ACCESS FULL| T2 | 300 | 1200 | 278 (2)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(NVL("DOCUMENT_TYPE",'YY')<>'XX')
In keeping with the original supplied code I have a predicate which translates NULLs into a value that isn’t the ‘XX’ that we want to exclude. With the histogram in place the optimizer expresion has actually got exactly the right estimate.
So here’s a possible index definition that will allow us to create a very small index that identifies exactly those rows as efficiently as possible:
SQL> create index t2_i1 on t2 (case when nvl(document_type,'YY') <> 'XX' then 1 end);
SQL> execute dbms_stats.gather_table_stats(user,'t2',method_opt =>'for all hidden columns')
SQL> select count(*) from t2 where case when nvl(document_type,'YY') <> 'XX' then 1 end = 1;
COUNT(*)
----------
300
Execution Plan
----------------------------------------------------------
Plan hash value: 3503408237
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 2 | | |
|* 2 | INDEX RANGE SCAN| T2_I1 | 300 | 600 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(CASE WHEN NVL("DOCUMENT_TYPE",'YY')<>'XX' THEN 1 END =1)
Two things to note; first, after I’ve created the “function-based” index I’ve gathered stats on “all hidden columns”. This is one way of gathering stats on the system-generated, invisible, virtual column that supports the index, and is a step that is often missed until the stats are gathered overnight by the automatic stats collection job at some random point in the future. Secondly I’ve had to change the query so that the predicate I use is an exact match for the index definition; this is why it’s often nicer to create a virtual column for the expression and index the virtual column – and to keep 3rd party or legacy code safe it’s often a good idea to declare such columns invisible.
Finally, just to show the efficiency of this strategy, here are a couple of the interesting stats about the index:
SQL> select table_name, index_name, num_rows , leaf_blocks from user_indexes where table_name = 'T2';
TABLE_NAME INDEX_NAME NUM_ROWS LEAF_BLOCKS
------------------------- -------------------- ---------- -----------
T2 T2_I1 300 1
The table has 50,000 rows, but the index has only 300 entries (that fit in a single leaf block) as the expression is NULL for the other 49,700. As an important bonus it’s fairly safe to assume that there won’t be any code in the system that will decide to use this strangely defined index when it should be using some other index.
Conclusion
We started with a query that was “too slow”. It contained some predicates that would hide any useful statistical information from the optimizer. In two of the three cases we could give the optimizer some useful statistics by creating virtual columns or extended stats on the expressions; and in the third case the structure of the expression was a special “nvl()” case that could simply have been missing a histogram on the underlying column.
Some of the information in the execution plan gave us clues about the scale of the tables – including the table which was accessed by a unique index – but on-site DBAs wouldn’t need to make guesses about some of the numbers I came up with, they could simply query the data dictionary or, for some details, query the data directly.
Reviewing table and column names, and making some assumptions about some of the data distributions (again details that could be extracted from the actual data), I picked a path that would probably be suitable if the required volume of data was relatively small, then demonstrated how we could add in an efficient index that would make this query as efficient as needed without requiring much overhead in index maintenance and without introducing the risk of other queries changing execution plans to use this new index.