Aliases

May 2, 2017

Aliases

Filed under: Execution plans,Oracle,Troubleshooting — Jonathan Lewis @ 9:23 am BST May 2,2017

Here’s a performance problem that came up on OTN recently. The following query (reformatted) takes “ages” to run – how do you address the problem:

SELECT
	COUNT(*) 
FROM
	smp_dbuser2.workflow_step_report
WHERE
	report_ID IN (
		SELECT	report_id
		FROM	smp_dbuser2.workflow_report
		WHERE	trunc(start_time) = '28-Apr-2017'
		AND	user_id = 'nbi_ssc'
	)
;

Various pieces of relevant information were supplied (the workflow_report table holds 1.4M rows the workflow_step_report table holds 740M rows and some indexes were described), but most significantly we were given the execution plan:

--------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                      |     1 |     6 |    10T  (1)|999:59:59 |       |       |
|   1 |  SORT AGGREGATE       |                      |     1 |     6 |            |          |       |       |
|*  2 |   FILTER              |                      |       |       |            |          |       |       |
|   3 |    PARTITION HASH ALL |                      |   731M|  4187M|  5363K  (1)| 17:52:47 |     1 |   128 |
|   4 |     TABLE ACCESS FULL | WORKFLOW_STEP_REPORT |   731M|  4187M|  5363K  (1)| 17:52:47 |     1 |   128 |
|*  5 |    FILTER             |                      |       |       |            |          |       |       |
|   6 |     PARTITION HASH ALL|                      |     2 |    38 | 14161   (1)| 00:02:50 |     1 |    32 |
|*  7 |      TABLE ACCESS FULL| WORKFLOW_REPORT      |     2 |    38 | 14161   (1)| 00:02:50 |     1 |    32 |
--------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( EXISTS (SELECT 0 FROM "SMP_DBUSER2"."WORKFLOW_REPORT" "WORKFLOW_REPORT" WHERE :B1=:B2
              AND "USER_ID"='nbi_ssc' AND TRUNC(INTERNAL_FUNCTION("START_TIME"))=TO_DATE(' 2017-04-28 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
   5 - filter(:B1=:B2)
   7 - filter("USER_ID"='nbi_ssc' AND TRUNC(INTERNAL_FUNCTION("START_TIME"))=TO_DATE(' 2017-04-28
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

You’ll notice that the optimizer has transformed the IN subquery into an EXISTS subquery – operation 2 is a FILTER operation, and you can see in the Predicate Information section that the filter() predicate at operation 2 shows the existence subquery that would be executed.

If you look carefully at the execution plan (all of it), what can you deduce from it ? What, then, should be your next step in dealing with this performance problem ?

Observations

I did emphasise the need to examine all of the execution plan – and the important feature appears not in the body of the plan but in the Predicate Information section.

The body tells us that Oracle has executed the query with a FILTER subquery approach, and we can see that the IN subquery has been transformed into an EXISTS subquery. In many cases Oracle could unnest the subquery and turn it into a join (often a semi-join) but it hasn’t in this case and we might wonder why not. Look closely at the text given for the subquery in the filter() predicate:


SELECT  0 
FROM   "SMP_DBUSER2"."WORKFLOW_REPORT" "WORKFLOW_REPORT" 
WHERE  :B1=:B2
AND    "USER_ID"='nbi_ssc' 
AND    TRUNC(INTERNAL_FUNCTION("START_TIME"))=TO_DATE(' 2017-04-28 00:00:00','syyyy-mm-dd hh24:mi:ss')

When an IN subquery is transformed into an EXISTS subquery we usually see: “outer.column in (select inner.column …)” turning into a predicate in the existence subquery of the form “inner.column = :B1”, so why does this example show “:B1 = :B2” when we’re expecting to see something like “workflow_report.report_id = :B1” ?

The (obvious, if you know your optimizer) answer is that there is no column report_id in table workflow_report but “column capture” means the optimizer has assumed that report_id in the subquery refers to workflow_step_report.report_id – hence “:B1 = :B2”. The consequence of this strange predicate is that the subquery may execute once for every row in the outer table (though scalar subquery caching may reduce the number of executions) performing a tablescan as it does so.

The correct next step is to check whether this was a simple typing error – the query as it stands is valid but not very sensible so what was it supposed to be doing.

It turned out that there was a column workflow_report.id, and that was the column that should have been selected in the subquery. (The OP also changed the trunc(start_date) to use a carefully constructed range-based clause – but that wasn’t really terribly important; and several people noted that some efficiency could be gained through suitable indexes – but that was missing the important point.)

Here’s the new query, with execution plan:


SELECT  COUNT(*) 
FROM    smp_dbuser2.workflow_step_report    wsr
WHERE   wsr.report_ID IN (
                SELECT  wr.id
                FROM    smp_dbuser2.workflow_report    wr
                WHERE   wr.start_time >= to_date( '28-Apr-2017','dd-mon-yyyy') 
                and     wr.start_time <  to_date( '28-Apr-2017','dd-mon-yyyy') + 1
                AND     wr.user_id = 'nbi_ssc'
        )
;

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name                           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                                |     1 |    31 | 22760   (1)| 00:04:34 |       |       |
|   1 |  SORT AGGREGATE           |                                |     1 |    31 |            |          |       |       |
|   2 |   NESTED LOOPS            |                                | 34458 |  1043K| 22760   (1)| 00:04:34 |       |       |
|   3 |    PARTITION HASH ALL     |                                |    72 |  1800 | 22759   (1)| 00:04:34 |     1 |    32 |
|*  4 |     TABLE ACCESS FULL     | WORKFLOW_REPORT                |    72 |  1800 | 22759   (1)| 00:04:34 |     1 |    32 |
|   5 |    PARTITION HASH ITERATOR|                                |   477 |  2862 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |     INDEX RANGE SCAN      | WORKFLOW_STEP_REPORT_REPORT_ID |   477 |  2862 |     1   (0)| 00:00:01 |   KEY |   KEY |
----------------------------------------------------------------------------------------------------------------------------

The modified query completed in 5 seconds – presumably because there were only a few relevant rows in the workflow_report table and the related workflow_step_report rows were well clustered and accessible through a suitable “foreign key” index (there’s also a hint in the plan costs that the partitioning column for workflow_step_report is the report_id)

The final point to note about the rewritten query is the use of table aliases – both tables have a short alias (wsr and wr), and every column is qualified by its table alias. If this approach had been taken in the original code then the attempt to run it would have resulted in an error like:

ERROR at line 7:
ORA-00904: "WR"."REPORT_ID": invalid identifier

Always give every table (or view) an alias.
Keep aliases short
Qualify every column with its table (or view) alias

Update – 11th July 2017

Here’s a worse example of the same failure – deleting everything from a table because you didn’t use aliases properly.

Comments (19)

19 Comments »

truncate table drop storage … for both tables
the trunc(date_column) is a direct hint for this :)

should give an immediate performance boost to sub second response

most reports are probably not used anyway
oh … and keep the backup close, just in case there are still some reports being used

if the users complain, slap them with some bind variables and indexes

Comment by Geert — May 2, 2017 @ 10:22 am BST May 2,2017 | Reply
Whithout and physical change on the tables, would try to transform the ‘exists’ subquery in a hash sj, examine any existing indexes on the user_id and start_time

Comment by attagianno — May 2, 2017 @ 10:26 am BST May 2,2017 | Reply
- attagianno,
  
  Checking why the subquery hasn’t unnested to a join, and whether that might help, is usually a good idea.
  
  Comment by Jonathan Lewis — May 8, 2017 @ 8:09 am BST May 8,2017 | Reply
Seems both tables have the same columns, so give them aliases.

Comment by ricsif — May 2, 2017 @ 12:41 pm BST May 2,2017 | Reply
- ricsif,
  
  Certainly a step anyone should take to eliminate accidents – assuming they used the aliases on every column
  
  Comment by Jonathan Lewis — May 8, 2017 @ 8:10 am BST May 8,2017 | Reply
Hi Jonathan.
You gave us a huge clue in the post title.
workflow_report doesn’t have a column named REPORT_ID, does it?

Comment by Oren Nakdimon (@DBoriented) — May 2, 2017 @ 12:57 pm BST May 2,2017 | Reply
- Oren,
  
  That’s the critical problem.
  
  Comment by Jonathan Lewis — May 8, 2017 @ 8:11 am BST May 8,2017 | Reply
I think Oren is right. Filter on the table that checks for existence on WORKFLOW_REPORT is :B1 = :B2, so I would think the REPORT_ID is not on the WORKFLOW_REPORT table.

Comment by Kaley — May 2, 2017 @ 1:19 pm BST May 2,2017 | Reply
- Kaley,
  
  Correct, the predicate section tells us what the problem is: there’s no predicate involving column report_id in the execution plan’s predicate section, so we ought to ask why not.
  
  Comment by Jonathan Lewis — May 8, 2017 @ 8:13 am BST May 8,2017 | Reply
maybe there is an index on start_time, so first thing I expect to help would be to get rid of the trunc
like so:
start_time between TO_DATE(‘2017-04-28 00:00:00’, ‘yyyy-mm-dd hh24:mi:ss’) and TO_DATE(‘2017-04-29 00:00:00’, ‘yyyy-mm-dd hh24:mi:ss’)

If needed create an index there.

Comment by Gerrit Haase — May 2, 2017 @ 4:03 pm BST May 2,2017 | Reply
2nd, why not use a join instead of the subselect?

select COUNT(report_id)
FROM
smp_dbuser2.workflow_step_report wsr, smp_dbuser2.workflow_report wr
WHERE
wsr.report_ID = wr.report_ID
AND user_id = ‘nbi_ssc’
AND start_time between TO_DATE(‘2017-04-28 00:00:00’, ‘yyyy-mm-dd hh24:mi:ss’) and TO_DATE(‘2017-04-29 00:00:00’, ‘yyyy-mm-dd hh24:mi:ss’)
;

Comment by Gerrit Haase — May 2, 2017 @ 4:08 pm BST May 2,2017 | Reply
Hi Jonathan,

As you already said that table WORKFLOW_REPORT has 2M rows. Here in the execution plan, at step 7 optimizer is performing FULL TABLE SCAN and expecting only 2 rows as a result. We can improve the query performance by creating an index on user_id column and index on start_time column should be used if any exists.

Thanks
Deepak Bhatnagar

Comment by Deepak Bhatnagar — May 2, 2017 @ 4:38 pm BST May 2,2017 | Reply
- Deepak,
  
  That’s a good observation to offer as justification for creating a suitable index (and, possibly, modifying the code to get rid of the trunc()) , but the cardinality estimate in this case is misleading.
  
  That part of the plan is implementing an existence subquery, and an existence subquery is optimized using first_rows(1) optimisation. We can still infer that there aren’t many rows that match the predicates because the optimizer thinks it will have to do a large part of the full tablescan before it finds the first one, but even that assumption is a little suspect because of the default selectivity of “function(col) = constant”.
  
  Comment by Jonathan Lewis — May 8, 2017 @ 8:21 am BST May 8,2017 | Reply
I’d rewrite this using the WITH syntax to force the subquery on the workflow_report table to be done once and only once to meet its where clause requirements

Comment by Bob — May 2, 2017 @ 6:06 pm BST May 2,2017 | Reply
“… Always give every table (or view) an alias. … ”

An alias is not necessary in order to completely qualify a column name, one can simply use table_name.column_name or view_name.column_name (without aliasing).
But I agree, always qualify every column.

Comment by Matthias Rogel — May 2, 2017 @ 7:03 pm BST May 2,2017 | Reply
- Matthias,
  
  “one can simply use …”
  
  True but, as the linked article shows, the resulting SQL can become very hard to read, especially if you have silly naming conventions and go for the full: schema_name.object_name.column_name.
  
  Comment by Jonathan Lewis — May 8, 2017 @ 8:25 am BST May 8,2017 | Reply
  - Slight correction: schema_name.object_name.column_name is not valid. It was previously allowed but that was fixed in 12.1.0.1. You can use schema_name.object_name.* if you really want to though
    
    Comment by Andrew Sayer — May 10, 2017 @ 12:45 pm BST May 10,2017 | Reply
    - Andrew,
      
      I can’t get hands on a 12.1.0.1, but here’s a cut-n-paste from 12.1.0.2:
      
      SQL> select test_user.t1.object_name 2 from test_user.t1 3 where test_user.t1.object_id = 100 4 ; OBJECT_NAME -------------------- SEQ$ 1 row selected.
      
      Is there a document that says it shouldn’t work, or might it be the case that 12.1.0.1 got broken and 12.1.0.2 fixed?
      
      Comment by Jonathan Lewis — May 10, 2017 @ 12:51 pm BST May 10,2017
    - It was from my interpretation of the select documentation https://docs.oracle.com/database/121/SQLRF/statements_10002.htm#i2126854 and a recollection of seeing some cases on the OTN failing with ORA-904 post upgrade (which of course I’m now failing to find). I probably generalised the issue in my head.
      I think in 11g and earlier you were allowed to do
      Select test_user.t1.object_name
      From t1
      (It silently removed test_user., and would do the same for anything else written there https://community.oracle.com/message/14121580 is an example where that seemed to have happened)
      
      But from 12 onwards you need to identify the schema in the from clause and not give it it’s own alias to be able to use the longer form.
      
      I’ll try and do some playing with an 11g db tonight and correct myself further.
      
      Comment by Andrew Sayer — May 10, 2017 @ 1:23 pm BST May 10,2017