How not to write subqueries:
AND sal.ticket_airline || sal.ticket_number NOT IN ( SELECT sub.dsd_airline || sub.dsd_ticket_number FROM ... )
If there had been any opportunity for the optimizer to do something clever with the NOT IN, you’ve just made sure it can’t happen. On top of that you may find that you don’t get the right results – consider the following cut-n-paste:
SQL> select user from dual where 1 || 23 = 12 || 3; USER ------------------------------ TEST_USER 1 row selected.
Sometimes people simply forget that you can have multiple columns in subqueries (or in “IN Lists”) – so it’s perfectly valid to write the subquery as:
AND (sal.ticket_airline, sal.ticket_number) NOT IN ( SELECT sub.dsd_airline, sub.dsd_ticket_number FROM ... )
It’s quite likely that Oracle will actually turn this into a NOT EXISTS, or ANTI-JOIN, of course. But if it doesn’t do something nice you could try doing a manual rewrite – provided it is actually logically equivalent:
AND not exists ( select null from .... where sub.dsd_airline = sal.ticket_airline and sub.dsd_ticket_number = sal.ticket_number )
Remember: NOT IN may not translate to NOT EXISTS – see also this.
Hi Jonathan,
good point. I think there is a little typo in the last where clause. It should be sal.ticket_airline.
Cheers Mathias
Comment by Mathias Zarick — December 10, 2013 @ 6:52 pm GMT Dec 10,2013 |
Mathias,
Thanks, now fixed.
Comment by Jonathan Lewis — December 10, 2013 @ 7:00 pm GMT Dec 10,2013 |
Dear Jonathan,
Trust you are fine. I was going through one of the queries and below is another interesting case where I believe the optimizer is not able to use Anti Join.
I have tried to make sure the post is with correct formatting, if it is not the case, please just post your comments and I will send you the test case via mail.
This is on 11.2.0.2.0 – 64bit on AIX.
With respect to the above, I have below observations and questions. It would be really great if you could shed some light on the same.
1) I believe in Query1 due to redundant decode() the optimizer was not able to use ANTI join, but in Query2 due to only 1 decode() (which is the way it should have been) the optimizer was able to use ANTI join.
2) In execution plan for Query1, line 4. I am not able to understand why the optimizer shows 1002 in starts and 2 in A-rows.
Thanking you.
Warm Regards,
Vijay Sehgal
Comment by vijay sehgal — December 17, 2013 @ 12:04 pm GMT Dec 17,2013 |
Your assumption is nearly correctly – you don’t get the anti-join because of the OR (rather than because of the decode() of itself). A check of the 10053 trace shows the following:
The second question is a demonstration of subquery caching – you only have two different input values for the first 30,000 rows, and they each find one row result which goes into the subquery cache, then you have 1000 distinct value that return no rows.
Comment by Jonathan Lewis — December 17, 2013 @ 12:53 pm GMT Dec 17,2013 |
Jonathan,
Thanks a lot for your help on the above, I have one more related question.
Is it possible to determine the number of times a filter subquery is executed other than using gather_plan_statisics ?
Warm Regards,
Vijay Sehgal
Comment by vijay sehgal — December 18, 2013 @ 1:32 pm GMT Dec 18,2013 |
Not really – although you could estimate if you knew your data well and had some figures about the total activity of the query.
(The 10046 trace could also help if it dumped STAT lines on every execution, but that’s not much different from using the hint and not so convenient.)
Comment by Jonathan Lewis — December 18, 2013 @ 2:14 pm GMT Dec 18,2013