The following question appeared on the Oracle Forums recently:
The use of functions – a function with other selects (eg. calculate availability of a part) – is slowing down our system when we do a select over our product file.
Is there some kind of rule when functions should be used or when we should try to create a more complex – combined – SQL that does not use the function.
Can functions be used in the where clause without loosing a lot of speed?
It’s a really good question, because it prompts such a lot of ideas that need to be tied together, so I thought I’d jot down a few thoughts.
First off – I’m not too surprised to hear that the function slows things down during a “select over the product table”. It’s relatively easy to get into situations where a function is being called once for every single row being returned – or worse, once for every single row examined. Even if the function is extremely efficient it’s still possible for the total resource usage to be huge.
Thoughts, then, about pl/sql functions that call SQL – make sure that the SQL is as efficient as possible – which means ensuring that it doesn’t even run unless necessary!
A standard optimisation method is to ‘remember’ the last set of values for the input parameters and output values so that you don’t have to run the SQL if the latest inputs are the same as the previous inputs. (Packaged global variables help). Extensions on this idea include building arrays of input and output values if the number of possible inputs isn’t too big.
If you have to call the SQL, you may be able to write it with stacks of hints so that Oracle always takes the most efficient path. (After all, it’s your function, you know how it’s supposed to be used – if you’re lucky there will be only one efficient option).
Of course, 10g release 2 has made deterministic functions work – nearly – so you may not have to worry too much about getting clever inside the PL/SQL. Another “near” solution to minimising the resource threats from functions without getting clever in the PL/SQL is to wrap them inside “scalar subqueries” – which will invoke ‘scalar subquery caching’ (which I have mentioned on my website in the past) in earlier versions of Oracle.
For example, change:
select func1(t1.owner) from t1 ;
to the slightly counter-intuitive:
select (select func1(t1.owner) from dual) from t1 ;
This is not guaranteed to reduce the number of times the function is called – there is a limit on the number of input and output values cached (for more details see Chapter 9 of Cost Based Oracle) but it may help if there are between a few dozen and a couple of hundred input values. (The cache is a fairly small hash-table – 256 or 1024 values for numbers, depending on Oracle versions – and hash collisions will occur).
This brings us to the last part of the question – what happens if the function is in the where clause? The results can be disastrous, although you may be rescued by the scalar subquery method. At one site I visited recently, a statspack snapshot of one hour showed that the most frequently executed statement has been called 35 million times; the next statement in the list had been executed just 35,000 times. That’s a good clue that the critical statement is in a function that appears in a where clause – and the table in question has been hit with a full tablescan.
If scalar subquery caching can’t help then you probably do need to take the statement out of the function and embed it in the main statement. There’s no telling what the best strategy may be – but you may find that the approach of modifying the statement from the function to access the entire data set and produce a result table – either with a no_merge view, or a materialized factored subquery – is a good starting step – then join this result into the original statement.
I’m going to leave it at that and see if anyone else wants to throw in other “generic” solutions for taking SQL out of functions.