Oracle Scratchpad

June 22, 2016

Conditional SQL- 6

Filed under: Conditional SQL,Execution plans,Oracle,Performance — Jonathan Lewis @ 2:16 pm BST Jun 22,2016

An odd little anomaly showed up on the OTN database forum a few days ago where a query involving a table covered by Oracle Label Security (OLS) seemed to wrap itself into a non-mergeable view when written using traditional Oracle SQL, but allowed for view-merging when accessed through ANSI standard SQL. I don’t know why there’s a difference but it did prompt a thought about non-mergeable views and what I’ve previously called “conditional SQL” – namely SQL which holds a predicate that should have been tested in the client code and not passed to the database engine.

The thought was this – could the database engine decide to do a lot of redundant work if you stuck a silly predicate inside a non-mergeable view: the answer turns out to be yes. Here’s a demonstration I’ve run on 11g and 12c:


rem
rem     Script:         conditional_fail.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2016
rem

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e5)         n1,
        lpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                  id,
        mod(rownum,1e3)         n1,
        lpad('x',100,'x')       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;
-- gather simple statistics (not needed in 12c) 

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T1',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          => 'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t2_i1 on t2(id);

variable b1 number
variable b2 number

exec :b1 := 1; :b2 := 0

There’s nothing terribly significant about the data, beyond the fact that I’ve got a “small” table and a “large” table that I can use to encourage the optimizer to do a hash join. I’ve also created a couple of bind variables and set them to values that ensure that we can see that b1 is definitely not smaller than b2. So here’s a simple query – with a mergeable inline view in the first instance which is then hinted to make the view non-mergeable.


select
        t1.n1, count(*), sum(v1.n1)
from
        t1,
        (select t2.id, t2.n1 from t2 where :b1 < :b2) v1
where
        t1.n1 = 0
and     v1.id = t1.id
group by
        t1.n1
;

select
        t1.n1, count(*), sum(v1.n1)
from
        t1,
        (select /*+ no_merge */ t2.id, t2.n1 from t2 where :b1 < :b2) v1
where
        t1.n1 = 0
and     v1.id = t1.id
group by
        t1.n1
;

Clearly, for our values of b1 and b2, the query will not return any data. In fact we can go further and say that the presence of the “bind variable predicate” in the inline view either has no effect on the volume of data returned or it eliminates all the data. But the presence of the no_merge hint makes a difference to how much work Oracle does for the “no data” option. Here are the two plans, pulled from the memory of an 11g instance after enabling rowsource execution statistics – first when the view is mergeable:


-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |      1 |        |   508 (100)|      0 |00:00:00.01 |
|   1 |  SORT GROUP BY NOSORT          |       |      1 |      1 |   508   (2)|      0 |00:00:00.01 |
|*  2 |   FILTER                       |       |      1 |        |            |      0 |00:00:00.01 |
|   3 |    NESTED LOOPS                |       |      0 |    100 |   508   (2)|      0 |00:00:00.01 |
|   4 |     NESTED LOOPS               |       |      0 |    100 |   508   (2)|      0 |00:00:00.01 |
|*  5 |      TABLE ACCESS FULL         | T1    |      0 |    100 |   208   (4)|      0 |00:00:00.01 |
|*  6 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |     2   (0)|      0 |00:00:00.01 |
|   7 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |     3   (0)|      0 |00:00:00.01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1<:B2)
   5 - filter("T1"."N1"=0)
   6 - access("T2"."ID"="T1"."ID")

Notice how, despite t2 being the second table in the join, the bind variable predicate has worked its way to the top of the execution plan and execution has terminated after the run-time engine has determined that 1 is not less than zero.

Compare this with the plan when the view is non-mergeable:

------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |      1 |        |  2300 (100)|      0 |00:00:00.01 |    1599 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |         |      1 |      1 |  2300   (4)|      0 |00:00:00.01 |    1599 |       |       |          |
|*  2 |   HASH JOIN            |         |      1 |    100 |  2300   (4)|      0 |00:00:00.01 |    1599 |  2061K|  2061K| 1109K (0)|
|   3 |    JOIN FILTER CREATE  | :BF0000 |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|*  4 |     TABLE ACCESS FULL  | T1      |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |    VIEW                |         |      1 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|*  6 |     FILTER             |         |      1 |        |            |      0 |00:00:00.01 |       0 |       |       |          |
|   7 |      JOIN FILTER USE   | :BF0000 |      0 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2      |      0 |   1000K|  2072   (3)|      0 |00:00:00.01 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)
   6 - filter(:B1<:B2)
   8 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

Thanks to the no_merge hint the bind variable predicate has not been promoted to the top of the plan, so the run-time engine has produced a plan that requires it to access data from table t1 before visiting table t2. In fact the optimizer has decided to do a hash join gathering all the relevant data from t1 and building an in-memory hash table before deciding that 1 is greater than zero and terminating the query.

Be careful if you write SQL that compares bind variables (or other pseudo-constants such as calls to sys_context) with bind variables (etc.); you may find that you’ve managed to produce code that forces the optimizer to do work that it could have avoided if only it had found a way of doing that comparison at the earliest possible moment.

These plans were from 11g, but 12c can behave the same way although, with my specific data set, I had to add the no_push_pred() hint to the query to demonstrate the effect of hash join appearing.

Footnote 1

An interesting side effect of this plan is that it has also allowed a Bloom filter to appear in a serial hash join – not something you’d normally expect to see, so I changed the predicate to :b1 > :b2 to see if the no_merge hint combined with the Bloom filter was faster than merging without the Bloom filter. Here are the two plans, pulled from memory. Running this on 11g I had to hint the hash join when I removed the no_merge hint:


Non-mergeable view - Bloom filter appears
------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |      1 |        |  2300 (100)|      1 |00:00:00.12 |   17725 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |         |      1 |      1 |  2300   (4)|      1 |00:00:00.12 |   17725 |       |       |          |
|*  2 |   HASH JOIN            |         |      1 |    100 |  2300   (4)|    100 |00:00:00.12 |   17725 |  2061K|  2061K| 1106K (0)|
|   3 |    JOIN FILTER CREATE  | :BF0000 |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|*  4 |     TABLE ACCESS FULL  | T1      |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |    VIEW                |         |      1 |   1000K|  2072   (3)|   1605 |00:00:00.10 |   16126 |       |       |          |
|*  6 |     FILTER             |         |      1 |        |            |   1605 |00:00:00.09 |   16126 |       |       |          |
|   7 |      JOIN FILTER USE   | :BF0000 |      1 |   1000K|  2072   (3)|   1605 |00:00:00.08 |   16126 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2      |      1 |   1000K|  2072   (3)|   1605 |00:00:00.07 |   16126 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)
   6 - filter(:B1>B2)
   8 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."ID"))

View merging allowed - no Bloom filter
-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |  2300 (100)|      1 |00:00:07.56 |   17725 |       |       |          |
|   1 |  SORT GROUP BY NOSORT|      |      1 |      1 |  2300   (4)|      1 |00:00:07.56 |   17725 |       |       |          |
|*  2 |   FILTER             |      |      1 |        |            |    100 |00:00:07.56 |   17725 |       |       |          |
|*  3 |    HASH JOIN         |      |      1 |    100 |  2300   (4)|    100 |00:00:07.56 |   17725 |  2061K|  2061K| 1446K (0)|
|*  4 |     TABLE ACCESS FULL| T1   |      1 |    100 |   208   (4)|    100 |00:00:00.01 |    1599 |       |       |          |
|   5 |     TABLE ACCESS FULL| T2   |      1 |   1000K|  2072   (3)|   1000K|00:00:01.94 |   16126 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1>:B2)
   3 - access("T2"."ID"="T1"."ID")
   4 - filter("T1"."N1"=0)

Things to note – the Bloom filter eliminated all but 1,605 rows from the tablescan before passing them to the hash join operation to probe the hash table; the run time of the query without filtering was 7.56 seconds (!) compared to 0.12 seconds with the Bloom filter. Fortunately I decided that this was too good to be true BEFORE I published the results and decided to re-run the tests with statistics_level set back to typical and found that most of the difference was CPU time spent on collecting rowsource execution statistics. The query with the Bloom filter was still faster, but only just – the times were more like 0.09 seconds vs. 0.12 seconds.

Footnote 2

The source of the problem on OTN was that as well as using OLS the query in question included a user-defined function. Since you can write a user-defined function that “spies” on the data content and uses (e.g.) dbms_output to write data to the terminal this poses a security risk; if a predicate calling that function executed before the security predicate had been tested then your function could output data that your query shouldn’t be able to report. To avoid this security loophole Oracle restricts the way it merges views (unless you set optimizer_secure_view_merging to false). For a more detailed explanation and demonstration of the issues, see this item on Christian Antognini’s blog.

I still don’t know why the ANSI form of the query managed to bypass this issue, but the predicate with the user-defined function was applied as a filter at the very last step of the plan, so perhaps there was something about the transformation Oracle took to get from ANSI syntax to its internal syntax (with cascading lateral views) that made it possible for the optimizer to recognize and eliminate the security threat efficiently.

January 2, 2014

Conditional SQL – 4

Filed under: Conditional SQL,Execution plans,Oracle — Jonathan Lewis @ 6:14 pm BST Jan 2,2014

This is one of those posts where the investigation is left as an exercise – it’s not difficult, just something that will take a little time that I don’t have, and just might end up with me chasing half a dozen variations (so I’d rather not get sucked into looking too closely). It comes from an OTN question which ends up reporting this predicate:

WHERE ( LENGTH ( :b7) IS NULL OR
         UPPER (TRIM (CODFSC)) = UPPER (TRIM ( :b8)) or
         UPPER (TRIM (CODUIC)) = UPPER (TRIM ( :b9)))
       AND STATE = 0;

The three bind variables all hold the same value; there is a function-based index on upper(trim(codfsc)), and another on upper(trim(coduic)). The execution plan for this query is a full tablescan, but if you eliminate the first predicate Oracle can do a concatenation of two index range scans. This variation doesn’t surprise me, the optimizer’s ability to introduce concatenation is limited; however, I did wonder whether some small variation in the SQL would allow the optimizer to get just a little more clever.

Would you get concatenation if you changed the first predicate to (:b7 is null); if not, would a similar query that didn’t depend on function-based indexes do concatenation; if not is there any rewrite of this query that could do a tablescan ONLY for the case where :b7 was null ?

Demonstrations of any levels of success can be left in the comments if anyone’s interested. To get a fixed font that preserves space start the code with “sourcecode” and end with “/sourcecode” (removing the quotation marks and replacing them with square brackets).

August 13, 2013

NVL()

Filed under: CBO,Conditional SQL,Execution plans,NULL,Oracle — Jonathan Lewis @ 7:14 am BST Aug 13,2013

Here’s a little detail about how the optimizer can handle the nvl() function that I hadn’t noticed before (and it goes back to at least 8i). This is running on 11.2.0.3, and table t1 is just all_objects where rownum <= 20000:

(more…)

October 31, 2010

Conditional SQL – 3

Filed under: CBO,Conditional SQL,Execution plans,Tuning — Jonathan Lewis @ 5:54 pm BST Oct 31,2010

I’ve spent the last week in Spain – sightseeing, rather than working – with a minimum amount of access to the Internet.

Inevitably I now have to work hard to catch up with my email. As a moment of light relief in an otherwise irritating chore I thought I’d respond to an emailed request for help. (Regular readers of the blog will know that I don’t usually respond to private email requests for solutions, but sometimes someone gets lucky.)

The question was basically this:  why do I get different execution plans for the following two statements:

(more…)

February 14, 2007

Conditional SQL (2)

Filed under: CBO,Conditional SQL,Execution plans,Indexing,Tuning — Jonathan Lewis @ 8:48 am BST Feb 14,2007

Some time ago I wrote a note on conditional SQL, commenting on code with predicates like:

         where	t1.n1 = nvl(:n1,t1.n1)

In that article, I showed how Oracle can transform a query of this type using concatenation so that at run time one of two possible execution paths can be taken. But, as often happens,  a simple example leads to more complex questions.

What happens if you have several similar predicates in your query – does Oracle keep doubling the number of possible execution paths for the query: if so, how far does it go, if not are there any side effects to consider? What happens with more complex queries, including joins.
(more…)

January 9, 2007

Conditional SQL

Filed under: CBO,Conditional SQL,Execution plans — Jonathan Lewis @ 7:57 pm BST Jan 9,2007

Here’s a simplified example of the type of SQL that I often used to see coming from screen generators:

select 	*
from 	t1
where	t1.n1 = nvl(:n1,t1.n1)
;

This code is basically trying to say: “if the user supplies an input value for some on-screen field then return data for that value, otherwise return all the data unconditionally.”
(more…)

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 6,549 other followers