Oracle Scratchpad

April 3, 2014

Cache anomaly

Filed under: Bugs,Oracle,Performance — Jonathan Lewis @ 1:27 pm BST Apr 3,2014

Just a quick heads-up for anyone who likes to play around with the Keep and Recycle caches.

In 11g Oracle introduced the option for serial direct path reads for tablescans on tables that was sufficiently large – which meant more than the small_table_threshold – provided the table wasn’t already sufficient well cached.  (The rules mean that the choice of mechanism can appear to be a little random in the production environment for tables that are near the threshold size – but if you try testing by doing “alter system flush buffer_cache” you find that you always get direct path reads in testing.)

I’ve just discovered a little oddity about this, though.  I have a table of about 50MB which is comfortably over the threshold for direct path reads. But if I create a KEEP cache (db_keep_cache_size) that is a little larger than the table and then assign the table to the KEEP cache (alter table xxx storage(buffer_pool keep)) then 11.2.0.4 stops doing direct path reads, and caches the table.

Now this doesn’t seem unreasonable – if you’ve assigned an object to the KEEP cache you probably want it (or once wanted it) to be kept in cache as much as possible; so using the KEEP cache if it’s defined and specified makes sense. The reason I mention this as an oddity, though, is that it doesn’t reproduce in 11.1.0.7.

I think I saw a bug note about this combination a few months ago- I was looking for something else at the time and, almost inevitably, I can’t find it when I want it – but I don’t remember whether it was the 11.1 or 11.2 behaviour that was deemed to be correct.

 Update

See comments 1 and 2 below.  I’ve written about this previously, and the caching bechaviour is the correct behaviour. The patch is in 11.2.0.2 and backports are available for 11.1.0.7 and 11.2.0.1. The patch ensures that the table will be read into the cache if it is smaller than the db_keep_cache_size.  (Although we might want to check – see Tanel’s notes – whether this is based on the high water mark recorded in the segment header or on the optimizer stats for the table; and I suppose it might be worth checking that the same feature applies to index fast full scans). From the MoS document:

With the bug fix applied, any object in the KEEP buffer pool, whose size is less than DB_KEEP_CACHE_SIZE, is considered as a small or medium sized object. This will cache the read blocks and avoid subsequent direct read for these objects.

 

 

April 2, 2014

Easy – Oops.

Filed under: Bugs,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 7:47 pm BST Apr 2,2014

A question came up on OTN today asking for suggestions on how to enforce uniqueness on a pair of columns only when the second column was not null. There’s an easy and obvious solution – but I decided to clone the OP’s example and check that I’d typed my definition up before posting it; and the result came as a bit of a surprise. Here’s a demo script (not using the OP’s table):


create table t1  
(  
	col1	int not null,
	col2	varchar2(1)
);  

create unique index t1_i1 on t1( 
--	case col2 when null then cast(null as int) else col1 end,
--	case when col2 is null then cast(null as int) else col1 end,
	case when col2 is not null then col1 end,
	col2
)
;

insert into t1 values(1,null);
insert into t1 values(1,null);
insert into t1 values(1,'x');
insert into t1 values(1,'y');
insert into t1 values(1,'y');

commit;

column ind1_is   format a5
column ind1_when format 9999

set null N/A

select
	case when col2 is null then cast (null as int) else col1 end	ind1_is,
	case col2 when null then cast (null as int)  else col1 end	ind1_when
from 
	t1
;

The strategy is simple, you create a unique function-based index with two columns; the first column of the index id defined to show the first column of the table if the second column of the table is not null, the second column of the index is simply the second column of the table. So if the second column of the table is null, both columns in the index are null and there is no entry in the index; but if the second column of the table is not null then the index copies both columns from the table and a uniqueness test applies.

Based on the requirement and definition you would expect the first 4 of my insert statements to succeed and the last one to fail. The index will then have two entries, corresponding to my 3rd and 4th insertions.

I’ve actually shown three ways to use the case statement to produce the first column of the index. The last version is the cleanest, but the first option is the one I first thought of – it’s virtually a literal translation the original requirement. The trouble is, with my first definition the index acquired an entry it should not have got, and the second insert raised a “duplicate key” error; the error didn’t appear when I switched the syntax of the case statement to the second version.

That’s why the closing query of the demo is there – when you run it the two values reported should be the same as each other for all four rows in the table – but they’re not. This is what I got on 11.2.0.4:


IND1_IS IND1_WHEN
------- ---------
N/A             1
N/A             1
      1         1
      1         1


I’m glad I did a quick test before suggesting my original answer.

Anyone who has other versions of Oracle available is welcome to repeat the test and report back which versions they finding working correctly (or not).

Update

It’s not a bug (see note 2 below from Jason Bucata), it’s expected behaviour.

 

March 20, 2014

RLS bug

Filed under: Bugs,Infrastructure,Oracle — Jonathan Lewis @ 1:21 pm BST Mar 20,2014

RLS – row level security, aka VPD (virtual private database) or FGAC (fine grained access control) has a critical bug in 11g. The bug is unpublished, but gets mentioned in various other documents, so can be identified as: Bug: 7828323 “SYS_CONTEXTS RETURNS WRONG VALUE WITH SHARED_CONTEXT_SENSITIVE”

The title tells you nearly everything you need to know – if you’ve declared a security policy as context_sensitive or shared_context_sensitive then a change to the context ought to result in the associated predicate function being called to generate a new security predicate the next time the policy becomes relevant. Thanks to bug 7828323 this doesn’t always happen – so queries can return the wrong set of results.

There are some patches for older versions (11.1.0.7 and 11.2.0.2 as far as I’ve checked), but if you don’t have, or can’t get, a patch the “workaround” is to change any relevant policies to dynamic; unfortunately the consequence of this is that the predicate function will then be called for every execution of any statement against any objects protected by that policy.

Depending on how your application has been written and how many queries are likely to invoke security policies this could easily increase your CPU usage by several percent (and if it’s a badly written application maybe a lot more).

Footnote:

It has occurred to me to wonder what happens if you have use a (normal) pl/sql function in a select list, and the function executes a statement against a table, and the table is protected by a context_sensitive security policy – and you decide to use the pl/sql result cache on the function. How long is an item supposed to stay in the result cache, and if it’s longer than a single execution of a statement will the result cache be invalidated if your context changes in a way that invalidates the current security predicate ? No time to check or test at present, though, but I’d be very cautious about putting RLS predicate functions into the result cache until I’ve played around with that idea for a bit.

March 13, 2014

Shrink Space

Filed under: Bugs,deadlocks,fragmentation,Index Rebuilds,Indexing,Locks,Oracle — Jonathan Lewis @ 7:08 am BST Mar 13,2014

Here’s a lovely effect looking at v$lock (on 11.2.0.4)

select
        sid, type, id1, id2, lmode, request, ctime, block
from
        V$lock
where   sid in (
                select  sid
                from    V$session
                where   username = 'TEST_USER'
        )
order by
        sid, type desc
;

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- -- ---------- ---------- ---------- ---------- ---------- ----------
       145 TX     589833       7518          6          0        489          0
           TM      92275          0          2          6        489          1
           SK          7   25165955          6          0        489          0
           AE        100          0          4          0        582          0

       148 TX     524308       7383          6          0        490          0
           TM      92275          0          2          6        490          1
           SK          7   25173379          6          0        490          0
           AE        100          0          4          0        998          0

You’ll notice I’ve got two sessions holding a TM lock on the same table (object 92275) in mode 2 (sub-share) and waiting for an exclusive lock on the same table. Both sessions are recording the fact that they are blocking something else. You’ll have trust me when I say there are no other user sessions on the system at this point, and none of the background sessions is doing anything with that table.

The clue to what’s happening is the SK lock – it’s the “segment shrink” lock. I had two sessions start an “alter index I_n shrink space” (two different indexes on the same table) at the same time. The problem is that “shrink space” without the “compact” tries to drop the highwater mark on the index’ allocated space after completing the shrink phase – and there’s a defect in the internal code that tries to get the required exclusive lock on the underlying table: it doesn’t seem to allow for all the possible ways you can fail to get the lock. If you look at v$session_wait_history for either of these sessions, you’ll see something like the following:


  SID  SEQ#     EVENT# EVENT                            P1         P2         P3  WAIT_TIME WAIT_TIME_MICRO TIME_SINCE_LAST_WAIT_MICRO
----- ----- ---------- ------------------------ ---------- ---------- ---------- ---------- --------------- --------------------------
  145     1        798 Wait for shrink lock              0          0          0         10           99954                      70137
          2        235 enq: TM - contention     1414332422      92275          0        600         6002022                        101
          3        798 Wait for shrink lock              0          0          0         10          100723                      69335
          4        235 enq: TM - contention     1414332422      92275          0        600         6001589                         81
          5        798 Wait for shrink lock              0          0          0         10          100500                      69542
          6        235 enq: TM - contention     1414332422      92275          0        600         6002352                         86
          7        798 Wait for shrink lock              0          0          0         10          100618                      69145
          8        235 enq: TM - contention     1414332422      92275          0        600         6001545                        144
          9        798 Wait for shrink lock              0          0          0         10          100996                      69445
         10        235 enq: TM - contention     1414332422      92275          0        600         6002744                        310

The attempt to acquire the TM enqueue (lock) times out every three seconds – and I think the session then releases and re-acquires the SK lock before trying to re-acquire the TM lock – and it’s never going to get it.

The alert log, by the way, looked like this:


Wed Mar 12 12:53:27 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
Wed Mar 12 12:53:30 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
Wed Mar 12 12:53:40 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.
Wed Mar 12 12:53:43 2014
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_1832.trc.
ORA-00060: Deadlock detected. More info in file /u01/app/oracle/diag/rdbms/test/test/trace/test_ora_2242.trc.

I’ve said it before, and I keep repeating it when people say “Oracle resolves deadlocks automatically”: Oracle does NOT resolve deadlocks automatically – one of the sessions will rollback its last DML statement to clear the deadlock, but the other session will (almost invariably) still be waiting. It’s up to the application to do something sensible to resolve the deadlock after it receives the ORA-00060 error.

Don’t ask about the trace files – but they had both reached 400MB by the time I finished this note.

Strategy Note

If you are going to shrink objects, it’s probably best to do it in two steps: “shrink space compact, followed by “shrink space”.
If you’re going to try to use multiple sessions to shrink several indexes as quickly as possible, make sure there’s no way that two sessions can try to “shrink space” on the indexes on the same table at the same time.

Footnote:

This article was prompted by the recent appearance of bug 18388128 on MoS.

March 7, 2014

Subquery Anomaly

Filed under: Bugs,CBO,Execution plans,Oracle,Upgrades — Jonathan Lewis @ 8:57 am BST Mar 7,2014

Here’s an oddity that appeared on the OTN database forum last night:

We have this query in our application which works fine in 9i but fails in 11gR2 (on Exadata) giving an “ORA-00937: not a single-group group function” error….

… The subquery is selecting a column and it doesn’t have a group by clause at all. I am not sure how is this even working in 9i. I always thought that on a simple query using an aggregate function (without any analytic functions / clause), we cannot select a column without having that column in the group by clause. So, how 11g behaves was not a surprise but surprised to see how 9i behaves. Can someone explain this behaviour?

The poster supplied the suspect query, and it certainly looked as if it should never have worked – but I took a guess that the optimizer was doing some sort of transformation that concealed the problem before the optimizer managed to see the error. The subquery was a little odd because it was doing something it didn’t need to do, and my was guess that the optimizer had recognised the option to simplify the query and the simplification had “accidentally” removed the error. This turned out to be correct, but my guess about exactly what had happened to hide the error was wrong.

Having created a hypothesis I couldn’t resist checking it this morning, so here’s the test case (don’t pay any attention to the actual data I’ve generated, it was a cut-n-paste from a script that I had previously used for something completely different):

create table t1
as
select
	trunc((rownum-1)/15)	n1,
	trunc((rownum-1)/15)	n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

create table t2
as
select
	mod(rownum,200)		n1,
	mod(rownum,200)		n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

begin
 	dbms_stats.gather_table_stats(
 		user,
 		't1',
 		method_opt => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		user,
		't2',
		method_opt => 'for all columns size 1'
	);
end;
/

explain plan for
select
	/*+ qb_name(main) */
	*
from t1
where (n2,n1) in (
	select /*+
			qb_name(subq)
			unnest
		*/
		max(t2.n2), t2.n1
	from t2
	where t2.n1 = t1.n1
)
;

You’ll notice, of course, that I don’t have a group by clause at all, so the presence of the t2.n1 in the select list should lead to Oracle error: “ORA-00937: not a single-group group function”.

In versions from 8i to 11.1.0.7, this query could run, and its execution plan looked looked like this:


----------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |   200 | 45200 |    46 |
|*  1 |  HASH JOIN           |         |   200 | 45200 |    46 |
|   2 |   VIEW               | VW_SQ_1 |   200 |  7800 |    31 |
|   3 |    HASH GROUP BY     |         |   200 |  2400 |    31 |
|   4 |     TABLE ACCESS FULL| T2      |  3000 | 36000 |    14 |
|   5 |   TABLE ACCESS FULL  | T1      |  3000 |   547K|    14 |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("N2"="MAX(T2.N2)" AND "N1"="N1" AND "ITEM_1"="T1"."N1")

Notice how the optimizer has produced an inline view (VW_SQ_1) from the subquery, using it to drive a hash join; notice how that inline view has an aggregation operation (HASH GROUP BY) in it. In effect the optimizer has rewritten my query like this:

select
	t1.*
from	(
		select
			distinct max(t2.n2) max_n2, t2.n1 item_1, t2.n1
		from	t2
		group by
			t2.n1
	)	vw_sq_1,
	t1
where
	t1.n2 = vw_sq_1.max_n2
and	t1.n1 = vw_sq_1.n1
and	t1.n1 = vw_sq_1.item_1
;

There’s a clue about why this succeeded in the 10053 trace file, which includes the lines:

"Subquery Unnesting on query block SEL$1 (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block SEL$1 (#1).
SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:   Passed validity checks.

Compared to the 11.2 lines:

Subquery Unnesting on query block MAIN (#1)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block MAIN (#1).
SU:   Checking validity of unnesting subquery SUBQ (#2)
SU:     SU bypassed: Failed basic validity checks.
SU:   Validity checks failed.

Whatever check it was that Oracle introduced in 11.2 (maybe a check that the query block was inherently legal), unnesting failed – and if I add an /*+ no_unnest */ hint to the original subquery in the earlier versions of Oracle I get the expected ORA-00937.

The philosophical argument is left to the reader: was the original behaviour a bug, or is the new behaviour the bug ?

 

March 3, 2014

Flashback Fail ?

Filed under: Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 4:19 pm BST Mar 3,2014

Sitting in an airport, waiting for a plane, I decided to read a note (pdf) about Flashback data archive written by Beat Ramseier from Trivadis.  I’d got about three quarters of the way through it when I paused for thought and figured out that on the typical database implementation something nasty is going to happen after approximately 3 years and 9 months.  Can you guess why ?

It’s all about smon_scn_time – which normally records one row every five minutes (created by smon) with a continuous cycle of 24 hours – typically giving you about 1,440 rows in the table. The table is in a cluster, and the cluster key is the instance (thread) number. Clearly this was originally a clever idea from someone who realised that a cluster key of thread number would be beneficial if you had a RAC system with multiple instances – each instance gets its own blocks and the data for any one instance is as well clustered as possible.

The trouble is, when you enable flashback data archive smon no longer sticks to a 24 hour cycle, it just keeps adding rows. Now on my 8KB block tablespace I see 6 rows per block in the table/cluster – which means I get through 48 blocks per days,  17,520 blocks per year, and in 3 years and 9 months I’ll get to roughly 65,700 blocks – and that’s the problem.  An index entry in a cluster index points to a chain of cluster blocks, and the last two bytes of the “rowid” in the index entry identify which block within the chain the cluster key scan should start at – and two bytes means you can only have 65,536 blocks for a single cluster key.

I don’t know what’s going to happen when smon tries to insert a row into the 65,535th (-ish) block for the current thread – but it ought to raise an Oracle error, and then you’ll probably have to take emergency action to make sure that the flashback mechanisms carry on running.

Although oraus.msg indicates that it’s an error message about hash clusters it’s possible that the first sight will be: Oracle error: “ORA-02475 maximum cluster chain block count of %s has been exceeded”. If you’re using a 16KB block size then you’ve got roughly 7.5 years, and 32KB block sizes give you about 15 years (not that that’s a good argument for selecting larger block sizes, of course.)

Footnote:

Searching MoS for related topics (smon_scn_time flashback) I found doc ID: 1100993.1 from which we could possibly infer that the 1,440 rows was a fixed limit in 10g but that the number of rows allowed in smon_scn_time could increase in 11g if you enable automatic undo management. I also found a couple of bugs relating to index or cluster corruption – fixed by 11.2.0.4, though.

 

 

February 28, 2014

Empty Hash

Filed under: Bugs,CBO,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 6:45 pm BST Feb 28,2014

A little while ago I highlighted a special case with the MINUS operator (that one of the commentators extended to include the INTERSECT operator) relating to the way the second subquery would take place even if the first subquery produced no rows. I’ve since had an email from an Oracle employee letting me know that the developers looked at this case and decided that it wasn’t feasible to address it because – taking a wider view point – if the query were to run parallel they would need a mechanism that allowed some synchronisation between slaves so that every slave could find out that none of the slaves had received no rows from the first subquery, and this was going to lead to hanging problems.

The email reminded me that there’s another issue of the same kind that I discovered several years ago – I thought I’d written it up, but maybe it was on a newsgroup or forum somewhere, I can’t find it on my blog or old website). The problem can be demonstrated by this example:


create table t1
as
select
	rownum			id,
	mod(rownum,25)		rep_col,
	rpad('x',50)		padding
from
	all_objects
where
	rownum <= 3000
;

delete from t1 where rep_col = 12;

create table t2
as
select
	rownum			id,
	mod(rownum,25)		rep_col,
	rpad('x',50)		padding
from
	all_objects
where
	rownum <= 10000
;

-- collect stats, no histograms

select
	/*+
		leading(t1 t2)
		use_hash(t2)
		no_swap_join_inputs(t2)
		pq_distribute(t2, hash, hash)
	*/
	count(*)
from
	t1,t2
where
	t1.rep_col = 12
and	t2.id = t1.id
;

You’ll notice that I’ve created a data set for table t1 where the values 12 does not appear in column rep_col; so my query will return no rows. However, for the purposes of demonstration, I’ve hinted an execution path that will scan t1 and distribute it by hash, then scan t2 to distribute that by hash before doing the join. Here’s the plan – which I’ve generated with rowsource execution statistics enabled and pulled from memory after executing the query:


-------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name     | Starts | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |          |      1 |      1 |00:00:00.01 |       6 |      2 |
|   1 |  SORT AGGREGATE            |          |      1 |      1 |00:00:00.01 |       6 |      2 |
|   2 |   PX COORDINATOR           |          |      1 |      2 |00:00:00.01 |       6 |      2 |
|   3 |    PX SEND QC (RANDOM)     | :TQ10002 |      0 |      0 |00:00:00.01 |       0 |      0 |
|   4 |     SORT AGGREGATE         |          |      2 |      2 |00:00:00.01 |       0 |      0 |
|*  5 |      HASH JOIN             |          |      2 |      0 |00:00:00.01 |       0 |      0 |
|   6 |       JOIN FILTER CREATE   | :BF0000  |      2 |      0 |00:00:00.01 |       0 |      0 |
|   7 |        PX RECEIVE          |          |      2 |      0 |00:00:00.01 |       0 |      0 |
|   8 |         PX SEND HASH       | :TQ10000 |      0 |      0 |00:00:00.01 |       0 |      0 |
|   9 |          PX BLOCK ITERATOR |          |      2 |      0 |00:00:00.01 |      54 |     27 |
|* 10 |           TABLE ACCESS FULL| T1       |     27 |      0 |00:00:00.01 |      54 |     27 |
|  11 |       PX RECEIVE           |          |      0 |      0 |00:00:00.01 |       0 |      0 |
|  12 |        PX SEND HASH        | :TQ10001 |      0 |      0 |00:00:00.01 |       0 |      0 |
|  13 |         JOIN FILTER USE    | :BF0000  |      2 |      2 |00:00:00.01 |     116 |     87 |
|  14 |          PX BLOCK ITERATOR |          |      2 |      2 |00:00:00.01 |     116 |     87 |
|* 15 |           TABLE ACCESS FULL| T2       |     29 |      2 |00:00:00.01 |     116 |     87 |
-------------------------------------------------------------------------------------------------

Since this was 11.2.0.4 Oracle has used Bloom filtering to reduce the traffic between slave sets – but you can see that despite returning no rows from t1 (lines 7 – 10), Oracle still performed the parallel tablescan of t2 (lines 12 – 15). Thanks to the Bloom filter we don’t transfer 10,000 rows between slave sets, but we can see from the Buffers and Reads columns that we really did do the tablescan – and if we take a snapshot of instance activity we would have seen 10,000 rows fetched by tablescan at that point.

If I ran this query serially Oracle would stop after discovering that the first tablescan returned no rows – why bother scanning the probe table when the hash table is empty ? But because the query is running in parallel, a single slave that receives no data from the first tablescan cannot assume that every other slave in the same set has also received no data – there’s no cross-chat that allows the slaves to discover that every slave has no data – so the second scan goes ahead.

I was a little surprised by this when I first found it since I thought (from looking at some of the 1039x trace information) that the parallel slaves were telling the query coordinator how many rows they had acquired on each granule – which would allow the coordinator to spot the zero total. But it looks as if I was misinterpreting the trace.

On the plus side for this example – it’s probably the case that if zero is a possible volume of data returned by the query then the optimizer will have decided that it was going to get a “small” data set for the build table and therefore do a broadcast distribution – and if that happens the second tablescan won’t occur – as we see below (note the zero Reads, zero Buffers):


------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Starts | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |      1 |      1 |00:00:00.01 |       6 |      2 |
|   1 |  SORT AGGREGATE           |          |      1 |      1 |00:00:00.01 |       6 |      2 |
|   2 |   PX COORDINATOR          |          |      1 |      2 |00:00:00.01 |       6 |      2 |
|   3 |    PX SEND QC (RANDOM)    | :TQ10001 |      0 |      0 |00:00:00.01 |       0 |      0 |
|   4 |     SORT AGGREGATE        |          |      2 |      2 |00:00:00.01 |       0 |      0 |
|*  5 |      HASH JOIN            |          |      2 |      0 |00:00:00.01 |       0 |      0 |
|   6 |       PX RECEIVE          |          |      2 |      0 |00:00:00.01 |       0 |      0 |
|   7 |        PX SEND BROADCAST  | :TQ10000 |      0 |      0 |00:00:00.01 |       0 |      0 |
|   8 |         PX BLOCK ITERATOR |          |      2 |      0 |00:00:00.01 |      54 |     27 |
|*  9 |          TABLE ACCESS FULL| T1       |     27 |      0 |00:00:00.01 |      54 |     27 |
|  10 |       PX BLOCK ITERATOR   |          |      0 |      0 |00:00:00.01 |       0 |      0 |
|* 11 |        TABLE ACCESS FULL  | T2       |      0 |      0 |00:00:00.01 |       0 |      0 |
------------------------------------------------------------------------------------------------

Unfortunately the same type of argument can’t be used to dismiss the minus/intersect cases.

 

February 26, 2014

Predicate Order

Filed under: Bugs,CBO,Execution plans,Oracle — Jonathan Lewis @ 8:14 am BST Feb 26,2014

Common internet question: does the order of predicates in the where clause make a difference.
General answer: It shouldn’t, but sometimes it will thanks to defects in the optimizer.

There’s a nicely presented example on the OTN database forum where predicate order does matter (between 10.1.x.x and 11.1.0.7). Notnne particularly – there’s a script to recreate the issue; note, also, the significance of the predicate section of the execution plan.
It’s bug 6782665, fixed in 11.2.0.1

February 25, 2014

FBI Skip Scan

Filed under: Bugs,Function based indexes,Indexing,Oracle — Jonathan Lewis @ 6:45 pm BST Feb 25,2014

A recent posting on the OTN database forum highlighted a bug (or defect, or limitation) in the way that the optimizer handles index skip scans with “function-based” indexes – it doesn’t do them. The defect has probably been around for a long time and demonstrates a common problem with testing Oracle – it’s very easy for errors in the slightly unusual cases to be missed; it also demonstrates a general principle that it can take some time for a (small) new feature to be applied consistently across the board.

The index definitions in the original posting included expressions like substr(nls_lower(colX), 1, 25), and it’s possible for all sorts of unexpected effects to appear when your code starts running into NLS  settings, so I’ve created a much simpler example. Here’s my table definition, with three index definitions:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	mod(rownum - 1,10)		mod1,
	rownum				id1,
	1 + mod(rownum - 1,10)		mod2,
	1 + rownum 			id2,
	lpad(rownum,7)			small_vc,
	rpad('x',100)			padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1e5 ; create index t1_normal on t1(mod2, id2); create index t1_split  on t1(mod2, id1 + 1); create index t1_fbi    on t1(mod1 + 1, id1 + 1); begin 	dbms_stats.gather_table_stats( 		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

Note that I gathered stats after creating the indexes (generally you would be safe creating the indexes after the gathering the stats from 10g onwards) because I needed to gather stats on the virtual columns that support the function-based indexes. As you can see, mod2 = 1 + mod1, and id2 = 1 + id1, and I’ve created three indexes which (internally) are the same although they are defined in three different ways.

So here are three queries which select the same data although, again, the queries are not all exactly the same. Since the queries are doing the same things with structures that are the same we might hope to see the optimizer using the same strategy for all three. In fact, to make things easier for the optimizer, I’ve even told it exactly what to do in two of the cases:


select	*
from	t1
where
	id2 = 50000
;

select	/*+ index_ss(t1(mod2)) */
	*
from	t1
where
	id1 + 1 = 50000
;

select	/*+ index_ss(t1 t1_fbi) */
	*
from	t1
where
	id1 + 1 = 50000
;

In the first query I’ve referenced only the second column of the “normal” index with a high precision query – expecting the optimizer to find the index skip scan path. In the second and third queries I’ve reference the (id1 + 1) index expression which appears as the second column in two different indexes; to help the optimizer along I’ve hinted the index skip scan for each of the indexes in turn.

If the first query can do an index skip scan then the second and third queries should be able to do the same because the physical structures and the statistics (and the actual stored and requested values) are all the same. Here are the three plans (pulled from memory in an 11.2.0.4 instance).


SQL_ID  bvjstz9xa3n7a, child number 0
-------------------------------------
select * from t1 where  id2 = 50000

Plan hash value: 2078113469

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |    12 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |     1 |   125 |    12   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | T1_NORMAL |     1 |       |    11   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID2"=50000)
       filter("ID2"=50000)

SQL_ID  95573qx3w62q2, child number 0
-------------------------------------
select /*+ index_ss(t1(mod2)) */  * from t1 where  id1 + 1 = 50000

Plan hash value: 1422030023

----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |       |       |   271 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1       |     1 |   130 |   271   (2)| 00:00:02 |
|*  2 |   INDEX FULL SCAN           | T1_SPLIT |     1 |       |   270   (2)| 00:00:02 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."SYS_NC00007$"=50000)
       filter("T1"."SYS_NC00007$"=50000)

SQL_ID  d4t48va4zrk1q, child number 0
-------------------------------------
select /*+ index_ss(t1 t1_fbi) */  * from t1 where  id1 + 1 = 50000

Plan hash value: 3151334857

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |   271 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |     1 |   130 |   271   (2)| 00:00:02 |
|*  2 |   INDEX FULL SCAN           | T1_FBI |     1 |       |   270   (2)| 00:00:02 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."SYS_NC00007$"=50000)
       filter("T1"."SYS_NC00007$"=50000)

The first query automatically does an index skip scan on the normal index.

The second and third queries calculate the cardinality correctly, reference the right “virtual” column correctly, and come close to obeying the hint by using the required index – and then spoil things by doing an index full scan instead of an index skip scan. Checking the 10053 trace file I found that the optimizer hadn’t even considered the possibility of a skip scan for the last two queries – it had jumped straight to the full scan. (If the optimizer had ignored the hint completely I would have said that this behaviour was simply a limitation of FBIs – but since the optimizer has got partway there I think it’s probably a bug.)

Once you’ve got this far, of course, you might wonder if you could work around the problem in 11g by using virtual columns – so that’s the next test.

drop index t1_split;
drop index t1_fbi;

alter table t1 add (mod_virtual generated always as (mod1 + 1) virtual);
alter table t1 add (id_virtual generated always as (id1 + 1) virtual);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

create index t1_virtual on t1(mod_virtual, id_virtual);

select
	/*+ index_ss(t1) */
	*
from	t1
where
	id_virtual = 50000

I started by dropping the two function-based indexes (because I’m about to create “proper” virtual columns which are identical to the hidden “virtual columns” that were supporting the function-based indexes – if I didn’t do this I’d hit Oracle error “ORA-54015: Duplicate column expression was specified”). Then I declared my virtual columns, collected stats on the whole table (I could have used method_opt=>’for all hidden columns size 1′) so that there would be stats on those columns, then created an index using the virtual columns and tried to get a skip scan on that index. Any bets ?

select  /*+ index_ss(t1) */  * from t1 where  id_virtual = 50000

Plan hash value: 4250443541

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |       |       |   271 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1         |     1 |   133 |   271   (2)| 00:00:02 |
|*  2 |   INDEX FULL SCAN           | T1_VIRTUAL |     1 |       |   270   (2)| 00:00:02 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID_VIRTUAL"=50000)
       filter("ID_VIRTUAL"=50000)

Bad luck – even though “proper” virtual columns can do wonders, they don’t get around this problem. Anyone who’s used multi-column function-based indexes or added virtual columns to indexes might want to look at their code and execution paths to see if there are opportunities for skip scans that the optimizer is failing to take. If there are you might like to raise an SR with Oracle – and point them to this blog as a repeatable test case.

I tried several searches for a relevant bug on MoS – unfortunately it was one of those occasions where the search result was either very small and didn’t have what I wanted, or was so large that I wasn’t going to check all the hits. If anyone does find a bug number, please tell us about it in the comments.

 

February 21, 2014

Index Compression – aargh

Filed under: Bugs,compression,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 7:57 am BST Feb 21,2014

The problem with telling people that some feature of Oracle is a “good thing” is that some of those people will go ahead and use it; and if enough people use it some of them will discover a hitherto undiscovered defect. Almost inevitably the bug will turn out to be one of those “combinations” bugs that leaves you thinking: “Why the {insert preferred expression of disbelief here} should {feature X} have anything to do with {feature Y}”.

Here – based on index compression, as you may have guessed from the title – is one such bug. I got it first on 11.1.0.7, but it’s still there on 11.2.0.4 and 12.1.0.1


create table t1 (
	id1	number not null,
	id2	number not null,
	n1	number,
	v1	varchar2(10)
)
rowdependencies
;

alter table t1 add constraint t1_pk primary key(id1, id2) using index (
	create index t1_pk on t1(id1, id2) compress 1
);

create table t2(
	id1	number not null,
	id2	number not null,
	id3	number not null,
	n1	number,
	v1	varchar2(10)
)
rowdependencies
;

alter table t2 add constraint t2_fk_t1 foreign key(id1, id2) references t1;

It’s quite simple – I’ve got a multi-column primary key, and it’s worth compressing on the first column because that column is fairly repetitive. Then I’ve created another table that has a foreign key constraint referencing my first table. Because I’ve got some replication going on and want to enable parallelism I’ve enabled rowdependencies all over the place. So let’s insert a couple of rows and see what happens – the next bit of text is cut-n-pasted from an 11.2.0.4 SQL*Plus session running a script after a call to set echo on:


SQL> insert into t1 values(1,1,1,'x');

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> insert into t2 values(1,1,1,1,'x');
insert into t2 values(1,1,1,1,'x')
            *
ERROR at line 1:
ORA-00600: internal error code, arguments: [25027], [5], [394178], [], [], [], [], [], [], [], [], []

For further details, and before you get completely thrilled at the possibility of compressing lots of indexes, keep an eye on:

Bug 18125878 : ORA-600 [25027] ON INSERT IF TABLE HAS ROWDEPENDENCIES AND COMPRESS ON PK

“Inserting into a table with a foreign key where the base table has a primary key using index key compression and the table also has row dependencies enabled. Stack will include kdsgrds and kdiexi0 (in 12) / kdiexi (in 11, 10)”

As implied by that note from the bug, it also affects 10g.  The bug note reports it as fixed in 12.2.

January 26, 2014

Pagination

Filed under: Bugs,Indexing,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 12:08 pm BST Jan 26,2014

I was involved in a thread on Oracle-L recently started with the question: “How many LIOs is too many LIOs”. Rather than rewrite the whole story, I’ve supplied a list of links to the contributions I made, in order – the final “answer” is actually the answer to a different question – but travels an interesting path to get there.#

I’ve got a script to emulate the requirement so that people can see for themselves the bug that I mention in post 15; I’ll try to add a couple of notes to it and publish it some time, but for the moment I’ll just remind myself that it’s called (slightly counter-intuitively: no_sort_problem.sql)

January 3, 2014

Index Hash

Filed under: Bugs,CBO,Hints,Ignoring Hints,Index Joins,Indexing,Oracle — Jonathan Lewis @ 6:56 pm BST Jan 3,2014

I’m afraid this is one of my bad puns again – an example of the optimizer  making a real hash of the index hash join. I’m going to create a table with several indexes (some of them rather similar to each other) and execute a query that should do an index join between the obvious two indexes. To show how obvious the join should be I’m going to start with a couple of queries that show the cost of simple index fast full scans.

Here’s the data generating code:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 10000
)
select
	rownum			id,
	mod(rownum-1,20)	flag,
	lpad(rownum,10,'0')	v10,
	lpad(rownum,20,'0')	v20,
	lpad(rownum,30,'0')	v30,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 100000
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		estimate_percent => 100,
		method_opt	 => 'for all columns size 1'
	);
end;
/
alter table t1 add constraint t1_pk primary key(id)
	using index (create index t1_pk on t1(id))
;

create index t1_flag on t1(flag);

create index t1_ia on t1(id, v20);
create index t1_ib on t1(id, v10);
create index t1_ic on t1(id, v30);

select
	index_name, leaf_blocks
from
	user_indexes
where
	table_name = 'T1'
order by
	index_name
;

/*  output from the query */
/*
INDEX_NAME           LEAF_BLOCKS
-------------------- -----------
T1_FLAG                      195
T1_IA                        515
T1_IB                        375
T1_IC                        657
T1_PK                        222

*/

Given the definitions of the primary key index and the three indexes that start with the ID column their relative sizes shouldn’t surprise you. The cost of an index fast full scan on these indexes will depend on your parameter settings and values for system stats, here are the figures from one system  (from autotrace) running 12.1 – the behaviour is consistent across several versions:


select /*+ index_ffs(t1 t1_pk) */ count(*) from t1;
select /*+ index_ffs(t1 t1_flag) */ count(*) from t1 where flag is not null;
select /*+ index_ffs(t1 t1_ia) */ count(*) from t1;
select /*+ index_ffs(t1 t1_ib) */ count(*) from t1;
select /*+ index_ffs(t1 t1_ic) */ count(*) from t1;

-- with autotrace results:

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |    63   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_PK |   100K|    63   (2)| 00:00:01 |
-----------------------------------------------------------------------

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |     3 |    56   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |         |     1 |     3 |            |          |
|*  2 |   INDEX FAST FULL SCAN| T1_FLAG |   100K|   292K|    56   (2)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG" IS NOT NULL)

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |   142   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_IA |   100K|   142   (1)| 00:00:01 |
-----------------------------------------------------------------------

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |   104   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_IB |   100K|   104   (1)| 00:00:01 |
-----------------------------------------------------------------------

-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |   181   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| T1_IC |   100K|   181   (1)| 00:00:01 |
-----------------------------------------------------------------------

If you compare the different costs of the fast full scans they’re consistent with the different sizes (leaf_blocks) of the indexes; so you might expect the following query to do either a tablescan or an index join between the t1_flag index and the t1_pk index (which is the smallest candidate index to find the id column):


select	sum(id)
from
	t1
where
	flag = 0
;

But here’s the plan I got:


--------------------------------------------------------------------------------------------
| Id  | Operation               | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                  |     1 |     8 |   528   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE         |                  |     1 |     8 |            |          |
|*  2 |   VIEW                  | index$_join$_001 |  5000 | 40000 |   528   (1)| 00:00:01 |
|*  3 |    HASH JOIN            |                  |       |       |            |          |
|*  4 |     INDEX RANGE SCAN    | T1_FLAG          |  5000 | 40000 |    10   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| T1_IA            |  5000 | 40000 |   646   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=0)
   3 - access(ROWID=ROWID)
   4 - access("FLAG"=0)

Four things to notice:

  1. The optimizer has picked the wrong index
  2. The fast full scan of t1_ia is 646 in this plan when (on its own) it was only 142
  3. The cost of the whole query is less than the cost of one of the lines
  4. The index chosen looks as if it might have been selected on the basis of alphabetical order

Oops.

Fortunately, of course, we can always add hints to get the right plan – so let’s try this – and this time the plan is what I got by using explain plan followed by a call to dbms_xplan() with the ‘outline’ option:

explain plan for
select
	/*+
		qb_name(main)
		index_join(@main t1 t1_flag t1_pk)
	*/
	sum(id)
from
	t1
where
	flag = 0
;

select * from table(dbms_xplan.display(null,null,'outline alias'));
--------------------------------------------------------------------------------------------
| Id  | Operation               | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                  |     1 |     8 |   528   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE         |                  |     1 |     8 |            |          |
|*  2 |   VIEW                  | index$_join$_001 |  5000 | 40000 |   528   (1)| 00:00:01 |
|*  3 |    HASH JOIN            |                  |       |       |            |          |
|*  4 |     INDEX RANGE SCAN    | T1_FLAG          |  5000 | 40000 |    10   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| T1_IA            |  5000 | 40000 |   646   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - MAIN
   2 - SEL$998059AF / T1@MAIN
   3 - SEL$998059AF
   4 - SEL$998059AF / indexjoin$_alias$_001@SEL$998059AF
   5 - SEL$998059AF / indexjoin$_alias$_002@SEL$998059AF

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      INDEX_JOIN(@"MAIN" "T1"@"MAIN" ("T1"."FLAG") ("T1"."ID" "T1"."V20"))
      OUTLINE(@"MAIN")
      OUTLINE_LEAF(@"MAIN")
      OUTLINE_LEAF(@"SEL$998059AF")
      ALL_ROWS
      DB_VERSION('12.1.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Ouch – the optimizer has ignored the hint and is still using the wrong index.
Here’s something really odd, though – and I’ll get around to looking at the 10053 eventually – let’s add an (undocumented) outline_leaf() hint to the query, a hint that is already in the Outline Data:

explain plan for
select
	/*+
		qb_name(main)
		outline_leaf(@main)
		index_join(@main t1 t1_flag t1_pk)
	*/
	sum(id)
from
	t1
where
	flag = 0
;

select * from table(dbms_xplan.display(null,null,'outline alias'));
--------------------------------------------------------------------------------------------
| Id  | Operation               | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                  |     1 |     8 |   235   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE         |                  |     1 |     8 |            |          |
|*  2 |   VIEW                  | index$_join$_001 |  5000 | 40000 |   235   (1)| 00:00:01 |
|*  3 |    HASH JOIN            |                  |       |       |            |          |
|*  4 |     INDEX RANGE SCAN    | T1_FLAG          |  5000 | 40000 |    10   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| T1_PK            |  5000 | 40000 |   280   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - MAIN
   2 - SEL$998059AF / T1@MAIN
   3 - SEL$998059AF
   4 - SEL$998059AF / indexjoin$_alias$_001@SEL$998059AF
   5 - SEL$998059AF / indexjoin$_alias$_002@SEL$998059AF

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      INDEX_JOIN(@"MAIN" "T1"@"MAIN" ("T1"."FLAG") ("T1"."ID"))
      OUTLINE(@"MAIN")
      OUTLINE_LEAF(@"MAIN")
      OUTLINE_LEAF(@"SEL$998059AF")
      ALL_ROWS
      DB_VERSION('12.1.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("FLAG"=0)
   3 - access(ROWID=ROWID)
   4 - access("FLAG"=0)

We get the plan we want, and it’s cheaper than the default one. It does still suffer from the problem that the cost of the fast full scan is larger than it should be (it seems to be the cost of an index fast full scan plus the cost of an index full scan) and the cost of the whole plan is still less than the cost of that one line.

There have been a number of cases where I’ve thought that the optimizer hasn’t chosen an index join when it was a sensible choice – this is probably one of the reasons why.

November 3, 2013

Not Pushing

Filed under: Bugs,Execution plans,Oracle,subqueries — Jonathan Lewis @ 5:24 pm BST Nov 3,2013

Here’s an odd little optimizer glitch – probably irrelevant to most people, but an indication of the apparent randomness that appears as you combine features. I’ve created an example which is so tiny that the only explanation I can come up with the for optimizer not “behaving properly” is that I’ve found an undocumented restriction relating to a particular feature.

Here’s the basic schema structure with query and execution plan – there’s nothing particularly significant about the object definitions – they’re just a couple of (reduced) structures from a client site I visited a few years ago:

drop view v_item;
drop table part;
drop table item;

create table item (
	item_id number not null,
	company number not null,
	part_id number not null,
	quantity number
);

alter table item add constraint item_pk primary key (company,part_id);

create table part(
	part_id number not null,
	company number not null,
	part_name varchar2(30) not null
);

create unique index part_idx on part(company,part_name);

alter table part
	add constraint part_fk foreign key (company, part_id)
	references item (company, part_id)
;

create or replace view v_item
as
select * from item
;

begin
	dbms_stats.gather_table_stats(user, 'part', method_opt=>'for all columns size 1');
	dbms_stats.gather_table_stats(user, 'item', method_opt=>'for all columns size 1');

end;
/

explain plan for
update v_item i
   set quantity = 0
 where i.company = to_number(:b1)
   and i.part_id = (select /*+ push_subq */
                           part_id
                      from part p
                     where part_name=:b2
                       and p.company=i.company
                   )
;

select * from table(dbms_xplan.display);

-- Initial Plan
------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT              |          |     1 |    39 |     0   (0)| 00:00:01 |
|   1 |  UPDATE                       | ITEM     |       |       |            |          |
|*  2 |   INDEX RANGE SCAN            | ITEM_PK  |     1 |    39 |     0   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| PART     |     1 |    43 |     0   (0)| 00:00:01 |
|*  4 |     INDEX UNIQUE SCAN         | PART_IDX |     1 |       |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("COMPANY"=TO_NUMBER(:B1))
       filter("PART_ID"= (SELECT "PART_ID" FROM "PART" "P" WHERE "PART_NAME"=:B2
              AND "P"."COMPANY"=:B1))
   4 - access("P"."COMPANY"=:B1 AND "PART_NAME"=:B2)

You can see from this plan that Oracle has use simple view merging to eliminate the view v_item and has then “pushed” the subquery to a point where it can be applied at the earliest possible moment – which is as Oracle reads an index entry from item_pk and before it visits the table. (This plan came from 11.1, and if you printed the outline information you would see that it includes a push_subq() hint.)

Now let’s make a little change to the object definition – and see what happens to the execution plan (two versions – the second one includes a hint):


create or replace trigger v_trig
instead of insert or update or delete on v_item
begin
	null;
end;
/

-- New plan without hinting:
-------------------------------------------------------------------------------------------
| Id  | Operation                      | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT               |          |     1 |    39 |     0   (0)| 00:00:01 |
|   1 |  UPDATE                        | V_ITEM   |       |       |            |          |
|*  2 |   FILTER                       |          |       |       |            |          |
|   3 |    VIEW                        | V_ITEM   |     1 |    39 |     0   (0)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID| ITEM     |     1 |    39 |     0   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | ITEM_PK  |     1 |       |     0   (0)| 00:00:01 |
|   6 |    TABLE ACCESS BY INDEX ROWID | PART     |     1 |    43 |     0   (0)| 00:00:01 |
|*  7 |     INDEX UNIQUE SCAN          | PART_IDX |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("I"."PART_ID"= (SELECT "PART_ID" FROM "PART" "P" WHERE
              "PART_NAME"=:B2 AND "P"."COMPANY"=:B1))
   5 - access("COMPANY"=TO_NUMBER(:B1))
   7 - access("P"."COMPANY"=:B1 AND "PART_NAME"=:B2)

-- New plan with a push_subq hint in the subquery:
------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT              |          |     1 |    39 |     0   (0)| 00:00:01 |
|   1 |  UPDATE                       | V_ITEM   |       |       |            |          |
|*  2 |   VIEW                        | V_ITEM   |     1 |    39 |     0   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| ITEM     |     1 |    39 |     0   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | ITEM_PK  |     1 |       |     0   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID| PART     |     1 |    43 |     0   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN         | PART_IDX |     1 |       |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("I"."PART_ID"= (SELECT /*+ PUSH_SUBQ */ "PART_ID" FROM "PART" "P"
              WHERE "PART_NAME"=:B2 AND "P"."COMPANY"=:B1))
   4 - access("COMPANY"=TO_NUMBER(:B1))
   6 - access("P"."COMPANY"=:B1 AND "PART_NAME"=:B2)

With the existence of the trivial “instead of” trigger on the view, view merging no longer takes place; moreover, with view merging blocked, subquery pushing doesn’t happen unless hinted – the subquery acts as a filter after Oracle starts generating the view rowsource.

Even when we use the push_subq hint in the subquery – which clearly has some effect as the filter is now applied “inside” the view – the subquery isn’t applied on the index range scan it is only applied after we visit the table.

Conclusion

I haven’t done any sort of exhaustive analysis of this example (a large data set, and checks on rowsource execution statistics would be the next stage), but it does look as if the presence of an “instead of” trigger blocks some query transformations that look (at first sight, at least) as if they should be legal.

If you’re using updateable views that depends on triggers and don’t see the optimizer taking the path you expect, you might want to try testing the query on a copy schema with the trigger removed before you spend too much time trying to work out how to change the execution path. The case that particularly springs to mind (though I’ve rarely seen the feature in use) is “Object Views” where a view may be used with triggers to translate between the relational database and an object-oriented programming environment.

 

 

October 18, 2013

Deferrable RI

Filed under: Bugs,Execution plans,Oracle — Jonathan Lewis @ 6:08 pm BST Oct 18,2013

Here’s a lovely little example that just came up on the OTN database forum of how things break when features collide. It’s a bug (I haven’t looked for the number) that seems to be fixed in 12.1.0.1. All it takes is a deferrable foreign key and an outer join. I’ve changed the table and column names from the original, and limited the deferability to just the foreign key:


create table parent(id_p date primary key);
create table child(id_c date not null references parent(id_p) deferrable);
 
alter session set constraints = deferred;
 
insert into child values(sysdate);
insert into child values(sysdate);
insert into child values(sysdate);
 
select	
	par.id_p, chi.id_c 
from
	child chi 
left join 
	parent par 
on 	par.id_p = chi.id_c
where	par.id_p is null 
and	chi.id_c is not null
;

select
	chi.id_c 
from
	child chi 
left join 
	parent par 
on	par.id_p = chi.id_c
where	par.id_p is null 
and	chi.id_c is not null
;

You’ll notice that the difference between the two queries is that the first one selects columns from both the parent and child tables; the second selects only from the child. Since the join is across a parent/child referential integrity constraint, and the primary key is a single column, and no columns from the parent appear in the select list, the optimizer is able to invoke “table elimination” in the second case – except that it shouldn’t because the side effect is to produce the wrong answer. Here are the two sets of results when running 11.2.0.4:

ID_P      ID_C
--------- ---------
          18-OCT-13
          18-OCT-13
          18-OCT-13

3 rows selected.

no rows selected

In 12.1.0.1 both queries produce the same (first) set of results.

In the second query 11g Oracle (incorrectly) removes the join to the parent table and replaces it with the predicate “id_c is null” (since the only effect of the join would normally be to eliminate any rows where id_c is null), this predicate is then combined with the existing “id_c is not null” predicate to produce “null is not null” – which is why we get no rows returned.

The problem, of course, is that the removal and substitution is only valid if the constraint check is in a valid state, and at this point we have deferred the constraint and got some bad data into the child table – Oracle should not do table elimination. (As a side note, this is why you do not see table elimination occurring when the critical constraints are in the state: “enable novalidate” – the tables may contain invalid data which means the predicate substitution may change the result.)

Here are the two sets of plans, first 11.2.0.4


------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |    82 |  1476 |     2   (0)| 00:00:01 |
|*  1 |  FILTER             |              |       |       |            |          |
|   2 |   NESTED LOOPS OUTER|              |    82 |  1476 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| CHILD        |    82 |   738 |     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN| SYS_C0010228 |     1 |     9 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("PAR"."ID_P" IS NULL)
   4 - access("PAR"."ID_P"(+)="CHI"."ID_C")



----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |     9 |     0   (0)|          |
|*  1 |  FILTER            |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL| CHILD |    82 |   738 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(NULL IS NOT NULL)


And now 12.1.0.1


------------------------------------------------------------------------------------
| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |              |    82 |  1476 |     2   (0)| 00:00:01 |
|*  1 |  FILTER             |              |       |       |            |          |
|   2 |   NESTED LOOPS OUTER|              |    82 |  1476 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| CHILD        |    82 |   738 |     2   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN| SYS_C0011569 |     1 |     9 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("PAR"."ID_P" IS NULL)
   4 - access("PAR"."ID_P"(+)="CHI"."ID_C")


-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |    82 |  1476 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI |              |    82 |  1476 |     2   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| CHILD        |    82 |   738 |     2   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| SYS_C0011569 |     1 |     9 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PAR"."ID_P"="CHI"."ID_C")

It’s interesting to note that 12c is able to convert the second query into an anti join (in other words it has changed an outer join to a (not exists) subquery, and then transformed it back into a different type of join).

One little aside – the first thought I had about the error was that it might be a side effect of the ANSI style join and something that the optimizer was messing up in the transformation to “traditional” style, so I have repeated the test using traditional Oracle syntax, and the problem persists.

Update 20/10/2013:

Corrected thanks to a twitter comment, I had swapped the ‘id_c is null’ and ‘id_c is not null’ in my explanation of the error.

October 16, 2013

Hash Clusters – 3

Filed under: 12c,Bugs,dbms_xplan,Oracle,Upgrades — Jonathan Lewis @ 1:03 pm BST Oct 16,2013

This note is a quick summary of an oddity that came to light after a twitter conversation with Christian Antognini yesterday. First a little test script to get things going:


set serveroutput off

define m_keys = 10000
define m_keys = 60

create cluster hash_cluster(
	 id number(5)
)
single table
hashkeys &m_keys
hash is id
size 120
;

create table hash_table (
	id                      number(5),
		constraint ht_pk primary key(id),
	owner                   varchar2(30),
	object_name             varchar2(128),
	object_id               number,
	object_type             varchar2(18),
	created                 date,
	last_ddl_time           date,
	timestamp               varchar2(19),
	status                  varchar2(7)
) cluster hash_cluster(id)
;

insert into hash_table
select
	rownum id, owner, object_name, rownum,
	object_type, created, last_ddl_time,
	timestamp, status
from
	all_objects
where
	rownum <= &m_keys ; commit; begin 	dbms_stats.gather_table_stats( 		ownname		 => user,
		tabname		 =>'hash_table',
		method_opt 	 => 'for all columns size 1',
		cascade		 => true
	);
end;
/

--
--  Now we can run a query and pull the execution plan from memory and
--  because of an oddity I noticed the first time around, we'll also do
--  an explain plan on it to see what Oracle puts in the plan_table
--

select	owner, object_type, object_name
from	hash_table
where	id = 50
;

select * from table(dbms_xplan.display_cursor);

explain plan set statement_id = 'Pre-stats' for
select	owner, object_type, object_name
from	hash_table
where	id = 50
;

--
--  Now we'll use the analyse command (you know, the one that you're
--  not supposed to use to collect statistics) to collect statistics
--  and see if anything changes.
--

analyze cluster hash_cluster compute statistics;

select	owner, object_type, object_name
from	hash_table
where	id = 50
;

select * from table(dbms_xplan.display_cursor);

explain plan set statement_id = 'Post-stats' for
select	owner, object_type, object_name
from	hash_table
where	id = 50
;

--
--  And here are the results from the plan_table for the two
--  runs of the test: showing just the cost-related columns
--

break on statement_id skip 1
select statement_id, operation, cost, io_cost, cpu_cost from plan_table;

If you compare the two sets of results then, after collecting cluster stats (analyse cluster), you’ll see that the optimizer reports the cost of getting the row as 1, which is an I/O cost of one block visit (because that’s the way we hope a single table hash cluster will work) plus a tiny CPU cost:

SQL_ID  5xrn2dfr7dbyv, child number 0
-------------------------------------
select owner, object_type, object_name from hash_table where id = 50

Plan hash value: 4047687169

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |     1 (100)|          |
|*  1 |  TABLE ACCESS HASH| HASH_TABLE |     1 |    19 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=50)

STATEMENT_ID                   OPERATION                            COST    IO_COST   CPU_COST
------------------------------ ------------------------------ ---------- ---------- ----------
Post-stats                     SELECT STATEMENT                        1          1       7121
                               TABLE ACCESS                            1          1       7121

But what does Oracle tell us if we don’t analyse the cluster ?

SQL_ID  5xrn2dfr7dbyv, child number 0
-------------------------------------
select owner, object_type, object_name from hash_table where id = 50

Plan hash value: 4047687169

----------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost  |
----------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |     1 |
|*  1 |  TABLE ACCESS HASH| HASH_TABLE |     1 |    23 |       |
----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ID"=50)

Note
-----
   - cpu costing is off (consider enabling it)

STATEMENT_ID                   OPERATION                            COST    IO_COST   CPU_COST
------------------------------ ------------------------------ ---------- ---------- ----------
Pre-stats                      SELECT STATEMENT                        0          0          0
                               TABLE ACCESS

Despite the absence of a cost on the table access line the statement as a whole is given a cost when we call dbms_xplan, but we’re told that we haven’t enabled CPU costing (when actually, take my word for it, we have). Looking in the plan_table (the results are the same if we look in v$sql_plan) we can see nulls for the table access costs, and zeros for the statement cost – the observation that cpu_cost = 0 may be the reason we see the note about CPU costing.

So there are some anomalies going in in costing table access by hash cluster. I suspect that very few people actual call analyse cluster, and I’m not going to try to predict what side effects may appear in more complex statements if you introduce cluster statistics; but I will say it’s worth checking carefully any code that has anything to do with clustered (or, at least, hash clustered) tables every time you upgrade in case this particular anomaly suddenly disappears and the costs of hash table access increase and execution plans change as a consequence.

 

Next Page »

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,453 other followers