Oracle Scratchpad

February 16, 2017

Truncate 12c

Filed under: 12c,Infrastructure,Oracle — Jonathan Lewis @ 12:52 pm BST Feb 16,2017

Here’s one of those little improvements in 12c (including 12.1) that will probably end up being described as “little known features” in about 3 years time. Arguably it’s one of those little things that no-one should care about because it’s not the sort of thing you should do on a production system, but that doesn’t mean it won’t be seen in the wild.

Rather than simply state the feature I’m going to demonstrate it, starting with a little code to build a couple of tables with referential integrity:


create table parent (
        id      number(4),
        name    varchar2(10),
        constraint par_pk primary key (id)
)
;

create table child(
        id_p    number(4)
                        constraint chi_fk_par
                        references parent
                        on delete cascade,
        id      number(4),
        name    varchar2(10),
        constraint chi_pk primary key (id_p, id)
)
;

insert into parent values (1,'Smith');
insert into parent values (2,'Jones');

insert into child values(1,1,'Sally');
insert into child values(1,2,'Simon');

insert into child values(2,1,'Jack');
insert into child values(2,2,'Jill');

commit;


There’s one important detail in this code that isn’t taking the default and isn’t used very frequently – it’s the option on the foreign key to take the action “on delete cascade”. If you delete a row from the parent table then Oracle will automatically delete any referenced rows from the child table first thus avoiding the error ORA-02292: integrity constraint (TEST_USER.CHI_FK_PAR) violated – child record found. (Conveniently I have a suitable index on the child table that will bypass the problem of a mode 4 (or, where child rows already exist, mode 5) TM lock being taken on the child as the parent row is deleted.)

And here’s the demonstration of the new feature – working in 12.1 onwards:


truncate table parent;

truncate table parent cascade;

The first command will raise Oracle error ORA-02266: unique/primary keys in table referenced by enabled foreign keys, but the second command will truncate the parent and child tables “simultaneously”: but only if the referential integrity constraint is set to “on delete cascade”. If the referential integrity constraint is left to its default action then the second command will raise error: ORA-14705: unique or primary keys referenced by enabled foreign keys in table “TEST_USER”.”CHILD”

This feature (and several broadly similar features) also works with matching partitions of equi-partitioned (or ref partitioned) tables – and that’s a context where the requirement  is much more likely to appear than with non-partitioned tables.

 

February 13, 2017

Band Join 12c

Filed under: 12c,Execution plans,Oracle,Performance,Upgrades — Jonathan Lewis @ 1:53 pm BST Feb 13,2017

One of the optimizer enhancements that appeared in 12.2 for SQL is the “band join”. that makes certain types of merge join much more  efficient.  Consider the following query (I’ll supply the SQL to create the demonstration at the end of the posting) which joins two tables of 10,000 rows each using a “between” predicate on a column which (just to make it easy to understand the size of the result set)  happens to be unique with sequential values though there’s no index or constraint in place:

select
        t1.v1, t2.v1
from
        t1, t2
where
        t2.id between t1.id - 1
                  and t1.id + 2
;

This query returns nearly 40,000 rows. Except for the values at the extreme ends of the range each of the 10,000 rows in t2 will join to 4 rows in t1 thanks to the simple sequential nature of the data. In 12.2 the query, with rowsource execution stats enabled, completed in 1.48 seconds. In 12.1.0.2 the query, with rowsource execution stats OFF, took a little over 14 seconds. (With rowsource execution stats enabled it took 12.1.0.2 a little over 1 minute to return the first 5% of the data – I didn’t bother to wait for the rest, though the rate would have improved over time.)

Here are the two execution plans – spot the critical difference:


12.1.0.2
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |    25M|   715M|  1058  (96)| 00:00:01 |
|   1 |  MERGE JOIN          |      |    25M|   715M|  1058  (96)| 00:00:01 |
|   2 |   SORT JOIN          |      | 10000 |   146K|    29  (11)| 00:00:01 |
|   3 |    TABLE ACCESS FULL | T1   | 10000 |   146K|    27   (4)| 00:00:01 |
|*  4 |   FILTER             |      |       |       |            |          |
|*  5 |    SORT JOIN         |      | 10000 |   146K|    29  (11)| 00:00:01 |
|   6 |     TABLE ACCESS FULL| T2   | 10000 |   146K|    27   (4)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("T2"."ID"<="T1"."ID"+2)   -- > had to add GT here to stop WordPress spoiling the format 
   5 - access("T2"."ID">="T1"."ID"-1)
       filter("T2"."ID">="T1"."ID"-1)

12.2.0.1
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 40000 |  1171K|    54  (12)| 00:00:01 |
|   1 |  MERGE JOIN         |      | 40000 |  1171K|    54  (12)| 00:00:01 |
|   2 |   SORT JOIN         |      | 10000 |   146K|    27  (12)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T1   | 10000 |   146K|    25   (4)| 00:00:01 |
|*  4 |   SORT JOIN         |      | 10000 |   146K|    27  (12)| 00:00:01 |
|   5 |    TABLE ACCESS FULL| T2   | 10000 |   146K|    25   (4)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."ID">="T1"."ID"-1)
       filter("T2"."ID"<="T1"."ID"+2 AND "T2"."ID">="T1"."ID"-1)

Notice how operation 4, the FILTER, that appeared in 12.1 has disappeared in 12.2 and the filter predicate that it used to hold is now part of the filter predicate of the SORT JOIN that has been promoted to operation 4 in the new plan.

As a reminder – the MERGE JOIN operates as follows: for each row returned by the SORT JOIN at operation 2 it calls operation 4. In 12.1 this example will then call operation 5 so the SORT JOIN there happens 10,000 times. It’s important to know, though, that the name of the operation is misleading; what’s really happening is that Oracle is “probing a sorted result set in local memory” 10,000 times – it’s only on the first probe that it finds it has to call operation 6 to read and move the data into local memory in sorted order.

So in 12.1 operation 5 probes (accesses) the in-memory data set starting at the point where t2.id >= t1.id – 1; I believe there’s an optimisation here because Oracle will recall where it started the probe last time and resume searching from that point; having found the first point in the in-memory set where the access predicate it true Oracle will walk through the list passing each row back to the FILTER operation as long as the access predicate is still true, and it will be true right up until the end of the list. As each row arrives at the FILTER operation Oracle checks to see if the filter predicate there is true and passes the row up to the MERGE JOIN operation if it is. We know that on each cycle the FILTER operation will start returning false after receiving 4 rows from SORT JOIN operation – Oracle doesn’t.  On average the SORT JOIN operation will send 5,000 rows to the FILTER operation (for a total of 50,000,000 values passed and discarded).

In 12.2, and for the special case here where the join predicate uses constants to define the range, Oracle has re-engineered the code to eliminate the FILTER operation and to test both parts of the between clause in the same subroutine it uses to probe and scan the rowsource. In 12.2 the SORT JOIN operation will pass 4 rows up to the MERGE JOIN operation and stop scanning on the fifth row it reaches. In my examples that’s an enormous (CPU) saving in subroutine calls and redundant tests.

Footnote:

This “band-join” mechanism only applies when the range is defined by constants (whether literal or bind variable). It doesn’t work with predicates like (e.g.):

where t2.id between t1.id - t1.step_back and t1.id + t1.step_forward

The astonishing difference in performance due to enabling rowsource execution statistics is basically due to the number of subroutine calls eliminated – I believe (subject to a hidden parameter that controls a “sampling frequency”) that Oracle will call the O/S clock twice each time it calls the second SORT JOIN operation from the FILTER operation to acquire the next row. In 12.1 we’re doing roughly 50M redundant calls to that SORT JOIN.

The dramatic difference in performance even when rowsource execution statistics isn’t enabled is probably something you won’t see very often in a production system – after all, I engineered a fairly extreme data set and query for the purposes of demonstration. Note, however, the band join does seemt to introduce a change in cost, so it’s possible that on the upgrade you may find a few cases where the optimizer will switch from a nested loop join to a merge join using a band-join.

January 30, 2017

ASSM Help

Filed under: 12c,Oracle,Troubleshooting — Jonathan Lewis @ 12:33 pm BST Jan 30,2017

I’ve written a couple of articles in the past about the problems of ASSM spending a lot of time trying to find blocks with usable free space. Without doing a bit of rocket science with some x$ objects, or O/S tracing for the relevant calls, or enabling a couple of nasty events, it’s not easy proving that ASSM might be a significant factor in a performance problem – until you get to 12c Release 2 where a staggering number of related statistics appear in v$sysstat.

I’ve published the full list of statistics (without explanation) at the end of this note, but here’s just a short extract showing the changes in my session’s ASSM stats due to a little PL/SQL loop inserting 10,000 rows, one row at a time into an empty table with a single index:

Name                                  Value
----                                  -----
ASSM gsp:get free block                 185
ASSM cbk:blocks examined                185
ASSM gsp:L1 bitmaps examined            187
ASSM gsp:L2 bitmaps examined              2
ASSM gsp:Search hint                      2
ASSM gsp:good hint                      185

It looks like we’ve checked a couple of “level 2” bitmap blocks (one for the table, one for the index, presumably) to pick a sequence of “level 1” bitmap blocks that have been very good at taking us to a suitable data (table or index) block that can be used.

You might have expected to see numbers more like 10,000 in the output, but remember that PL/SQL has lots of little optimisations built into it and one of those is that it pins a few blocks while the anonymous block is running so it doesn’t have to keep finding blocks for every single row.

In comparison here’s the effect of the same data load when operated as 10,000 separate insert statements called from SQL*Plus:

Name                                  Value
----                                  -----
ASSM gsp:get free block              10,019
ASSM cbk:blocks examined             10,220
ASSM cbk:blocks marked full             201
ASSM gsp:L1 bitmaps examined         10,029
ASSM gsp:L2 bitmaps examined              6
ASSM gsp:L2 bitmap full                   1
ASSM gsp:Search all                       1
ASSM gsp:Search hint                      2
ASSM gsp:Search steal                     1
ASSM gsp:bump HWM                         1
ASSM gsp:good hint                   10,016
ASSM rsv:fill reserve                     1

It’s interesting to note that in this case we see (I assume) a few cases where we’ve done the check for an L1 bitmap block, gone to a data blocks that was apparently free, and discovered that our insert would make to over full – hence the 201 “blocks marked full”.

Critically, of course, this is just another of the many little indications of how “client/server” chatter introduces lots of little bits of extra work when compared to the “Thick DB “ approach.

One final set of figures. Going back to an example that first alerted me to the type of performance catastrophes that ASSM could contribute to, I re-ran my test case on 12.2 and checked the ASSM figures reported. The problem was that a switch from a 4KB or 8KB blocks size to a 16KB bblock size produced a performance disaster. A version of my  test case and some timing results are available on Greg Rahn’s site.

In my test case I have 830,000 rows and do an update that sets column2 to column1 changing it from null to an 8-digit value. With a 16KB block size and PCTFREE set to a highly inappropriate value (in this case the default value of 10) this is what the new ASSM statistics looks like:


Name                                  Value
----                                  -----
ASSM gsp:get free block             668,761
ASSM cbk:blocks examined            671,404
ASSM cbk:blocks marked full           2,643
ASSM gsp:L1 bitmaps examined      1,338,185
ASSM gsp:L2 bitmaps examined        672,413
ASSM gsp:Search all                     332
ASSM gsp:Search hint                668,760
ASSM gsp:Search steal                   332
ASSM gsp:bump HWM                       332
ASSM wasted db state change         669,395

I’d love to know what the figures would have looked like if they had been available in the original Oracle 9.2.0.8 case (my guess is that the “blocks examined” statistic would have been in the order of hundreds of millions); they look fairly harmless in this case even though the database (according to some of the other instance activity stats) did roughly 10 times the work you might expect from a perfect configuration.

Even here, though, where the original catastrophic bug has been addressed, the ASSM stats give you an important clue: we’ve been doing a simple update so why have we even been looking for free space (get free block); even stranger, how come we had to examine 1.3M L1 bitmaps when we’ve only updated 830,000 rows surely the worst case scenario shouldn’t have been worse that 1 to 1; and then there’s that “wasted db state change” – I don’t understand exactly what that last statistic is telling me but when I’m worried about performance I tend to worry about anything that’s being wasted.

In passing – if you want to insert a single row into an unindexed table you can expect Oracle to examine the segment header, then an L2 bitmap block, then an L1 bitmap block to find a data block for the insert. (In rare cases that might be segment header, L3, L2, L1). There are then optimisation strategies for pinning blocks – the session will pin the L1 bitmap block briefly because it may have to check several data blocks it references because they may be full even though they are flagged as having space; similarly the session will pin the L2 bitmap block because it may need to mark an L1 bitmap block as full and check another L1 block. The latter mechanism probably explains why we have examined more L1 bitmaps than L2 bitmaps.

Finally, the full monty

Just a list of all the instance statistics that start with “ASSM”:

ASSM bg: segment fix monitor
ASSM bg:create segment fix task
ASSM bg:mark segment for fix
ASSM bg:slave compress block
ASSM bg:slave fix one segment
ASSM bg:slave fix state
ASSM cbk:blocks accepted
ASSM cbk:blocks examined
ASSM cbk:blocks marked full
ASSM cbk:blocks rejected
ASSM fg: submit segment fix task
ASSM gsp:Alignment unavailable in space res
ASSM gsp:L1 bitmaps examined
ASSM gsp:L2 bitmap full
ASSM gsp:L2 bitmaps examined
ASSM gsp:L3 bitmaps examined
ASSM gsp:Optimized data block rejects
ASSM gsp:Optimized index block rejects
ASSM gsp:Optimized reject DB
ASSM gsp:Optimized reject l1
ASSM gsp:Optimized reject l2
ASSM gsp:Search all
ASSM gsp:Search hint
ASSM gsp:Search steal
ASSM gsp:add extent
ASSM gsp:blocks provided by space reservation
ASSM gsp:blocks rejected by access layer callback
ASSM gsp:blocks requested by space reservation
ASSM gsp:bump HWM
ASSM gsp:get free block
ASSM gsp:get free critical block
ASSM gsp:get free data block
ASSM gsp:get free index block
ASSM gsp:get free lob block
ASSM gsp:good hint
ASSM gsp:reject L1
ASSM gsp:reject L2
ASSM gsp:reject L3
ASSM gsp:reject db
ASSM gsp:space reservation success
ASSM gsp:use space reservation
ASSM rsv:alloc from reserve
ASSM rsv:alloc from reserve fail
ASSM rsv:alloc from reserve succ
ASSM rsv:clear reserve
ASSM rsv:fill reserve
ASSM wasted db state change

January 27, 2017

DBaaS Performance

Filed under: 12c,Cloud,Oracle — Jonathan Lewis @ 7:58 am BST Jan 27,2017

I don’t know how I missed it but Randolf Geist has been doing writing a series of posts on the performance of Oracle’s DBaaS offering, using a series of long-running tests to capture not only raw performance figures but also an indication of consistency. You can find all of these tests with a search URL on his blog, but I’ve also created a little index here to make it easier for me to access them in order.

Oracle Database Cloud (DBaaS) Performance Consistency

Oracle Database Cloud (DBaaS) Performance

h/t to Connor McDonald for the tweet that took me back to Randolf’s blog.

January 10, 2017

Join Elimination 12.2

Filed under: 12c,Bugs,Execution plans,Oracle — Jonathan Lewis @ 1:03 pm BST Jan 10,2017

From time to time someone comes up with the question about whether or not the order of tables in the from clause of a SQL statement should make a difference to execution plans and performance. Broadly speaking the answer is no, although there are a couple of boundary cases were a difference can appear unexpectedly.

When considering join permutations the optimizer has a few algorithms for picking an initial join order and then deciding how to permute from that order, and one of the criteria with the very lowest priority (i.e. when all other factors are equal) is dictated by the order the tables appear in the from clause so if you have enough tables in the from clause it’s possible for the subset of join orders considered to change if you change the from clause in a way that causes the initial join order to change.

It’s been over 11 years since I wrote the article I’ve linked to in the previous paragraph and in that time no-one has yet approached me with other examples of a plan changing due to a change in the from clause order (though, with all the transformations now available to the optimizer, I wouldn’t be surprised if a few cases have appeared occasionally, so feel free to let me know if you think you’ve got an interesting example that I can experiment on).

A little while ago, though, while testing a feature enhancement in 12.2, I finally came across a case where a real difference appeared. Here’s the query I was using – I’ll give you the SQL to reproduce the tables at the end of the article:


select 
	count(c.small_vc_c)
from 
	grandparent	g, 
	parent		p,
	child		c
where
	c.small_num_c between 200 and 215
and	p.id   = c.id_p
and	p.id_g = c.id_g
and	g.id   = p.id_g
;

As you will see later on the three tables grandparent, parent, child have the obvious primary keys and referential integrity constraints. This means that grandparent has a single-column primary key, parent has a two-column primary key, and child has a three-column primary key. The query joins the three tables along their primary keys and then selects data only from the child table, so it’s a good candidate for join elimination.

In earlier versions of Oracle join elimination could take place only if the primary key you joined to was a single column key, so 12.1 and earlier would be able to eliminate just the grandparent from this three-table join; but in 12.2 multi-column primary keys also allow join elimination to take place, so we might hope that the plan we get from this query would eliminate both the grandparent and parent tables. Here’s the plan (pulled from memory after execution):

SQL_ID  8hdybjpn2884b, child number 0
-------------------------------------
select  count(c.small_vc_c) from  grandparent g,  parent  p,  child  c
where  c.small_num_c between 200 and 215 and p.id   = c.id_p and p.id_g
= c.id_g and g.id   = p.id_g

Plan hash value: 4120004759

-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |       |       |    26 (100)|          |
|   1 |  SORT AGGREGATE     |       |     1 |    23 |            |          |
|   2 |   NESTED LOOPS      |       |    85 |  1955 |    26   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| CHILD |    85 |  1615 |    26   (4)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN| G_PK  |     1 |     4 |     0   (0)|          |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("C"."SMALL_NUM_C"<=215 AND "C"."SMALL_NUM_C">=200))
   4 - access("G"."ID"="C"."ID_G")

It didn’t work quite as expected. The optimizer has managed to eliminate table parent – so that looks like “single column primary key” join elimination has worked, but “multi-column” join elimination hasn’t appeared. On the other hand, I’ve not followed my usual rules for writing SQL so let’s try again. If I follow the pattern I usually follow, my from clause would have been in the order child  -> parent -> grandparent – listing the tables in the order I expect to visit them. Here’s the plan – again pulled from memory – after making this visual change the SQL:


SQL_ID  1uuq5vf4bq0gt, child number 0
-------------------------------------
select  count(c.small_vc_c) from  child  c,  parent  p,  grandparent g
where  c.small_num_c between 200 and 215 and p.id   = c.id_p and p.id_g
= c.id_g and g.id   = p.id_g

Plan hash value: 1546491375

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |       |       |    26 (100)|          |
|   1 |  SORT AGGREGATE    |       |     1 |    15 |            |          |
|*  2 |   TABLE ACCESS FULL| CHILD |    85 |  1275 |    26   (4)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("C"."SMALL_NUM_C"<=215 AND "C"."SMALL_NUM_C">=200))

So join elimination based on multi-column primary keys does work – but you might have to get a bit lucky in the order you list the tables in the from clause.

Footnote.

If you’re wondering whether or not switching from Oracle syntax to ANSI syntax would make a difference, it does: with ANSI syntax both grandparent and parent are eliminated if the SQL lists the tables in the order grandparent -> parent -> child (i.e. the order which doesn’t work properly for Oracle syntax) and only the parent is eliminated for the order child -> parent -> grandparent. In other words, both syntax options have a point of failure but they fail the opposite way around.

Code:


rem
rem	Script:		join_elimination_12c2.sql
rem	Author:		Jonathan Lewis
rem	

-- Environment details eliminated

define m_pad=100

/*
	IDs will be 1 to 1000
*/

create table grandparent 
as
select 
	rownum			id,
	trunc((rownum-1)/5)	small_num_g,
	rpad(rownum,10)		small_vc_g,
	rpad(rownum,&m_pad)	padding_g
from 
	all_objects 
where 
	rownum <= 1000
;

/*
	Each GP has two (scattered) children here
	Parent IDs are 1 to 2,000
*/

create table parent 
as
select 
	1+mod(rownum,1000)	id_g,
	rownum			id,
	trunc((rownum-1)/5)	small_num_p,
	rpad(rownum,10)		small_vc_p,
	rpad(rownum,&m_pad)	padding_p
from 
	all_objects 
where 
	rownum <= 2000
;

/*
	Simple trick to get 5 (clustered) children per parent
	Child IDs are 1 to 12,000
*/

create table child 
as
select 
	id_g,
	id			id_p,
	rownum			id,
	trunc((rownum-1)/5)	small_num_c,
	rpad(rownum,10)		small_vc_c,
	rpad(rownum,&m_pad)	padding_c
from 
	parent	p,
	(
		select /*+ no_merge */ 
			rownum 
		from	parent p 
		where	rownum <= 5
	)	d
;

create unique index g_pk on grandparent(id);
create unique index p_pk on parent     (id_g, id)       compress 1;
create unique index c_pk on child      (id_g, id_p, id) compress 2;

alter table grandparent add constraint g_pk primary key (id);
alter table parent      add constraint p_pk primary key (id_g, id);
alter table child       add constraint c_pk primary key (id_g, id_p, id);

alter table parent add constraint p_fk_g foreign key (id_g)       references grandparent;
alter table child  add constraint c_fk_p foreign key (id_g, id_p) references parent;

rem
rem	Don't need to collect stats because it's 12c
rem

prompt	===============================================================
prompt	Join all three tables with the FROM clause ordered gp -> p -> c
prompt	The final plan is GP->C, The optimizer eliminated P before
prompt	considering GP
prompt	===============================================================

select 
	count(c.small_vc_c)
from 
	grandparent	g, 
	parent		p,
	child		c
where
	c.small_num_c between 200 and 215
and	p.id   = c.id_p
and	p.id_g = c.id_g
and	g.id   = p.id_g
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

prompt	===============================================================
prompt	Join all three tables with the FROM clause ordered c -> p -> gp
prompt	The final plan is a tablescan of C only. The optimizer managed 
prompt	to eliminate GP first and P second
prompt	===============================================================

select 
	count(c.small_vc_c)
from 
	child		c,
	parent		p,
	grandparent	g 
where
	c.small_num_c between 200 and 215
and	p.id   = c.id_p
and	p.id_g = c.id_g
and	g.id   = p.id_g
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

prompt	==================================================
prompt	Convert to ANSI standard in the order gp -> p -> c
prompt	and both gp and p eliminated.
prompt	==================================================

select 
	count(c.small_vc_c)
from 
	grandparent	g
join
	parent		p
on	p.id_g = g.id
join
	child		c
on	c.id_p = p.id
and	c.id_g = p.id_g
where
	c.small_num_c between 200 and 215
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

prompt	===================================================
prompt	Convert to ANSI standard in the order c -> p -> gp
prompt	and only p is eliminated. 
prompt	===================================================

select 
	count(c.small_vc_c)
from 
	child		c
join
	parent		p
on      p.id   = c.id_p 
and	p.id_g = c.id_g 
join
	grandparent	g
on	g.id = p.id_g 
where
	c.small_num_c between 200 and 215
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

It’s possible, of course, that with different system stats, or I/O calibration, or extent sizes, or segment space management, or block sizes, sundry other parameter details that you won’t be able to reproduce the results without messing about a little bit, but I don’t think I’ve done anything special in the setup that would make a real difference.

Footnote:

If you’re wondering why the “traditional” and “ANSI” syntax should exhibit this flaw for joins in the opposite direction – remember that ANSI SQL is first transformed into an equivalent Oracle form and – in the simple cases – the first two tables form the first query block then each table after that introduces a new query block, so the optimizer strategy does (approximately) the following translation:


select ... from grandparent join parent join child

==>

select ... from (select ... from grandparent join parent) join child

The optimizer then optimizes the inline query, which eliminates grandparent leaving a join between parent and child, which then allows parent to be eliminated.

Conversely we get:

select ... form child join parent join grandparent 

==>

select ... from (select ... from child join parent) join grandparent

In this form the optimizer eliminates parent from the inline view and is left with a join between child and grandparent – so no further elimination.

December 13, 2016

Index Compression

Filed under: 12c,Indexing,Oracle — Jonathan Lewis @ 1:11 pm BST Dec 13,2016

Richard Foote has published a couple of articles in the last few days on the new (licensed under the advanced compression option) compression mechanism in 12.2 for index leaf blocks. The second of these pointed out that the new “high compression” mechanism was even able to compress single-column unique indexes – a detail that doesn’t make sense and isn’t allowed for the older style “leading edge deduplication” mechanism for index compression.

In 12.2 an index can be created (or rebuilt) with the option “compress advanced high” – and at the leaf-block level this will create “compression units” (possibly just one per leaf block – based on my early testing) that takes the complexity of compression far beyond the level of constructing a directory of prefixes. Richard demonstrated the benefit by creating a table with a numeric unique index – then compressed the index, reducing its size from 2,088 leaf blocks to 965 leaf blocks, which is pretty dramatic difference.

It crossed my mind, though, to wonder whether the level of compression was a side effect of the very straightforward code that Richard had used to create the data set: the table was a million rows with a primary key that had been generated as the rownum selected from the now-classic “connect by..” query against dual, and the row length happened to allow for 242 rows per 8KB table block.

If you pause to think about this data set you realise that if you pick the correct starting point and walk through 242 consecutive entries of the index you will be walking through 242 consecutive rows in the table starting from the zeroth row in a particular table block and ending at the 241st row in that block. A rowid (as stored by Oracle in a simple B-tree index) consists of 6 bytes and the first five bytes of the rowid will be the same for the first 256 rows in any one table block (and the first four will still be the same for the remaining rows in the block). Richard’s data set will be very close to ideal for any byte-oriented, or bit-oriented, compression algorithm because (to use Oracle terminology) the data will have a perfect clustering_factor. (He might even have got a slightly better compression ratio if he’d used an /*+ append */ on the insert, or done a CTAS, and reduced the rowsize to get a uniform 256 rows per table block.)

So how do things change if you randomise the ordering of the key ? Here’s a variant on Richard’s code:


rem
rem	Script:		index_compression_12c_2.sql
rem	Author:		Jonathan Lewis
rem	Dated:		Dec 2016
rem
rem	Last tested 
rem		12.2.0.1
rem

execute dbms_random.seed(0)

create table t1
nologging
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4
)
select
	rownum				id,
	lpad('x',130,'x')		padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
order by
	dbms_random.value
;

select table_name, blocks, round(num_rows/blocks,0) rpb from user_tables where table_name = 'T1';

drop index t1_i1;
create unique index t1_i1 on t1(id);
execute dbms_stats.gather_index_stats(user,'t1_i1');
select index_name, compression, pct_free, leaf_blocks from user_indexes where index_name = 'T1_I1';

drop index t1_i1;
create unique index t1_i1 on t1(id) compress advanced high;
execute dbms_stats.gather_index_stats(user,'t1_i1');
select index_name, compression, pct_free, leaf_blocks from user_indexes where index_name = 'T1_I1';

The initial drop index is obviously redundant, and the calls to gather_index_stats should also be redundant – but they’re there just to make it obvious I haven’t overlooked any checks for correctness in the build and stats.

You’ll notice that my row length is going to be slightly more “real-world” than Richard’s so that the degree of compression I get from nearly identical rowid values is likely to be reduced slightly, and I’ve completely randomised the order of key values.

So what do the results like ?

With the default pctfree = 10, and in a tablespace of uniform 1MB extents, 8KB blocks, utilising ASSM I get this:


TABLE_NAME               BLOCKS        RPB
-------------------- ---------- ----------
T1                        19782         51

INDEX_NAME           COMPRESSION     PCT_FREE LEAF_BLOCKS
-------------------- ------------- ---------- -----------
T1_I1                DISABLED              10        2088
T1_I1                ADVANCED HIGH         10        1303

Unsurprisingly the uncompressed index is exactly the same size as Richard’s (well, it was just the integers from 1 to 1M in both cases) but the compression ratio is significantly less – though still pretty impressive.

Of course, for this type of index my example probably goes to the opposite extreme from Richard’s. Realistically if you have a sequence based key with an OLTP pattern of data arrival then consecutive key values are likely to be scattered within a few blocks of each other rather than being scattered complely randomly across the entire width of the table; so a more subtle model (using a suitable number of concurrent processes to insert ids based on a sequence, perhaps) would probably get a better compression ratio than I did, though a worse one than Richard’s.There’s also the issue of the size of the key value itself – once you get to values in the order of 10 million to 100 million you’re looking at mostly 4 bytes (internal format) storage where for large runs of values the first 3 bytes match, possibly leading to a better compression ratio.

Of course the question of globally partitioned indexes will be relevant for some people since the principle reason for global indexes on partitioned tables is to enforce uniqueness on column combinations that don’t include the partition key, and that introduces another possible benefit – the rowid goes up to 10 bytes, of which the first 4 bytes are the object id of the table partition: depending on the nature of the partitioning that repeated 4 bytes per row may be close to non-existent after compression, giving you a better compression ratio on globally partitioned than you get on any other type of single column unique index.

Once you start looking at the details there are a surpising number of factors that can affect how well the advanced compression high can work.

Footnote:

Once you’ve created the index, you can start poking around in all sorts of ways to try and work out what the compression algorithm does. A simple block dump is very informative, with lots of clues in the descriptive details – and lots of puzzles when you start looking too closely. There are hints that this type of index compression adopts a strategy similar to “oltp comprssion” for tables in that compression occurs only as the leaf block becomes full – and possibly allows some sort of batching process within a leaf block before finally compressing to a single contiguous unit. (This is just conjecture, at present: the only firm statement I’ll make about the actual implementation of index compression is that it uses a lot of CPU; in my example the baseline create index took about 1.5 seconds of CPU, the compressed create took about 4.5 seconds of CPU.)

There are also a couple of amusing side effects that may confound those who use the old “validate index / query index_stats” two-step to decide when to rebuild indexes. Here’s what I got on the compressed index:


SQL> validate index t1_i1;

SQL> select blocks, lf_rows, lf_rows_len, btree_space, used_space, pct_used from index_stats;

    BLOCKS    LF_ROWS LF_ROWS_LEN BTREE_SPACE USED_SPACE   PCT_USED
---------- ---------- ----------- ----------- ---------- ----------
      1408    1000000	 14979802    10416812	14994105	144

My index is using 144% of the space that it has been allocated. You don’t have to be very good at maths (or math, even) to realise that something’s wrong with that number.

October 20, 2016

Conjuctive Normal Form

Filed under: 12c,Exadata,in-memory,Oracle — Jonathan Lewis @ 1:00 pm BST Oct 20,2016

I recently tweeted about a comment I’d picked up at the Trivadis performance days regarding tablescans and performance.

“If you can write your SQL in conjunctive normal form it can help the optimizer to offload more predicates”

Inevitably someone asked me if I had an example to demonstrate this – I didn’t, and still don’t really, but here’s an interesting demo based on an example from the Oracle In-Memory blog showing how the optimizer will rearrange your filter predicates before passing them to the tablescan code for evaluation against an inmemory table.


rem
rem     Script:         in_memory_conjunctive.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2016
rem     Purpose:
rem
rem     Last tested
rem             12.1.0.2
rem

create table t1
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                          id,
        trunc(dbms_random.value(1,501)) qty,
        mod(rownum,200) + 1             part_no,
        lpad(rownum,10,'0')             v1,
        lpad('x',50,'x')                padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e7
;
prompt  ==========
prompt  Base query
prompt  ==========

select
        count(v1)
from
        t1
where
        (qty > 495 or (qty < 3 and part_no = 50))
;
prompt  ===============
prompt  predicate added
prompt  ===============

select
        count(v1)
from
        t1
where
        (qty > 495 or qty < 3) and (qty > 495 or (qty < 3 and part_no = 50))
;
prompt  =================
prompt  Ordered predicate
prompt  =================

select  /*+ ordered_predicates */
        count(v1)
from
        t1
where
        (qty > 495 or qty < 3) and (qty > 495 or (qty < 3 and part_no = 50))
;

The 2nd and 3rd queries add a predicate to the first query – which, unfortunately, changes the estimated cardinality even though it has no effect on the result. This predicate is one that would be added by the inmemory code path if the table were declared to be inmemory. I’ve got two versions of the query, one with the (deprecated) ordered_predicates hint because in my initial tests the optimizer swapped the order of the predicates and I wanted to see if the ordering was at all critical.

Here’s the plan for the base query – first before declaring the table inmemory, then after declaring the table inmemory:


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       | 14739 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    19 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   100K|  1862K| 14739   (6)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("QTY">495 OR ("QTY"<3 AND "PART_NO"=50)))
------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |       |       |  1974 (100)|          |
|   1 |  SORT AGGREGATE             |      |     1 |    19 |            |          |
|*  2 |   TABLE ACCESS INMEMORY FULL| T1   |   100K|  1862K|  1974  (44)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - inmemory((("QTY">495 OR "QTY"<3) AND ("QTY">495 OR ("QTY"<3 AND "PART_NO"=50)))) filter(("QTY">495 OR ("QTY"<3 AND "PART_NO"=50)))

And here, after putting the table back to no inmemory are the plans for the second and third queries; note, particularly the different order of the predicates in the predicate section: the predicate order matches the inmemory predicate order only if I use the ordered_predicates hint:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       | 14741 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    19 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  1404 | 26676 | 14741   (6)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((("QTY">495 OR ("QTY"<3 AND "PART_NO"=50)) AND ("QTY">495
              OR "QTY"<3)))
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       | 14741 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    19 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |  1404 | 26676 | 14741   (6)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter((("QTY">495 OR "QTY"<3) AND ("QTY">495 OR ("QTY"<3 AND
              "PART_NO"=50))))

Finally the run times – after running the queries a few times each to check for consistency:

  • Base query: 0.82 seconds
  • Query with extra predicate: 0.86 seconds
  • Query with extra predicate and forced order of predicate evaluation: 0.71 seconds

The query with the predicate arrangement matching the inmemory rewrite actually ran 13% faster than the original. Unfortunatly the rewrite without the ordered_predicates hint ran slower – which is a bit of a shame but understandable – the first predicate is the more complex, and then the code has to run a completely redundant second predicate; I was a little surprised at how much slower it was, but the table is 10M rows and we’re only looking at sub-second times anyway.

My table was fully cached and just under 112,000 blocks, so not very large, and this was running a serial query on a basic Oracle instance. Nevetheless there is a difference in execution time that is more than just “random noise” – If this is an indication of how a little unsightly tweaking of SQL for small data sets can make a difference, you can imagine that there might be a worthwhile benefit to considering ways of tweaking your predicates that make a significant difference to execution time if the extra predicates end up being pushed down to storage on an Exadata machine.

Footnote:

Another “not quite” example I happen to have written about a few months ago is a case where rewriting “not exists() OR not exists() OR not exists()” as “not (exists() AND exists() AND exists())” allowed Oracle to rewrite three subqueries as a single subquery with three-table join.

 

October 10, 2016

InMemory Bonus

Filed under: 12c,in-memory,Oracle — Jonathan Lewis @ 1:13 pm BST Oct 10,2016

It should be fairly well known by now that when you enable the 12c InMemory (Columnar Store) option (and set the inmemory_size) your SQL may take advantage of a new optimizer transformation know as the Vector Transformation, including Vector Aggregation. You may be a little surprised to learn, though, that some of your plans may change even when they don’t produce any sign of a vector transformation as a consequence. This is because In-Memory Column Store isn’t just about doing tablescans very quickly it’s also about new code paths for doing clever things with predicates to squeeze all the extra benefits from the technology. Here’s an example:


rem
rem     Script:         12c_inmemory_surprise.sql
rem     Author:         Jonathan Lewis
rem     Dated:          July 2016
rem
rem     Last tested
rem             12.1.0.2
rem

drop table t2 purge;
drop table t1 purge;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                                          id,
        trunc((rownum - 1)/100)                         n1,
        trunc((rownum - 1)/100)                         n2,
        trunc(dbms_random.value(1,1e4))                 rand,
        cast(lpad(rownum,10,'0') as varchar2(10))       v1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
from
        generator       v1
;

create table t2
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                                          id,
        trunc((rownum - 1)/100)                         n1,
        trunc((rownum - 1)/100)                         n2,
        trunc(dbms_random.value(1,1e4))                 rand,
        cast(lpad(rownum,10,'0') as varchar2(10))       v1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create index t1_n1   on t1(n1)   nologging;
create index t2_rand on t2(rand) nologging;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for columns (n1,n2) size 1'
        );
end;
/

There’s nothing particularly special about these two tables – I engineered them for a different demo, and the call to gather extended stats on the column group (n1, n2) is just a minor detail in getting better cardinality estimates in the upcoming plans. At this point I haven’t specified that either table should be in memory, so let’s see what plan I get from dbms_xplan.display_cursor() when I run a query that should do a hash join using t1 as the build table and t2 as the probe table:


select
        /*+
                qb_name(main)
        */
        count(*)
from    (
        select
                /*+ qb_name(inline) */
                distinct t1.v1, t2.v1
        from
                t1,t2
        where
                t1.n1 = 50
        and     t1.n2 = 50
        and     t2.rand = t1.id
        )
;

select * from table(dbms_xplan.display_cursor);

Plan hash value: 1718706536

-------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |       |       |       |  2441 (100)|          |
|   1 |  SORT AGGREGATE                         |       |     1 |       |            |          |
|   2 |   VIEW                                  |       | 10001 |       |  2441   (4)| 00:00:01 |
|   3 |    HASH UNIQUE                          |       | 10001 |   351K|  2441   (4)| 00:00:01 |
|*  4 |     HASH JOIN                           |       | 10001 |   351K|  2439   (4)| 00:00:01 |
|*  5 |      TABLE ACCESS BY INDEX ROWID BATCHED| T1    |   100 |  2100 |     3   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN                  | T1_N1 |   100 |       |     1   (0)| 00:00:01 |
|   7 |      TABLE ACCESS FULL                  | T2    |  1000K|    14M|  2416   (4)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."RAND"="T1"."ID")
   5 - filter("T1"."N2"=50)
   6 - access("T1"."N1"=50)

Thanks to the column group the optimizer has estimated correctly that the number of rows selected from t1 would be 100. Beyond that there’s very little exciting about this execution plan.

So let’s modify t2 to be in-memory and see how the plan changes as we re-execute the query:


alter table t2 inmemory;

select
        /*+ qb_name(main) */
        count(*)
...

select * from table(...);


Plan hash value: 106371239

----------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |         |       |       |   259 (100)|          |
|   1 |  SORT AGGREGATE                          |         |     1 |       |            |          |
|   2 |   VIEW                                   |         | 10001 |       |   259  (27)| 00:00:01 |
|   3 |    HASH UNIQUE                           |         | 10001 |   351K|   259  (27)| 00:00:01 |
|*  4 |     HASH JOIN                            |         | 10001 |   351K|   257  (26)| 00:00:01 |
|   5 |      JOIN FILTER CREATE                  | :BF0000 |   100 |  2100 |     3   (0)| 00:00:01 |
|*  6 |       TABLE ACCESS BY INDEX ROWID BATCHED| T1      |   100 |  2100 |     3   (0)| 00:00:01 |
|*  7 |        INDEX RANGE SCAN                  | T1_N1   |   100 |       |     1   (0)| 00:00:01 |
|   8 |      JOIN FILTER USE                     | :BF0000 |  1000K|    14M|   234  (20)| 00:00:01 |
|*  9 |       TABLE ACCESS INMEMORY FULL         | T2      |  1000K|    14M|   234  (20)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T2"."RAND"="T1"."ID")
   6 - filter("T1"."N2"=50)
   7 - access("T1"."N1"=50)
   9 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."RAND"))

The cost of the tablescan drops dramatically as the optimizer assumes the table will be in the in-memory column store (IMCS) but (in manys way more significantly) we suddenly have a (serial) Bloom filter for a hash join – which eliminates (most of) the data that wouldn’t have survived the hash join without having to use the CPU that would normally be spent probing the build table.

This is an interesting example of what the in-memory code path can do for us. There’s no point in using the Bloom filter for a serial hash join in “classic” Oracle because the Bloom filter is basically the bitmap of the hash table that is the first thing Oracle examines when probing the hash table – but with “in-memory” Oracle there are some enhancements to the Bloom filter and the code path that make using the Bloom filter an effective strategy. Most significant, perhaps, is that the in-memory code path can use SIMD instructions to perform multiple probes from t2 simultaneously, so not only do we get the benefits of avoiding disk access, buffer activity and row-by-row access to columns, we also reduce the CPU time spent on making the first-stage comparisons of the hash join. (And shared columnar dictionaries in 12.2 could reduce this even further!)

Footnote: I also have a note I scribbled dowsn at the Trivadis performance days last month that the Bloom filter used with IMCS carries the actual low and high values from the build table.  I may have misinterpreted this as I wrote it, but if that’s correct then it’s another step in eliminating data very quickly and very early when using IMCS (or Exadata if the same feature exists in the Bloom filters that get pushed to the storage servers.)

 

August 2, 2016

Adaptive mayhem

Filed under: 12c,Oracle — Jonathan Lewis @ 4:29 pm BST Aug 2,2016

So you run a query and it gives you a plan with a note that says “This is an adaptive plan”.

So you run it again and the plan changes,  with a note that says “Statistics feedback used for this statement”

So you pause to think for a bit, then run the query again and the plan changes, with a note that says “One SQL Directive used, dynamic statistics used”. (You waited too long and the internal re-optimization hints got flushed down into an SQL directive.)

So you decide to think about it the following morning when you’re feeling bright and fresh, and when you run it you get another plan because overnight the automatic stats job gathered stats on the critical table and created a column group that was indicated by the (now defunct) directive.

Happy optimisation!

See also:

Update 1:

This just in from fellow Oak Table member Stefan Koehler:If the automatic stats collection task doesn’t finish inside its window that you may get unlucky and find that Oracle has created extended stats on a table, failed to complete the stats collection, and dropped the extended stats. Bug 19450314 mentioned above avoids the problems of unnecessary invalidation when the extended stats are added and collected – it doesn’t deal with the problem if the stats are dropped.

Update 2(Sep 2016):

Fresh news from OpenWorld 2016 on 12.2, courtesy of DBI Services’ Franck Pachot: The parameter contolling adaptive optimisation has been split into two parts, one for the adaptive plans the other for adaptive statistics – so you can officially switch of the bit that’s more likely to cause unexpected side effects while leaving the bit that’s more likely to be beneficial active.

Update 3 (Nov 2016):

The latest news from Stefan Koelher on Bug 19450314 (see update 1 above) is that the unfinished bits of the fix for this bug are now known as bug 21418655. “Not done were the following: MODIFY, RENAME and DROP column (including work on Drop_Extended_Stats).”

Update 4 (Dec 2016)

And now a potential threat (about column groups in general) described in a series of three blog posts from Magnus Johansson. With reference to a relevant MoS Doc Id and patch.

July 7, 2016

Invisible Bug

Filed under: 12c,Bugs,CBO,Indexing,Oracle — Jonathan Lewis @ 5:27 pm BST Jul 7,2016

At this Wednesday’s Oracle Midlands event someone asked me if Oracle would use the statistics on invisible indexes for the index sanity check. I answered that there had been a bug in the very early days of invisible indexes when the distinct_key statistic on the index could be used even though the index itself would not be considered as a candidate in the plan (and the invisible index is still used to avoid foreign key locking – even in 12c – it’s only supposed to be invisible to the optimizer).

The bug was fixed quite a long time ago – but a comment on the “Index Sanity” article has introduced me to a related bug that is still present in 11.2.0.4 where the presence of an invisible index can affect an execution plan. Here’s a little model (run under 11.2.0.4) to demonstrate:

rem
rem     Script:         invisible_index_bug.sql
rem     Author:         Jonathan Lewis
rem

execute dbms_random.seed(0)

drop table t2;
drop table t1;

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        cast(rownum as number(8,0))                     id,
        cast(mod(rownum,1000) as number(8,0))           n1,
        cast(lpad(rownum,10,'0') as varchar2(10))       v1,
        cast(lpad('x',100,'x') as varchar2(100))        padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t2
as
select
        rownum id,
        trunc(dbms_random.value(0,10000)) n1
from
        dual
connect by
        level <= 100
;
begin 
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

column n1 new_value m_n1
select n1 from t2 where id = 50;
clear columns

set autotrace traceonly explain

select
        t1.*
from
        t1, t2
where
        t2.n1 = &m_n1
;

create unique index t2_i1 on t2(n1)
-- invisible
;

select
        t1.*
from
        t1, t2
where
        t2.n1 = &m_n1
;

set autotrace off

All I’ve done is create a couple of tables then do a join that we might expect to see executed as a cartesian merge join; at one point I was going to make the data more complicated and include a join condition, but decided to keep things small and simple so it’s a silly example but it is sufficient to make the point. The funny little bit about selecting an n1 value from t2 was also in anticipation of a more complex example but it does, at least, ensure I query for a value that is in range.

Here are the two execution plans from 11.2.0.4 – the key feature is that the plan changes after the invisible index is created:


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|   119M|  2263   (3)| 00:00:12 |
|   1 |  MERGE JOIN CARTESIAN|      |  1000K|   119M|  2263   (3)| 00:00:12 |
|*  2 |   TABLE ACCESS FULL  | T2   |     1 |     4 |     2   (0)| 00:00:01 |
|   3 |   BUFFER SORT        |      |  1000K|   115M|  2261   (3)| 00:00:12 |
|   4 |    TABLE ACCESS FULL | T1   |  1000K|   115M|  2261   (3)| 00:00:12 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T2"."N1"=5308)


---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  1000K|   119M|  2263   (3)| 00:00:12 |
|   1 |  NESTED LOOPS      |      |  1000K|   119M|  2263   (3)| 00:00:12 |
|*  2 |   TABLE ACCESS FULL| T2   |     1 |     4 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| T1   |  1000K|   115M|  2261   (3)| 00:00:12 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T2"."N1"=5308)

Notice how the plan without the invisible index does a “sort” (actually a “buffer sort” so it’s just copying the data into local memory), while the plan with the not quite invisible enough index in place gets away with just a full tablescan. This is bug 16564891, associated with base bug 16544878.

The bug notes say “fixed in 12.2”, but in Oracle 12.1.0.2 the first plan appears in both cases, and we have to make the index visible to get the second plan. (Take note of the need for the “negative” test to prove the point; the fact that the same plan appears for both cases doesn’t, by itself, prove that the bug was fixed, we have to show that the plan would have changed if the bug had still been present).

I believe the problem isn’t the problem of Oracle using the statistics when it shouldn’t; the change appears because in 11g Oracle incorrectly allows itself to see the uniqueness of the index and infer that table t2 is a “single row” table. In 12c the optimizer calculates that there will probably be only one row but that doesn’t stop it choosing the merge join cartesian as the “insurance bet” against having to do more than one tablescan of the t1 table. We can see this difference in the 10053 trace files, the 11g file has an entry for the “Single Table Access Path” for t2 that reads:

1-ROW TABLES:  T2[T2]#0

If you read the bug note for bug 16564891 you’ll see that it has a more realistic example of the problem – and it may give you some idea of where you might run into the bug. In general I don’t think many people are likely to come across the problem since it revolves around uniqueness, which is rather an important property, and there can’t be many occasions when someone decides to add (or test dropping) a unique index. Given that the example in the bug looks like “add a unique index to a dimension table that’s joining to a fact table” that may be a good pointer to where you’re most likely to run into the problem — when you’re trying to enforce data correctness in a data warehouse.

 

June 10, 2016

Uniquely parallel

Filed under: 12c,distributed,Execution plans,Oracle,Parallel Execution,Performance,Upgrades — Jonathan Lewis @ 7:36 am BST Jun 10,2016

Here’s a surprising (to me) execution plan from 12.1.0.2 – parallel execution to find one row in a table using a unique scan of a unique index – produced by running the following script (data creation SQL to follow):


set serveroutput off
set linesize 180
set trimspool on
set pagesize 60

alter session set statistics_level = all;

variable b1 number
exec :b1 := 50000

select /*+ parallel (3) */ id, v1 from t2 where id=:b1;

select * from table(dbms_xplan.display_cursor(null,null,'allstats parallel'));

break on dfo_number skip 1 on tq_id skip 1 on server_type

select
        dfo_number, tq_id, server_type, instance, process, num_rows
from
        v$pq_tqstat
order by
        dfo_number, tq_id, server_type desc, instance, process
;

All I’ve done is enable rowsource execution statistics, set a bind variable to a value, query a table with a /*+ parallel(3) */ hint to find the one row that will be identified by primary key, and then reported the actual execution plan. When I first ran the test Oracle didn’t report the execution statistics correctly so I’ve also queried v$pq_tqstat to show the PX servers used and the flow of data through the plan. Here’s the plan, followed by the  results from v$pq_tqstat:


SQL_ID  0dzynh9d29pt9, child number 0
-------------------------------------
select /*+ parallel (3) */ id,v1 from t2 where id=:b1

Plan hash value: 247082613

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name     | Starts | E-Rows |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |          |      1 |        |        |      |            |      1 |00:00:00.02 |
|   1 |  PX COORDINATOR                   |          |      1 |        |        |      |            |      1 |00:00:00.02 |
|   2 |   PX SEND QC (RANDOM)             | :TQ10001 |      0 |      1 |  Q1,01 | P->S | QC (RAND)  |      0 |00:00:00.01 |
|   3 |    TABLE ACCESS BY INDEX ROWID    | T2       |      0 |      1 |  Q1,01 | PCWP |            |      0 |00:00:00.01 |
|   4 |     BUFFER SORT                   |          |      0 |        |  Q1,01 | PCWC |            |      0 |00:00:00.01 |
|   5 |      PX RECEIVE                   |          |      0 |      1 |  Q1,01 | PCWP |            |      0 |00:00:00.01 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)| :TQ10000 |      0 |      1 |  Q1,00 | S->P | HASH (BLOCK|      0 |00:00:00.01 |
|   7 |        PX SELECTOR                |          |      0 |        |  Q1,00 | SCWC |            |      0 |00:00:00.01 |
|*  8 |         INDEX UNIQUE SCAN         | T2_PK    |      0 |      1 |  Q1,00 | SCWP |            |      0 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access("ID"=:B1)

Note
-----
   - Degree of Parallelism is 3 because of hint

DFO_NUMBER      TQ_ID SERVER_TYP   INSTANCE PROCES   NUM_ROWS
---------- ---------- ---------- ---------- ------ ----------
         1          0 Producer            1 P003            0
                                          1 P004            1
                                          1 P005            0
                      Consumer            1 P000            0
                                          1 P001            1
                                          1 P002            0

                    1 Producer            1 P000            0
                                          1 P001            1
                                          1 P002            0
                      Consumer            1 QC              1

As you can see the table access follows a unique scan of an index and, although the rowsource execution stats report zero starts for the unique scan, we can see from v$pq_tqstat that slave P004 acquired a “row” (actually a rowid) and passed it to slave P001 which then acquired a row from the table and passed that row to the query coordinator. Oracle really did execute a parallel query, starting and stopping a total of 6 sessions to perform a single unique index access.

You’ll notice operation 7 is one you’ve only seen in the latest version of Oracle. The PX SELECTOR was introduced in 12c to reduce the number of times a complex parallel query would funnel into the query coordinator (parallel to serial) and then fan out again (serial to parallel) generating a new data flow operation tree (DFO tree) spawning one or two new parallel server groups as it did so. To stop this happening a step that needs to serialise in a 12c parallel plan can nominate one of the existing PX server processes (from each set, if necessary) to do the job so that the same set of PX servers can carry on running the query without the need for a new DFO tree to appear.

This enhancement to parallel execution plans is a good idea – except when it appears in my silly little query and turns something that ought to be quick and cheap into a job that is far more resource-intensive than it should be.

At this point, of course, you’re probably wondering what kind of idiot would put a parallel() hint into a query that was doing nothing but selecting one row by primary key – the answer is: “the Oracle optimizer in 12c”. I discovered this anomaly while creating a demonstration of the way that a distributed parallel query has to serialise through a single database link even if the operations at the two ends of the link run parallel. Here’s the SQL I wrote for the full demonstration:


rem     Script:         distributed_pq.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2016

define m_remote='test@loopback'
define m_remote='orcl@loopback'

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        cast(rownum as number(8,0))                     id,
        cast(lpad(rownum,8,'0') as varchar2(8))         v1,
        cast(rpad('x',100) as varchar2(100))            padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5
;

create table t2
as
select  *
from    t1
where   mod(id,100) = 0
;

alter table t2 add constraint t2_pk primary key(id);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

set serveroutput off

select
        /*+ parallel(3) */
        t1.v1, t2.v1
from
        t1,
        t2@&m_remote
where
        mod(t1.id,10) = 0
and     t2.id = t1.id
and     mod(to_number(t2.v1),10) = 1
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats parallel'));

If you want to run this demo you’ll need to do something about formatting the output; more importantly you’ll have to create a database link (with a loopback link) and set up a define identifying it at the line where I’ve got orcl@loopback and test@loopback (which are my 12c and 11g loopback links respectively).

Here’s the plan (with rowsource stats) I got from the 12c test:


----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Starts | E-Rows |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |      1 |        |        |      |            |      0 |00:01:14.67 |       7 |      0 |
|   1 |  NESTED LOOPS         |          |      1 |     10 |        |      |            |      0 |00:01:14.67 |       7 |      0 |
|   2 |   PX COORDINATOR      |          |      1 |        |        |      |            |  10000 |00:00:00.11 |       7 |      0 |
|   3 |    PX SEND QC (RANDOM)| :TQ10000 |      0 |   1000 |  Q1,00 | P->S | QC (RAND)  |      0 |00:00:00.01 |       0 |      0 |
|   4 |     PX BLOCK ITERATOR |          |      3 |   1000 |  Q1,00 | PCWC |            |  10000 |00:03:17.72 |    1745 |   1667 |
|*  5 |      TABLE ACCESS FULL| T1       |     39 |   1000 |  Q1,00 | PCWP |            |  10000 |00:00:00.06 |    1745 |   1667 |
|   6 |   REMOTE              | T2       |  10000 |      1 |        |      |            |      0 |00:01:14.44 |       0 |      0 |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access(:Z>=:Z AND :Z<=:Z)
       filter(MOD("T1"."ID",10)=0)

Remote SQL Information (identified by operation id):
----------------------------------------------------
   6 - SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1
       (accessing 'ORCL@LOOPBACK' )

Note
-----
   - Degree of Parallelism is 3 because of hint

I have hacked this output a little – the “Remote SQL” section didn’t get reported by display_cursor(), so I’ve inserted the remote sql I got from a call to dbms_xplan.display() after using explain plan to generate a plan. Note the /*+ shared(3) */ hint that appears in the remote SQL – that’s the internal version of a parallel(3) hint.

In 11g the query complete in 2.4 seconds, in 12c the query took nearly 75 seconds to run thanks to the 12c enhancement that allowed it to obey the hint! Looking at the time column (and ignoring the anomalous 3:17 at operation 4 – which might roughly be echoing 3 * 1:14) we can see that the time goes on the calls to the remote database (and a check of v$session_event shows this time spent in “SQL*Net message from db link”), so the obvious thing to do is check what actually happened at the remote database and we can do that by searching the library cache for a recognizable piece of the remote SQL – here’s the SQL to do that, with the results from 11g followed by the results from 12c:


SQL> select sql_id, child_number, executions, px_servers_executions, sql_text from v$sql
  2  where sql_text like '%SHARED%' and sql_text not like 'select sql_id%';

11g results
SQL_ID        CHILD_NUMBER EXECUTIONS PX_SERVERS_EXECUTIONS
------------- ------------ ---------- ---------------------
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------------------
c0f292z5czhwk            0      10000                     0
SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE MOD(TO_NUMBER("V1"),10)=1 AND "ID"=:1


12c results
SQL_ID        CHILD_NUMBER EXECUTIONS PX_SERVERS_EXECUTIONS
------------- ------------ ---------- ---------------------
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------------------
7bk51w7vtagwd            0      10000                     0
SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1

7bk51w7vtagwd            1          0                 59995
SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND MOD(TO_NUMBER("V1"),10)=1

It’s not surprising to see that the query has executed 10,000 times – that’s what we were told by the Starts statistic from dbms_output.display_cursor(), but 12c has 60,000 (with a little error) PX Servers executions of the statement. That’s 10,000 executions * degree 3 * the 2 slave sets we saw in my original execution plan. (It’s an odd little quirk of the two versions of Oracle that the order of predicates in the remote SQL was reversed between 11g and 12c – leading to two different SQL_IDs).

By enabling rowsource execution stats at the system level I was able to capture the remote execution plan with its stats:


SQL_ID  7bk51w7vtagwd, child number 0
-------------------------------------
SELECT /*+ SHARED (3) */ "ID","V1" FROM "T2" "T2" WHERE "ID"=:1 AND
MOD(TO_NUMBER("V1"),10)=1

--------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |          |      0 |        |      0 |00:00:00.01 |       0 |
|   1 |  PX COORDINATOR                   |          |      0 |        |      0 |00:00:00.01 |       0 |
|   2 |   PX SEND QC (RANDOM)             | :TQ10001 |      0 |      1 |      0 |00:00:00.01 |       0 |
|*  3 |    TABLE ACCESS BY INDEX ROWID    | T2       |  29983 |      1 |      0 |00:00:22.21 |    1000 |
|   4 |     BUFFER SORT                   |          |  29995 |        |    999 |00:00:21.78 |       0 |
|   5 |      PX RECEIVE                   |          |  29924 |      1 |    994 |00:00:21.21 |       0 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)| :TQ10000 |      0 |      1 |      0 |00:00:00.01 |       0 |
|   7 |        PX SELECTOR                |          |  29993 |        |    999 |00:00:06.08 |   19992 |
|*  8 |         INDEX UNIQUE SCAN         | T2_PK    |  29999 |      1 |   1000 |00:00:00.24 |   20000 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(MOD(TO_NUMBER("V1"),10)=1)
   8 - access("ID"=:1)

Unlike the test case I started with, this output did show the number of starts (with a few missing) and the work done across the slaves. Our index probe had to do two buffer gets on every execution, and we have 10,000 executions of the query so 20,000 buffer gets on the index unique scan. Even though only one slave actually does any work with the PX Selector, all three slaves in that set seem to “start” the relevant operations. The definition of the data meant that only one index probe in 10 was successful so we only visited 1,000 buffers from the table. If you’re familiar with reading plans with rowsource execution stats you’ll appreciate that something has gone wrong in the reporting here – that 1,000 at operation 3 should read 21,000 because it ought to include the 20,000 from the index scan (at least, that’s what a serial plan would do).

If you’re still wondering why running this query as a parallel query should take so long – after all it’s only 10,000 executions in 70 seconds – bear in mind that Oracle has to allocate and deallocate 6 PX servers to new sessions each time it starts; the instance activity stats showed “logons cumulative” going up by 60,000 each time I ran the driving query: that’s about 850 logons (and log offs) per second. I don’t think my test machine would give a realistic impression of the impact of a couple of copies of this query running simultaneously, but when I tried the contention introduce increased the run time to 93 seconds.

tl;dr

Watch out for poor performance becoming disastrous for distributed parallel queries when you upgrade from 11g to 12c

 

 

.

June 7, 2016

Quiz Night

Filed under: 12c,Execution plans,Oracle,subqueries — Jonathan Lewis @ 7:35 pm BST Jun 7,2016

Here’s an execution plan from a recent OTN database forum posting:

 
------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT          |                    |     1 |   231 |   160   (6)| 00:00:02 |
|   1 |  UPDATE                   | GS_TABLE           |       |       |            |          |
|*  2 |   HASH JOIN SEMI          |                    |     1 |   231 |   130   (0)| 00:00:02 |
|*  3 |    TABLE ACCESS FULL      | GS_TABLE           |     5 |   895 |   123   (0)| 00:00:02 |
|*  4 |    TABLE ACCESS FULL      | UPDATEDPROGRAMCODE |   850 | 44200 |     7   (0)| 00:00:01 |
|*  5 |   VIEW                    |                    |    11 |  2024 |     8  (13)| 00:00:01 |
|*  6 |    WINDOW SORT PUSHED RANK|                    |    11 |   440 |     8  (13)| 00:00:01 |
|*  7 |     FILTER                |                    |       |       |            |          |
|*  8 |      TABLE ACCESS FULL    | UPDATEDPROGRAMCODE |    11 |   440 |     7   (0)| 00:00:01 |
|   9 |   VIEW                    |                    |   850 |  1138K|     9  (23)| 00:00:01 |
|  10 |    SORT ORDER BY          |                    |   850 |   685K|     9  (23)| 00:00:01 |
|* 11 |     VIEW                  |                    |   850 |   685K|     8  (13)| 00:00:01 |
|  12 |      WINDOW SORT          |                    |   850 | 47600 |     8  (13)| 00:00:01 |
|* 13 |       FILTER              |                    |       |       |            |          |
|* 14 |        TABLE ACCESS FULL  | UPDATEDPROGRAMCODE |   850 | 47600 |     7   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Unfortunately the originator of this plan declined to show us the query or answer any questions about where the work was going, but did confirm a speculative comment I had made that the instance was 12c. So the question is this: can you spot what it was that made me think that the plan came from 12c ?

I have to say, by the way, that there may be ways to get this plan from 11g, it was just that my first impression was that it was probably 12c and I didn’t attempt to come up with a way of getting a similar plan from 11g. (And, as far as the general shape of the plan is concerned, I can think of two different types of query that could produce it.)

Footnote

You are allowed to prove me wrong.

Answer

(Which might be me showing ignorance rather than inspiration)

The basic shape of the plan suggests to me that the query is of the form:

update gs_table 
set     col1 = (select from updatedprogramcode),
        col2 = (select from updatedprogramcode)
where   exists (select from updatedprogramcode)         -- possibly "where IN (subquery)"
;

There are a couple of variations in how the “set” subquery content might vary, and I’ll write up a short blog about that later.

Having noted this basic shape, I then noted that the subqueries involved analytic functions – as indicated by the WINDOW SORT operations; moreover one of them used a PUSHED RANK option and the other was embedded in a non-mergeable VIEW (operation 11). Updates with subqueries generally involve correlated columns – and prior to 12c there are some restrictions on how far up the tree the correlation can go. Here’s a sample query (using two tables that I’ve cloned from all_objects) to demonstrate:


update t1 set
        data_object_id = (
                select  objno
                from    (
                        select
                                object_id objno,
                                row_number() over (order by  object_id desc) rn
                        from
                                t2
                        where
                                t2.object_type = t1.object_type
                        )
                where rn = 1
        )
/

We need to embed the inner select statement in an inline view because we want to use the result of the row_number() analytic function in a filter predicate, but in Oracle 11g the reference to t1.object_id can’t correlate back to the outer t1 table, while in 12c this isn’t a problem. Here’s the 12c plan, followed by the 11g error:


12c Plan (autotrace)
----------------------------------------------------------------------------------
| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT          |      |  5000 | 45000 | 70012  (15)| 00:04:34 |
|   1 |  UPDATE                   | T1   |       |       |            |          |
|   2 |   TABLE ACCESS FULL       | T1   |  5000 | 45000 |    12   (0)| 00:00:01 |
|*  3 |   VIEW                    |      |     1 |    26 |    13   (8)| 00:00:01 |
|*  4 |    WINDOW SORT PUSHED RANK|      |   556 |  6116 |    13   (8)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL     | T2   |   556 |  6116 |    12   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("RN"=1)
   4 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("OBJECT_ID")
              DESC )<=1)
   5 - filter("T2"."OBJECT_TYPE"=:B1)


11g Error
---------
                                t2.object_type = t1.object_type
                                                 *
ERROR at line 11:
ORA-00904: "T1"."OBJECT_TYPE": invalid identifier

Notice, by the way that my predicate “rn = 1” has resulted in the WINDOW SORT PUSHED RANK that appeared in the original plan.

In case I haven’t said it enough times: this is a just a rapid inference I drew from looking briefly at the plan and I haven’t tried hard to work out whether there is a way to get a plan like this in 11g. It was nice being proved right by the follow-up post from the OP, but my guess may have been right by accident – I’d rather be proved wrong than carry on thinking I’d got it right when I hadn’t … so feel free to supply an example in the comments.

 

May 23, 2016

Virtual Partitions

Filed under: 12c,CBO,Infrastructure,Oracle,Partitioning — Jonathan Lewis @ 1:16 pm BST May 23,2016

Here’s a story of (my) failure prompted by a recent OTN posting.

The OP wants to use composite partitioning based on two different date columns – the table should be partitioned by range on the first date and subpartitioned by month on the second date. Here’s the (slightly modified) table creation script he supplied:


rem
rem     Script: virtual_partition.sql
rem     Dated:  May 2016
rem

CREATE TABLE M_DTX
(
        R_ID    NUMBER(3),
        R_AMT   NUMBER(5),
        DATE1   DATE,
        DATE2   DATE,
        VC GENERATED ALWAYS AS (EXTRACT(MONTH FROM DATE2))
)
PARTITION BY RANGE (DATE1) interval (numtoyminterval(1,'MONTH'))
SUBPARTITION BY LIST (VC)
        SUBPARTITION TEMPLATE (
                SUBPARTITION M1 VALUES (1),
                SUBPARTITION M2 VALUES (2),
                SUBPARTITION M3 VALUES (3),
                SUBPARTITION M4 VALUES (4),
                SUBPARTITION M5 VALUES (5),
                SUBPARTITION M6 VALUES (6),
                SUBPARTITION M7 VALUES (7),
                SUBPARTITION M8 VALUES (8),
                SUBPARTITION M9 VALUES (9),
                SUBPARTITION M10 VALUES (10),
                SUBPARTITION M11 VALUES (11),
                SUBPARTITION M12 VALUES (12)
        )
        (
        PARTITION M_DTX_2015060100 VALUES LESS THAN (TO_DATE('2015-06-01 00:00:01', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
        )
;

There’s nothing particularly exciting about this – until you get to the query requirement – the user wants to query on date1 and date2, and doesn’t know about the virtual month column, e.g. (and, I know that there should be a to_date() or ANSI equivalent here):

SELECT * FROM m_dtx WHERE date1 = trunc(sysdate) AND date2 = '01-Jun-2016';

Now, as a general rule, you don’t expect partition elimination to occur unless the partitioning column appears with a predicate that make elimination possible, so your first response to this query is that it could eliminate on date1, but can’t possibly eliminiate on vc because vc isn’t in the where clause. However it’s possible that the partitioning code might be coded to recognise that the subpartition is on a virtual column that is derived from date2, so perhaps it could generate a new predicate before optimising, for example:

date2 = '01-Jun-2016'  => vc = 6

Unfortunately, your first response is correct – the optimizer doesn’t get this clever, and doesn’t do the sub-partition elimination. Here’s the execution plan from 12.1.0.2 for the sample query, followed by the execution plan when I explicitly add the predicate vc = 6.


SQL_ID  8vk1a05uv16mb, child number 0
-------------------------------------
SELECT /*+ dynamic_sampling(0) */  * FROM m_dtx WHERE date1 =
trunc(sysdate) AND date2 = to_date('01-Jun-2016','dd-mon-yyyy')

Plan hash value: 3104206240

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |       |       |    15 (100)|          |       |       |
|   1 |  PARTITION RANGE SINGLE|       |     1 |    57 |    15   (7)| 00:00:01 |   KEY |   KEY |
|   2 |   PARTITION LIST ALL   |       |     1 |    57 |    15   (7)| 00:00:01 |     1 |    12 |
|*  3 |    TABLE ACCESS FULL   | M_DTX |     1 |    57 |    15   (7)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("DATE2"=TO_DATE(' 2016-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "DATE1"=TRUNC(SYSDATE@!)))



SQL_ID  33q012bdhjrpn, child number 0
-------------------------------------
SELECT /*+ dynamic_sampling(0) */  * FROM m_dtx WHERE date1 =
trunc(sysdate) AND date2 = to_date('01-Jun-2016','dd-mon-yyyy') and vc
= 6

Plan hash value: 938710559

------------------------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |       |       |    15 (100)|          |       |       |
|   1 |  PARTITION RANGE SINGLE|       |     1 |    57 |    15   (7)| 00:00:01 |   KEY |   KEY |
|   2 |   PARTITION LIST SINGLE|       |     1 |    57 |    15   (7)| 00:00:01 |     6 |     6 |
|*  3 |    TABLE ACCESS FULL   | M_DTX |     1 |    57 |    15   (7)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("DATE2"=TO_DATE(' 2016-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "DATE1"=TRUNC(SYSDATE@!)))


Note how the predicate vc = 6  doesn’t show up in the predicate section in either case, but the execution plan shows PARTITION LIST ALL at operation 2 when we omit the predicate and PARTITION LIST SINGE when we include it (with suitable values also appearing for Pstart and Pstop). (The cost, by the way, is the cost of scanning a whole (range)partition whether or not the optimizer expects to restrict that scan to just one sub-partition.)

So the optimizer isn’t quite clever enough (yet). BUT … the optimizer can be very clever with constraints, combining constraints with predicates and applying transitive closure to produce new predicates – so maybe we could get the optimizer to do this if we helped it a little bit. Given the table definition supplied I’m going to assume that the date2 column is supposed to be non-null, so let’s add some truthful constraints/declarations to the table definition:


alter table m_dtx modify date2 not null;
alter table m_dtx modify vc  not null;
alter table m_dtx add constraint md_ck_vc check (vc = extract(month from date2));

Alas, this didn’t make any difference to the execution plan. But it did do something surprising to my attempts to load data into the table:


insert into m_dtx (r_id, r_amt, date1, date2)
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        mod(rownum, 1000),
        rownum,
        trunc(sysdate,'yyyy') + dbms_random.value(0,365),
        trunc(sysdate,'yyyy') + dbms_random.value(0,365)
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4
;

insert into m_dtx (r_id, r_amt, date1, date2)
*
ERROR at line 1:
ORA-01400: cannot insert NULL into (???)

So the array insert with the virtual column doesn’t like the NOT NULL constraint on the virtual column because vc is, presumably, still null when the constraint is checked (though there’s no problem with single row inserts with the values() clause – I wonder what happens with the PL/SQL “FORALL” clause) – so let’s remove the not null constraint on vc and see what happens.


insert into m_dtx (r_id, r_amt, date1, date2)
*
ERROR at line 1:
ORA-02290: check constraint (TEST_USER.MD_CK_VC) violated

Unsurprisingly, given the fact that Oracle didn’t like the not null constraint, the critical check constraint also fails. This, by the way, is odd because a check constraint should accept a row when the constraint doesn’t evaluate to FALSE, so (a) vc can’t have been evaluated at this point or the constraint would evaluate to TRUE – which is not FALSE, and (b) vc at this point can no longer be null or the constraint would evaluate to NULL – which is not FALSE: so what “value” has vc got that makes the constraint check return FALSE ?

Bottom line:

I can see some scope for an optimizer enhancement that tries to find eliminating predicates from virtual columns; and I think there’s a need for ensuring that we can safely add constraints to virtual columns – after all we might want to create an index on a virtual column and sometimes we need a NOT NULL declaration to ensure that an index-only execution path can be found. Unfortunately I have to end this blog without finding an immediate solution for the OP.

Despite this failure, though, there are cases (as I showed a couple of years ago) where the optimizer in 12c can get clever enough to recognize the connection between a queried date column and the virtual partitioning column based on that date column.

May 16, 2016

Cursor_Sharing problem

Filed under: 12c,Bugs,Oracle,Troubleshooting — Jonathan Lewis @ 12:49 pm BST May 16,2016

Here’s a possible bug (though maybe “not a bug”) that came up over the weekend on the OTN database forum. An application generating lots of “literal string” SQL was tested with cursor_sharing set to force. This successfully forced the use of bind variable substitution, but a particular type of simple insert statement started generating very large numbers of child cursors – introducing a lot of mutex waits and library cache contention. Here’s a (substituted) statement that was offered as an example of the problem:


INSERT INTO schema.tableName (column1,columns2,..)
VALUES (:"SYS_B_0",:"SYS_B_1",:"SYS_B_2",:"SYS_B_3",:"SYS_B_4",:"SYS_B_5",:"SYS_B_6",timestamp:"SYS_B_7",SYSTIMESTAMP,:"SYS_B_8")

Note particularly the slightly odd looking detail: timestamp:”SYS_B_7″; this is how bind variable substitution looks if you’ve used the “compact” ANSI mechanism for handling datetime literals. We were told, in fact, that the tables had only number, varchar2, and date columns – so it looks a little suspicious when see timestamp values being inserted but the implied coercion wasn’t the source of the problem. Here’s all it takes to see the problem (tested only on 12.1.0.2):


rem
rem     Script:         ansi_datetime_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2016
rem

create table t1 (
        n1      number(8,0),
        v1      varchar2(10),
        d1      date,
        t1      timestamp
);

insert into t1 values(-1,'x',sysdate, systimestamp);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

set serveroutput off
alter session set cursor_sharing = force;

prompt  ============================
prompt  Testing DATE literals
prompt  Expect to see child number 2
prompt  ============================

insert into t1 values(1, 'A', date'2016-01-01', null);
insert into t1 values(1, 'A', date'2016-01-02', null);
insert into t1 values(1, 'A', date'2016-01-03', null);

select * from table(dbms_xplan.display_cursor(null,null,'-note -plan_hash'));

prompt  ============================
prompt  Testing TIMESTAMP literals
prompt  Expect to see child number 2
prompt  ============================

insert into t1 values(1, 'A', null, timestamp'2016-01-01 00:00:00');
insert into t1 values(1, 'A', null, timestamp'2016-01-02 00:00:00');
insert into t1 values(1, 'A', null, timestamp'2016-01-03 00:00:00');

select * from table(dbms_xplan.display_cursor(null,null,'-note -plan_hash'));

prompt  ===============================
prompt  Need privilege to see this view
prompt  Pre-coded for the sql_ids above
prompt  ===============================

break on sql_id skip 1

select
        sql_id, child_number, hash_match_failed
from
        v$sql_shared_cursor
where
        sql_id in ('58udhcm270bhn', 'gssz5cbnt7mgn')
order by
        sql_id, child_number
;

A couple of points – there are several more tests in the script demonstrating things that do NOT cause multiple child cursors to appear. I probably didn’t cover all the options that I could have covered but I hit a number of common cases to check that it wasn’t simply that cursor_sharing being severely broken in 12c. I’ve also allowed a side effect to demonstrate the presence of multiple child cursors rather than explcitly listing the child cursors. If the three statements (of each type) had produced shareable cursors then the child number reported by dbms_xplan.display_cursor() would have been zero in both cases. Here are the actual outputs:


SQL_ID  58udhcm270bhn, child number 2
-------------------------------------
insert into t1 values(:"SYS_B_0", :"SYS_B_1", date:"SYS_B_2", null)

-------------------------------------------------
| Id  | Operation                | Name | Cost  |
-------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |
|   1 |  LOAD TABLE CONVENTIONAL | T1   |       |
-------------------------------------------------


SQL_ID  gssz5cbnt7mgn, child number 2
-------------------------------------
insert into t1 values(:"SYS_B_0", :"SYS_B_1", null, timestamp:"SYS_B_2")


-------------------------------------------------
| Id  | Operation                | Name | Cost  |
-------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |
|   1 |  LOAD TABLE CONVENTIONAL | T1   |       |
-------------------------------------------------


And, having pre-coded the script with the SQL_IDs of the two guilty statements, here’s the output identifying the cause of the failure to share from v$sql_shared_cursor:


SQL_ID        CHILD_NUMBER H
------------- ------------ -
58udhcm270bhn            0 N
                         1 Y
                         2 Y

gssz5cbnt7mgn            0 N
                         1 Y
                         2 Y

There is a bug on MoS relating to timestamp columns and failure to share cursors – it doesn’t really look like the same problem but it could be related in some way: Bug 13407937 : HIGH VERSION COUNT FOR INSERT WITH CURSOR_SHARING=FORCE AND TIMESTAMP COLUMN. It’s described as “not a bug” :(

April 25, 2016

DDL logging

Filed under: 12c,Oracle,Partitioning — Jonathan Lewis @ 1:05 pm BST Apr 25,2016

I was presenting at the UKOUG event in Manchester on Thursday last week (21st April 2016), and one of the sessions I attended was Carl Dudley’s presentation of some New Features in 12c. The one that caught my eye in particular was “DDL Logging” because it’s a feature that has come up fairly frequently in the past on OTN and other Oracle forums.

So today I decided to write a brief note about DDL Logging – and did a quick search of my blog to see if I had mentioned it before: and I found this note that I wrote in January last year but never got around to publishing – DDL Logging is convenient, but doesn’t do the one thing that I really want it to do:

DDL Logging – 12c

One of the little new features that should be most welcome in 12c is the ability to capture all DDL executed against the database. All it takes is a simple command (if you haven’t set the relevant parameter in the parameter file):

alter system set enable_ddl_logging = true;

All subsequent DDL will be logged to two different places (in two formats)

  • $ORACLE_BASE/diag/rdbms/{database}/{instance}/log/ddl/log.xml
  • $ORACLE_BASE/diag/rdbms/{database}/{instance}/log/ddl_{instance}.log

Unfortunately the one thing I really wanted to see doesn’t appear – probably because it doesn’t really count as DDL –  it’s the implicit DDL due to inserting into not-yet-existing partitions of an interval partitioned table.

Note: If you’re using a container database with pluggable databases then the DDL for all the pluggable databases goes into the same log file.

Update – Licensing

The following text in the Oracle 12c Database Licensing document has just been brought to my attention:

Licensed Parameters
The init.ora parameter ENABLE_DDL_LOGGING is licensed as part of the Database Lifecycle Management Pack when set to TRUE. When set to TRUE, the database reports schema changes in real time into the database alert log under the message group schema_ddl. The default setting is FALSE.

The licensing document is also linked to from the 12c online html page for the parameter.

The 11g parameter definition makes no mention of licensing, and the 11g “New Features” manual don’t mention the feature at all, but the parameter does get a special mention in the 11g licensing document where it is described as being part of the Change Management Pack.

init.ora Parameters
The use of the following init.ora parameter is licensed under Oracle Change Management Pack:
■ ENABLE_DDL_LOGGING: when set to TRUE (default: FALSE)

 

Next Page »

Powered by WordPress.com.