Oracle Scratchpad

May 28, 2012

Ch-ch-ch-ch-changes

Filed under: Index Rebuilds,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 5:31 pm UTC May 28,2012

For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.

It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while read the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment to the effect that:

In NOARCHIVELOG mode all bulk operations (such as INSERT, APPEND, index REBUILD and ALTER TABLE MOVE) are automatically nologging.

The obvious error there is the reference to “index REBUILD”. Although create table as select and alter table move default to nologging (when running in noarchivelog mode) the equivalent commands for indexes have always been logged. On the other hand, pausing for thought here, I wouldn’t expect such an obvious error to slip past all three authors and the technical reviewers so, before opening my mouth and putting my foot firmly into it, I decided to run a quick test and, almost inevitably, I have a handy test script that I’ve been running intermittently for years for exactly this test case.

execute snap_redo.start_snap

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 10000
)
select
	rownum			id,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 100000
;

execute snap_redo.end_snap

execute snap_redo.start_snap
create index t1_i1 on t1(padding);
execute snap_redo.end_snap

execute snap_redo.start_snap
alter index t1_i1 rebuild;
execute snap_redo.end_snap

The snap_redo package is a simple bit of pl/sql I wrote to report the changes in a few of the current session stats (view v$mystat) over time. Specfically it looks at the statistics containing the word redo (and, in a more sophisticated form, a few others related to transaction management). Here are the key results for a test run on 10.2.0.3 and 11.1.0.7 – first from 10g

============
Create Table
============
redo synch writes                                                            2
redo entries                                                               410
redo size                                                               71,792
redo wastage                                                             1,836
redo writes                                                                  6
redo blocks written                                                        133
redo ordering marks                                                          3
redo subscn max counts                                                      10

============
Create Index
============
redo synch writes                                                            3
redo synch time                                                              2
redo entries                                                             1,887
redo size                                                           13,129,888
redo buffer allocation retries                                               4
redo wastage                                                             2,144
redo writes                                                                 11
redo blocks written                                                     26,428
redo write time                                                             67
redo log space requests                                                      3
redo log space wait time                                                    31
redo ordering marks                                                          3
redo subscn max counts                                                       3

=============
Rebuild index
=============
redo synch writes                                                            2
redo entries                                                             1,978
redo size                                                           13,132,052
redo buffer allocation retries                                               2
redo wastage                                                             4,084
redo writes                                                                 16
redo blocks written                                                     26,453
redo write time                                                             66
redo ordering marks                                                          3
redo subscn max counts                                                       4

Now from 11g

============
Create Table
============
redo synch writes                                                            1
redo synch time                                                              1
redo entries                                                               428
redo size                                                               67,228
redo size for direct writes                                              2,912
redo wastage                                                             1,072
redo writes                                                                  5
redo blocks written                                                        138
redo write time                                                              1
redo ordering marks                                                          3
redo subscn max counts                                                       5

============
Create Index
============
redo synch writes                                                            1
redo entries                                                               601
redo size                                                               71,092
redo size for direct writes                                             14,916
redo wastage                                                               972
redo writes                                                                  4
redo blocks written                                                        146
redo blocks checksummed by FG (exclusive)                                    6
redo ordering marks                                                          3
redo subscn max counts                                                       4

=============
Rebuild index
=============
redo synch writes                                                            1
redo entries                                                               684
redo size                                                               80,400
redo size for direct writes                                             14,916
redo wastage                                                             1,988
redo writes                                                                  7
redo blocks written                                                        167
redo write time                                                              5
redo blocks checksummed by FG (exclusive)                                    6
redo ordering marks                                                          4
redo subscn max counts                                                       5

As you can see, somewhere between 10.2.0.3 and 11.1.0.7 index creation and rebuild finally became consistent with table creation and move when running in noarchivelog mode. The book was right – it was, after all, talking about Exadata which means it’s implicitly talking about 11g and doesn’t really have to qualify the comment with references to earlier versions.

There’s a secondary moral to this story: instead of saying: “You’re wrong”, you might look a little wiser if you start with “Are you sure about that?” or even “Which version are you thinking of?”

There’s another corollary, of course – if you decide to test out the time and impact of rebuilding a very large index by using a backup copy of your production system, make sure that you are running in archivelog mode or you won’t be doing a test that is anything like valid for the production system (assuming your production systems are running in archivelog mode, of course).

May 24, 2012

Subquery Factoring

Filed under: Execution plans,Hints,Oracle,Subquery Factoring,Tuning — Jonathan Lewis @ 6:37 pm UTC May 24,2012

I have a small collection of postings where I’ve described anomalies or limitations in subquery factoring (the “with subquery”, or Common Table Expression (CTE) to give it the official ANSI name). Here’s another example of Oracle’s code not behaving consistently. You may recognise the basic query from yesterday’s example of logical tuning – so I won’t reprint the code to generate the data sets. This examples in this note were created on 11.2.0.2 – we start with a simple query and its execution plan:

select
	*
from
	t2
where
	not exists (
		select	null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2
	)
and	not exists (
		select	null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2 + 1000
	)
;

-------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |    50 |  7200 |    16 |
|   1 |  NESTED LOOPS ANTI  |       |    50 |  7200 |    16 |
|   2 |   NESTED LOOPS ANTI |       |  4999 |   644K|    16 |
|   3 |    TABLE ACCESS FULL| T2    |  5000 |   585K|    16 |
|*  4 |    INDEX UNIQUE SCAN| T3_PK |     1 |    12 |       |
|*  5 |   INDEX UNIQUE SCAN | T3_PK |  5000 | 60000 |       |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2"+1000)
   5 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2")

Now what happens if I take this piece of SQL (which is really the core of a much more complex query), put it into a CTE, and reference it later ? In one of the queries in the client’s application I wanted to materialize a piece of code like this because the result set was used multiple times in the body of the query. You’ll notice that I got two anti-joins when my test case ran as a “naked” SQL statement, but look what happens when I try to use the CTE mechanism in the simplest possible way:

with	cte as (
select
	/*+ materialize */
	*
from
	t2
where
	not exists (
		select
			null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2
	)
and	not exists (
		select
			null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2 + 1000
	)
)
select
	*
from
	cte
;

------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Rows  | Bytes | Cost  |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |  5000 |   468K|  5029 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |       |       |       |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D6641_40101879 |       |       |       |
|*  3 |    FILTER                  |                             |       |       |       |
|   4 |     TABLE ACCESS FULL      | T2                          |  5000 |   585K|    15 |
|*  5 |     INDEX UNIQUE SCAN      | T3_PK                       |     1 |    12 |     1 |
|*  6 |     INDEX UNIQUE SCAN      | T3_PK                       |     1 |    12 |     1 |
|   7 |   VIEW                     |                             |  5000 |   468K|    14 |
|   8 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6641_40101879 |  5000 |   585K|    14 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter( NOT EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID3"=:B1 AND "N3B"=:B2
              AND "N3A"=:B3) AND  NOT EXISTS (SELECT 0 FROM "T3" "T3" WHERE "ID3"=:B4+1000 AND
              "N3B"=:B5 AND "N3A"=:B6))
   5 - access("N3A"=:B1 AND "N3B"=:B2 AND "ID3"=:B3)
   6 - access("N3A"=:B1 AND "N3B"=:B2 AND "ID3"=:B3+1000)

When I try to materialize the view Oracle reverts to a pair of filter subqueries. If I take away the /*+ materialize */ hint the CTE goes inline and the query optimizes with anti-joins – just as it did in the original form – so the behaviour isn’t as inconsistent as some of the earlier cases I’ve documented. I can get the behaviour I want by adding /*+ unnest */ hints to the two subqueries so the problem isn’t a show-stopper, but it’s just a little irritating to have to do this.

Here’s what I found in the 10053 trace file with the attempt to materialize:

SU:   Checking validity of unnesting subquery SEL$2 (#0)
SU:     SU bypassed: invalidated.
SU:   Validity checks failed.
SU:   Checking validity of unnesting subquery SEL$3 (#0)
SU:     SU bypassed: invalidated.
SU:   Validity checks failed.

That’s just a little strange, considering that the equivalent section of the trace for the version where the CTE goes inline reads as folllows:

SU:   Checking validity of unnesting subquery SEL$2 (#3)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$2 (#3)SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.
SU:   Checking validity of unnesting subquery SEL$3 (#2)
SU:   Passed validity checks.
SU:   Unnesting subquery query block SEL$3 (#2)SU: Transform ALL/NOTEXISTS subquery into a regular antijoin.

Thank goodness we can still hint – and then generate SQL Baselines.

 

May 23, 2012

Logical tuning

Filed under: Hints,Ignoring Hints,Oracle,Performance,Tuning — Jonathan Lewis @ 6:22 pm UTC May 23,2012

Here’s a model of a problem I solved quite recently at a client site. The actual query involved was much more complex and the volume of data much larger, but this tiny, two table, example was enough to demonstrate the key principle. (Originally I thought I’d have to use three tables to model the problem, which is why you my be puzzled by my numerically odd choice of table names). I ran this example on 11.2.0.2 – the client version:

create table t2
as
select
	rownum					id2,
	trunc(dbms_random.value(0,1000))	n2a,
	trunc(dbms_random.value(0,1000))	n2b,
	lpad(rownum,6,'0')			vc2,
	lpad('x',100,'x')			padding
from
	all_objects
where
	rownum <= 5000;

alter table t2 add constraint t2_pk primary key(id2, n2a);

create table t3
as
select
	rownum					id3,
	trunc(dbms_random.value(0,1000))	n3a,
	trunc(dbms_random.value(0,1000))	n3b,
	lpad(rownum,6,'0')			vc3,
	lpad('x',100,'x')			padding
from
	all_objects
where
	rownum <= 5000;

alter table t3 add constraint t3_pk primary key(n3a, n3b, id3);

-- now collect stats on the table and execute this query (with autotrace enabled)

select
	*
from
	t2
where
	not exists (
		select	/*+ unnest */
			null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	(id3 = id2 or id3 = id2 + 1000)
	)
;

select * from table(dbms_xplan.display);

You’ll note that I’ve included the /*+ unnest */ hint in the subquery because I want Oracle to run this as a hash anti-join; and it would appear to be legal (and simple) to do this given the various not null contraints and primary key information. In fact the plan uses a filter subquery:

------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost  |
------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |  5000 |   585K|  5015 |
|*  1 |  FILTER            |       |       |       |       |
|   2 |   TABLE ACCESS FULL| T2    |  5000 |   585K|    15 |
|*  3 |   INDEX RANGE SCAN | T3_PK |     1 |    12 |     2 |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( NOT EXISTS (SELECT /*+ UNNEST */ 0 FROM "T3" "T3" WHERE
              "N3B"=:B1 AND "N3A"=:B2 AND ("ID3"=:B3 OR "ID3"=:B4+1000)))
   3 - access("N3A"=:B1 AND "N3B"=:B2)
       filter("ID3"=:B1 OR "ID3"=:B2+1000)

I suspect that the optimizer code bypasses the anti-join because of the (carefully bracketed) disjunct (OR) predicate. On the client site this resulted in the subquery being executed 9 million times, reducing an intermediate data set from 9M rows to 2M rows at a cost of 27 million buffer visits and about 60 CPU seconds. Fortunately I was able to dredge up a little bit of propositional calculus and quote the following equivalence:

        not( A or B )  (not A and not B)

Equally fortunately I didn’t have to worry about three-valued logic (all relevant columns were declared not null), so I was able to rewrite the query in the form:

select
	*
from
	t2
where
	not exists (
		select	null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2
	)
and	not exists (
		select	null
		from 	t3
		where	n3a = n2a
		and	n3b = n2b
		and	id3 = id2 + 1000
	)
;

With this code Oracle did two unnests and converted to hash anti-joinsin both cases (at least, that’s what happened on the client site – my small sample switched to nested loop anti-joins):

-------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |    50 |  7200 |    15 |
|   1 |  NESTED LOOPS ANTI  |       |    50 |  7200 |    15 |
|   2 |   NESTED LOOPS ANTI |       |  4999 |   644K|    15 |
|   3 |    TABLE ACCESS FULL| T2    |  5000 |   585K|    15 |
|*  4 |    INDEX UNIQUE SCAN| T3_PK |     1 |    12 |       |
|*  5 |   INDEX UNIQUE SCAN | T3_PK |  5000 | 60000 |       |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2"+1000)
   5 - access("N3A"="N2A" AND "N3B"="N2B" AND "ID3"="ID2")

The change to hash anti-joins was a huge benefit (the nested loop anti-join would have improved things for the client to a degree, but there’s not really an enormouse difference in some cases between a filter subquery and an equivalent nested loop anti/semi-join). In this case the query run time dropped from 95 seconds to 27 seconds – all of it CPU time.

Update: a check of the 10053 trace file for 10.2.0.3 shows the following:

SU:   Checking validity of unnesting subquery SEL$2 (#2)
SU:     SU bypassed: Invalid correlated predicates.
SU:   Validity checks failed.

On the other hand, it is possible to get unnesting and a hash anti-join with the predicate: id3 between id2 and id2 + 1000. (But see comment 6 and its reply)

May 17, 2012

Index Sizing

Filed under: Indexing,Infrastructure,Oracle — Jonathan Lewis @ 8:53 am UTC May 17,2012

I was in a discussion recently about how to estimate the size of a bitmap index before you build it, and why it’s much harder to do this for bitmap indexes than it is for B-tree indexes. Here’s what I wrote in “Practical Oracle 8i”:

An interesting feature of bitmap indexes is that it is rather hard to predict how large the index segment will be. The size of a b-tree index is based very closely on the number of rows and the typical size of the entries in the index column.  The size of a bitmap index is dictated by a fairly small number of bit-strings which may have been compressed to some degree depending upon the number of consecutive 1’s and 0’s.   To pick an extreme example, imagine a table of one million rows that has one column that may contain one of eight values ‘A’ to ‘H’ say, which has been generated in one of of the two following extreme patterns:

  • All the rows for a given value appear together, so scanning down the table we get 125,000 rows with ‘A’ followed by 125,000 rows of ‘B’ and so on.
  • The rows cycle through the values in turn, so scanning down the table we get ‘A’,’B’. . . ‘H’ repeated 125,000 times.  

What will the bitmap indexes look like in the two cases?  

For the first example, the basic map for the ‘A’ value will be 125,000 one-bits, followed by 875,000 zero bits – which will be trimmed off.  Splitting the 125,000 bits into bytes and adding the necessary overhead of about 12% we get an entry for the ‘A’ rows of 18K.  A similar argument applies for each of the values ‘B’ to ‘H’, so we get a total index size of around 8 x 18K – giving 156K.  

For the second example, the basic map for the ‘A’ value will be a one followed by 7 zeros, repeated 125,000 times.  There is no chance of compression here, so the ‘A’ entry will start at 125,000 bytes.  Adding the overhead this goes up to 140K, and repeating the argument for the values ‘B’ to ‘H’ we get a total index of 1.12 MB.  

This wild variation in size looks like a threat, but to put this into perspective, a standard B-tree index on this column would run to about 12 Mb irrespective of the pattern of the data.  It would probably take about ten times as long to build as well.

I wrote up a test case to recreate this model some time ago, so here it is with the results from an instance of 11.1.0.7:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 10000
)
select
	chr(65 + mod(rownum-1,8))		bit_scattered,
	chr(65+trunc((rownum-1)/125000))	bit_clustered
from
	generator	v1,
	generator	v2
where
	rownum <= 1e6
;

create bitmap index t1_b1_scattered on t1(bit_scattered);
create bitmap index t1_b1_clustered on t1(bit_clustered);

select
	index_name, leaf_blocks, 8 * leaf_blocks KB
from
	user_indexes
where
	table_name = 'T1'
;

set doc off
doc

Results from 11.1.0.7
---------------------

INDEX_NAME           LEAF_BLOCKS         KB
-------------------- ----------- ----------
T1_B1_SCATTERED              164       1312
T1_B1_CLUSTERED               24        192

2 rows selected.
#

So, no big change there, then.
If you modify the code to create B-tree indexes you’ll find they are 14MB each if you don’t use compression, 12MB if you do.

May 11, 2012

Exadata

Filed under: Advertisements,Exadata,Oracle — Jonathan Lewis @ 2:12 pm UTC May 11,2012

Enkitec have just announced the call for papers for their E4 conference to be held in Irving, Texas this August. (See this link for the announcement.)

I’ll be there … playing the devil’s advocate.

May 1, 2012

Web Cams

Filed under: Uncategorized — Jonathan Lewis @ 5:04 pm UTC May 1,2012

A little something to revive flagging spirits at the end of the day:

Red-tailed Hawk nest

Great Blue Heron nest / alternative view

It’s wonderful to “be” this close to the nests and see such detail (ymmv, depending on time of day).

BT/Yahoo

Filed under: Uncategorized — Jonathan Lewis @ 8:00 am UTC May 1,2012

If anyone is expecting email from me any time in the near future, you may have to hold your breath.

In a staggering feat of customer relations BT (British Telecomms), or Yahoo acting on their behalf, has managed to introduce random blocks to any email I send out. At present I have something like a 30% chance of email being blocked by their smtp servers with reports like the following:

An unknown error has occurred. Subject ‘Re: {deleted}‘, Account: ‘{deleted}‘, Server: ‘mail.btinternet.com’, Protocol: SMTP, Server Response: ’554 Transaction failed : Cannot send message due to possible abuse; please visit http://postmaster.yahoo.com/abuse_smtp.html for more information’, Port: {deleted}, Secure(SSL): {deleted}, Server Error: 554, Error Number: 0x800CCC6F

Take a look at the URL supplied in the response – I’ve highlighted it to make it easy to spot – and then try going to that URL.  (Page doesn’t exist – excellent service, isn’t it?)

Sometimes I can get my reply through if I delete any of the original text, sometimes I can get my reply through if I create a whole new message (rather than hitting reply) and cut just my text into the new message. Sometimes I can hit reply and get a reply through if it contains nothing but the comment (with no signature, and nothing from the incoming post):

British Telecomm is blocking virtually every email I try to send to you as “possible abuse”.
I can’t figure out why, so it may be some time before I get a proper reply through.

I tried calling customer support and was told that the problem  was obviously my configuration and not their mail servers; but they could email me the URL of the BT broadband help page so that I could find out how to set up mail program properly; or I could let the call centre employee log on to my machine over the internet; or I could pay them to get a member of the support staff  to help.

Of course there was no answer to the question:  “so what have BT/Yahoo changed on their mail servers in the last few days?” and there was no answer to the question: “why have several other people suddenly hit the same problem at the same time if it’s my configuration?”.

So, BT, get your act together – send an email to your customers explaining why your email servers might choose to block apparently random email as “possible abuse”; and brief your call centre staff to recognise the description of the problem so they don’t try the standard brush-off

I hear that using twitter, facebook, blogs etc. to make a fuss about  poor quality service is more effective nowadays than phoning call centres in distant countries – so I thought I’d give it a go. I know I’m not the only one in the UK facing this problem at the moment, so I’m curious to see how many more people I can get complaining about it.

Footnote: I have a yahoo mail account via  BT – and the fact that I could send an email from it “proved” to the call centre person that it wasn’t BT’s problem. But it doesn’t, and I don’t want to use a service that floods half my screen with moving adverts all the time.

 

April 30, 2012

Clustered Indexes

Filed under: SQL Server — Jonathan Lewis @ 2:15 pm UTC Apr 30,2012

… which, for those in the know, means something to do with SQL Server. (The closest Oracle equivalent is index-organized tables, the closest sounding Oracle feature is “index clusters” but that means something rather different.)

Redgate has arranged for an online debate between an Oracle specialist (me) and a SQL Server specialist (Grant Fritchey) to talk about the strengths and weaknesses of the two mechanisms, discuss why Oracle users seem to be biased towards one implementation and SQL Server users towards the other, and then see where the conversation takes us. This will be followed by a Q&A session.

If this sounds interesting – and I think it’s a great idea, you can learn an awful lot more from a discussion than you can by listening to monologue – you need to sign up early. The event will be held on 7th June at 8:00 am (which is 16:00 BST as far as I’m concerned).

Further details and registration at this URL.

April 28, 2012

SQLTXPLAIN

Filed under: Uncategorized — Jonathan Lewis @ 11:53 am UTC Apr 28,2012

Just in on the Oracle-L list server – if you want to hear from the author of SQLTXPLAIN (MOS 215187.1) how to install and use the main features, Carlos Sierra is presenting a one hour seminar on May 15th. In his words:

If you or someone you know may want to attend, please register following link below. Capacity for this 1hr SQLTXPLAIN Webinar on May 15 is limited, so please register early so you can reserve one connection. Feel free to share this link with your customer(s).

https://oracleaw.webex.com/oracleaw/onstage/g.php?d=590415036&t=a

Event Information: Using SQLTXPLAIN to diagnose SQL statements performing poorly

The event is scheduled for 8:00 am in San Francisco, which is 4:00 pm BST (GMT+1) if you’re in the UK.

 

April 23, 2012

NVL2()

Filed under: Function based indexes,Indexing,Oracle,Performance — Jonathan Lewis @ 5:43 pm UTC Apr 23,2012

There are many little bits and pieces lurking in the Oracle code set that would be very useful if only you had had time to notice them. Here’s one that seems to be virtually unknown, yet does a wonderful job of eliminating calls to decode().

The nvl2() function takes three parameters, returning the third if the first is null and returning the second if the first is not null. This is  convenient for all sorts of example where you might otherwise use an expression involving  case or decode(), but most particularly it’s a nice little option if you want to create a function-based index that indexes only those rows where a column is null.

Here’s a code fragment to demonstrate the effect:

select nvl2(1,2,3) from dual;

select nvl2(null,2,3) from dual;

select nvl2(1,null,3) from dual;

select nvl2(null,null,3) from dual;

And here’s the resulting output – conveniently the function call is also the column heading in the output:

NVL2(1,2,3)
-----------
          2

NVL2(NULL,2,3)
--------------
             3

NVL2(1,NULL,3)
--------------

NVL2(NULL,NULL,3)
-----------------
                3

Note, particularly, from the last two that a non-null input (first parameter) turns into the null second parameter, and the null input turns into the non-null third parameter. To create a function-based index on rows where columnX is null, and be able to access them by index, you need only do the following:

create index t1_f1 on t1(nvl2(columnX,null,1));

select * from t1 where nvl2(columnX,null,1) = 1;

(Don’t forget, of course, that you will need to gather stats on the hidden column underpinning the function-based index before you can expect the optimizer to use it in the correct cases.)

April 20, 2012

Chris Date

Filed under: Uncategorized — Jonathan Lewis @ 10:58 am UTC Apr 20,2012

Just in from Peter Robson on the Oracle-L listserver:

Chris Date on Database Design and Relational Theory, Edinburgh, June 7/8 2012

I’m using the ‘good offices’ (thank you Mark) of the list to inform you all that Chris Date is making a rare appearance in the UK (he was last here two years ago) for a 2 day seminar on the above topic. It is planned that all delegates will receive a complimentary copy of his very latest book on the subject, just being published by O’Reilly. Two years ago people attended his seminar on SQL from as far away as Benelux and Spain, so lets agree that this posting could be of interest to anyone in Western Europe!

More details are here:  http://www.justsql.co.uk/chris_date/cjd_edin_june_2012.htm

I’ve heard Chris Date speak on the topic before and it’s a real treat to hear such clarity of thought and presentation (the only person in the Oracle community that comes close, in my view, is Cary Millsap); so if you’ve got the opportunity to attend, take it.

 

April 19, 2012

Drop Constraint

Filed under: Indexing,Infrastructure,Oracle — Jonathan Lewis @ 4:22 pm UTC Apr 19,2012

If you drop a unique or primary key constraint the index that supports it may be dropped at the same time – but this doesn’t always happen. Someone asked me recently if it was possible to tell whether or not an index would be dropped without having to find out the hard way by dropping the constraint. The answer is yes – after all, Oracle has to make a decision somehow, so if we can find out how it makes the decision we can predict the decision.

So here’s my best theory so far – along with the observations that led to it. First, run a trace while dropping a primary key constraint and see if this gives you any clues; on an instance running 10gR2 I noticed the following statement appearing in the trace file immediately after the delete from cdef$ (constraint definitions).

select
        o.owner#, i.property, o.name
from
        obj$ o,
        ind$ i
where
        i.obj# = o.obj#
and     o.obj#=:1
;

Given this clue, the obvious next step is to look in sql.bsq (or dcore.bsq if you’re running 11g) to see what it says about column ind$.property, and we find the following comment:

/* The index was created by a constraint : 0x1000 */

This looks promising, so let’s try a few experiments to see what happens in a few cases of creating indexes and constraints – here’s a starting sample:

create table t1 (n1 number);
alter table t1 add constraint t1_pk primary key (n1);

create table t1 (n1 number);
create unique index t1_pk on t1(n1);
alter table t1 add constraint t1_pk primary key (n1);

In the first case (which generated the index automatically) ind$.property held the value 4097; in the second case, where I created the index explicitly, it held the value 1. (For the less mathematically inclined, 0×1000 = decimal 4096). So it would appear that if we want to know if an index would be dropped as the constraint was dropped, we need only check to see if bitand(ind$.property,4096) = 4096. If the predicate evaluates to true the index will be dropped. (But see comment 3 and the reply below)

The exact SQL for the check is left as an exercise to the reader – you will probably want to join ind$ to obj$ to user$.

Footnote: this particular bit of the property column doesn’t seem to be exposed in any of the data dictionary views; but if anyone knows of any other way to see it then I’d be interested to hear about it, especially if you don’t have to have access to the sys schema to see it.

April 11, 2012

Extended Stats

Filed under: CBO,Oracle,Statistics — Jonathan Lewis @ 11:37 pm UTC Apr 11,2012

I’m very keen on the 11g extended stats feature, but I’ve just discovered a critical weakness in one of the implementation details that could lead to some surprising instability in execution plans. It’s a combination of “column group” statistics and “out of range” predicates. Let’s start with  some sample data. (Note: I was running this test on 11.2.0.3):

create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 10000
)
select
	mod(rownum,100)		col1,
	mod(rownum,10)		col2
from
	generator	v1,
	generator	v2
where
	rownum <= 50000
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

I’ve got 100 distinct values for col1 and 10 distinct values for col2 – so the optimizer will assume that I’ve got 1,000 distinct values for the combination. Since there are 50,000 rows this means 50 rows per combination. So let’s see this demonstrated:

set autotrace traceonly explain

select	*
from	t1
where
	col1 = 50
and	col2 = 5
;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

----------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    50 |   300 |    13 |
|*  1 |  TABLE ACCESS FULL| T1   |    50 |   300 |    13 |
----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("COL1"=50 AND "COL2"=5)

If you examine my data definition, though, you will see that the col1 and col2 values align in a way that means there are really only 100 distinct combination, with 500 rows per combination. This looks like one of those cases where extended statistics might be useful. So let’s create a suitable set:

begin
	dbms_output.put_line(
		dbms_stats.create_extended_stats(
			ownname		=> user,
			tabname		=> 'T1',
			extension	=> '(col1, col2)'
		)
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		block_sample 	 => true,
		method_opt 	 => 'for all hidden columns size 1'
	);
end;
/

With these stats in place, I repeat my query with the following effect:


Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

----------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500 |  3000 |    13 |
|*  1 |  TABLE ACCESS FULL| T1   |   500 |  3000 |    13 |
----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("COL1"=50 AND "COL2"=5)

Oracle has used the “column group” statistics to derive a good estimate of the combined selectivity and produced an estimate of 500 rows. But what happens as new data (following the same pattern) is entered, and the queries change to keep up with the data; consider, for example, the slightly artificial query:

select	*
from	t1
where
	col1 = 50
and	col2 = 9.001
;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

----------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    50 |   300 |    13 |
|*  1 |  TABLE ACCESS FULL| T1   |    50 |   300 |    13 |
----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("COL1"=50 AND "COL2"=9.001)

I’ve used the value 9.001 because it is only just outside the known value range for col2 (which ranges from 0 to 9) – and the optimizer has immediately switched from the extended stats back to the single column stats and the “independent predicates” mechanism to estimate the cardinality, leading to a dramatic change in the estimate.

This strategy may be a deliberate choice on the part of the developers, but personally I think that it’s not appropriate. If you’ve created “column group” stats like this it seems reasonably likely that you’re trying to show Oracle a consistent pattern in the data; however it’s quite possible that the data you’re using shows some sort of time-dependency such that the pattern doesn’t change, but the actual values in (at least one) of the columns keeps increasing.

In this case, applying the normal “linear decay” strategy to the extended stats seems much more sensible – if one (or both) of the predicates is slightly out of range then use the extended stats to derive an estimate then apply the decay factor(s) to that value, rather than falling back on multiplying the individual selectivities.

(The case were I saw this problem was one where rows were constantly being inserted and deleted. One column was a time-based audit id, the other was an action id, and every audit id produced a few dozen action ids, with some overlap of actions across audit ids. New, higher valued, audit rows were constantly being added to this table, while old, lower ones, were being deleted. The effect was that the extended stats because useless within seconds of being generated.)

Footnote:

Coincidentally, Randolf Geist highlighted another edge case a couple of days ago – but it hasn’t stopped me thinking that extended stats, and “column group” stats especially are a terrific feature.

April 10, 2012

Stats Collection

Filed under: Uncategorized — Jonathan Lewis @ 7:13 am UTC Apr 10,2012

This is just a temporary note to point you to a poll set up by Timur Akhmadeev to get an idea of how people are handling stats collection in newer versions of Oracle.

 

April 4, 2012

Philosophy 16

Filed under: Philosophy — Jonathan Lewis @ 5:49 pm UTC Apr 4,2012

I couldn’t help laughing when I saw this.

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 966 other followers