Oracle Scratchpad

July 27, 2014

Analogy

Filed under: 12c,Oracle,Uncategorized — Jonathan Lewis @ 8:02 am BST Jul 27,2014

So 12.1.0.2 is out with a number of interesting new features, of which the most noisily touted is the “in-memory columnar storage” feature. As ever the key to making best use of a feature is to have an intuitive grasp of what it gives you, and it’s often the case that a good analogy helps you reach that level of understanding; so here’s the first thought I had about the feature during one of the briefing days run by Maria Colgan.

“In-memory columnar storage gives you bitmap indexes on OLTP systems without the usual disastrous locking side effects.”

Obviously the analogy isn’t perfect … but I think it’s very close:  for each column stored you use a compression technique to pack the values for a large number of rows into a very small space, and for each stored row you can derive the rowid by arithmetic.  In highly concurrent OLTP systems there’s still room for some contention as the session journals are applied to the globally stored compressed columns (but then, private redo introduces some effects of that sort anyway); and the “indexes” have to be created dynamically as tables are initially accessed (but that’s a startup cost, it’s timing can be controlled, and it’s basically limited to a tablescan).

Whatever the technical hand-waving it introduces – thinking of the in-memory thing as enabling real-time bitmaps ought to help you think of ways to make good use of the feature.

 

 

May 15, 2014

Subquery with OR

Filed under: 12c,Bugs,CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 6:23 pm BST May 15,2014

Prompted by a pingback on this post, followed in very short order by a related question (with a most gratifying result) on Oracle-L, I decided to write up a note about another little optimizer enhancement that appeared in 12c. Here’s a query that differs slightly from the query in the original article:


select
	id, modded, mod_15
from
	t1
where
	t1.mod_15 = 1                     -- originally t1.mod_15 > 0
and	(   t1.modded is null             -- originally t1.modded = 0
	 or exists (
		select	null
		from	t2
		where	t2.id = t1.modded
	    )
	)
;

As a general principle, the “OR EXISTS” stops the optimizer from unnesting the subquery, so my original article suggested a workaround that required you to rewrite the query with a UNION ALL, using the lnnvl() function (where possible) as the easy way to eliminate accidental duplication. Take a look at the plans for my new query, though – first in 11.2.0.4, then in 12.1.0.1:


Execution Plan for 11.2.0.4
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    34 |   374 |    50   (0)| 00:00:01 |
|*  1 |  FILTER            |       |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T1    |   667 |  7337 |    50   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| T2_PK |     1 |     3 |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T1"."MODDED" IS NULL OR  EXISTS (SELECT 0 FROM "T2" "T2"
              WHERE "T2"."ID"=:B1))
   2 - filter("T1"."MOD_15"=1)
   3 - access("T2"."ID"=:B1)

Execution Plan for 12.1.0.1
------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |    27 |   378 |    50   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI NA|       |    27 |   378 |    50   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | T1    |   667 |  7337 |    50   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN  | T2_PK |     1 |     3 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."MOD_15"=1)
   3 - access("T2"."ID"="T1"."MODDED")

As expected, 11.2.0.4 has had to use a filter subquery approach – but 12.1.0.1 has found a different path. For this special “is null” case the optimizer has unnested the subquery and used a “null aware (NA) semi-join”. In this very small example there is no change in the reported cost, and the mechanics of the execution plan will be quite similar at run time – but in real systems there are bound to be cases where the new strategy is more efficient.

Unfortunately …

Bug 18650065 (fixed in 12.2) rears it’s ugly head: WRONG RESULTS ON QUERY WITH SUBQUERY USING OR EXISTS.
I can demonstrate this with the following code:


update t1 set modded = null
where id <= 30;
commit;

select
	id, modded, mod_15
from
	t1
where
	t1.id = 1                     -- previously mod_15 = 1
and	(   t1.modded is null
	 or exists (
		select	null
		from	t2
		where	t2.id = t1.modded
	    )
	)
;

alter table t1 add constraint t1_pk primary key(id);

select
	id, modded, mod_15
from
	t1
where
	t1.id = 1                     -- previously mod_15 = 1
and	(   t1.modded is null
	 or exists (
		select	null
		from	t2
		where	t2.id = t1.modded
	    )
	)
;

And here’s the output from the above script:


30 rows updated.

Commit complete.

        ID     MODDED     MOD_15
---------- ---------- ----------
         1                     1

1 row selected.

Table altered.

no rows selected

I’ve modified a few rows so that the “null-aware” bit of the new transformation matters, but I’ve now got a data set and transformation where I get the wrong results because I’ve defined a primary key (unique would have done) on a critical column in the query. If you check the execution plan you’ll find that the optimizer has switched from a null aware semi-join to a simple nested loop join.

There is a workaround for this problem – disable the relevant feature:

alter session set "_optimizer_null_accepting_semijoin"=false;

For Reference:

Here’s the SQL to generate the data for the above demonstration:

create table t1
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 5000
)
select
	rownum			id,
	mod(rownum,999)		modded,
	mod(rownum,15)		mod_15,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 10000
;

update t1 set modded = null where modded = 26;

create index t1_i1 on t1(id);
create index t1_i2 on t1(modded);

create table t2
as
select
	2 * rownum		id,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	all_Objects
where
	rownum <= 20
;	

alter table t2 add constraint t2_pk primary key(id);

May 4, 2014

Extended stats

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 1:24 pm BST May 4,2014

Like the recent article on deleting histograms this is another draft that I rediscovered while searching for some notes I had written on a different topic – so I’ve finally finished it off and published it.

Here’s a quirky little detail of extended stats that came up in an OTN thread earlier on this week [ed: actually 8th Jan 2014]. When you create column group stats, Oracle uses an undocumented function sys_op_combined_hash() to create a hash value, and if you gather simple stats on the column (i.e. no histogram) you can get some idea of the range of values that Oracle generates through the hash function. For example:


create table t1 as
select  1 n1, 2 n2
from dual
connect by level<=5000
union all
select  2, 1
from dual
connect by level<=5000
;

select dbms_stats.create_extended_stats(user,'t1','(n1, n2)') name from dual;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'
	);
end;
/

break on column_name skip 1

column column_name format a32
column endpoint_number heading "EP No."
column endpoint_value  heading "Value" format 999,999,999,999,999,999,999

select
	column_name, endpoint_number, endpoint_value
from
	user_tab_histograms
where
	table_name = 'T1'
order by
	column_name, endpoint_value
;

COLUMN_NAME                          EP No.                        Value
-------------------------------- ---------- ----------------------------
N1                                        0                            1
                                          1                            2

N2                                        0                            1
                                          1                            2

SYS_STUBZH0IHA7K$KEBJVXO5LOHAS            0      298,332,787,864,733,000
                                          1    8,095,781,421,167,520,000

I could have selected low_value and high_value from user_tab_cols, using utl_raw.cast_to_number() to display them in numeric format, but the view user_tab_histograms display the low and high as a two-bucket histogram if there is no actual histogram data for the column in the histogram (histgrm$) table.

We probably don’t need to worry about what the low and high values might be because taking hash values destroys any meaning that a range might have (the optimizer can’t use column group stats in range-based predicates, only in equality predicates). However, we might collect a frequency histogram (or Top-N histogram in 12c) on the column group because there might be some data skew in the sets of values that we need to tell the optimizer about – so let’s gather a histogram with 2 buckets on our sample data set and see what we get:


begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 2'
	);
end;
/

select
	column_name, endpoint_number, endpoint_value
from
	user_tab_histograms
where
	table_name = 'T1'
order by
	column_name, endpoint_value
;

COLUMN_NAME                          EP No.                        Value
-------------------------------- ---------- ----------------------------
N1                                     5000                            1
                                      10000                            2

N2                                     5000                            1
                                      10000                            2

SYS_STUBZH0IHA7K$KEBJVXO5LOHAS         5000                1,977,102,303
                                      10000                7,894,566,276

The histogram values have changed! As one of the posters on the OTN thread points out, what Oracle has actually stored in this case is mod(sys_op_combined_hash() ,9999999999).

So far I haven’t done any exhaustive testing to work out whether the change in the strategy for storing numbers makes any difference to the typical optimizer arithmetic – but I do have at least one case (relating to “missing values” behaviour where the presence or absence of a column group histogram does make a difference to the estimated cardinality in a way that seems inconsistent with other patterns of behaviour: I doubt if it’s actually due to the change in what’s stored, and one day I may come across a client where I actually need to work out what’s going on and how best to work with the anomaly.

Bonus thought:

In 12c Oracle collects column stats automatically as it loads data into an empty table; but not if it’s got extended stats defined on it.  (This is bug no 18425876, fixed in 12.2). Here’s some code modelling a client scenario where we truncate and reload a table every day. In the first part of the demonstration I’ve loaded the table twice to show that after truncating and reloading I get new stats on the table – the first load is 10,000 rows, the second is 20,000 rows and the stats reflect this automatically. In the second part of the code, after adding a set of column group stats, truncating and loading 5,000 rows, the stats from the previous cycle are still in place. (The code is only relevant to 12c, of course)

create table t1 (n1 number, n2 number);

insert	/*+ append */
into	t1
select	object_id, data_object_id
from	all_objects
where
	rownum <= 10000
;
commit;

-- stats have appeared without a call to dbms_stats to gather them.

select blocks, num_rows from user_tables where table_name = 'T1';
select column_name, num_distinct, num_nulls, density, low_value, high_value from user_tab_cols where table_name = 'T1';

truncate table t1;

insert	/*+ append */
into	t1
select	object_id, data_object_id
from	all_objects
where
	rownum <= 20000
;
commit;

-- Stats now show the latest data 

select blocks, num_rows from user_tables where table_name = 'T1';
select column_name, num_distinct, num_nulls, density, low_value, high_value from user_tab_cols where table_name = 'T1';

-- Add a column group to the stats 

select dbms_stats.create_extended_stats(user,'t1','(n1, n2)') name from dual;

truncate table t1;

insert	/*+ append */
into	t1
select	object_id, data_object_id
from	all_objects
where
	rownum <= 5000
;
commit;

-- The stats have not been updated to reflect the new data, and the column group stats are empty

select blocks, num_rows from user_tables where table_name = 'T1';
select column_name, num_distinct, num_nulls, density, low_value, high_value from user_tab_cols where table_name = 'T1';

--------------------------------------------------------
-- Here are the three consecutive sets of results
--------------------------------------------------------

    BLOCKS   NUM_ROWS
---------- ----------
        16      10000

COLUMN_NAME                      NUM_DISTINCT  NUM_NULLS    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------------------- ------------ ---------- ---------- -------------------------- --------------------------
N1                                      10000          0      .0001 C103                       C3020C60
N2                                       2534       7429 .000394633 C103                       C30B2929

    BLOCKS   NUM_ROWS
---------- ----------
        32      20000

COLUMN_NAME                      NUM_DISTINCT  NUM_NULLS    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------------------- ------------ ---------- ---------- -------------------------- --------------------------
N1                                      20000          0     .00005 C103                       C30A4553
N2                                       3115      16848 .000321027 C103                       C30B2929

    BLOCKS   NUM_ROWS
---------- ----------
        32      20000

COLUMN_NAME                      NUM_DISTINCT  NUM_NULLS    DENSITY LOW_VALUE                  HIGH_VALUE
-------------------------------- ------------ ---------- ---------- -------------------------- --------------------------
N1                                      20000          0     .00005 C103                       C30A4553
N2                                       3115      16848 .000321027 C103                       C30B2929
SYS_STUBZH0IHA7K$KEBJVXO5LOHAS

The workaround given in the bug is “add the extended stats after loading the table” – but if you’re constantly truncating and reloading that means you have to drop and add the extended stats and do a tablescan to gather the column group stats every time you reload.

Note: the limitation applies whether you create a column group, “ordinary” extended stats, a virtual column, or an implicit virtual column underlying a function-based index.

Just as a little aside – when I first wrote the demo script I forgot to put in the commit; after the insert/append – which meant I was trying to create column group stats on a table which should have given me Oracle error: “ORA-12838: cannot read/modify an object after modifying it in parallel”; instead this has been trapped by the dbms_stats package and shows up as a slightly confusing:


select dbms_stats.create_extended_stats(user,'t1','(n1, n2)') name from dual
       *
ERROR at line 1:
ORA-20001: Error when processing extension -  resource busy and acquire with NOWAIT specified or timeout expired
ORA-06512: at "SYS.DBMS_STATS", line 12977
ORA-06512: at "SYS.DBMS_STATS", line 44967
ORA-06512: at "SYS.DBMS_STATS", line 44986

March 14, 2014

12c Temporary

Filed under: 12c,Infrastructure,Oracle,undo — Jonathan Lewis @ 6:41 pm BST Mar 14,2014

Just one of those little snippets to cover something new and remind of something old. A single session can now have three (or more) temporary tablespaces in use at the same time for different reasons.

  • In 12c you can set parameter temp_undo_enabled to true, at which point the undo relating to global temporary tables (GTTs) will be written into the database default temporary tablespace, which means it won’t generate redo. As an interesting side effect this also means that you can do DML on temporary objects in a physical standby database. Currently the undo segment shows as type “UNDEFINED” in v$sort_usage. A detail to watch out for, though – it looks as if each session may get its own temporary undo segment – so be careful about specifying the extent size and tempfile size for the database default temporary tablespace.
  • In 11g you can specify a tablespace (though not a tablespace group) in the ‘create global temporary table’ statement, which means you keep activity about GTTs separated from the I/O resulting from sorts and hash joins etc. spilling to disc; in fact you could specify a different temporary tablespace for every GTT if you wanted to – and I could imagine a case for having a couple of different temporary tablespaces to accommodate GTTs with very different usage characteristics. (Unfortunately you still can’t specify a tablespace in the dbms_lob.create_temporary() function).  If you don’t specify a tablespace for a GTT it will go into the default temporary tablespace of the user who is using it (not the database default, and not the default for the definer of the GTT). If you create indexes on a GTT they will automatically go into the same tablespace as the table.
  • Finally, of course, there’s the default temporary tablespace for the user and this is where GTTs will go if they don’t have a tablespace specified, and where all the scratch data (sorts, hash tables, factored subqueries et. al.) will go.

This combination means, of course, that you could manage to do a single “insert as select” writing a GTT to one temporary tablespace, with its undo going to a second temporary tablespace, and the spill from a sort or hash join in the select going to a third. The flexibility probably won’t make much difference to performance (for most people), but it’s possible that it will make it easier to monitor where the work is coming from if you’r ever in the position where your single temporary tablespace is subject to a lot of I/O.

Footnote:

In the past I’ve advised DBAs to set up a small number of tablespaces (or tablespace groups) so that they can allocate different classes of users – typically grouped by business function – to different temporary tablespaces. The ability to allocate GTTs to temporary tablespaces allows a further degree of refinement to this strategy.

March 5, 2014

12c pq_replicate

Filed under: 12c,Exadata,Execution plans,Oracle,Parallel Execution — Jonathan Lewis @ 5:39 pm BST Mar 5,2014

Another day, another airport lounge – another quick note: one of the changes that appeared in 12c was a tweak to the “broadcast” distribution option of parallel queries. I mentioned this in a footnote to a longer article a couple of months ago; this note simply expands on that brief comment with an example. We’ll start with a simple two-table hash join – which I’ll first construct and demonstrate in 11.2.0.4:


create table t1
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 3000
)
select
	rownum				n1,
	lpad(rownum,6,'0')		small_vc,
	lpad(rownum,200,'0')		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1000
;

create table t2
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 3000
)
select
	1 + mod(rownum,10000)			n1,
	lpad(1 + mod(rownum,10000),6,'0')	small_vc,
	lpad(rownum,500,'0')			padding
from
	generator	v1,
	generator	v2
where
	rownum <= 20000 ;

-- collect stats, no histograms.

select
  	/*+
  		leading(t1 t2)
 		parallel(t1 2)
 		parallel(t2 2)
 		use_hash(t2)
 	*/
 	t1.padding,
 	t2.padding
from 	t1, t2
where	t2.n1 = t1.n1
and	t2.small_vc = t1.small_vc
;

-------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost  |    TQ  |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |  1000 |   707K|   135 |        |      |            |
|   1 |  PX COORDINATOR         |          |       |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10001 |  1000 |   707K|   135 |  Q1,01 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN            |          |  1000 |   707K|   135 |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE          |          |  1000 |   207K|     4 |  Q1,01 | PCWP |            |
|   5 |      PX SEND BROADCAST  | :TQ10000 |  1000 |   207K|     4 |  Q1,00 | P->P | BROADCAST  |
|   6 |       PX BLOCK ITERATOR |          |  1000 |   207K|     4 |  Q1,00 | PCWC |            |
|   7 |        TABLE ACCESS FULL| T1       |  1000 |   207K|     4 |  Q1,00 | PCWP |            |
|   8 |     PX BLOCK ITERATOR   |          | 20000 |     9M|   131 |  Q1,01 | PCWC |            |
|   9 |      TABLE ACCESS FULL  | T2       | 20000 |     9M|   131 |  Q1,01 | PCWP |            |
-------------------------------------------------------------------------------------------------

In this plan slave set 2 scans table t1 in parallel and broadcasts the result set to slave set 1 (lines 5 – 7). The significance of the broadcast option is that each slave in slave set 2 sends all the rows it has read to every slave in slave set 1. For a fairly large table with a high degree of parallelism this could be a lot of inter-process communication; the total number of rows passing through the PX message pool is “DOP x number of row filtered from t1″.

After a slave in slave set 1 has receive the whole of the t1 result set it builds an in-memory hash table and starts scanning rowid ranges (PX BLOCK ITERATOR) from table t2, probing the in-memory hash table to effect the join (lines 3,4, 8,9). Since each slave has a copy of the whole result set from t1 it can scan any chunk of t2 and handle the contents locally. Moreover, because slave set 1 isn’t reading its second input from a virtual table it is able to write its output immediately the virtual table (:TQ10001) that feeds the query coordinator with the result (lines 1,2) – we don’t have to do a “hash join buffered” operation and buffer the entire second input before starting to execute the join.

So how does 12c change things. With the same starting data and query, here’s the execution plan:

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost  |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |  1000 |   707K|   135 |        |      |            |
|   1 |  PX COORDINATOR       |          |       |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM) | :TQ10000 |  1000 |   707K|   135 |  Q1,00 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN          |          |  1000 |   707K|   135 |  Q1,00 | PCWP |            |
|   4 |     TABLE ACCESS FULL | T1       |  1000 |   207K|     4 |  Q1,00 | PCWP |            |
|   5 |     PX BLOCK ITERATOR |          | 20000 |     9M|   131 |  Q1,00 | PCWC |            |
|   6 |      TABLE ACCESS FULL| T2       | 20000 |     9M|   131 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------

Notice, in particular, that we only have one virtual table (or table queue :TQ10000) rather than two – and that’s from a parallel query slave set to the query co-ordinator, parallel to serial; the query only uses one set of parallel query slaves. Until you run the query with rowsource execution statistics enabled and look at the output from v$pq_tqstat it’s not going to be immediately obvious what has happened, but we should see that somehow Oracle is no longer broadcasting the first table even though it’s still doing something in parallel with both tables.

The run-time statistics confirm that we’ve only used one set of slaves, and each slave in the slave set has scanned the whole of table t1. This means each slave can build the full hash table and then go on to read rowid ranges from table t2. We’ve managed to get the benefit of broadcasting t1 (every slave has the whole of t1 so we don’t have to scan and distribute the big table t2 through the PX message pool) but we haven’t had to clone it multiple times through the PX message pool.

Clearly there’s a trade-off here that Oracle Corp. has decided is worth considering. I’m guessing it’s biased towards Exadata where you might run queries with a very high degree of parallelism. In that case the overhead of task switching as large numbers of messages are passed around may (and this is pure supposition) be greater than the added cost of loading the table into the buffer cache (of each instance) and having each slave scan it from there. (Reminder – 11g introduced two “opposite” changed to tablescans: “serial direct reads” and “parallel in-memory scans”.)

There’s one little oddity in this replication – there’s a pair of hints: pq_replicate and no_pq_replicate to control the effect if you think the optimizer is making the wrong choice. I would have guessed that in my example the hint would read: /*+ pq_replicate(t1) */ as it’s table t1 that is read by every single slave. Strangely, though, this is what the outline section of the execution plan showed:


  /*+
      BEGIN_OUTLINE_DATA
      PQ_REPLICATE(@"SEL$1" "T2"@"SEL$1")
      PQ_DISTRIBUTE(@"SEL$1" "T2"@"SEL$1" BROADCAST NONE)
      USE_HASH(@"SEL$1" "T2"@"SEL$1")
      LEADING(@"SEL$1" "T1"@"SEL$1" "T2"@"SEL$1")
      FULL(@"SEL$1" "T2"@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      OPT_PARAM('_optimizer_cost_model' 'io')
      DB_VERSION('12.1.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Notice how the hint specifies table t2, not table t1 !

Footnote

Here’s a little anomaly,  and a generic warning about “optimizer_features_enable”: I found that if I used the hint /*+ optimizer_features_enable(‘11.2.0.4′) */ in 12c I could still get the pq_replicate() hint to work. Unfortunately there are a few places where the hint (or parameter) isn’t guaranteed to take the optimizer code backwards the full 100%.

February 14, 2014

12c Subquery Factoring

Filed under: 12c,Oracle,Subquery Factoring,Tuning — Jonathan Lewis @ 11:44 am BST Feb 14,2014

From time to time I’ve posted a reminder that subquery factoring (“with subquery”) can give you changes in execution plans even if the subquery that you’ve taken out of line is written back inline by Oracle rather than being materialized. This can still happen in 12c – here’s a sample query in the two forms with the result sets and execution plans.  First, the “factored” version:

(more…)

February 6, 2014

12c fixed subquery

Filed under: 12c,Execution plans,Oracle,subqueries — Jonathan Lewis @ 2:25 pm BST Feb 6,2014

Here’s a simple little demonstration of an enhancement to the optimizer in 12c that may result in some interesting changes in execution plans as cardinality estimates change from “guesses” to accurate estimates.

(more…)

December 8, 2013

12c Subqueries

Filed under: 12c,CBO,Execution plans,Oracle,subqueries — Jonathan Lewis @ 11:32 am BST Dec 8,2013

When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:

(more…)

November 14, 2013

32K Columns

Filed under: 12c,Function based indexes,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 8:06 am BST Nov 14,2013

Oracle 12c has increased the maximum length of character-based columns to 32K bytes – don’t get too excited, they’re stored out of lines (so similar in cost to LOBs) and need some modification to the parameter file and data dictionary (starting the database in upgrade mode) before you can use them.

Richard Foote has a pair of articles on indexing such columns:

Be cautious about enabling this option and test carefully – there are going to be a number of side effects, and some of them may require a significant investment in time to resolve. The first one that came to my mind was that if you’ve created a function-based index on a pl/sql function that returns a varchar2() type and haven’t explicitly created the index on a substr() of the return value then the data type of the function’s return value will change from the current default of varchar2(4000) to varchar2(32767) – which means the index will become invalid and can’t be rebuilt or recreated.

Obviously you can redefine the index to include an explicit substr() call – but then you have to find all the code that was supposed to use the index and modify it accordingly.

November 6, 2013

12c In-memory

Filed under: 12c,compression,Indexing,Infrastructure,Oracle — Jonathan Lewis @ 6:53 pm BST Nov 6,2013

I wrote a note about the 12c “In-Memory” option some time ago on the OTN Database forum and thought I’d posted a link to it from the blog. If I have I can’t find it now so, to avoid losing it, here’s a copy of the comments I made:

(more…)

October 16, 2013

Hash Clusters – 3

Filed under: 12c,Bugs,dbms_xplan,Oracle,Upgrades — Jonathan Lewis @ 1:03 pm BST Oct 16,2013

This note is a quick summary of a costing oddity that came to light after a twitter conversation with Christian Antognini yesterday. First a little test script to get things going:

(more…)

October 9, 2013

12c Histograms pt.3

Filed under: 12c,Histograms,Oracle,Statistics — Jonathan Lewis @ 8:13 pm BST Oct 9,2013

It has taken much longer than I anticipated to get around to writing part 3 of this mini-series on what Oracle has done about histograms in 12c.
In part 1 I gave a thumbnail sketch of the three types of histogram available in 12c
In part 2 I described in some detail the improvements in performance and accuracy for the frequency and top-frequency histograms

In part 3 of this mini-series I’ll be describing how the implementation of the “hybrid” histogram that Oracle produces if the “approximate NDV” mechanism has been enabled and you’ve left the estimate_percent to auto_sample_size. There is little difference between the work needed to create a hybrid histogram and the work needed to generate the old “height-balanced” histogram, but the degree of information captured by the hybrid is much greater than that of the height-balanced.

(more…)

Hinting

Filed under: 12c,Hints,Ignoring Hints,Oracle — Jonathan Lewis @ 5:33 pm BST Oct 9,2013

I’ve spent so many years trying to explain that a “hint” to the Oracle optimizer is an order – if you know how to do it properly – that I finally decided to list the manual references that have made this point over the last 15 or so years. Here’s the list, which ends with a surprising change of flavour. (Emphasis in the body of the text is mine).

(more…)

September 3, 2013

Bitmap / Btree

Filed under: 12c,Indexing,Oracle — Jonathan Lewis @ 7:28 am BST Sep 3,2013

Here’s a little note that came about after I tweeted an idle thought on Twitter yesterday

  • 12c allows you to have multiple indexes on the same columns on a table, although only one of them is allowed to be visible at any one time – you can do the same with any recent versions of Oracle “almost”, and without the invisibility requirements. (Thanks to Jason Bucata for suggesting the critical detail on this one.)
  • 12c allows you to have “partial” indexing on partitioned tables –  you can do the same with earlier versions of Oracle “almost” but only if the indexes are local indexes or globally partitioned.
  • 12c doesn’t officially allow you to create an index that is a bitmap in the past and a btree in the present (yet) – although you can almost do this in any recent versions of Oracle.

(more…)

August 9, 2013

12c Join Views

Filed under: 12c,Oracle,Performance,Tuning — Jonathan Lewis @ 6:36 pm BST Aug 9,2013

There are a couple of posts on the blog describing problems with updateable join views or, to be more precise, join views which were key-preserved but which the optimizer did not recognize as key-preserved. Both scenarios are addressed in 12c:

Next Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,909 other followers