I gather that journalistic style dictates that if the headline is a question then the answer is no. So, following on from a discussion of possible side effects of partition exchange, let’s look at an example which doesn’t involve partitions. I’ve got a schema that holds nothing by two small, simple heap tables, parent and child, (with declared primary keys and the obvious referential integrity constraint) and I run a couple of very similar queries that produce remarkably different results:
select par.id parent_id, chi.id child_id, chi.name child_name from parent par, child chi where chi.id_p = par.id order by par.id, chi.id ; PARENT_ID CHILD_ID CHILD_NAME ---------- ---------- ---------- 1 1 Simon 1 2 Sally 2 1 Janet 2 2 John 3 1 Orphan 5 rows selected.
Having got this far with my first query I’ve decided to add the parent name to the report:
select par.id parent_id, par.name parent_name, chi.id child_id, chi.name child_name from parent par, child chi where chi.id_p = par.id order by par.id, chi.id ; PARENT_ID PARENT_NAM CHILD_ID CHILD_NAME ---------- ---------- ---------- ---------- 1 Smith2 1 Simon 1 Smith 1 Simon 1 Smith2 2 Sally 1 Smith 2 Sally 2 Jones 1 Janet 2 Jones 2 John 6 rows selected.
How could adding a column to the select list result in one child row disappearing and two child rows being duplicated; and is this a bug ?
To avoid any confusion, here’s the complete script I used for creating the schema owner, in 126.96.36.199, with no extra privileges granted to PUBLIC:
create user u1 identified by u1 default tablespace test_8k quota unlimited on test_8k ; grant create session, create table to u1 ;
It didn’t take long for a couple of people to suggest that the oddity was the consequence of constraints that had not been enabled and validated 100% of the time, but the suggestions offered were a little more convoluted than necessary. Here’s the code I ran from my brand new account before running the two select statements:
create table parent ( id number(4), name varchar2(10), constraint par_pk primary key (id) rely disable novalidate ) ; create table child( id_p number(4) constraint chi_fk_par references parent on delete cascade rely disable novalidate, id number(4), name varchar2(10), constraint chi_pk primary key (id_p, id) rely disable novalidate ) ; insert into parent values (1,'Smith'); insert into parent values (1,'Smith2'); insert into parent values (2,'Jones'); insert into child values(1,1,'Simon'); insert into child values(1,2,'Sally'); insert into child values(2,1,'Janet'); insert into child values(2,2,'John'); insert into child values(3,1,'Orphan'); commit; begin dbms_stats.gather_table_stats(user,'child'); dbms_stats.gather_table_stats(user,'parent'); end; /
In a typical data warehouse frame of mind I’ve added plenty of constraints, but left them all disabled and novalidated, but told Oracle to rely on them for optimisation strategies. This means all sorts of incorrect data could get into the tables, with all sorts of unexpected side effects on reporting. The example above shows duplicates on primary keys (and if you checked the table definition you’d find that the primary key columns were nullable as well), child rows with no parent key.
In fact 11g and 12c behave differently – the appearance of the Orphan row in the first sample query is due, as Chris_cc pointed out in the first comment, to the optimizer deciding that it could use join elimination because it was joining to a single-column primary key without selecting any other columns from the referenced table. In 12c the optimizer doesn’t use join elimination for this query, so both queries have the same (duplicated) output.
Make sure you read the articles linked to by Dani Schneider’s comment below, and note especially the impact on the query_rewrite_integrity parameter.