You might have expected the following query ought to run reasonably efficiently, after all it seems to be targeted very accurately at precisely the few rows of information I’m interested in:
select column_name, avg_col_len from dba_subpart_col_statistics where owner = 'TEST_USER' and table_name = 'TEST_COMP' and subpartition_name = 'P_MAX_D'
I’m after some subpartition column stats (so that I can work out whether a subpartition of a local index on a composite partition is roughly the right size) and I’m querying the view by the only columns that seem to be there to allow me to access the data efficiently. Unfortunately the execution plan isn’t doing what I need it to do. The following plan is coming from a small 11.2.0.3 database with up to date statistics:
------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 2141 | 18 (6)| 00:00:01 | | 1 | NESTED LOOPS OUTER | | 1 | 2141 | 18 (6)| 00:00:01 | |* 2 | HASH JOIN | | 1 | 2128 | 16 (7)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 100 | 3 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID | USER$ | 1 | 18 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | I_USER1 | 1 | | 0 (0)| 00:00:01 | |* 6 | INDEX RANGE SCAN | I_OBJ2 | 1 | 82 | 2 (0)| 00:00:01 | | 7 | VIEW | TSP$ | 954 | 1889K| 13 (8)| 00:00:01 | |* 8 | HASH JOIN | | 954 | 83952 | 13 (8)| 00:00:01 | | 9 | NESTED LOOPS OUTER | | 82 | 6396 | 10 (0)| 00:00:01 | | 10 | NESTED LOOPS | | 82 | 2788 | 9 (0)| 00:00:01 | | 11 | TABLE ACCESS FULL | TABCOMPART$ | 7 | 70 | 2 (0)| 00:00:01 | |* 12 | TABLE ACCESS CLUSTER | COL$ | 12 | 288 | 1 (0)| 00:00:01 | |* 13 | INDEX UNIQUE SCAN | I_OBJ# | 1 | | 0 (0)| 00:00:01 | | 14 | TABLE ACCESS BY INDEX ROWID| ATTRCOL$ | 1 | 44 | 1 (0)| 00:00:01 | |* 15 | INDEX UNIQUE SCAN | I_ATTRCOL1 | 1 | | 0 (0)| 00:00:01 | | 16 | TABLE ACCESS FULL | TABSUBPART$ | 81 | 810 | 2 (0)| 00:00:01 | | 17 | TABLE ACCESS BY INDEX ROWID | HIST_HEAD$ | 1 | 13 | 2 (0)| 00:00:01 | |* 18 | INDEX RANGE SCAN | I_HH_OBJ#_INTCOL# | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("O"."OBJ#"="TSP"."OBJ#") 5 - access("U"."NAME"='TEST_USER') 6 - access("O"."OWNER#"="U"."USER#" AND "O"."NAME"='TEST_COMP' AND "O"."NAMESPACE"=1 AND "O"."REMOTEOWNER" IS NULL AND "O"."LINKNAME" IS NULL AND "O"."SUBNAME"='P_MAX_D' AND "O"."TYPE#"=34) filter("O"."TYPE#"=34 AND "O"."SUBNAME"='P_MAX_D' AND "O"."LINKNAME" IS NULL) 8 - access("TSP"."POBJ#"="TCP"."OBJ#") 12 - filter(BITAND("C"."PROPERTY",32768)<>32768) 13 - access("TCP"."BO#"="C"."OBJ#") 15 - access("C"."OBJ#"="A"."OBJ#"(+) AND "C"."INTCOL#"="A"."INTCOL#"(+)) 18 - access("TSP"."OBJ#"="H"."OBJ#"(+) AND "TSP"."INTCOL#"="H"."INTCOL#"(+))
Notice particularly the full tablescan of the tabcompart$ table (the thing that identifies the (logical) partition objects that make up a composite partitioned table). This is happening inside the view tsp$, which Oracle is treating as a non-mergeable view, so we have to ask why the optimizer isn’t using one of two features that would allow this query to run more efficiently:
- view merging so that it can index into tabsubpart$ using the object id I get in line 6 of the plan
- predicate pushing to achieve a similar effect in a different way
Let’s look at the definition of dba_subpart_col_statistics (you can find this in the dictionary script $ORACLE_HOME/rdbms/admin/cdoptim.sql if you’re running a recente version of Oracle) to see what we can discover:
create or replace view DBA_SUBPART_COL_STATISTICS (OWNER, TABLE_NAME, SUBPARTITION_NAME, COLUMN_NAME, NUM_DISTINCT, LOW_VALUE, HIGH_VALUE, DENSITY, NUM_NULLS, NUM_BUCKETS, SAMPLE_SIZE, LAST_ANALYZED, GLOBAL_STATS, USER_STATS, AVG_COL_LEN, HISTOGRAM) as select u.name, o.name, o.subname, tsp.cname, h.distcnt, case when SYS_OP_DV_CHECK(o.name, o.owner#) = 1 then h.lowval else null end, case when SYS_OP_DV_CHECK(o.name, o.owner#) = 1 then h.hival else null end, h.density, h.null_cnt, case when nvl(h.distcnt,0) = 0 then h.distcnt when h.row_cnt = 0 then 1 when (h.bucket_cnt > 255 or (h.bucket_cnt > h.distcnt and h.row_cnt = h.distcnt and h.density*h.bucket_cnt < 1)) then h.row_cnt else h.bucket_cnt end, h.sample_size, h.timestamp#, decode(bitand(h.spare2, 2), 2, 'YES', 'NO'), decode(bitand(h.spare2, 1), 1, 'YES', 'NO'), h.avgcln, case when nvl(h.row_cnt,0) = 0 then 'NONE' when (h.bucket_cnt > 255 or (h.bucket_cnt > h.distcnt and h.row_cnt = h.distcnt and h.density*h.bucket_cnt < 1)) then 'FREQUENCY' else 'HEIGHT BALANCED' end from sys.obj$ o, sys.hist_head$ h, tsp$ tsp, user$ u where o.obj# = tsp.obj# and tsp.obj# = h.obj#(+) and tsp.intcol# = h.intcol#(+) and o.type# = 34 /* TABLE SUBPARTITION */ and o.owner# = u.user# and o.namespace = 1 and o.remoteowner IS NULL and o.linkname IS NULL /
This doesn’t look too bad. Although the layout of the code makes it a little hard to read it looks as if we could start at obj$ with the object name, join to user$ to check we’ve got the right object owner, then join to tsp$ to get some “denormalised” details about the Table Sub Partition (like the subpartition object number and internal column numbers) and then join to hist_head$ where the interesting column statistics are kept. The join to hist_head$ will have to appear last in the join order because it’s an outer join (which probably has to be there so that the query can return rows that report NULLs for columns where the statistics haven’t been collected).
So let’s take a closer look at tsp$ (again it’s in script cdoptim.sql) – perhaps that’s where the problem is:
create or replace view TSP$ as select tsp.obj#, tcp.bo#, c.intcol#, decode(bitand(c.property, 1), 1, a.name, c.name) cname from sys.col$ c, sys.tabsubpart$ tsp, sys.tabcompart$ tcp, attrcol$ a where tsp.pobj# = tcp.obj# and tcp.bo# = c.obj# and bitand(c.property,32768) != 32768 /* not unused columns */ and c.obj# = a.obj#(+) and c.intcol# = a.intcol#(+) /
Again it doesn’t look too bad; we seem to have possible access paths through tsp.obj# (the subpartition’s object id), tcp.bo# (the base object id for the partition – which means the object id of the owning table), and c.intcol# (the internal column position for the column). Again, though, we see an outer join – we have a join to table attrcol$ (the thing that defines attributes for object types). But again there’s no obvious reason why this should cause a problem.
It’s only when you look closely at the whole picture that you can spot the problem. Here’s the critical outer join from the main query:
and tsp.obj# = h.obj#(+) and tsp.intcol# = h.intcol#(+)
But the relevant columns in the select list for view tsp$ are tsp.obj# and c.intcol# – we’re trying to do an outer join to two different tables, and there’s a limitation in the optimizer that makes this difficult. Here’s a brief demo of the basic problem:
rem rem Script: outer_limitation.sql rem Author: Jonathan Lewis rem Dated: May 2012 rem create table t1a ( n1a number, v1a varchar(10) ); create table t1b ( n1b number, v1b varchar(10) ); create table t2 ( n2a number, n2b number, v2 varchar2(10) ); create or replace view v1 as select t1a.n1a, t1a.v1a, t1b.n1b, t1b.v1b from t1a, t1b where t1b.n1b = t1a.n1a ; set autotrace traceonly explain select t1a.v1a, t1b.v1b, t2.v2 from t1a, t1b, t2 where t1b.n1b = t1a.n1a and t2.n2a(+) = t1a.n1a and t2.n2b(+) = t1b.n1b ; select v1a, v1b, t2.v2 from v1, t2 where t2.n2a(+) = v1.n1a and t2.n2b(+) = v1.n1b ; set autotrace off
When I try to run the simple query (any version up to 11.2.0.3) Oracle raises error “ORA-01417: a table may be outer joined to at most one other table”. When I hide the join between t1a and t1b inside the view v1, Oracle can execute the query but it has to make v1 a non-mergeable view that is also “non-pushable” (i.e. the optimizer can’t push the join predicates inside the view). Here’s the resulting execution plan; note, particularly, the VIEW operator at line 2 which tells us that v1 was not merged:
------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 73 | 8 | |* 1 | HASH JOIN OUTER | | 1 | 73 | 8 | | 2 | VIEW | V1 | 1 | 40 | 5 | |* 3 | HASH JOIN | | 1 | 40 | 5 | | 4 | TABLE ACCESS FULL| T1A | 1 | 20 | 2 | | 5 | TABLE ACCESS FULL| T1B | 1 | 20 | 2 | | 6 | TABLE ACCESS FULL | T2 | 1 | 33 | 2 | ------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T2"."N2A"(+)="V1"."N1A" AND "T2"."N2B"(+)="V1"."N1B") 3 - access("T1B"."N1B"="T1A"."N1A")
In the case of my client there was no reasonable workaround (although I could, perhaps, have designed a more complex piece of PL/SQL that avoided this particular query). Fortunately the client doesn’t use objects in the database, and we always collect for stale stats, so we decided that the simplest way to avoid the problem was to create a replacement view (jpl_subpart_col_stats) that copied the original view but turned the join to hist_head$ into an inner join – this made it possible for Oracle to merge the tsp$view, and the resulting query was fast enough for the job we had to do.
Footnote:
Don’t try to compare the behaviour of this query too closely with the behaviour of view dba_subpart_col_statistics, I’m just trying to demonstrate one little point with this example – the way in which the double outer join is currently illegal, and how it may be hidden (at a price) inside a view.
This restriction on outer joins disappears in Oracle 12c
Maybe you could change the title of this post. “double outer join and hiding behind a view”.
Comment by Hemant K Chitale — June 5, 2012 @ 4:00 pm BST Jun 5,2012 |
Why does Oracle have the restriction that “a table may be outer joined to at most one other table”? When I re-write the SQL using ANSI SQL it permits the join.
Comment by Ashley — June 6, 2012 @ 8:02 am BST Jun 6,2012 |
Ashley,
You’re correct, but the reason why you can do it in ANSI is that Oracle transforms your ANSI, in a form which basically contains the necessary non-mergeable in-line view that makes it possible, e.g.
Note the VIEW operator in line 2 of the plan, where the non-mergeable in-line view makes it presence felt.
Comment by Jonathan Lewis — June 7, 2012 @ 5:57 pm BST Jun 7,2012 |