Oracle Scratchpad

February 14, 2014

12c Subquery Factoring

Filed under: 12c,Oracle,Subquery Factoring,Tuning — Jonathan Lewis @ 11:44 am BST Feb 14,2014

From time to time I’ve posted a reminder that subquery factoring (“with subquery”) can give you changes in execution plans even if the subquery that you’ve taken out of line is written back inline by Oracle rather than being materialized. This can still happen in 12c – here’s a sample query in the two forms with the result sets and execution plans. ¬†First, the “factored” version:


with e as (
	select
		deptno
	from	emp
	order by
		deptno
)
select
	deptno, count(*)
from
	e
group by deptno
;

    DEPTNO   COUNT(*)
---------- ----------
        10          3
        20          5
        30          6

3 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1536940522

-------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost  |
-------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     3 |     9 |     6 |
|   1 |  SORT GROUP BY NOSORT|      |     3 |     9 |     6 |
|   2 |   VIEW               |      |    14 |    42 |     6 |
|   3 |    SORT ORDER BY     |      |    14 |    42 |     6 |
|   4 |     TABLE ACCESS FULL| EMP  |    14 |    42 |     2 |
-------------------------------------------------------------

Note that the view has not been materialized (no “TEMP TABLE TRANSFORMATION”). There is a “sort group by” with NOSORT, and the data has been reported in sorted order as a side effect of the “sort order by” in line 3.

Now with the inline view manually written in place:


select   deptno, count(*)
from (
	select
		/*+ gather_plan_statistics */
		deptno
	from	emp
	order by
		deptno
) e
group by deptno
;

    DEPTNO   COUNT(*)
---------- ----------
        30          6
        20          5
        10          3

3 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4067220884

-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     3 |     9 |     6 |
|   1 |  HASH GROUP BY     |      |     3 |     9 |     6 |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |    42 |     2 |
-----------------------------------------------------------

In principle, the optimizer should have rewritten the first query in the second form and optimized it – but there’s clearly something out of order in the optimisation. As you can see, the result is no longer in the same order (but we don’t expect any specific order, of course, since we didn’t end with an “order by” clause) and the aggregation step is a hash group by rather than a sort group by.

Footnote:

I know that I got this example off someone else, but all I’ve written into my script is a URL, and it’s the wrong URL – my clipboard buffer seems to lag a little sometimes when I’m copying from the host to a virtual machine. The URL I’ve got is to an article by Steven Pinker in the New Republic, and it had nothing to do with Oracle (though it was comparing science and the humanities).

If you’re the original source of the example, please let me know so that I can add an acknowledgement.

[Further reading on "subquery factoring"]

Test case

If you would like to reproduce the example, here’s the script – it’s a simple hack of the Scott emp/dept schema.

CREATE TABLE EMP (
	EMPNO		NUMBER(4) NOT NULL,
        ENAME		VARCHAR2(10),
        JOB		VARCHAR2(9),
        MGR		NUMBER(4),
        HIREDATE	DATE,
        SAL		NUMBER(7, 2),
        COMM		NUMBER(7, 2),
        DEPTNO		NUMBER(2)
);

insert into emp values
        (7369, 'SMITH',  'CLERK',     7902,
        to_date('17-DEC-1980', 'DD-MON-YYYY'),  800, NULL, 20);

insert into emp values
        (7499, 'ALLEN',  'SALESMAN',  7698,
        to_date('20-FEB-1981', 'DD-MON-YYYY'), 1600,  300, 30);

insert into emp values
        (7521, 'WARD',   'SALESMAN',  7698,
        to_date('22-FEB-1981', 'DD-MON-YYYY'), 1250,  500, 30);

insert into emp values
        (7566, 'JONES',  'MANAGER',   7839,
        to_date('2-APR-1981', 'DD-MON-YYYY'),  2975, NULL, 20);

insert into emp values
        (7654, 'MARTIN', 'SALESMAN',  7698,
        to_date('28-SEP-1981', 'DD-MON-YYYY'), 1250, 1400, 30);

insert into emp values
        (7698, 'BLAKE',  'MANAGER',   7839,
        to_date('1-MAY-1981', 'DD-MON-YYYY'),  2850, NULL, 30);

insert into emp values
        (7782, 'CLARK',  'MANAGER',   7839,
        to_date('9-JUN-1981', 'DD-MON-YYYY'),  2450, NULL, 10);

insert into emp values
        (7788, 'SCOTT',  'ANALYST',   7566,
        to_date('09-DEC-1982', 'DD-MON-YYYY'), 3000, NULL, 20);

insert into emp values
        (7839, 'KING',   'PRESIDENT', NULL,
        to_date('17-NOV-1981', 'DD-MON-YYYY'), 5000, NULL, 10);

insert into emp values
        (7844, 'TURNER', 'SALESMAN',  7698,
        to_date('8-SEP-1981', 'DD-MON-YYYY'),  1500,    0, 30);

insert into emp values
        (7876, 'ADAMS',  'CLERK',     7788,
        to_date('12-JAN-1983', 'DD-MON-YYYY'), 1100, NULL, 20);

insert into emp values
        (7900, 'JAMES',  'CLERK',     7698,
        to_date('3-DEC-1981', 'DD-MON-YYYY'),   950, NULL, 30);

insert into emp values
        (7902, 'FORD',   'ANALYST',   7566,
        to_date('3-DEC-1981', 'DD-MON-YYYY'),  3000, NULL, 20);

insert into emp values
        (7934, 'MILLER', 'CLERK',     7782,
        to_date('23-JAN-1982', 'DD-MON-YYYY'), 1300, NULL, 10);

CREATE TABLE DEPT(
	DEPTNO	NUMBER(2),
        DNAME	VARCHAR2(14),
        LOC	VARCHAR2(13)
);

insert into dept values (10, 'ACCOUNTING', 'NEW YORK');
insert into dept values (20, 'RESEARCH',   'DALLAS');
insert into dept values (30, 'SALES',      'CHICAGO');
insert into dept values (40, 'OPERATIONS', 'BOSTON');

commit;

alter table emp add constraint emp_pk primary key(empno);
alter table dept add constraint dept_pk primary key(deptno);

1 Comment »

  1. Jonathan,

    I’ve seen so many blog posts about the WITH subquery that suggest it produces “bad” optimization plans that I hesitate to use it. I have a couple times but only when embedding the subquery would be too painful (because of duplications in the main query).

    Do you have any guidelines for when you would and absolutely would not use this technique?

    Thanks,

    Stew

    Comment by Stew Stryker — February 16, 2014 @ 5:49 pm BST Feb 16,2014 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,453 other followers