Oracle Scratchpad

October 11, 2010

Distributed Objects

Filed under: distributed,Performance,Troubleshooting — Jonathan Lewis @ 7:12 pm BST Oct 11,2010

I recently came across a tidy solution to a common problem – how to minimise code maintenance in a procedure while maximising flexibility of the procedure. The task was fairly simple – create a ref cursor for a calling program to return data that (a) followed complex selection rules and (b) allowed the user to specify numerous types of input.

The principle was simple – the final ref cursor was driven by a list of (say) order ids – and the details to be returned about those orders required some fairly complex SQL to execute. To separate the complexity of constructing the list of columns from the complexity of identifying the required rows the developers had split the procedure into two stages. First, select the list of relevant order ids using one of several possible statements – the appropriate statement being derived from analysis of the inputs to the procedure; secondly open a ref cursor using that list of order ids. In this way if a new set of rules for selection appeared the only new code needed was a new query to select the ids – the main body of code didn’t need to be modified and re-optimised.

They could have inserted the list of order ids into a global temporary table (GTT), of course; but what they actually did was to “bulk collect” them into an object table type, and then use the table() operator to reference them in their ref cursor. Here’s a very simple example demonstrating the concept:


create type number_type_table as table of number;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		rownum <= 10000
)
select
	cast(rownum as number(8,0))	id,
	lpad(rownum,10,'0')		small_vc,
	rpad('x',100)			padding
from
	generator	v1,
	generator	v2
where
	rownum <= 100000
;

alter table t1 add constraint t1_pk primary key(id);

-- collect some stats on the table

declare
	m_n1_tab	number_type_table;

begin
	select
		id
	bulk collect into m_n1_tab
	from
		t1
	where
		rownum <= 10
	;

	for c1 in (
		select
			/*+
				cardinality(l1 10)
			*/
			r1.small_vc
		from
			table(m_n1_tab)		l1,
			t1			r1
		where
			r1.id = l1.column_value
	) loop
		dbms_output.put_line(c1.small_vc);
	end loop;
end;
/

I’ve used the table t1 both as the source of a few id values and as the target for the final select – the client code was obviously more subtle and complex.

Key details – from the top down – are: the “table type”, the efficiency of using a “bulk collect”, and the table(m_n1_tab) that appears in the cursor loop. (I’ve used a simple cursor loop in my demonstration rather than opening a ref cursor). You’ll notice that I’ve included a /*+ cardinality */ hint to give the optimizer a rough idea of the volume of data in the collection.

It’s clean, it’s simple, and it works well – until you start using distributed queries, and then sometimes the performance is dire. In this example my target table has a primary key, and I’m joining 10 rows on the primary key – the code is very fast and the plan (running 11.1.0.6) is as follows:

SELECT /*+     cardinality(l1 10)    */ R1.SMALL_VC FROM TABLE(:B1 )
L1, T1 R1 WHERE R1.ID = L1.COLUMN_VALUE

Plan hash value: 609144676

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |       |       |    39 (100)|          |
|   1 |  NESTED LOOPS                       |       |       |       |            |          |
|   2 |   NESTED LOOPS                      |       |    10 |   180 |    39   (0)| 00:00:01 |
|   3 |    COLLECTION ITERATOR PICKLER FETCH|       |       |       |            |          |
|*  4 |    INDEX UNIQUE SCAN                | T1_PK |     1 |       |     0   (0)|          |
|   5 |   TABLE ACCESS BY INDEX ROWID       | T1    |     1 |    16 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("R1"."ID"=VALUE(KOKBF$))

But when I changed the code so that the second use of t1 was referenced (through a loopback database link) as a remote table the query took about four minutes to complete, with a plan that looked like this:

SELECT /*+     cardinality(l1 10)    */ R1.SMALL_VC FROM TABLE(:B1 )
L1, T1@D11G@LOOPBACK R1 WHERE R1.ID = L1.COLUMN_VALUE

Plan hash value: 3896059973

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |      |       |       |    39 (100)|          |        |      |
|   1 |  NESTED LOOPS                      |      |    10 |   220 |    39   (0)| 00:00:01 |        |      |
|   2 |   COLLECTION ITERATOR PICKLER FETCH|      |       |       |            |          |        |      |
|*  3 |   FILTER                           |      |     1 |    20 |     1   (0)| 00:00:01 |        |      |
|   4 |    REMOTE                          | T1   |       |       |            |          | D11G@~ | R->S |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("R1"."ID"=VALUE(KOKBF$))
   4 - SELECT "ID","SMALL_VC" FROM "T1" "R1" (accessing 'D11G@LOOPBACK' )

(Notice that the second use of table t1 is now “t1@d11g@loopback” – a remote table).

The optimizer has decided to take the same nested loop execution path, presumably because it can see the high precision indexed access path – but then something has gone wrong. The SQL that gets sent to the remote database doesn’t include an access predicate on table t1 – so the remote database does a full tablescan for each row in the collection. The local database pulls the entire set of rows across the database link and then uses a filter operation at line 3 to eliminate all but the one row it needed.

What’s gone wrong ? The problem (I think) is that the value we want to pass to the remote database comes from an abstract data type (ADT) and although we know that the base type is a simple number type the code isn’t designed to deduce that. Since the remote database may not know how to treat our abstract data types we can’t possibly send them across the network. (The client’s immediate problem was that remote remote table was 1.2M blocks long – ca. 10GB – and it took about 90 seconds to get one row by tablescan.)

The client could have switched to using GTTs in the cases that went wrong, of course, but didn’t really want to have two different code mechanisms. Fortunately, once I’d modelled the problem I found a workaround. It’s a method that you may not be able to take advantage of in all cases but it was perfect for the client. Since the basic problem is that we don’t seem to be allowed to send abstract types across the network all we have to do is turn the abstract type into an Oracle base type:

begin
	select
		id
	bulk collect into m_n1_tab
	from
		t1
	where
		rownum <= 10
	;

	for c1 in (
		select
			r1.small_vc
		from
			(
			select
				/*+
					no_merge
					cardinality(l1 10)
				*/
				cast(column_value as number(8,0))	v1
			from
				table(m_n1_tab)		l1
			)				l1,
		t1@d11g@loopback	r1
		where
			r1.id = l1.v1
	) loop
		dbms_output.put_line(c1.small_vc);
	end loop;
end;
/

We replace the naked table() operator with a non-mergeable inline view that selects the absract data type from the table() operator then casts it into an Oracle base type. With this code change the query runs quickly again and we see the following execution plan.

SELECT R1.SMALL_VC FROM ( SELECT /*+      no_merge      cardinality(l1
10)     */ CAST(COLUMN_VALUE AS NUMBER(8,0)) V1 FROM TABLE(:B1 ) L1 )
L1, <a href="mailto:T1@D11G@LOOPBACK">T1@D11G@LOOPBACK</a> R1 WHERE R1.ID = L1.V1

Plan hash value: 966067288

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |       |       |    39 (100)|          |        |      |
|   1 |  NESTED LOOPS                       |      |    10 |   330 |    39   (0)| 00:00:01 |        |      |
|   2 |   VIEW                              |      |    10 |   130 |    29   (0)| 00:00:01 |        |      |
|   3 |    COLLECTION ITERATOR PICKLER FETCH|      |       |       |            |          |        |      |
|   4 |   REMOTE                            | T1   |     1 |    20 |     1   (0)| 00:00:01 | D11G@~ | R->S |
------------------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------
4 - SELECT "ID","SMALL_VC" FROM "T1" "R1" WHERE "ID"=:1 (accessing <a href="mailto:'D11G@LOOPBACK'">'D11G@LOOPBACK'</a> )

As you can see from the view operator at line 2, we have created a result set from the collection operation, and the query that goes to the remote site at line 4 now includes our high-precision predicate. It’s also rather convenient that we can include the cardinality hint in the in-line view – the client loves ANSI SQL, and for complex statements it can be quite difficult getting the hints right.

 

4 Comments »

  1. Genius! How long did it take you to figure out a solution?

    Comment by Mike Johnson — October 13, 2010 @ 3:22 am BST Oct 13,2010 | Reply

    • As often happens, it took more time to decide that Oracle was behaving the way it was supposed to, rather than revealing a bug – and that took two or three hours. Once I realised what the limitation was it took about 10 minutes to come up with a workaround.

      Mind you, my first workaround was a bit daft – and it was only while explaining the problem to the developer the following morning that I realised that the solution I’ve shown above was much simpler.

      Comment by Jonathan Lewis — October 13, 2010 @ 7:24 am BST Oct 13,2010 | Reply

  2. In older versions of Oracle the construct “TABLE(m_n1_tab)” was not allowed and you had to use “TABLE( SELECT CAST( m_n1_tab AS number_type_table ) FROM DUAL )” instead. I wonder if that older way would have given Oracle the data type information it needed as well.

    Comment by Milo — October 13, 2010 @ 4:54 pm BST Oct 13,2010 | Reply

    • Milo,

      I suspect not – I think the cast() here tells Oracle which abstract data type to use when converting the variable to a table. So the thing that comes out is still an abstract data type.

      Comment by Jonathan Lewis — October 14, 2010 @ 3:17 pm BST Oct 14,2010 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 4,013 other followers