I have often said that the optimizer “forgets” that it is dealing with a distributed query once it has collected the stats that it can about the objects in the query, and that as a consequence the driving site for a distributed query will be the local database unless you use the /*+ driving_site */ hint to change it.
While investigating an oddity with a distributed query between two 18.104.22.168 databases a few days, I noticed something in the 10053 trace file that made me change my mind, and go back to look at earlier versions of Oracle.
Here are two sections extracted from a 10053 trace file running under 10.2.0.3 with CPU costing (system statistics) enabled:
SINGLE TABLE ACCESS PATH Table: T1 Alias: AWAY Card: Original: 3240 Rounded: 41 Computed: 40.50 Non Adjusted: 40.50 Access Path: TableScan Cost: 53.22 Resp: 53.22 Degree: 0 Cost_io: 53.00 Cost_cpu: 2073815 Resp_io: 53.00 Resp_cpu: 2073815 Access Path: index (AllEqRange) Index: 0 resc_io: 4.00 resc_cpu: 29536 ix_sel: 0.0125 ix_sel_with_filters: 0.0125 Cost: 4.00 Resp: 4.00 Degree: 1 Remote table cost added, new values: cost 4.00 resc 4.00 resp .2f Best:: AccessPath: IndexRange Index: 0 <<=== Cost: 4.00 Degree: 1 Resp: 4.00 Card: 40.50 Bytes: 0 ... HA cost: 50.54 resc: 50.54 resc_io: 50.00 resc_cpu: 5187262 resp: 50.54 resp_io: 50.00 resp_cpu: 5187262 Cost adjustment for NL join with remote table: 0.72 <<=== Join order aborted: cost > best plan cost ***********************
Note the two lines with the reference to “remote” (I’d highlight them properly, but you can’t do highlighing and code in the same text). Notice, also that one of the programmers made a bit of a mistake with their printf() call in the first of the lines – a bug that is still there in 22.214.171.124
Clearly Oracle is doing some arithmetic relating to the costs of accessing distributed data from at least 10.2.0.3 (there was nothing similar in the equivalent trace file for 126.96.36.199, and I don’t have a 10.1 available for testing). Unfortunately I have yet to see a single distributed execution plan where it does the right thing – but that might be a problem related to histograms (and the failure to use them) rather than a defect in the algorithms for distributed cost.
I’ll have to spend some time looking at what it does before I can write any more about it – but given the number of times I’ve said the optimizer doesn’t do any arithmetic I thought it was important to point out that I was wrong as soon as I discovered the change.