A * question came up on Oracle-l* list-server a few days ago about how Oracle calculates costs for a scalar subquery in the select list. The question included an example to explain the point of the question. I’ve reproduced the test below, with the output from an 18.3 test system. The numbers don’t match the numbers produced in the original posting but they are consistent with the general appearance.

rem rem Script: ssq_costing.sql rem Author: Jonathan Lewis rem Dated: May 2019 rem Purpose: rem rem Last tested rem 18.3.0.0 rem 12.2.0.1 rem create table t_1k ( n1 integer ) ; create table t_100k ( n1 integer ) ; insert into t_1k select level from dual connect by level <= 1e3; insert into t_100k select level from dual connect by level <= 1e5; commit ; begin dbms_stats.gather_table_stats ( null, 'T_1K') ; dbms_stats.gather_table_stats ( null, 'T_100K') ; end ; / explain plan for select /*+ qb_name(QB_MAIN) */ ( select /*+ qb_name(QB_SUBQ) */ count(*) from t_1k where t_1k.n1 = t_100k.n1 ) from t_100k ; select * from table(dbms_xplan.display); ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K| 488K| 1533K (2)| 00:01:00 | | 1 | SORT AGGREGATE | | 1 | 4 | | | |* 2 | TABLE ACCESS FULL| T_1K | 1 | 4 | 17 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | T_100K | 100K| 488K| 36 (9)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T_1K"."N1"=:B1)

The key point to note is this – the scalar subquery has to execute 100,000 times because that’s the number of rows in the driving table. The cost for executing the scalar subquery once is 17 – so the total cost of the query should be 1,700,036 – not 1,533K (and for execution plans the K means x1000, not x1024). There’s always room for rounding errors, of course, but a check of the 10053 (CBO trace) file shows the numbers to be 17.216612 for the * t_1k* tablescan, 36.356072 for the

*tablescan, and 1533646.216412 for the whole query. So how is Oracle managing to get a cost that looks lower than it ought to be?*

**t_100K**There’s plenty of scope for experimenting to see how the numbers change – and my first thought was simply to see what happens as you change the number of distinct values in the * t_100K.n1* column. It would be rather tedious to go through the process of modifying the data a few hundred times to see what happens, so I took advantage of the

*and*

**get_column_stats()***procedures in the*

**set_column_stats()***package to create a PL/SQL loop that faked a number of different scenarios that lied about the actual table data.*

**dbms_stats**delete from plan_table; commit; declare srec dbms_stats.statrec; n_array dbms_stats.numarray; m_distcnt number; m_density number; m_nullcnt number; m_avgclen number; begin dbms_stats.get_column_stats( ownname => user, tabname => 't_100k', colname => 'n1', distcnt => m_distcnt, density => m_density, nullcnt => m_nullcnt, srec => srec, avgclen => m_avgclen ); for i in 1 .. 20 loop m_distcnt := 1000 * i; m_density := 1/m_distcnt; dbms_stats.set_column_stats( ownname => user, tabname => 't_100k', colname => 'n1', distcnt => m_distcnt, density => m_density, nullcnt => m_nullcnt, srec => srec, avgclen => m_avgclen ); execute immediate ' explain plan set statement_id = ''' || m_distcnt || ''' for select /*+ qb_name(QB_MAIN) */ ( select /*+ qb_name(QB_SUBQ) */ count(*) from t_1k where t_1k.n1 = t_100k.n1 ) from t_100k '; end loop; end; /

The code is straightforward. I’ve declared a few variables to hold the column stats from the * t_100k.n1* column, called

**get_column stats(***, then looped 20 times through a process that changes the number of distinct values (and corresponding density) recorded in the column stats, then used execute immediate to call*

**)***“explain plan”*for the original query.

You’ll notice I’ve given each plan a separate * statement_id* that corresponds to the

*that generated the plan. In the code above I’ve changed the*

**num_distinct***from 1,000 to 20,000 in steps of 1,000.*

**num_distinct**Once the PL/SQL block ends I’ll have a plan table with 20 execution plans stored in it and, rather than reporting those plans with calls to * dbms_xplan.display()*, I’m going to be selective about which rows and columns I report.

select statement_id, io_cost, io_cost - lag(io_cost,1) over (order by to_number(statement_id)) io_diff, cpu_cost, cpu_cost - lag(cpu_cost,1) over (order by to_number(statement_id)) cpu_diff, cost from plan_table where id = 0 order by to_number(statement_id) ;

I’ve picked * id = 0* (the top line of the plan) for each

*and I’ve reported the*

**statement_id***column, which is made up of the*

**cost***column plus a scaled down value of the*

**io_cost***column. I’ve also used the analytic*

**cpu_cost***function to calculate how much the*

**lag()***and*

**io_****cost***changed from the previous*

**cpu_cost***. Here are my results from 18c:*

**statement_id**STATEMENT_ID IO_COST IO_DIFF CPU_COST CPU_DIFF COST ------------------------------ ---------- ---------- ---------- ---------- ---------- 1000 17033 1099838920 17253 2000 34033 17000 2182897480 1083058560 34470 3000 51033 17000 3265956040 1083058560 51686 4000 68033 17000 4349014600 1083058560 68903 5000 85033 17000 5432073160 1083058560 86119 6000 102033 17000 6515131720 1083058560 103336 7000 119033 17000 7598190280 1083058560 120553 8000 136033 17000 8681248840 1083058560 137769 9000 153033 17000 9764307400 1083058560 154986 10000 170033 17000 1.0847E+10 1083058560 172202 11000 197670 27637 1.2608E+10 1760725019 200191 12000 338341 140671 2.1570E+10 8962036084 342655 13000 457370 119029 2.9153E+10 7583261303 463200 14000 559395 102025 3.5653E+10 6499938259 566525 15000 647816 88421 4.1287E+10 5633279824 656073 16000 725185 77369 4.6216E+10 4929119846 734428 17000 793452 68267 5.0565E+10 4349223394 803565 18000 854133 60681 5.4431E+10 3865976350 865019 19000 908427 54294 5.7890E+10 3459031472 920005 20000 957292 48865 6.1003E+10 3113128324 969492

The first pattern that hits the eye is the constant change of 17,000 in the * io_cost *in the first few lines of the output. For “small” numbers of distinct values the (IO) cost of the query is (33 + 17 *

*) – in other words, the arithmetic seems to assume that it will execute the query once for each value and then cache the results so that repeated executions for any given value will not be needed. This looks as if the optimizer is trying to match its arithmetic to the*

**num_distinct***“scalar subquery caching”*mechanism.

But things change somewhere between 10,000 and 11,000 distinct values. The point comes where adding one more distinct value causes a much bigger jump in cost than 17, and that’s because Oracle assumes it’s reached a point where there’s a value that it won’t have room for in the cache and will have to re-run the subquery multiple times for that value as it scans the rest of the table. Let’s find the exact break point where that happens.

Changing my PL/SQL loop so that we calculate * m_distcnt *as

*“19010 + i”*this is the output from the final query:

-- m_distcnt := 10910 + i; STATEMENT_ID IO_COST IO_DIFF CPU_COST CPU_DIFF COST ------------------------------ ---------- ---------- ---------- ---------- ---------- 10911 185520 1.1834E+10 187887 10912 185537 17 1.1835E+10 1083059 187904 10913 185554 17 1.1836E+10 1083058 187921 10914 185571 17 1.1837E+10 1083059 187938 10915 185588 17 1.1838E+10 1083058 187956 10916 185605 17 1.1839E+10 1083059 187973 10917 185622 17 1.1841E+10 1083059 187990 10918 185639 17 1.1842E+10 1083058 188007 10919 185656 17 1.1843E+10 1083059 188025 10920 185673 17 1.1844E+10 1083058 188042 10921 185690 17 1.1845E+10 1083059 188059 10922 185707 17 1.1846E+10 1083058 188076 10923 185770 63 1.1850E+10 4027171 188140 10924 185926 156 1.1860E+10 9914184 188298 10925 186081 155 1.1870E+10 9912370 188455 10926 186237 156 1.1880E+10 9910555 188613 10927 186393 156 1.1890E+10 9908741 188770 10928 186548 155 1.1900E+10 9906928 188928 10929 186703 155 1.1909E+10 9905114 189085 10930 186859 156 1.1919E+10 9903302 189243

If we have 10,922 distinct values in the column the optimizer calculates as if it will be able to cache them all; but if we have 10,923 distinct values the optimizer thinks that there’s going to be one value where it can’t cache the result and will have to run the subquery more than once.

Before looking at this in more detail let’s go to the other interesting point – when does the cost **stop** changing: we can see the cost increasing as the number of distinct values grows, we saw at the start that the cost didn’t seem to get as large as we expected, so there must be a point where it stops increasing before it “ought” to.

I’ll jump straight to the answer: here’s the output from the test when I start * num_distinct* off at slightly less than half the number of rows in the table:

-- m_distcnt := (50000 - 10) + i; STATEMENT_ID IO_COST IO_DIFF CPU_COST CPU_DIFF COST ------------------------------ ---------- ---------- ---------- ---------- ---------- 49991 1514281 9.6488E+10 1533579 49992 1514288 7 9.6489E+10 473357 1533586 49993 1514296 8 9.6489E+10 473337 1533594 49994 1514303 7 9.6490E+10 473319 1533601 49995 1514311 8 9.6490E+10 473299 1533609 49996 1514318 7 9.6491E+10 473281 1533616 49997 1514325 7 9.6491E+10 473262 1533624 49998 1514333 8 9.6492E+10 473243 1533631 49999 1514340 7 9.6492E+10 473224 1533639 50000 1514348 8 9.6493E+10 473205 1533646 50001 1514348 0 9.6493E+10 0 1533646 50002 1514348 0 9.6493E+10 0 1533646 50003 1514348 0 9.6493E+10 0 1533646 50004 1514348 0 9.6493E+10 0 1533646 50005 1514348 0 9.6493E+10 0 1533646 50006 1514348 0 9.6493E+10 0 1533646 50007 1514348 0 9.6493E+10 0 1533646 50008 1514348 0 9.6493E+10 0 1533646 50009 1514348 0 9.6493E+10 0 1533646 50010 1514348 0 9.6493E+10 0 1533646

The cost just stops changing when * num_distinct* = half the rows in the table.

### Formulae

During the course of these experiments I had been exchanging email messages with * Nenad Noveljic* via the Oracle-L list-server (

*) and he came up with the suggesion of a three-part formula that assumed a cache size and gave a cost of*

**full monthly archive here***“tablescan cost + num_distinct * subquery unit cost”*for values ofup to the cache size;**num_distinct**- then, for values of
greater than the cache_size and up to half the size of the table added a marginal cost representing the probability that some values would not be cached;**num_distinct** - then for values of
greater than half the number of rows in the table reported the cost associated with**num_distinct**= half the number of rows in the table.**num_distinct**

Hence:

- for 1 <= num_distinct <= 10922, cost = (33 + num_distinct + 17)
- for 10,923 <= num_distinct <= 50,000, cost = (33 + 10,922 * 17) + (1 – 10,922/num_distinct) * 100,000 * 17
- for 50,000 <= num_distinct <= 100,000, cost = cost(50,000).

The middle line needs a little explanation: ( 1-10,922 / num_distinct ) is the probability that a value will not be in the cache; this has to be 100,000 to give the expected number of rows that will not be cached, and then multiplied by 17 as the cost of running the subquery for those rows.

The middle line can be re-arranged as 33 + 17 * (10,922 + (1 – 10,922/num_distinct) * 100,000)

### Tweaking

At this point I could modify my code loop to report the calculated value for the cost and compare it with the actual cost to show you that the two values didn’t quite match. Instead I’ll jump forward a little bit to a correction that needs to be made to the formula above. It revolves around how Oracle determines the cache size. There’s a hidden parameter (which I mentioned in CBO Fundamentals) that controls scalar subquery caching. In the book I think I only referenced it in the context of subqueries in the *“where”* clause. The parameter is ** “_query_execution_cache_max_size”** and has a default value of 131072 (

*power(2,7)*) – so when I found that the initial formula didn’t quite work I made the following observation:

- 131072 / 10922 = 12.00073
- 131072 / 12 = 10922.666…

So I put 1092.66667 into the formula to see if that would improve things.

For the code change I added a variable * m_cost* to the PL/SQL block, and set it inside the loop as follows:

m_cost := round(33 + 17 * (10922.66667 + 100000 * (1 - (10922.66667 / m_distcnt))));

Then in the *“execute immediate”* I changed the *“explain plan”* line to read:

explain plan set statement_id = ''' || lpad(m_distcnt,7) || ' - ' || lpad(m_cost,8) ||

This allowed me to show the formula’s prediction of (IO)cost in final output, and here’s what I got for values of * num_distinct* in the region of 10,922:

STATEMENT_ID IO_COST IO_DIFF CPU_COST CPU_DIFF COST ------------------------------ ---------- ---------- ---------- ---------- ---------- 10911 - 183901 185520 1.1834E+10 187887 10912 - 184057 185537 17 1.1835E+10 1083059 187904 10913 - 184212 185554 17 1.1836E+10 1083058 187921 10914 - 184368 185571 17 1.1837E+10 1083059 187938 10915 - 184524 185588 17 1.1838E+10 1083058 187956 10916 - 184680 185605 17 1.1839E+10 1083059 187973 10917 - 184836 185622 17 1.1841E+10 1083059 187990 10918 - 184992 185639 17 1.1842E+10 1083058 188007 10919 - 185147 185656 17 1.1843E+10 1083059 188025 10920 - 185303 185673 17 1.1844E+10 1083058 188042 10921 - 185459 185690 17 1.1845E+10 1083059 188059 10922 - 185615 185707 17 1.1846E+10 1083058 188076 10923 - 185770 185770 63 1.1850E+10 4027171 188140 10924 - 185926 185926 156 1.1860E+10 9914184 188298 10925 - 186081 186081 155 1.1870E+10 9912370 188455 10926 - 186237 186237 156 1.1880E+10 9910555 188613 10927 - 186393 186393 156 1.1890E+10 9908741 188770 10928 - 186548 186548 155 1.1900E+10 9906928 188928 10929 - 186703 186703 155 1.1909E+10 9905114 189085 10930 - 186859 186859 156 1.1919E+10 9903302 189243

The formula is only supposed to work in the range 10923 – 50,000, so the first few results don’t match; but in the range 10,923 to 10,930 the match is exact. Then, in the region of 50,000 we get:

STATEMENT_ID IO_COST IO_DIFF CPU_COST CPU_DIFF COST ------------------------------ ---------- ---------- ---------- ---------- ---------- 49991 - 1514281 1514281 9.6488E+10 1533579 49992 - 1514288 1514288 7 9.6489E+10 473357 1533586 49993 - 1514296 1514296 8 9.6489E+10 473337 1533594 49994 - 1514303 1514303 7 9.6490E+10 473319 1533601 49995 - 1514311 1514311 8 9.6490E+10 473299 1533609 49996 - 1514318 1514318 7 9.6491E+10 473281 1533616 49997 - 1514325 1514325 7 9.6491E+10 473262 1533624 49998 - 1514333 1514333 8 9.6492E+10 473243 1533631 49999 - 1514340 1514340 7 9.6492E+10 473224 1533639 50000 - 1514348 1514348 8 9.6493E+10 473205 1533646 50001 - 1514355 1514348 0 9.6493E+10 0 1533646 50002 - 1514363 1514348 0 9.6493E+10 0 1533646 50003 - 1514370 1514348 0 9.6493E+10 0 1533646 50004 - 1514377 1514348 0 9.6493E+10 0 1533646 50005 - 1514385 1514348 0 9.6493E+10 0 1533646 50006 - 1514392 1514348 0 9.6493E+10 0 1533646 50007 - 1514400 1514348 0 9.6493E+10 0 1533646 50008 - 1514407 1514348 0 9.6493E+10 0 1533646 50009 - 1514415 1514348 0 9.6493E+10 0 1533646 50010 - 1514422 1514348 0 9.6493E+10 0 1533646

Again, the formula applies only in the range up to 50,000 (half the rows in the table) – and the match is perfect in that range.

### Next steps

The work so far gives us some idea of the algorithm that the optimizer is using to derive a cost, but this is just one scenario and there are plenty of extra questions we might ask. What, as the most pressing one, is the significance of the number 12 in the calculation 131,072/12. From previous experience I guess that is was related to the length of the input and output values of the scalar subquery – as in “value * X* for n1 returns value

*for count(*)”.*

**Y**To pursue this idea I recreated the data sets using * varchar2(10)* as the definition of

*and*

**n1***as the value – the “breakpoint” dropped from 10,922 down to 5,461. Checking the arithmetic 131,072 / 5461 = 24.001456, then 131,072/24 = 5461.333… And that’s the number that made fhe formular work perfectly for the modified data set.*

**lpad(rownum,10)**Then I set used * set_column_stats()* to hack the

*of*

**avg_col_,len***to 15 and the break point dropped to 4,096. Again we do the two arithmetic steps: 131072/4096 = 32 (but then we don’t need to do the reverse step since the first result is integral).*

**t_100K.n1**Checking the original data set when * n1* was a numeric the

*was 5, so we have three reference points:*

**avg_col_len**- Avg_col_len = 5. “Cache unit size” = 12
- Avg_col_len = 11. Cache unit size = 24 (don’t forget the
includes the length byte, so our padded**avg_col_len**has a length of 11).**varchar2(10)** - Avg_col_len = 15, Cache unit size = 32

There’s an obvious pattern here: *“Cache unit size”* = (2 x * avg_col_len* + 2). Since I hadn’t been changing the

*column at the same time, that really does look like a deliberate factor of 2 (I’d thought intially that maybe the 12 was affected by the lengths of both columns in the predicate – but that doesn’t seem to be the case.)*

**t_1k.n1**The scientific method says I should now make a prediction based on my hypothesis – so I set the * avg_col_len *for t_100K.n1 to 23 and guessed that the break point would be at 2730 – and it was. (131072 / (2 * 23 + 2) = 2730.6666…) .

The next question, of course, is *“where does the “spare 2″ come from?”* Trying to minimize the change in the code I modified my subquery to select * sum(to_number(n1))* rather than

*, then to*

**count(*)***– remember I had changed*

**avg(to_number(n1)****)***to a*

**n1***that looked like a number left-padded with spaces. In every variant of the tests I’d done so far all I had to do to get an exact match between the basic formula and the optimizer’s cost calculation was to use “2 * avg_col_len +*

**varchar2(10)***” as the cache unit size – and 22 is the nominal maximum length of an internally stored numeric column.*

**22**Bottom line: the cache unit size seems to be related to the input and output values, but I don’t know why there’s a factor of 2 applied to the input column length, and I don’t know why the length of * count(*)* is deemed to be 2 when other derived numeric outputs use have the more intuitive 22 for their length.

### tl;dr

The total cost calculation for a scalar subquery in the select list is largely affected by:

- a fixed cache size (131,072 bytes) possibly set by hidden parameter _query_execution_cache_max_size
- the avg_col_len of the input (correlating) column(s) from the driving table
- the nominal length of the output (select list) of the subquery

There is an unexplained factor of 2 used with the avg_col_len of the input, and a slightly surprising value of 2 if the output is simply count(*).

If the number N of distinct values for the driving column(s) is less than the number of possible cache entries the effect of the scalar subquery is to add N * estimated cost of executing the subquery once. As the number of distinct values for the driving column(s) goes above the limit then the incremental effect of the subquery is based on the expected number of times an input value will not be cached. When the number of distinct values in the driving column(s) exceeds half the number of rows in the driving table the cost stops increasing – there is no obvious reason when the algorithm does this.

There are many more cases that I could investigate at this point – but I think this model is enough as an indication of general method. If you come across a variation where you actually need to work out how the optimizer derived a cost then this framework will probably be enough to get you started in the right direction.

Jonathan,

The cost becomes by around 307 higher accros the whole NDV range after adding DISTINCT to QB_MAIN on 12.2. First I thought this is due to the HASH UNIQUE overhead, but the value seems to high for that.

Would you know the reason?

On 18.5 this difference is insignificant – 25.

Below are the costs for the query without and with DISTINCT on 12.2, respectively:

Comment by Nenad Noveljic (@NenadNoveljic) — June 19, 2019 @ 2:45 pm BST Jun 19,2019 |

Nenad,

I’ve done a couple of quick experiments. My results aren’t a perfect match for yours, of course. In my case I get a fixed increment of 286 in 12.2 and a fixed increment of 287 in 18.3

If you check your system stats you may find that they vary between your versions – mine are set to match – and that may explain the discrepancy you see.

When I check the 10053 trace file I can find the following:

And when I check the contents of the plan table for the initial query I see the following – with difference calculated – between the query without and with the distinct:

So, though it doesn’t seem like a reasonable estimate – the fixed difference is the cost of “sorting” 100,000 rows of “count(*)” values.

Comment by Jonathan Lewis — June 24, 2019 @ 3:50 pm BST Jun 24,2019 |

Johnathan,

The system statistics were the same. But the Sort statistics provide the hint of what was different:

18: Blocks to Sort: 46

12.2: Blocks to Sort: 184

Now I see that I was comparing apples and oranges – 18c database was created with 32k block size and 12.2 with 8k. The 18c database has, accordingly, 4 times less blocks to sort. Consequently, its cost is significantly lower.

Comment by Nenad Noveljic (@NenadNoveljic) — June 27, 2019 @ 11:10 am BST Jun 27,2019 |

Nenad,

That’s a little surprising – I would have expected the sort cost to be dependent on the number of rows / bytes, not the number of blocks.

Thanks for the observation.

Comment by Jonathan Lewis — June 28, 2019 @ 10:54 am BST Jun 28,2019

Jonathan,

A single 32k IO should be more efficient than four 8k IOs, shouldn’t it? This could be the reason, why the IO sort cost depends on the block size.

Comment by Nenad Noveljic (@NenadNoveljic) — July 1, 2019 @ 10:10 am BST Jul 1,2019 |

Nenad,

I think it would be reasonable to expect a very slight improvement in efficiency but not a lot of difference between a 4 * 8K block direct path write / read and a single 32KB block write / read. In other areas of costing the variation in the arithmetic you get from using a different block size is far less significant.

Comment by Jonathan Lewis — July 1, 2019 @ 2:27 pm BST Jul 1,2019 |

Jonathan,

Interestingly, the IO sort cost is 2.5 higher on a 8k block size database than on a 32k block size database. Both databases are 18.5.

32k:

`SORT ressource Sort statistics`

Sort width: 306 Area size: 268288 Max Area size: 53686272

Degree: 1

Blocks to Sort: 49 Row size: 16 Total Rows: 100000

Initial runs: 2 Merge passes: 1 IO Cost / pass: 74

Total IO sort cost: 123.000000 Total CPU sort cost: 139966334

Total Temp space used: 1246000

8k:

`SORT ressource Sort statistics`

Sort width: 305 Area size: 268288 Max Area size: 53686272

Degree: 1

Blocks to Sort: 196 Row size: 16 Total Rows: 100000

Initial runs: 2 Merge passes: 1 IO Cost / pass: 108

Total IO sort cost: 304.000000 Total CPU sort cost: 119864460

Total Temp space used: 1221000

Comment by Nenad Noveljic (@NenadNoveljic) — July 1, 2019 @ 3:40 pm BST Jul 1,2019 |

I’ve had a thought on the significant difference between the cost of sort on the 32KB block compared to the 8KB block.

There’s a hidden parameter _sort_multiblock_read_count which defaults to 2. and if your system stats are default (i.e. seek time = 10ms and transfer rate = 4KB/ms) then you have a 64KB sort I/O size on one system and 16KB sort I/O size on the other.

sort-multiblock read cost (32KB) = (10 + 16 ) / (10 + 8) = 1.44444

sort multiblock read cost (8KB) = (10 + 4) / (10 + 2) = 1.16666

Now check the toatl I/O sort costs: (123 * 1.4444 / 1.16666) * 2 = 304.

I can’t explain the “times 2” unfortunately – but there’s a bit of a coincidence in the numbers

Comment by Jonathan Lewis — July 6, 2019 @ 5:16 pm BST Jul 6,2019

Jonathan,

First of all, we can derive the following formula from the optimizer trace:

“Total IO sort cost” = “Blocks to Sort” + “IO Cost / pass”

Somewhat surprising, “Blocks to Sort” is the fix part of the cost calculation and, therefore, the main contributor to the relatively large difference between cost calculations for different block sizes.

Further, “IO Cost / pass”, the other summand in the “Total IO sort cost” addition, is calculated as follows:

my $io_cost_per_pass = 2 * ( ( $blocks_to_sort + 1 ) / $io_scale_factor + 1 ) ;

where:

$io_scale_factor = ( $seek_time + $block_size_k / $transfer_rate_k ) * ( 64 / $block_size_k ) / ( $seek_time + 64 / $transfer_rate_k ) ;

Simply put, “IO Cost / pass”, is “Blocks to Sort” “softened” by dividing it with io_scale_factor which is higher for smaller block sizes. In other words, it’s this division that prevents an unjust cost explosion for a large number of smaller blocks.

The multiplier 2 isn’t _sort_multiblock_read_count. The cost calculation doesn’t seem to depend on this parameter.

Both “+1” in the “IO Cost / pass” calculation are just the safeguards against the multiplication by zero, an edge condition, which would lead the whole calculation astray.

Comment by Nenad Noveljic (@NenadNoveljic) — July 11, 2019 @ 3:45 pm BST Jul 11,2019 |

Nenad,

Nice bit of research.

Thanks for sharing.

Comment by Jonathan Lewis — July 11, 2019 @ 7:20 pm BST Jul 11,2019 |

Thank you, Jonathan. Wouldn‘t have come that far without your guidance.

Comment by Nenad Noveljic (@NenadNoveljic) — July 11, 2019 @ 7:42 pm BST Jul 11,2019 |