Nice article – it’s clever bit of analysis and I liked the stepwise explanation.

]]>One of the problems of creating small tests to demonstrate a principle is that the sample data may introduce boundary conditions if you try to use the same data to extend the investigation. The contradiction between predicted cost and the actual buffer gets on the two cases arises, in part, because of the differences between the generic range scan model and the specific data set I’ve constructed; and in past because of the differences between the skip scan model and its physical implementation. (Physical implementation, of course, explains why the cost model for subquery filters can be very misleading – the optimizer doesn’t allow for scalar subquery caching.)

The difference in buffer gets is becausse of the root and branch pinning that takes place as you run the skip scan and travel up and down the index; so the block visits are still there, but they don’t cause the same level of latch activity that the iteration does, and aren’t “gets”. It would be instructive to finish off the comparison by creating a “driving table” just the 100 rows in it to drive a nested loop join and see how that affected the cost of probing the table 100 times.

]]>It’s interesting to know that, when we ‘’simulate’’ the skip scan by adding the leading column to the predicate part, the CBO will do a range scan instead of an index skip scan of the same index.

SQL_ID 87t23bskhh051, child number 0 ------------------------------------- select * from t1 where id1 between 501 and 502 Plan hash value: 266987255 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 200 |00:00:00.33 | 311 | 101 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 299 | 200 |00:00:00.33 | 311 | 101 | |* 2 | INDEX SKIP SCAN | T1_I2 | 1 | 285 | 200 |00:00:00.33 | 210 | 101 | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID1">=501 AND "ID1"<=502) filter(("ID1"<=502 AND "ID1">=501)) --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 200 |00:00:00.01 | 402 | 105 | | 1 | INLIST ITERATOR | | 1 | | 200 |00:00:00.01 | 402 | 105 | | 2 | TABLE ACCESS BY INDEX ROWID| T1 | 100 | 296 | 200 |00:00:00.01 | 402 | 105 | |* 3 | INDEX RANGE SCAN | T1_I2 | 100 | 297 | 200 |00:00:00.01 | 301 | 105 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access((("ID2"=0 OR "ID2"=1 OR "ID2"=2 OR "ID2"=3 OR "ID2"=4 OR "ID2"=5 OR "ID2"=6 OR "ID2"=7 OR "ID2"=8 OR "ID2"=9 OR "ID2"=10 OR "ID2"=11 OR "ID2"=12 OR "ID2"=13 OR "ID2"=14 OR "ID2"=15 OR "ID2"=16 OR "ID2"=17 OR "ID2"=18 OR "ID2"=19 OR "ID2"=20 OR "ID2"=21 OR "ID2"=22 OR "ID2"=23 OR "ID2"=24 OR "ID2"=25 OR "ID2"=26 OR "ID2"=27 OR "ID2"=28 OR "ID2"=29 OR "ID2"=30 OR "ID2"=31 OR "ID2"=32 OR "ID2"=33 OR "ID2"=34 OR "ID2"=35 OR "ID2"=36 OR "ID2"=37 OR "ID2"=38 OR "ID2"=39 OR "ID2"=40 OR "ID2"=41 OR "ID2"=42 OR "ID2"=43 OR "ID2"=44 OR "ID2"=45 OR "ID2"=46 OR "ID2"=47 OR "ID2"=48 OR "ID2"=49 OR "ID2"=50 OR "ID2"=51 OR "ID2"=52 OR "ID2"=53 OR "ID2"=54 OR "ID2"=55 OR "ID2"=56 OR "ID2"=57 OR "ID2"=58 OR "ID2"=59 OR "ID2"=60 OR "ID2"=61 OR "ID2"=62 OR "ID2"=63 OR "ID2"=64 OR "ID2"=65 OR "ID2"=66 OR "ID2"=67 OR "ID2"=68 OR "ID2"=69 OR "ID2"=70 OR "ID2"=71 OR "ID2"=72 OR "ID2"=73 OR "ID2"=74 OR "ID2"=75 OR "ID2"=76 OR "ID2"=77 OR "ID2"=78 OR "ID2"=79 OR "ID2"=80 OR "ID2"=81 OR "ID2"=82 OR "ID2"=83 OR "ID2"=84 OR "ID2"=85 OR "ID2"=86 OR "ID2"=87 OR "ID2"=88 OR "ID2"=89 OR "ID2"=90 OR "ID2"=91 OR "ID2"=92 OR "ID2"=93 OR "ID2"=94 OR "ID2"=95 OR "ID2"=96 OR "ID2"=97 OR "ID2"=98 OR "ID2"=99)) AND "ID1">=501 AND "ID1"<=502)

Interesting also to note that the index range scan operation necessitate 100 more logical reads (301) when compared to the skipping operation of the same index (210). It might be due to the number of time (Starts=100) the index range scan has been started while the index skip scan has been started only once. Knowing that the index skip scan is more expensive because it requires a special pinning and it has to go up and down the branch levels of the index, the above situation is not suggesting such a conclusion. Isn’t it?

I have also tried to compress the index T1_I2 to see if, for the original query, the index range scan will be chosen automatically but it hasn’t.

Best regards

]]>I have logged a SR on MOS to address clustering factor when index SS access path used, couple of weeks ago. Applying “table_sel * CF” is really wrong idea to determine the cost of finding rows in table blocks. I have suggested development to use MIN(“table_sel * CF(t1_i2) + NDV(id2), table_sel * table_num_rows) instead. Your idea is very simple and probably much smarter, however there is probably reason oracle is not using this straightforward formula.

There is a lot of changes in IDNEX_SS costing in 11gR2. The fix 9195582 (introduced in 11.2.0.2) caused performance issue of our application, since estimate for index scan (but only index part) will be no more than leaf blocks. So it decreased (probably in correct manner) cost of index blocks visits, but made CF effect even worse.

Regards

Pavol Babel

Although I found a tolerable workaround for this with recursive subquery factoring, but it would be much better able to cbo it myself. I wrote about it here: http://orasql.org/2012/09/21/distinct-values-by-index-topn/ ]]>