Oracle Scratchpad

September 5, 2021

Quiz Night

Filed under: Oracle,Problem Solving — Jonathan Lewis @ 6:35 pm BST Sep 5,2021

This little observation came from running a couple of tests while looking at a problem on OTN – hence the odd bit of PL/SQL.

declare
        cursor c1
        is
        select * from t1;

        rec c1%rowtype;
        ch char(1);

begin
        open c1;
        loop
              fetch c1 into rec;
                  exit when c1%notfound;
                  select null into ch from t1 where id  = rec.id;
                  select null into ch from t1 where id  = rec.id;
                  select null into ch from t1 where id  = rec.id;
        end loop;
        close c1;
end;

There is a simple B-tree index on t1(id), and t1 was populated with the values from 1 to 10,000 in order. A query against user_tables (immediately after gathering stats) reported 20 blocks and 10,000 rows. This is NOT an attempt to trick anyone, it’s simple demonstrating a surprising result.

The trace file shows a tablescan for the driving cursor and an index(-only) path for the three identical queries inside the loop.

Question: if you look at the session activity stats (v$mystat) from running this anonymous PL/SQL block what value (ballpark figure) would you expect to see for the following statistics:

table scans (short tables)
table scan rows gotten
table scan blocks gotten

If it’s any help: when I counted rows per block I had 15 blocks with 657 rows and one block with 145 rows. (The 20 blocks reported in the table stats is the number of formatted blocks below the highwater mark, but some may be empty and some may be space management (bitmap) blocks).

Answers

For my test data, block size, and segment space management the results I got for 19.11.0.0 (and they’d be pretty much the same for any version back to 10g) were as follows:

table scans (short tables)                  1
table scan rows gotten              6,495,760
table scan blocks gotten               10,000

Sneak insight of where that 6.5M comes from – I said I had 15 blocks with 657 rows and one with 145 rows:

15 * (657 * 657) + (145 * 145) = 6,495,760

When the PL/SQL loop is doing single row fetches it release the table block after each row, so has to do 10,000 gets on the table – which are all counted towards the “table scan blocks gotten”. Then, to fetch one row from a block the code seems to believe that it has to visit every single row in the block, hence the bit of arithmetic that produced a row scan count of around 6.5M for this “short table” tablescan.

When I ran up this test initially the three indexed accesses in the middle were emulating the code supplied in the original statement of the problem but, as Viacheslav Andzhich noted out in his follow-up to comment #3, they don’t have anything to do with this effect.

We could change from an explicit cursor loop to an implicit cursor loop as follows, of course:

declare
        cursor c1
        is
        select * from t1;

        ch char(10) := '0';

begin
        for rec in c1 loop
                  select null into ch from t1 where id  = rec.id;
                  select null into ch from t1 where id  = rec.id;
                  select null into ch from t1 where id  = rec.id;
        end loop;
end;
/

While you might think this won’t make much (or any) difference you need to remember that quite a long time ago Oracle introduced a PL/SQL optimisation behind the scenes that turns the single row loop into an array fetch of 100 rows (see footnote), so the table scan results I got were as follows:

table scans (short tables)                 1
table scan rows gotten                74,531
table scan blocks gotten                 115

Again the numbers are higher than we might expect but rather more realistic. Again the results are due to double (and treble, and quadruple…) counting at the start of each array fetch, but I haven’t yet tried to work out the exact numbers in detail – though you can see that since I have roughly 7 arrays per block it’s not entirely surprising that the “blocks gotten” should be somewhere in the region of 7 * “data blocks below hwm”.

Addendum

It’s hard to leave out a little detail when the numbers don’t quite fit expectations – so I enabled the 10202 trace at level 4 while running the second loop and checked the resulting trace file for the buffer header dumps as the tablescan took place. (I commented out the 3 indexed accesses before I did this). Here’s a sample of the dump for one buffer access:

buffer tsn: 6 rdba: 0x03400084 (13/132)
scn: 0x82af1e seq: 0x02 flg: 0x04 tail: 0x0201002c
frmt: 0x02 chkval: 0x716f type: 0x06=trans data
Hex dump of corrupt header 4 = CORRUPT
Dump of memory from 0x000000009B0BE000 to 0x000000009B0BE014
09B0BE000 0000A206 03400084 0082AF1E 04020000  [......@.........]
09B0BE010 0000716F                             [oq..]
Consistent read complete...

Once I’d got this trace file I just extracted all the lines starting with “buffer tsn”, did a sort with count on them, and then worked out (and annotated each block address in turn) how each block had contributed to multiples of 100 rows. The results were as follows – and the answer was 115 buffer visits as reported:

grep "^buffer tsn" or21_ora_10141.trc | sort | uniq -c >temp.txt
vi temp.txt

      7 buffer tsn: 6 rdba: 0x03400084 (13/132)              600 + 57
      8 buffer tsn: 6 rdba: 0x03400085 (13/133)         43 + 600 + 14
      7 buffer tsn: 6 rdba: 0x03400086 (13/134)         86 + 500 + 71
      8 buffer tsn: 6 rdba: 0x03400087 (13/135)         29 + 600 + 28
      7 buffer tsn: 6 rdba: 0x03400088 (13/136)         72 + 500 + 85
      8 buffer tsn: 6 rdba: 0x03400089 (13/137)         15 + 600 + 42
      7 buffer tsn: 6 rdba: 0x0340008a (13/138)         58 + 500 + 99
      8 buffer tsn: 6 rdba: 0x0340008b (13/139)          1 + 600 + 56
      8 buffer tsn: 6 rdba: 0x0340008c (13/140)         44 + 600 + 13
      7 buffer tsn: 6 rdba: 0x0340008d (13/141)         87 + 500 + 70
      8 buffer tsn: 6 rdba: 0x0340008e (13/142)         30 + 600 + 27
      7 buffer tsn: 6 rdba: 0x0340008f (13/143)         73 + 500 + 84
      8 buffer tsn: 6 rdba: 0x03400090 (13/144)         16 + 600 + 41
      7 buffer tsn: 6 rdba: 0x03400091 (13/145)         59 + 500 + 98
      8 buffer tsn: 6 rdba: 0x03400092 (13/146)          2 + 600 + 55
      2 buffer tsn: 6 rdba: 0x03400093 (13/147)         45 + 100

To finish off the arithmetic, this also shows 113 visits to blocks with 657 rows and 2 visits to blocks with 145 rows: (113 * 657) + (2 * 145) = 74,531 (as reported by “table scan rows gotten”).

Setup

If you want to do some experimentation on variations, here’s the code I used to create my test data.

rem
rem     Script:         tablescan_joke.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sept 2021
rem
rem     Last tested 
rem             21.3.0.0
rem             19.11.0.0
rem
rem     Notes:
rem     When you do a tablescan with single row fetches 
rem     inside a pl.sql loop, evey row fetch is a block 
rem     fetch and the row count increments by the number
rem     of rows in the block.
rem
rem     In 19.3 a call to rowid_count gave me the result
rem     ROWS_STARTING_IN_BLOCK     BLOCKS
rem     ---------------------- ----------
rem                        145          1
rem                        657         15
rem     So total rows =
rem             657 * (10000 - 145) +
rem             145 * 145
rem

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4    -- > comment to avoid WordPress format issue
)
select
        rownum                          id
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4   -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(id);

select  blocks, num_rows 
from    user_tables
where   table_name = 'T1'
;

The code I used to check the number of rows per block is at this URL.

Footnote

In 10g Oracle introduced the plsql_optimize_level to affect the way in which the pl/sql optimizer was allowed to manipulate your code. The default level is 2, and at this level an implicit cursor for loop will use an array fetch size of 100 behind the scenes. If you want to disable this optimisation you need to set the parameter value down to 1 (or zero).

6 Comments »

  1. Well, that was interesting. I won’t say at this time what I predicted, but suffice it to say, my prediction was incorrect.

    Comment by jkstill — September 5, 2021 @ 6:59 pm BST Sep 5,2021 | Reply

  2. I expected 1 table scans (short tables) and 1000 table scan blocks gotten because fetch size is 1
    But I didn’t expect reading many table scan rows gotten on each fetch 🤔

    Comment by @FranckPachot — September 5, 2021 @ 8:23 pm BST Sep 5,2021 | Reply

  3. Whatever common sense might suggest, I believe, the correct answer is way-way above that :)
    I suspect, that those three index-only statements are there for a reason. Probably, every time they launched, they bump up the table scans stat, even though they shouldn’t. If that’s the case, we have 3 scans per each row, 3*10000 + 1 for the cursor = 30001.
    I guess, then the blocks gotten should be around 16*300001 = 480k, and rows gotten 30001*15*647+30001*1*145 it’s something like 295M

    Comment by Viacheslav Andzhich — September 5, 2021 @ 10:04 pm BST Sep 5,2021 | Reply

    • I’ve built a test case this morning in local 19.3.To not spoil fun to anyone, I’d just say, that
      a) my assumption was wrong, none of the numbers is correct;
      b) the index-only statements seem to be a decoy :)

      Comment by Viacheslav Andzhich — September 6, 2021 @ 11:38 am BST Sep 6,2021 | Reply

  4. Might be related to this oddity I video’d a while back https://youtu.be/jlcKTzj-tiI

    Comment by Connor McDonald — September 5, 2021 @ 11:49 pm BST Sep 5,2021 | Reply

  5. I’ll take a guess (I’m probably way off). I thought I heard somewhere that PL/SQL Cursor loops cache 100 rows. So with each fetch, you would hit each of the blocks in t1 6 or 7 times. So say 100ish for table scan blocks gotten.

    Table scans rows gotten…i think i remember someplace that it has to re-traverse rows when it revisits the same block? So 1st scan would be 100 rows, second scan would be an additional 200 rows (300 rows total at this point). Third scan 300 rows (600 total) fourth scan 400 rows (1000 total). Fifth scan 500 rows (1.5k) sixth scan 600 rows (2.1k). Repeat for 16ish blocks, table scan rows gotten would be ~30k rows.

    Table scans (short tables) I would expect to be 1, but the fact that you’re listing it tells me that it’s probably not. Lol.

    I’m likely way off on everything though, just guessing for funzies ¯\_(ツ)_/¯

    Comment by Kaley Crum — September 6, 2021 @ 3:05 am BST Sep 6,2021 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.