Oracle Scratchpad

April 2, 2024

Index Usage – 3

Filed under: Indexing,Oracle,Troubleshooting,Tuning — Jonathan Lewis @ 4:34 pm BST Apr 2,2024

In the second part of this series I described some of the technicalities of Index Usage Tracking and showed an example of what I was doing to test the feature. In this episode I’ll describe some of the index access methods I’ve tested and report the results. I’ve listed the tests I’ve planned so far and will show results as I find time to run the tests – if you can think of more cases add them in the comments and I’ll extend the list. (If you think a test is a very good idea, “upvote” it in the comments and I’ll try to run it sooner.

Before I’ve done all the tests I’ll add a section on Conclusions and Recommendations. This will be addressing the questions: “Is the feature worth using?” and “What’s the best way to use it”. This section of the note may develop over time as special cases or boundary conditions show up.

Setup (recap from pt.2)

I’ll be using three session to do my testing:

  • Session 1 logged in as the test user to execute test statements and query dba_index_usage.
  • Session 2 logged in as sys in the same pdb to query x$keiut and x$keiut_info/v$index_usage_info (real time summaries)
  • Session 3 logged in as the oracle s/w owner and connected as sys in the cdb$root to call keiutflush to make mmon flush x$keiut to disk. The session may have to “alter session set container = {the test pdb}” to flush the right x$keiut.
  • CDB SYS session
    • call keiutFlush twice to clear all active elements from x$keiut_info/v$index_usage_info (check contents)
  • PDB SYS session
    • check x$keiut is empty
  • User session:
    • check dba_index_usage for targetted index(es)
    • execute some statements forcing use of index and check execution plan
  • PDB sys session:
    • check contents of x$keiut and x$keiut_info/v$index_usage_Info
  • CDB sys session
    • call keiutFlush
  • End user session
    • Check contents of dba_index_usage (report changes)

Presentation

For each test I’ll show (in collapsed form) the SQL I used to define the objects being tested, and describe the purpose and method of the test. Then I’ll simply show the critical changes in dba_index_usage and, in some cases, x$keiut_info/v$index_usage_info, x$keiut as a result of the test.

Initial list of tests

  • Gathering index stats
  • “Foreign Key” indexes and primary key indexes during referential integrity
  • Single table, single column index, multiple statements
    • Unique index (equality and range predicates)
    • non-unique index covering unique constraint
    • non-unique index generally
  • Inlist iterator and union all.
  • Nested loop joins
  • Single table, Multi-column index (access/filter, and skip scan)
  • Index-only query (range scan, full scan, fast full scan, index joins)
  • Bitmap access (with multiple bitmaps per key value)
    • Bitmap and / or / minus
  • IOT accessed by primary key
  • IOT accessed by secondary key – ensuring “misses on guesses”
  • DML – single table access
  • DML – with subquery / join methods embedded
  • Locally partitioned index (single table partition, multiple table partition)
  • Globally partitioned index

Test Data

I’ve used the following script to generate common data for many of the initial tests in the list above – I may extend this script as I complete more and more of the tests, though I may also introduce separate scripts for some specific cases:

rem
rem     Script:         iut_02.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2024
rem     Purpose:        
rem
rem     Last tested 
rem             19.11.0.0
rem     To be tested
rem             23.3.0.0
rem             12.2.0.1
rem
rem     Notes:
rem     Create some test data to demonstrate index usage tracking
rem

create table t1 as
select
        rownum - 1                              id,
        mod(rownum-1,10000)                     n1,
        trunc((rownum - 1)/20)                  col1,
        trunc((rownum - 1)/10)                  col2,
        rownum - 1                              col3,
        round(100 * dbms_random.normal())       rand,
        cast(rpad(rownum,25) as varchar2(25))   v1,
        cast(rpad('x',80,'x') as varchar2(80))  padding
from
        all_objects
where
        rownum <= 50000
/

create table t2 as select * from t1;

-- single column primary key index (unique)

alter table t1 add constraint t1_pk primary key (id) 
        using index (
        create unique index t1_pk on t1(id)
        );

create index t1_col3 on t1(col3);
alter table t1 add constraint t1_uk unique (col3);

create index t1_3col on t1(col1, col2, col3) compress 2;
create index t1_rand on t1(rand);

-- multi-column primary key index (unique)

alter table t2 add constraint t2_pk primary key (col1, id) 
        using index (
        create unique index t2_pk on t2(col1, id) compress
        );

-- "foreign key index" (one to many)

alter table t2 add constraint t2_fk_t1 foreign key (col3) references t1(id);
create index t2_col3 on t2(col3);

Gathering Stats

The SQL used to gather index stats from any of the calls to dbms_stats.gather_xxx_stats() does a simple select statement that is hinted to access the index. In my case the indexes were all fairly small – smaller than the critical number of blocks that trigger sampling methods – so Oracle examined every block and row in the index, leading to (a check in) index usage stats looking like the following t1 primary key example:

OBJECT_ID                      : 209180
NAME                           : T1_PK
OWNER                          : TEST_USER
TOTAL_ACCESS_COUNT             : 1
TOTAL_EXEC_COUNT               : 1
TOTAL_ROWS_RETURNED            : 50000
BUCKET_1000_PLUS_ACCESS_COUNT  : 1
BUCKET_1000_PLUS_ROWS_RETURNED : 50000
LAST_USED                      : 01-apr-2024 13:48:51

So gathering stats does flag the index as used – but with the enhanced tracking it’s going to be relatively easy to spot cases where a small number of accesses account for a large number of rows – hinting that these accesses may be only for stats collection.

If you’ve read the linked article you’ll see how Oracle’s sampling strategy has changed in recent years, so a check that will be relevant to some of your indexes is whether or not the average rows returned is equivalent to roughly 1,140 leaf blocks. Other simple checks that might be appropriate are: “is the very large access happening once per day/week” in line with your expected stats collection strategy.

Of course, if large access counts are happening fairly frequently you can raise the question – does this look an appropriate result for the table or index, or does it hint at an index that is used when it should not be, or an index that needs to be refined (e.g. by the addition of extra columns to allow better elimination before visiting the table).

Referential Integrity (and DML)

With referential integrity in place Oracle will (in the most commonly used setup)

  • check that matching child rows do not exist when you try to delete a parent or modify its key value – will either operation flag a supporting “foreign key” index as used.
  • check that the parent row exists if you try to insert a child row or update a child row to associate it with a different parent value – will either operation flag the parent primary key index as used

In my test script t1 is the parent and t2 is the child. The referential integrity is from t2.col3 to t1.id and the relevant indexes are unique indexes named t2_col3 and t1_pk respectively. Test statements are:

-- Delete parent when child exists
-- Delete parent when child deleted
-- --------------------------------

delete from t1 where id = 17000;
-- ORA-02292: integrity constraint (TEST_USER.T2_FK_T1) violated - child record found

delete from t2 where col3 = 17000;
delete from t1 where id   = 17000;

commit;

-- Insert into child without parent
-- Insert into child with    parent
-- --------------------------------

insert into t2 values(60000, 1,1,1,60000,1,'x','x');
-- ORA-02291: integrity constraint (TEST_USER.T2_FK_T1) violated - parent key not found

insert into t2 values (18000,1,1,1,15000,1,'x','x');

commit;

-- update parent to "abandon" child
-- --------------------------------

update t1 set id = 60000 where id = 25000;
-- ORA-02292: integrity constraint (TEST_USER.T2_FK_T1) violated - child record found

-- update child to change to non-existent parent
-- update child to change to pre-existing parent
-- ---------------------------------------------

update t2 set col3 = 60000 where id = 25000;
-- ORA-02291: integrity constraint (TEST_USER.T2_FK_T1) violated - parent key not found

update t2 set col3 = 30000 where id = 25000;

commit;

After calling keiutFlush and checking that there were no active elements in x$keiutinfo/v$index_usage_Info, and no rows in x$keiut I executed all the statements above one after the other (some failed, of course, with their errors shown above). Nothing was captured in x$keiut.

Apart from the implications of “foreign key” indexes not being flagged as used during referential integrity checks, the tests above also show us that updates and deletes driven by index access do not show the driving index flagged as used: t1.id = constant, t2.col3 = constant, t2.id = constant (which used an index skip scan on t2_pk)).

Single table, single column index

This set of tests is close to a repeat of the first demonstration in part 2. Here’s the list of statements aimed at index t1_pk. Note that t1_pk(id) is a unique index on a primary key constraint, t1_uk(col3) is a non-unique index covering a unique constraints, t2_col3(col3) is a single column non-unique index.

--
--      Out of range, no rows
--
select v1 from t1 where id = -1;
select v1 from t1 where id = 60001;

--
--      Single row
--
select v1 from t1 where id = 1000;
select v1 from t1 where id = 2000;

--
--      multiple rows: 5, 50, 500, 1500
--
select max(v1) from t1 where id between 11 and 15;
select max(v1) from t1 where id between 101 and 150;
select max(v1) from t1 where id between 1001 and 1500;
select max(v1) from t1 where id between 10001 and 11500;

Here are the results after the sequence: call keiutFlush, execute test SQL, report x$keiut, call keiutFlush, report change in dba_index_usage:

SQL> select objname, num_starts, num_execs, rows_returned from x$keiut;

OBJNAME                          NUM_STARTS  NUM_EXECS ROWS_RETURNED
-------------------------------- ---------- ---------- -------------
TEST_USER.T1_PK                           8          8          2057


SQL> host expand temp1.lst
OBJECT_ID                      : 209180
NAME                           : T1_PK
OWNER                          : TEST_USER
TOTAL_ACCESS_COUNT             : 8
TOTAL_EXEC_COUNT               : 8
TOTAL_ROWS_RETURNED            : 2075
BUCKET_0_ACCESS_COUNT          : 2
BUCKET_1_ACCESS_COUNT          : 2
BUCKET_2_10_ACCESS_COUNT       : 1
BUCKET_2_10_ROWS_RETURNED      : 5
BUCKET_11_100_ACCESS_COUNT     : 1
BUCKET_11_100_ROWS_RETURNED    : 50
BUCKET_101_1000_ACCESS_COUNT   : 1
BUCKET_101_1000_ROWS_RETURNED  : 500
BUCKET_1000_PLUS_ACCESS_COUNT  : 1
BUCKET_1000_PLUS_ROWS_RETURNED : 1500
LAST_USED                      : 02-apr-2024 15:29:06

You’ll note that I executed 8 select statements, and expected a total of 2,057 rows (index rowids) being passed to the table access operation, and the changes in stats shown in dba_index_usage are an exact match for the predictions above the listed SQL statements.

Since t1.col3 is an exact match of t1.id, and since t2 is a duplicate of t1, it seems likely that tests that start by cloning the SQL and changing the column or table name as appropriate would give the matching results – and they do, so I won’t bother to print them all up.

There is one final test of a single column index before I move on to simple queries targeting a multi-column index. Here’s the statement I want to test to confirm an important point:

select v1 from t1 where id between 101 and 150 and mod(id,10) = 0;

This query will scan through 50 index entries, discarding all but 5 of them, returning 5 rows from the table. The key question is this – will dba_index_usage report 50 rows accessed or 5 rows accessed. Here’s what the changes in stats looked like after the test

TOTAL_ACCESS_COUNT             : 1
TOTAL_EXEC_COUNT               : 1
TOTAL_ROWS_RETURNED            : 5
BUCKET_0_ACCESS_COUNT          : 0
BUCKET_1_ACCESS_COUNT          : 0
BUCKET_2_10_ACCESS_COUNT       : 1
BUCKET_2_10_ROWS_RETURNED      : 5
BUCKET_11_100_ACCESS_COUNT     : 0
BUCKET_11_100_ROWS_RETURNED    : 0
BUCKET_101_1000_ACCESS_COUNT   : 0
BUCKET_101_1000_ROWS_RETURNED  : 0
BUCKET_1000_PLUS_ACCESS_COUNT  : 0
BUCKET_1000_PLUS_ROWS_RETURNED : 0

This is a very important point: the stats in dba_index_usage do not tell us how many rows (index entries) we visited in the index, they tell us how many rowids (or, possibly, key values) survived to be passed to the parent operation (typically the table access). So you might look at some stats that say: “25,000 executions, all in the 2 to 10 range – good index” when you’ve got a badly designed index does 90% of the total work of a query and discards 2,000 index entries for each rowid it uses to access a table.

Inlist Iterators, Union All and mutiple references

Here’s a sample query (with its result set, and actual execution plan pulled from memory) accessing the t1 table through the index on column rand. If you try to repeat this example it probably won’t give you exactly the same results because I used Oracle’s random number generator to generate a normal distribution of integer values (with mean zero and standard deviation of 100), but there’s a reasonable chance that you’ll see similar numbers in your output as I’ve been careful to pick three values that should return significantly different numbers of rows:

select
        rand, count(*) 
from    t1 
where   rand in (100, 200, 300)
and     v1 is not null
group by 
        rand 
order by 
        count(*)
/


      RAND   COUNT(*)
---------- ----------
       300          2
       200         25
       100        114

3 rows selected.

select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------
SQL_ID  fbwmnax39jqtq, child number 0
-------------------------------------
select         rand, count(*) from    t1 where   rand in (100, 200,
300) and     v1 is not null group by         rand order by
count(*)

Plan hash value: 874747030

-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |         |      1 |        |      3 |00:00:00.01 |     143 |       |       |          |
|   1 |  SORT ORDER BY                 |         |      1 |      3 |      3 |00:00:00.01 |     143 |  2048 |  2048 | 2048  (0)|
|   2 |   SORT GROUP BY NOSORT         |         |      1 |      3 |      3 |00:00:00.01 |     143 |       |       |          |
|   3 |    INLIST ITERATOR             |         |      1 |        |    141 |00:00:00.01 |     143 |       |       |          |
|*  4 |     TABLE ACCESS BY INDEX ROWID| T1      |      3 |    220 |    141 |00:00:00.01 |     143 |       |       |          |
|*  5 |      INDEX RANGE SCAN          | T1_RAND |      3 |    220 |    141 |00:00:00.01 |       5 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("V1" IS NOT NULL)
   5 - access(("RAND"=100 OR "RAND"=200 OR "RAND"=300))

This plan is probably the simplest demonstration of the difference between total_exec_count and total_access_count. Here’s the output from x$keiut after this test, followed by the details available from dba_index_usage after calling keiutFlush:

OBJNAME                          NUM_STARTS  NUM_EXECS ROWS_RETURNED
-------------------------------- ---------- ---------- -------------
TEST_USER.T1_RAND                         3          1           141

Given that x$keiut has reported three “starts”, and we can see that the plan shows three “Starts” and “A-rows” is 141, and the output shows three rows totalling 141 we might hope that dba_index_usage is going to show us exactly what happened with this query. Here are the results:

SQL> execute print_tABLE(q'(select * from dba_index_usage where name = 'T1_RAND')')
OBJECT_ID                      : 209738
NAME                           : T1_RAND
OWNER                          : TEST_USER
TOTAL_ACCESS_COUNT             : 3
TOTAL_EXEC_COUNT               : 1
TOTAL_ROWS_RETURNED            : 141
BUCKET_0_ACCESS_COUNT          : 0
BUCKET_1_ACCESS_COUNT          : 0
BUCKET_2_10_ACCESS_COUNT       : 0
BUCKET_2_10_ROWS_RETURNED      : 0
BUCKET_11_100_ACCESS_COUNT     : 3
BUCKET_11_100_ROWS_RETURNED    : 141
BUCKET_101_1000_ACCESS_COUNT   : 0
BUCKET_101_1000_ROWS_RETURNED  : 0
BUCKET_1000_PLUS_ACCESS_COUNT  : 0
BUCKET_1000_PLUS_ROWS_RETURNED : 0
LAST_USED                      : 02-apr-2024 19:02:03

The three “starts” from x$keiut show up in the total_access_count but the three separate accesses have been accumulated and averaged to appear as 3 accesses in the 11-100 range. (141 / 3 = 47). So there’s some loss of precision in the information. It’s better than just the old “yes/no” report, and in many cases it’s likely to give you numbers that close enough to the right ballpark to be useful, but there are likely to be some cases where the more detailed information would be more beneficial. It’s worth mentioning that the total_access_count (hence the averaging) will include “Starts” where no rows were returned.

Once upon a time, of course, a query involving an inlist might have been rewritten as a query with union all, such as:

select
        rand, count(v1)
from    t1
where   rand = 100
group by
        rand
union all
select
        rand, count(v1)
from    t1
where   rand = 200
group by
        rand
union all
select
        rand, count(v1)
from    t1
where   rand = 300
group by
        rand
;

---------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |      1 |        |      3 |00:00:00.01 |     144 |
|   1 |  UNION-ALL                    |         |      1 |        |      3 |00:00:00.01 |     144 |
|   2 |   SORT GROUP BY NOSORT        |         |      1 |    128 |      1 |00:00:00.01 |     113 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1      |      1 |    128 |    114 |00:00:00.01 |     113 |
|*  4 |     INDEX RANGE SCAN          | T1_RAND |      1 |    128 |    114 |00:00:00.01 |       2 |
|   5 |   SORT GROUP BY NOSORT        |         |      1 |     61 |      1 |00:00:00.01 |      27 |
|   6 |    TABLE ACCESS BY INDEX ROWID| T1      |      1 |     61 |     25 |00:00:00.01 |      27 |
|*  7 |     INDEX RANGE SCAN          | T1_RAND |      1 |     61 |     25 |00:00:00.01 |       2 |
|   8 |   SORT GROUP BY NOSORT        |         |      1 |     61 |      1 |00:00:00.01 |       4 |
|   9 |    TABLE ACCESS BY INDEX ROWID| T1      |      1 |     61 |      2 |00:00:00.01 |       4 |
|* 10 |     INDEX RANGE SCAN          | T1_RAND |      1 |     61 |      2 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("RAND"=100)
   7 - access("RAND"=200)
  10 - access("RAND"=300)

Does this make any difference to the way that Oracle reports the executions, starts, and rows returned? Here’s the change in dba_index_usage due to this query:

NAME                           : T1_RAND
TOTAL_ACCESS_COUNT             : 3
TOTAL_EXEC_COUNT               : 3
TOTAL_ROWS_RETURNED            : 141
BUCKET_0_ACCESS_COUNT          : 0
BUCKET_1_ACCESS_COUNT          : 0
BUCKET_2_10_ACCESS_COUNT       : 1
BUCKET_2_10_ROWS_RETURNED      : 2
BUCKET_11_100_ACCESS_COUNT     : 1
BUCKET_11_100_ROWS_RETURNED    : 25
BUCKET_101_1000_ACCESS_COUNT   : 1
BUCKET_101_1000_ROWS_RETURNED  : 114
BUCKET_1000_PLUS_ACCESS_COUNT  : 0
BUCKET_1000_PLUS_ROWS_RETURNED : 0

In this case three separate accesses through the t1_rand index have been recorded individually in the correct buckets – with one interesting anomaly: the total execution count is 3 rather than 1. This suggests that the “total_exec_count” is a little more complicated than I originally thought. For each execution of a single query the “total_exec_count” seems to be the “total_operation_count” and possibly, for “conditional SQL”, the count of operations that were actually called.

Nested Loops

After seeing how handled the inlist iterator, the obvious next step was to repeat the experiment but use a table to hold the values I’d selected for the inlist and force a nested loop join. Here’s the table creation followed by the query and its execution plan:

create table t3 (n1 number);

insert into t3 values(100);
insert into t3 values(200);
insert into t3 values(300);
commit;

execute dbms_stats.gather_table_stats(user,'t3')

alter table t3 add constraint t3_uk unique(n1);

select
        /*+
                leading(t3 t1)
                use_nl(t1)
                index(t1(rand))
        */
        t3.n1,
        max(t1.v1)
from
        t3, t1
where
        t1.rand = t3.n1
group by
        t3.n1
/


PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
SQL_ID  7mxmx715yfb73, child number 0
-------------------------------------
select  /*+   leading(t3 t1)   use_nl(t1)   index(t1(rand))  */  t3.n1,
 max(t1.v1) from  t3, t1 where  t1.rand = t3.n1 group by  t3.n1

Plan hash value: 3159923631

---------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |      1 |        |      3 |00:00:00.01 |     147 |
|   1 |  SORT GROUP BY NOSORT         |         |      1 |      3 |      3 |00:00:00.01 |     147 |
|   2 |   NESTED LOOPS                |         |      1 |    220 |    141 |00:00:00.01 |     147 |
|   3 |    NESTED LOOPS               |         |      1 |    220 |    141 |00:00:00.01 |       9 |
|   4 |     INDEX FULL SCAN           | T3_UK   |      1 |      3 |      3 |00:00:00.01 |       2 |
|*  5 |     INDEX RANGE SCAN          | T1_RAND |      3 |     73 |    141 |00:00:00.01 |       7 |
|   6 |    TABLE ACCESS BY INDEX ROWID| T1      |    141 |     73 |    141 |00:00:00.01 |     138 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T1"."RAND"="T3"."N1")

As you can the plan involves two indexes, and (after the usual start of called keiutFlush) executing the query ends up with x$keiut reporting two rows:

OBJNAME                          NUM_STARTS  NUM_EXECS ROWS_RETURNED
-------------------------------- ---------- ---------- -------------
TEST_USER.T1_RAND                         3          1           141
TEST_USER.T3_UK                           1          1             3

The num_starts and rows_returned happily agree with the Starts and A-rows values in the plan and after a second call to keiutFlush the changes in dba_index_usage are as follows:

NAME                           : T1_RAND
TOTAL_ACCESS_COUNT             : 3
TOTAL_EXEC_COUNT               : 1
TOTAL_ROWS_RETURNED            : 141
BUCKET_0_ACCESS_COUNT          : 0
BUCKET_1_ACCESS_COUNT          : 0
BUCKET_2_10_ACCESS_COUNT       : 0
BUCKET_2_10_ROWS_RETURNED      : 0
BUCKET_11_100_ACCESS_COUNT     : 3
BUCKET_11_100_ROWS_RETURNED    : 141
BUCKET_101_1000_ACCESS_COUNT   : 0
BUCKET_101_1000_ROWS_RETURNED  : 0
BUCKET_1000_PLUS_ACCESS_COUNT  : 0
BUCKET_1000_PLUS_ROWS_RETURNED : 0

NAME                           : T3_UK
TOTAL_ACCESS_COUNT             : 1
TOTAL_EXEC_COUNT               : 1
TOTAL_ROWS_RETURNED            : 3
BUCKET_0_ACCESS_COUNT          : 0
BUCKET_1_ACCESS_COUNT          : 0
BUCKET_2_10_ACCESS_COUNT       : 1
BUCKET_2_10_ROWS_RETURNED      : 3
BUCKET_11_100_ACCESS_COUNT     : 0
BUCKET_11_100_ROWS_RETURNED    : 0
BUCKET_101_1000_ACCESS_COUNT   : 0
BUCKET_101_1000_ROWS_RETURNED  : 0
BUCKET_1000_PLUS_ACCESS_COUNT  : 0
BUCKET_1000_PLUS_ROWS_RETURNED : 0

Again we see that the figures agree with the execution plan but, just as we saw with the inlist iterator, the level of detail reported for rows returned does go down to the individual accesses, it simply averages (for the execution) the total rows returned across the total access count to assign the work done in index t1_uk to the 11 to 100 bucket.

Prompted by the reporting of the union all query in the previous session it’s worth looking at a slight more complex join to clarify what an “exec” is. Using the same t1 and t2 tables, here’s a suitablu silly query:

select
        count(t2.v1)
from
        t2,
        t1 t1a,
        t1 t1b
where
        t1a.id = t2.n1
and     t1b.id = 10 * t2.n1
/

SQL_ID  917pamcvnmtsk, child number 0
-------------------------------------
select  count(t2.v1) from  t2,  t1 t1a,  t1 t1b where  t1a.id = t2.n1
and t1b.id = 10 * t2.n1

Plan hash value: 1623429891

----------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |      1 |        |      1 |00:00:00.24 |    5306 |
|   1 |  SORT AGGREGATE      |       |      1 |      1 |      1 |00:00:00.24 |    5306 |
|   2 |   NESTED LOOPS       |       |      1 |  50000 |  25000 |00:00:00.24 |    5306 |
|   3 |    NESTED LOOPS      |       |      1 |  50000 |  50000 |00:00:00.13 |    1662 |
|   4 |     TABLE ACCESS FULL| T2    |      1 |  50000 |  50000 |00:00:00.02 |     960 |
|*  5 |     INDEX UNIQUE SCAN| T1_PK |  50000 |      1 |  50000 |00:00:00.07 |     702 |
|*  6 |    INDEX UNIQUE SCAN | T1_PK |  50000 |      1 |  25000 |00:00:00.08 |    3644 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("T1A"."ID"="T2"."N1")
   6 - access("T1B"."ID"=10*"T2"."N1")

You’ll notice that there are two “index unique scan” operations, both starting 50,000 times; and that the second operation returns a row in only 25,000 of those starts thanks to the definition of n1 and the “10 x n1” in my predicate. Here’s the corresponding report from x$keiut.

OBJNAME                          NUM_STARTS  NUM_EXECS ROWS_RETURNED
-------------------------------- ---------- ---------- -------------
TEST_USER.T1_PK                      100000          2         75000

We see num_execs = 2 – i.e. one for each time the index appears in an operation in the plan. Then, as expected, num_starts and rows_returned match the sum of the two Starts and A-Rows. But here’s the change in dba_index_usage – spot the oddity:

NAME                           : T1_PK
OWNER                          : TEST_USER
TOTAL_ACCESS_COUNT             : 100000
TOTAL_EXEC_COUNT               : 2
TOTAL_ROWS_RETURNED            : 75000
BUCKET_0_ACCESS_COUNT          : 50000
BUCKET_1_ACCESS_COUNT          : 50000

Oracle reports the “execute count” of 2; and the total rows returned as 75000 – but it seems that that 75,000 is made up of 50,000 accesses that returned 1 row (which we know happened), and 50,000 accesses that returned 0 rows (which is not the way things went).

It’s going to be very easy for me to fool myself that I know what Oracle is doing here – it’s going to take several more experiments to get a good idea of what information Oracle is recording and when and how it’s accumulating it (or not) between calls to keiutFlush.

In this simple case where we’ve executed one query with two operations it would be easy to assume that Oracle has identified and summarised the two operations separately, introducing some simple rounding errors.

  • The first operation starts 50,000 times and returns 50,000 rows so, on average, it should be counted in the “1 access” bucket 50,000 times.
  • The second operation starts 50,000 times and returns 25,000 rows so, on average and with rounding (down), that’s 0 rows per start. Hence a count of 50,000 in the “0 access” bucket.

It seems like a viable explanation – but we might want to fiddle with slightly different queries, and do some tests of multiple executions of the same query (while the data changes), and some tests with slightly different but similarly structured queries, to see if Oracle is accumulating on a “per query execution basis”, or doing something more complex, or something more naive.

The other thing to note about the way that dba_index_usage has captured the statistics of this report is that it looks as if it depends either on rowsource execution statistics or the SQL Monitor statistics being gathered on every execution – and that could lead to significant overheads (in CPU, most probably).

Conclusions and Suggestions

Calls to gather index stats execute queries that will be captured by index usage tracking, so you need a strategy to help you ignore “BUCKET_1000_PLUS_ROWS_RETURNED” accesses that are about stats collection, while taking action when they are indications of index usage that could be made more efficient.

Indexes used during referential integrity checks are not flagged as used. This is probably not much of a threat for the relevant (primary/unique key) indexes on the parent end since you are unlikely to drop indexes that enforce uniqueness even if they don’t appear to be used; however it does mean that there is some risk of dropping a “foreign key” index that is needed to avoid locking problems.

Indexes used to drive delete and update statements are not captured by the new mechanism – at least for simple “delete from single_table where …” and “update table where …”. This could result in indexes being dropped that have been created to enhance DML performance. (There are outstanding tests for DML using index-driven subqueries or join views to see if indexes used to access other tables in the statement do get flagged.)

The statistics about “rows returned” tell you how many index entries are passed to the parent operation, not about the number of index entries examined; so a frequently used index that consistently reports a very small number of “rows returned” is not necessarily an efficient, well-designed index.

The stats on inlist iterators are a little disappointing: while the number of iterations appears as the total_access_count, the accesses are all allocated to the same bucket on the basis of total_rows_returned / total_access_count and no allowance is made for different values in the list returning significantly different numbers of rows. The same is true of the index accesses for the second table of a nested loop join. On the other hand if an inlist query is rewritten as a union all query then the stats for each branch of the union all are captured separately – this means access counts and rows returned are reported against the correct bucket, but as a side effect the execution count is reported as the number of branches that have used the index, executed rather than the number of executions of the query.

Tests with slightly more complex queries show that the total_exec_count relates to the number of times an index is named in an execution plan operation, not just the number of times the query is executed; and the slightly more complex queries make it easy to see side effects of the rounding of “total_row_count / total_access_count” (particularly the “0 bucket” and “1 bucket” stats) that make the overall results look strangely inconsistent. In my (current) opinion, though, it’s probably the case that the 0/1 level where the inconsistency is most visible is where it’s also least important.

3 Comments »

  1. […] Index Usage Tracking (April 2024): Part 3 – first tests of what’s tracked and what isn’t. […]

    Pingback by Indexing Catalogue | Oracle Scratchpad — April 2, 2024 @ 4:47 pm BST Apr 2,2024 | Reply

  2. Hi Jonathan, I’m confuse when you presented this (141 / 3 = 27) in my mind 141/3 = 47.

    Also I think a small typo when you started the paragraph with “If course”. but of course English is not my first, no even my second language.

    Comment by Anonymous — April 3, 2024 @ 12:46 pm BST Apr 3,2024 | Reply


RSS feed for comments on this post. TrackBack URI

Comments and related questions are welcome.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.