I’ve just seen a note on the news group comp.databases.server.oracle advising someone to check the online manual for a piece of code to report which objects are using how much space in the buffer cache. This is the reference and this is the code:
SELECT o.object_name, COUNT(1) number_of_blocks FROM DBA_OBJECTS o, V$BH bh WHERE o.object_id = bh.objd AND o.owner != 'SYS' GROUP BY o.object_name ORDER BY count(1);
There are two flaws with this code – it gets the wrong results, and it’s inefficient.
The column objd in v$bh is the data_object_id from dba_objects, not the object_id. The data object id caters for rebuilding indexes, moving tables, clusters, and partitioning (amongst others). Look at the sample below where I’ve created a partitioned table with a primary key index, then done various bits of DDL, including a partition exchange and one index partition rebuild.
SQL> l 1 select 2 object_id, data_object_id, 3 object_name, subobject_name 4 from 5* user_objects SQL> / OBJECT_ID DATA_OBJECT_ID OBJECT_NAME SUBOBJECT_NAME ---------- -------------- -------------------- -------------------- 44797 44802 PT_PK P400 44801 44791 PT_PK P600 44795 PT_PK 44794 44794 PT_RANGE P400 44800 44790 PT_RANGE P600 44792 PT_RANGE 44790 44800 T1 44791 44801 T1_PK 44798 V_PT SQL> spool off
Note particularly how the partitioned table has a ‘logical’ object id, but it is only the physical data segments of the underlying partitions that have a ‘physical’ data object id.
Note also that the ‘logical’ object id of the table t1 matches the ‘physical’ data object id of partition p600 and vice versa (with a matching cross-over on the primary keys). This is because I exchanged that table with that partition. And while we’re talking about partitioning, shouldn’t the code also aggregate by subobject_name to separate the partitions out ? So this code is going to get funny results, and miss objects, for all sorts of reasons.
A couple of standard guidelines for writing SQL – eliminate early, aggregate before extending. So let’s apply them here.
The join to dba_objects is not going to work for rollback segments or global temporary tables, so eliminate them – the predicate “objd < power(2,22)” will do this. (It’s an interesting point that we can now create partitioned tables with millions of partitions – but once you get above roughly 4,000,000 the data_object_id is going to collide with the objd values used for temporary table segments).
Then we might want to eliminate the buffers which are free – they may still have the object information in them but (after a truncate, for example) the information is no longer valid. For v$bh an appropriate predicate would be status != ‘free’.
We then see that the code joins to dba_objects before aggregating by object name. So if big_table_X has 20,000 blocks in the buffer, we join 20,000 times to extend 20,000 times to aggregate by object name – when we could have aggregated by the (short) objd and joined just once.
So take out the reference to v$bh, and replace it with a view that does the aggregation, for example something like the following:
SELECT o.owner, o.object_name, subobject_name, ct number_of_blocks FROM DBA_OBJECTS o, (select /*+ no_merge */ objd, count(*) ct from v$bh where objd < power(2,22) and status != 'free' group by objd ) bh WHERE o.data_object_id = bh.objd AND o.owner != 'SYS' ORDER BY ct desc, o.owner, o.object_name, o.subobject_name ;
Of course, you could then argue that if you’re going to play with v$bh, you should be brave enough to use obj$ and user$, rather than joining to an over-burdened dba_objects – eliminating ‘SYS’ on the understanding that ‘SYS’ has a user id of zero.
But that’s left as an exercise to the reader (and while you’re at it, you might as well use x$bh so that you don’t have to waste resoures doing the decode() that gets you the status, and I wonder if there’s any efficient point where you can adjust the code to avoid reporting objects which have been dropped but are still in the recycle bin if you’re on 10g).
Just as an afterthought – even if you do make this query produce the correct results efficiently, what is it telling you ? After all, there may be multiple copies of each data block in the buffer**, so do you want to know about distinct blocks, or all blocks, and is there any significance in the two possible approaches ?
** In principle the parameter _db_block_max_cr_dba suggests that the limit should be 6 copies per block. I believe the parameter first appeared in version 6 of Oracle but there still seem to be cases where it doesn’t quite work, even in the very latest versions of Oracle.