In one of those little coincidences that seem to happen so often a question came up on the comp.databases.oracle. server news group that has an answer in a blog note I’d written just a few of days earlier . The question was simply asking for a way of counting the number of rows in each partition of a partitioned table.
Of course, there are lots of ways of doing this, depending on how much time you want to write the code, how efficient you want the code to be, how generic you want the code to be, and so on; but here’s a little script that I wrote in about five minutes based on the ‘partition marker’ that appeared in recent versions of Oracle.
rem rem Script: pt_count.sql rem Dated: Nov 2009 rem Author: Jonathan Lewis rem rem Simple script to count the number of rows in each partition rem of a partitioned table. This is not going to be fast if the rem table is very large. rem rem The number of objects in the schema will also affect the rem performance since the query includes a call to instantiate rem a complete list of them. rem rem Change the following define to supply the target table name rem define m_tab_name = pt_range select obj.object_name, obj.subobject_name, rct.row_count from ( select /*+ no_merge */ object_name, subobject_name, data_object_id from user_objects where object_type like 'TABLE%' and subobject_name is not null ) obj, ( select /*+ no_merge */ dbms_mview.pmarker(rowid) data_object_id, count(*) row_count from &m_tab_name -- this is where your table-name goes. group by dbms_mview.pmarker(rowid) ) rct where rct.data_object_id = obj.data_object_id order by obj.object_name, obj.subobject_name ;
It’s very simple, and also happens to demonstrate a good use of the /*+ no_merge */ hint. (I could have used subquery factoring with the /*+ materialize */ hint to similar effect).
My partitioned table is called pt_range and I’ve had to include its name in the query. The query is then written in two halves; the first half finds the names and data_object_ids of all the partitions (or subpartitions) of tables in the user’s schema (and I could have put in a restriction on object_name at that point, of course). The second half counts rows by data_object_id (which is what the dbms_mview.pmarker(rowid) function returns. Then it’s simply a question of joining the two result sets (using an outer join if you want to list partitions with no rows) and reporting the desired columns in the order you like.
I’ve used the no_merge hint to allow Oracle to do the aggregation early – at the most efficient point possible; the alternative would be to do the join and then aggregate on all the other columns reported, but would be a waste of resources. The no_merge hint is probably redundant in the query against the user_objects view – but for a quick hack it’s quite sensible to block the optimizer from trying to get too clever with the code you’re writing.
If you don’t approve of using the dbms_mview.pmarker() function (because of the reliance on the undocumented fact that it returns the data_object_id) you could use the dbms_rowid.rowid_object() function which also returns the data_object_id (despite its name) for a given rowid.
The code isn’t fast, though – so you might want to go one better and trying use a substr() on the rowid to extract the encoded data_object_id, aggregate on that, and then write your own function to turn the encoded form into a decimal form.
Bear in mind, though, whatever you do Oracle will have to scan the entire table and call whichever function you apply on every single row, and then aggregate across the entire result set. So even though the code looks quite clean and simple, it’s not efficient.
If efficiency combined with flexibility is important you might be better off writing a (pipelined) pl/sql function that takes a table-name as an input and loops through each partition in turn with a simple count(*).
Update – Feb 2016
It is a measure of how rarely I feed the need to do an accurate count of rows per partition that I didn’t realise until a few days ago that when you are using interval partitioning the order in which the object_id is generated need not be the same as the order in which the partition high_value needs to be seen (Oracle will create partitions when needed, but the object_id is generated by a sequence). So I had to enhance the code to includ the partition and subpartition positions to report the segments by “data order”. (Given that the the generated names don’t look particularly meaningful I’m not sure that this will be generally useful, though).