Sitting in an airport, waiting for a plane, I decided to read a note (pdf) about Flashback data archive written by Beat Ramseier from Trivadis. I’d got about three quarters of the way through it when I paused for thought and figured out that on the typical database implementation something nasty is going to happen after approximately 3 years and 9 months. Can you guess why ?
It’s all about smon_scn_time – which normally records one row every five minutes (created by smon) with a continuous cycle of 24 hours – typically giving you about 1,440 rows in the table. The table is in a cluster, and the cluster key is the instance (thread) number. Clearly this was originally a clever idea from someone who realised that a cluster key of thread number would be beneficial if you had a RAC system with multiple instances – each instance gets its own blocks and the data for any one instance is as well clustered as possible.
The trouble is, when you enable flashback data archive smon no longer sticks to a 24 hour cycle, it just keeps adding rows. Now on my 8KB block tablespace I see 6 rows per block in the table/cluster – which means I get through 48 blocks per days, 17,520 blocks per year, and in 3 years and 9 months I’ll get to roughly 65,700 blocks – and that’s the problem. An index entry in a cluster index points to a chain of cluster blocks, and the last two bytes of the “rowid” in the index entry identify which block within the chain the cluster key scan should start at – and two bytes means you can only have 65,536 blocks for a single cluster key.
I don’t know what’s going to happen when smon tries to insert a row into the 65,535th (-ish) block for the current thread – but it ought to raise an Oracle error, and then you’ll probably have to take emergency action to make sure that the flashback mechanisms carry on running.
Although oraus.msg indicates that it’s an error message about hash clusters it’s possible that the first sight will be: Oracle error: “ORA-02475 maximum cluster chain block count of %s has been exceeded”. If you’re using a 16KB block size then you’ve got roughly 7.5 years, and 32KB block sizes give you about 15 years (not that that’s a good argument for selecting larger block sizes, of course.)
Searching MoS for related topics (smon_scn_time flashback) I found doc ID: 1100993.1 from which we could possibly infer that the 1,440 rows was a fixed limit in 10g but that the number of rows allowed in smon_scn_time could increase in 11g if you enable automatic undo management. I also found a couple of bugs relating to index or cluster corruption – fixed by 18.104.22.168, though.