So much for my belief that I’d have some quiet time for catching up with a little internet gossip while attending Hotsos 2009.
The days were busy and I crashed out at about 7:00 pm (local time) each evening and was asleep by 7:30 pm and up by 2:00 a.m – then I spent the morning (until breakfast) writing up notes I had taken the day before. So hardly a moment for blogging or answering questions on OTN.
Best topic of Hotsos (for me, at any rate – there were lots of very good presentations): Amit Poddar’s presentation on the new “approximate NDV” mechanism that Oracle 11g uses to do a one pass, accurate, estimate of the number of distinct values in a column (no more massive sorts for “count(distinct)” and how it manages to keep a “synopsis” for each partition of a partitioned table so that there is no need to scan the entire table when you need to recalculate statistics for a single partition.
Amit Poddar’s website is no longer online, but he has given me permission to publish the material, so here are the links to a pair of pdf files: the presentation (1.6MB), and the white paper (3.45MB).
The mathematics is brilliant, and I’m going to have to review my previous strategy for stats collection as a consequence.
The upside to starting the day at 2:00 am in Dallas, by the way – no jet lag when I got home !
Update Dec 2010:
Just in from an OTN thread and Greg Rahn; there is a bug relating to synopses and approximate NDV that shows up with partitioned tables and incremental stats – leading to very long stats collection time. The bug number is 8310339, and Greg Rahn recommends applying the fix for bug 8719831.
Update Jan 2012:
A recent OTN note highlights a problem with the Approximate NDV code when collecting statistics on external tables. MOS notes 1290722.1 or 1305127.1 are relevant.