Oracle 11g dbms_stats AUTO sample size tips
Oracle Database Tips by Donald BurlesonJanuary 23, 2015
See this link for a complete treatment of
11g enhancements to statistics collection.
Providing the Oracle cost-based optimizer with information
about dynamic tables has always been a problem. In large databases, analyzing
50% of the rows in a table could take many hours, and in a dynamic environment
the changes to the table will make the dbms_stats metadata sub-optimal,
no matter how often we re-analyze CBO statistics. This was one reason for
implementing the 11g new feature of dynamic sampling, but Oracle has improved
dbms_stats again in Oracle 11g.
This article notes that 11g has improved the AUTO parameter in dbms_stats.
?A fixed sampling percentage size
that was good at some point in time may not be appropriate after the data
distribution in the table has changed.
On the other hand when AUTO value
is used Oracle will adjust the sample size when the data distribution changes.?
An issue of skew
A one-sized-fits all approach to sample sizes has issues,
especially when the sample size is too small to recognize a skewed distribution
of values. Taking too small a sample from a table with skewed column values can
cause the SQL optimizer to generate sub-optimal execution plans:
?When AUTO is used Oracle picks a
sample size where the statistics quality is good enough. However, it does not
work very well under cases of extreme skew in the data.?
Choosing the right sample size is not limited to Oracle CBO
statistics; it's an issue with sampling a statistically representative sample
from any universe.. This issue has been widely studied by statisticians across
America, and it's nice to see that Oracle has chosen to change the AUTO sample
size to be more robust:
?In Oracle 11g, we improved the
behavior when the AUTO value is used. First, AUTO sampling now generates
deterministic statistics. Second, and more importantly, AUTO sampling generates
statistics that are almost as accurate as 100% sampling but takes much less time
than 100% sampling.?
Bottom line: The Oracle 11g dbms_stats AUTO sample size
runs 10 times faster than 100% sampling and yields the statistics that
asymptotically approach a time-consuming 100% sample.
If you like Oracle tuning, you
might enjoy my book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
You can buy it direct from the publisher for 30%-off and get instant
access to the code depot of Oracle tuning scripts.