 |
|
Fixing unexpected performance degradation caused by sub-optimal optimizer
statistics
Oracle Database Tips by Donald Burleson |
Question:
"My performance goes from good-to-bad as my data changes and
my underlying execution plans change.
How do I get a guaranteed
optimal execution plan for my SQL?"
This is a
very important and common question relating to performance
instability that is the result of Oracle's semi-automated
mechanisms to refresh the SQL optimizer statistics:
-
Dynamic sampling - When enabled, SQL
may frequently change execution plans based on a quickie sample of the data
distributions.
-
Automatic 10g statistics refreshing -
The default value for refreshing statistics is
For any given SQL statement in
your database, does there exist one, and only one, optimal
execution plan? If you can generalize this for your whole
database, then you are among the majority of shops with immutable
access paths. If so, you may want to consider archiving your optimal execution
plans and use them forever, regardless of future changes to the
data distribution.
More Details >> We are running a 10gR2 RAC database that
is hosting OLTP application and we refresh our statistics on a
daily basis.
First, ask yourself; "Do I need to
re-analyze my CBO statistics every day? Do my tables exceed
the statistics refresh threshold (over 10% changed?) daily?
I am
doing it as a "best practice?" Remember, the ONLY reason for
analyzing your statistics is to CHANGE your SQL execution plans.
What do I do to guarantee an always-optimal SQL execution plan
for my whole database?"
Theoretically, any "lag" in the updating of CBO statistics may
result in "stale" statistics and possible failure to "change the
execution plan" to accommodate the chance to the data. The
classic example goes like this:
Consider an order processing table
that is frequently accessed by a state_name column which
contains 50 possible distinct values. Depending on the
specific value of the state_name (as specified in the SQL
WHERE clause), the optimizer may need to frequently change the
execution plan depending on the state_name value.
-
A popular value (select . . WHERE
state_name = California") might rightfully invoke a full-table scan
-
A low popularity value (select . . . WHERE
state_name = 'Idaho') would be better served with an index scan.
In cases of reentrant SQL within
the library cache (e.g. SQL with host variables), we have the "optimizer
peeking" feature of cursor_sharing=similar.
More Details >> The
problem is that there are a few large, volatile tables (lots of
inserts and updates).
Lots of inserts and updates are better than lots of DELETE's,
which cause table and index fragmentation (i.e. logically
deleted leaf nodes). Lots of INSERT/UPDATE will only extend the
table in a uniform fashion.
More Details >> As the day progresses, the optimizer
starts ignoring some of the indexes and forces full-table scans.
OK, what do we know for 100% certain? The execution plans
changed, right?
So, what makes an execution plan change? Usually, defaultish
features like Dynamic Sampling or not disabling the "automatic"
statistics collection mechanism in 10g.
More Details >> As a temporary work around, we
refresh the stats on these specific tables throughout the day,
but this is not really a solution we want to stick with due to
the load that gathering the stats puts on the system.
Remember, it's perfectly acceptable to save and re-use your
statistics. Lots of shops undertake to create a set of stats
that work well for all SQL, save it, and rest-assured that the
execution plans will not change. You can also export optimized
statistics to your TEST and DEV instances, so that the
developers will have their SQL optimized as-if it was on the
PROD database.
For more details on saving your and re-using optimal statistics,
see my OTN article on
SQL Best
Practices.
The nature of data distribution
There are two general types of Oracle shops. We have the 80% who
have uniformly distributed data. Large tables remain large, and
the distribution of values within a column remain unchanged.
On the other hand, we have roughly 20% of databases that
experience highly volatile data loads, where tables are small on
one day and huge the next, or cases where the is a "difference
that makes a difference". In these databases, huge changes in
the tables data (usually associated with high DML) changes the
distribution of data values, necessitating a re-analysis of
column histograms.
Histograms are critical to the CBO
decision to choose an optimal table join order (hence the
popularity of the ORDERED hint).
My advise, unless you are the rare database that does not have
immutable execution plans, it a best practice to take a single,
deep sample, determine the optimal placement of column
histograms, and then save it as the "optimal" execution plan,
once and forever.
It's the same concept as
"optimizer plan stability" (stored outlines), which are used to
firm-up execution plans during upgrades, where we want to test
the database with our "old" execution plans before allowing the
new release of Oracle to determine "new", and possibly
sub-optimal execution plans.
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|