 |
|
Oracle 10g adaptive threshold & automated corrective actions
Oracle Database Tips by Donald Burleson |
Ever since Oracle 9i
first allowed you to create a self tuning database, I've been
interested in
automating Oracle tuning, essentially programming the
adaptive threshold alert
rules and corrective actions. in my book "Creating
a Self-Tuning Oracle Database" ($9.95) I discuss scripting
automated tuning actions, and I have more elaborate scripts in
my 10g tuning book.
 |
I discuss exception (threshold) based tuning in my book "Oracle
Tuning: The Definitive Reference", over 900 pages
of BC's favorite tuning tips & scripts.
Oracle OEM provides alert thresholds for all of the
common metrics, but until 10g r2 you could not program
automatic corrective actions. |
One of my favorite features of
Oracle 10g OEM is the ability to create
thresholds for server metrics such as CPU and RAM
consumption. Prior to this functionality you had to write
your own
tools for performing Oracle threshold alerts and automated
corrective actions. Let's take a look at the new adaptive
threshold monitoring in 10g release 2, and see how you can do it
in earlier releases of Oracle.
Automatic adaptive
threshold monitoring
To understand
the adaptive threshold collection mechanism I recommend this
outstanding article. Also, the
Oracle documentation on adaptive thresholds notes that the
OEM interface has been enhanced to with more sophisticated
threshold metrics, now allowing two ways to monitor exception
thresholds for alerts:
-
Significance Level - This uses
metric percentages to trigger the alert, e.g. alert when
physical reads exceeds 95% of the threshold value.
-
Percentage of Maximum - Oracle
alerts based on your pre-defined threshold number.
"Once metric baselines are defined, they
can be used to establish alert thresholds that are statistically
significant and adapt to expected variations across time. For
example, you can define alert thresholds to be generated based
on significance level, such as the HIGH significance level
thresholds are values that occur 5 in 100 times.
Alternatively,
you can generate thresholds based on a percentage of the maximum
value observed within the baseline period. These can be used to
generate alerts when performance metric values are observed to
exceed normal peaks within that period."
This
Oracle document explains adaptive thresholds in plain
English and notes that the threshold will adjust according to
database load:
"Adaptive baselines help to
significantly improve the accuracy of performance alerting.
The adaptive baselines allow for improved manageability
since the thresholds adapt to changes in usage and load, and
the thresholds do not need to be reconfigured with these
changes in load patterns."
Here is a query to see the adaptive
threshold values from the data dictionary:
select
metric.metric_name,
metric.metric_unit,
parm.threshold_method,
parm.num_occurrences,
parm.warning_param,
parm.critical_param,
stat.compute_date,
stat.sample_count,
stat.average,
stat.minimum,
stat.maximum,
stat.sdev,
stat.pctile_25,
stat.pctile_50,
stat.pctile_75,
stat.pctile_90,
stat.pctile_95,
stat.est_pctile_99,
stat.est_pctile_999,
stat.est_pctile_9999
from
v$metricname
metric,
dbsnmparm.mgmt_bsln_statistics
stat,
dbsnmparm.mgmt_bsln_threshold_parms parm,
dbsnmparm.mgmt_bsln_datasources
ds,
dbsnmparm.mgmt_bsln_baselines
b
where
stat.bsln_guid = parm.bsln_guid
and
parm.bsln_guid = b.bsln_guid
and
parm.datasource_guid = ds.datasource_guid
and
stat.datasource_guid = ds.datasource_guid
and
ds.metric_id
= n.metric_id;
It's important to remember that this
adaptive threshold alert functionality has been around since the
earliest days of Oracle and it's
good to see that it has finally been placed into OEM.
Here are my notes on exception threshold based Oracle tuning:
Here we see some of the metrics allowed for
adaptive threshold monitoring in 10g r2, small, but it's a
start:

When setting the adaptive thresholds you can
choose the time period and the time grouping. The larger
the time grouping, the less granularity:

I first publishing about the display of
metric signatures by hour-of-the-day and day-of-the-week in my
book "High Performance Oracle tuning with STATSPACK", and I'm
glad to see that signature analysis has been incorporated into
the Oracle engine.
Many DBA's prefer to use the
Ion tool for this type
of predictive modeling because it is more mature and robust than
OEM:

Just like Ion, Oracle 10g release 2 OEM
now allows you to display your
adaptive threshold metrics over time, in a more
primitive method:

This ability to view repeating signatures
is an extremely valuable proactive Oracle tuning technique.
Here are my notes on proactive signature analysis:
Corrective Actions in Oracle 10g release 2
The Oracle documentation states that Oracle
is moving toward a full self-tuning database, with automatic
corrective actions being triggers when the threshold is
exceeded. However the corrective actions are brand new,
and I have scripts for creating a self-tuning database in my
book "Creating
a Self-Tuning Oracle Database" (only $9.95) with enhanced
details on adaptive threshold monitoring and automatic
corrective actions in my 10g Tuning book. My notes on
automated corrective actions describe the various methods
for automating Oracle tuning.