 |
|
Proactively Monitor Oracle performance via scripts and queries
Oracle Tips by Burleson Consulting
March 11, 2003, Updated March 22, 2007 |
With over
250 configuration parameters and thousands
of metrics to monitor, it’s no small task
for Oracle administrators to monitor the
overall health of their Oracle databases.
For a complete set of pre-tested Oracle
scripts, see the
Oracle script collection, 670 Oracle
scripts, designed for the senior Oracle
DBA. In Oracle 10g, we also have
the Automated Workload Repository (AWR),
a historical workload monitor, where you
can use Oracle scripts to
monitor
Oracle performance.
Oracle offers a variety of tools for
performance monitoring, but there are
quite a few of these as well. To be able
to effectively monitor the health of your
Oracle database, you’ll need to be
familiar with the following scripts and
queries:
- Data Buffer Alert
reports times when the data buffer hit
ratio falls below a preset threshold.
- Redo Log Space Requests Alert
can indicate trouble if the number of
requests are greater than zero. You may
want to increase the log_buffer
parameter if that occurs.
- Shared Pool Contention Alert
tells you when there is contention
within the shared pool and
locking-related problems.
- System Waits Alert query
interrogates the Oracle event structures
to locate events where there are
excessive waits due to contention.
- Library Cache Misses Alert
query looks for excessive library cache
miss ratios. When the library cache miss
ratio is greater than .02, you may want
to increase shared_pool_size as a
remedy.
- Database Writer Contention Alert
looks for values in summed dirty queue
length, write requests, and Database
Writer Utility (DBWR) checkpoints. When
the write request length is greater than
three or your DBWR checkpoint waits, you
need to look at tuning the database
writer processes.
- Data Dictionary Miss Ratio Alert
script will alert you to times when
requests for data dictionary metadata
are high. You can sometimes relieve the
problem by increasing the
shared_pool_sizeinit.ora parameter.
- Data Dictionary Object Alert
report can reveal internal contention
with the Oracle data dictionary and
times of high dictionary metadata
requests.
Take a closer look at Oracle scripts for
monitoring
Let’s take a closer look at how these
scripts work. The STATSPACK utility takes
time-based Oracle tuning information and
records it in over a dozen tables. These
table names mirror the v$ internal Oracle
views and have names like stats$sysstat
and stats$sql_summary. Knowing
this, you can write simple Oracle queries
that will show you trend-based performance
information. You can then take this
performance information and feed it into
predictive models, such as linear
regressions, which will accurately tell
you the most appropriate times to change
the internal structure of your System
Global Area (SGA).
Here is an example of the use
of this performance information. This
script produces a running total of the
library cache miss ratio over time and
references the stats$librarycache
table.
set lines 80;
set pages 999;
column mydate heading 'Yr. Mo Dy Hr.' format a16
column c1 heading "execs" format 9,999,999
column c2 heading "Cache Misses|While Executing" format 9,999,999
column c3 heading "Library Cache|Miss Ratio" format 999.99999
break on mydate skip 2;
select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
sum(new.pins-old.pins) c1,
sum(new.reloads-old.reloads) c2,
sum(new.reloads-old.reloads)/
sum(new.pins-old.pins) library_cache_miss_ratio
from
stats$librarycache old,
stats$librarycache new,
stats$snapshot sn
where
new.snap_id = sn.snap_id
and
old.snap_id = new.snap_id-1
and
old.namespace = new.namespace
group by
to_char(snap_time,'yyyy-mm-dd HH24');
The output of the script, indicating that
you need to schedule additional RAM memory
for the shared_pool_size during
this period, either via a cron job
or dbms_job, appears below:

In Oracle 10g, this
query becomes easier. Let's see a sample custom AWR query by
starting with a simple query to plot the 'user I/O wait time' statistic for each
AWR snapshot. From phys_reads.sql script we can see that it is easy to
extract the physical read counts from the AWR.break on begin_interval_time skip 2
column phyrds format
999,999,999
column begin_interval_time format a25
select
begin_interval_time,
filename,
phyrds
from
dba_hist_filestatxs
natural join
dba_hist_snapshot
;
Below we see a running total of Oracle physical reads from phys_reads.sql.
Note that the snapshots are collected every half-hour in this example, and many
DBAs will increase the default collection frequency of AWR snapshots. Starting
from this script, we could easily add a where clause criteria and create a
unique time-series exception report.
SQL> @phys_reads
BEGIN_INTERVAL_TIME
FILENAME PHYRDS
------------------------- ---------------------------------------- ------
24-FEB-04 11.00.32.000 PM E:\ORACLE\ORA92\FSDEV10G\SYSTEM01.DBF 164,700
E:\ORACLE\ORA92\FSDEV10G\UNDOTBS01.DBF 26,082
E:\ORACLE\ORA92\FSDEV10G\SYSAUX01.DBF 472,008
E:\ORACLE\ORA92\FSDEV10G\USERS01.DBF
1,794
E:\ORACLE\ORA92\FSDEV10G\T_FS_LSQ.ORA
2,123
Here are some other notes on monitoring with 10g AWR
scripts:
Dynamic performance
reconfiguration
Table A gives a high-level view of
some of the major events that are used to
trigger a dynamic tuning reconfiguration.
For the purpose of illustration, I will
focus only on the major areas within the
SGA appearing in the table.
Table A
|
RAM Area |
Too small condition
|
Too large condition
|
|
Shared pool |
Library cache misses |
No misses |
|
Data buffer cache |
Hit ratio < 90% |
Hit ratio > 95% |
|
PGA aggregate |
High multipass executions
|
100% optimal executions
|
|
Major reconfiguration
triggers
Obviously, a high amount of library cache
misses indicates that the shared pool is
too small, while a data buffer hit ratio
of less than 90 percent for any one of
Oracle’s seven data buffer pools indicates
that you should take memory from other
database regions and reallocate it to the
data buffers. For sorting activity, you’ll
take a look at the percentage of optimal
executions within the Program Global Area
(PGA), and increase the value of the PGA
aggregate target parameter whenever you
experience less than 95 percent optimal
executions for sorting operations.
While the rules for the data buffer caches
and shared pool sizing are
straightforward, the new
pga_aggregate_target parameter
warrants further investigation. As a
general rule, you’ll look at changing the
value for pga_aggregate_target when
the following occurs:
- Whenever the value of the
v$sysstat statistic estimated PGA
memory for one-pass exceeds
pga_aggregate_target, then you’ll
want to increase pga_aggregate_target.
- Whenever the value of the
v$sysstat statistic workarea
executions—multipass is greater than
1 percent, the database may benefit from
additional RAM memory.
You can overallocate PGA memory and you
may consider reducing the value of
pga_aggregate_target whenever the
value of the v$sysstat row
workarea executions—optimal
consistently measures 100 percent.
As you can see, the proactive monitoring
of an Oracle database can get quite
complicated. With hundreds of metrics and
parameters to monitor and reset, Oracle
tuning can be extremely challenging. But
with knowledge of Oracle’s performance
measures and major reconfiguration
triggers, you can begin to sort things
out.
Also see these related notes on
using Oracle scripts:
 |
If you like Oracle tuning, see the book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
scripts.
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |