Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Oracle UNIX STATSPACK Report on Specific I/O Activity Administration

Oracle UNIX/Linux Tips by Burleson Consulting

A STATSPACK Report on Specific I/O Activity

If the DBA is prudent in segregating Oracle objects into distinct tablespaces and datafiles, STATSPACK can be used to create extremely useful reports that show individual I/O or selected datafiles or groups of related datafiles.

The script below accepts a filename ?mask? that can be used to report on selected groups of related datafiles. For example, if we have named our customer-related datafiles customer.dbf, custhistory.dbf, and custorders.dbf, the following script can be run to report on all datafile names that contain the string ?cust?. In the example below, we execute the script with the filename mask to see the I/O history for these datafiles.

rpt_file_io.sql
set pages 9999;

column snapdate format a16
column filename format a40
column mydate heading 'Yr. Mo Dy Hr.' format a16

select
   to_char(snap_time,'yyyy-mm-dd') mydate,
--   old.filename,
   sum(new.phyrds-old.phyrds)   phy_rds,
   sum(new.phywrts-old.phywrts) phy_wrts
from
   perfstat.stats$filestatxs old,
   perfstat.stats$filestatxs new,
   perfstat.stats$snapshot   sn
where
   new.snap_id = sn.snap_id
and
   old.filename = new.filename
and
   old.snap_id = sn.snap_id-1
and
   (new.phyrds-old.phyrds) > 0
and
   old.filename like '%&1%'
group by
   to_char(snap_time,'yyyy-mm-dd'),
   old.filename
;

Here is the output from this script, showing total read and write I/O per day for our cust datafiles:

2000-12-12              833       2770
2000-12-13                6          9
2000-12-14                2         80
2000-12-15                2         26
2000-12-16                2          4
2000-12-17                2          3
2000-12-18                7        226
2000-12-19               87        556
2000-12-20              141        640
2000-12-21               26        452
2000-12-22               45        368
2000-12-23               10        115
2000-12-24                3         14
2000-12-25                5         54
2000-12-26              169        509
2000-12-27               14        101
2000-12-28               25        316
2000-12-29               13        132
2000-12-30                7        158
2000-12-31                2        129
2001-01-01                4        264
2001-01-02               57        756
2001-01-03               56        317
2001-01-04             1110        123
2001-01-05             1075        386
2001-01-06               20        293
2001-01-07                1          6
2001-01-08              955      1774
2001-01-09              247       1145
2001-01-10              538       1724
2001-01-11              387       1169
2001-01-12             1017       1964
2001-01-13              115        397
2001-01-14               89        443
2001-01-15               22        125
2001-01-16             1267       1667
2001-01-17              646       2082
2001-01-18              588       2359
2001-01-19               46        296

Once gathered, this data can be graphed (see Figure 4-14) to see the detailed activity of the tables and indexes within these datafiles.

Figure 14: File I/O for a selected subset of Oracle datafiles

Often, the graphical representation of the data is more useful, because the unique I/O signature of the data becomes obvious.

Next, let?s examine some STATSPACK tools that can be used to identify potential disk bottlenecks.

A STATSPACK Script to Identify Hot Datafiles

The first step in balancing the load on disks is to find out where they're out of balance by identifying possible bottlenecks. Start by identifying hot disks?those with a disproportionate amount of activity. For example, if one disk in a ten-disk system were experiencing 50 percent of the I/O, measured as the number of reads, writes, or both, you would consider the disk to be hot.

Detecting I/O-Related Slowdowns in AIX

If you are using the IBM AIX operating system, it is easy to detect when a database server may be experiencing I/O slowdowns. An I/O bound database server is usually evidenced by a high value in the wa (wait) column of the UNIX vmstat utility. For example, in the output below we see that 45 percent of the CPU time is being used waiting for database I/O:

Prompt> vmstat 5 1

kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
r b   avm   fre re pi po fr   sr cy in   sy cs us sy id wa
0 0 217485   386 0   0   0   4   14   0 202 300 210 14 19 22 45

The Approach to Locating Hot Disks

For other operating environments, we are concerned whenever we see a backlog of I/O tasks waiting to access data on a single disk. For other operating systems, the iostat utility can be used to detect I/O issues.

Once you've identified the hot disks, look closely to find out which files and tables on the disks experience most of the activity, so that you can move them to less-active disks as needed. The actual process of identifying hot files and disks involves running data collection utilities, such as STATSPACK and the UNIX iostat utility, and then using the collected I/O data to pinpoint the sources of excessive I/O measurements.

Here are the cardinal rules for disk I/O:

* There is a difference between a busy disk and a disk that is waiting for I/O to complete. In the next section we will explore the UNIX iostat utility and show how you can identify busy disks.

* If you are using RAID such as RAID 0+1, the Oracle data blocks will be spread randomly across all of the disks, and load will rise and fall in a uniform fashion.

* Senior Oracle DBAs often prefer not to implement RAID striping so that they have more control over the disk I/O subsystem.

* Many disk arrays such as EMC provide sophisticated disk monitoring tools such as Open Symmetrics Manager and Navistar. These tools report on more than simple disk waits, and highlight contention for disks, channels, and disk adapters.

Now that we understand the basic principles behind locating hot disks, ???

Viewing I/O Signatures with STATSPACK

You will find that your database will develop distinctive I/O signatures. The I/O signature for an OLTP database will be very different than that of a data warehouse, and you can use these I/O signatures to determine regular times when the disk I/O subsystem is overloaded. When we aggregate disk information by day of the week and hour of the day, we can see some very interesting patterns.

Note: When developing I/O signatures for your database, it is very important to begin at the global level and drill down for successive detail. For example, after running the global reports, you will find spikes in your database I/O during specific times. Your next step should be to isolate these I/O spikes to specific Oracle database files by closer inspection of the STATSPACK data, and running the rpt_io_pct.sql script to report on specific datafiles.

Let?s begin by taking a look at a STATSPACK script to average disk read and write activity by the day of the week.

Rpt_avg_io_dy.sql
set pages 9999;

column reads format 999,999,999
column writes format 999,999,999

select
   to_char(snap_time,'day'),
   avg(newreads.value-oldreads.value) reads,
   avg(newwrites.value-oldwrites.value) writes
from
   perfstat.stats$sysstat oldreads,
   perfstat.stats$sysstat newreads,
   perfstat.stats$sysstat oldwrites,
   perfstat.stats$sysstat newwrites,
   perfstat.stats$snapshot   sn
where
   newreads.snap_id = sn.snap_id
and
   newwrites.snap_id = sn.snap_id
and
   oldreads.snap_id = sn.snap_id-1
and
   oldwrites.snap_id = sn.snap_id-1
and
oldreads.statistic# = 40
and
newreads.statistic# = 40
and
oldwrites.statistic# = 41
and
newwrites.statistic# = 41
having
   avg(newreads.value-oldreads.value) > 0
and
   avg(newwrites.value-oldwrites.value) > 0
group by
   to_char(snap_time,'day')
;

The output from the script will take a running average by the day of the week and display the output as follows:

TO_CHAR(S        READS       WRITES
--------- ------------ ------------
friday              72        2,093
monday             221        8,896
saturday           211        5,869
sunday             160        5,056
thursday           338        7,232
tuesday            603       11,765
wednesday          316        7,781

This output can then be pasted into an Excel spreadsheet, resequenced, and displayed using the Excel Chart Wizard. In Figure 4-15, we see the I/O signature for an Oracle database. Note that this signature clearly shows peak write activity on Mondays, Tuesdays, and Wednesdays. From this signature, the Oracle DBA knows that this database is loaded during the first part of each week.

Figure 15: Average I/O signature by day of the week

Now, let?s see how easy it is to change this report to aggregate the data by the hour of the day. The script below is identical to the aggregate averages by day of the week except that the date format string has been changed from ?day? to ?HH24?.

rpt_avg_io_hr.sql
set pages 9999;

column reads format 999,999,999
column writes format 999,999,999

select
   to_char(snap_time,'HH24'),
   avg(newreads.value-oldreads.value) reads,
   avg(newwrites.value-oldwrites.value) writes
from
   perfstat.stats$sysstat oldreads,
   perfstat.stats$sysstat newreads,
   perfstat.stats$sysstat oldwrites,
   perfstat.stats$sysstat newwrites,
   perfstat.stats$snapshot   sn
where
   newreads.snap_id = sn.snap_id
and
   newwrites.snap_id = sn.snap_id
and
   oldreads.snap_id = sn.snap_id-1
and
   oldwrites.snap_id = sn.snap_id-1
and
oldreads.statistic# = 40
and
newreads.statistic# = 40
and
oldwrites.statistic# = 41
and
newwrites.statistic# = 41
having
   avg(newreads.value-oldreads.value) > 0
and
   avg(newwrites.value-oldwrites.value) > 0
group by
   to_char(snap_time,'HH24')
;

Now, when we execute this script we see the read and write averages displayed by the hour of the day. Again, we can paste this output into a spreadsheet and create a graphical representation, thereby getting a visual picture of the I/O signature.

TO        READS       WRITES
-- ------------ ------------
00          250        6,103
02          180        4,701
03          174        4,580
04          195        5,832
05          191        5,109
06          171        4,669
07          221        4,727
08          354        5,353
09          264        9,531
10          258        7,994
11          249        7,397
12          364        8,499
13          341        7,902
14          326        8,288
15          305       10,891
16          279        9,019
17          692       17,291
18          592       10,444
19          448        9,911
20          385        8,247
21          395       11,405
22          366        9,182
23          271        7,308

The graph in Figure 4-16 is a graphical representation of the I/O signature of physical reads by hour of the day. Here we see a clear daily trend where the read activity increases throughout the afternoon and a high peak of read activity every day at 6:00 P.M. This information can be extremely valuable to the Oracle DBA. In this example, the DBA could encourage the end-user community to direct their processing to periods before 5:00 P.M.

Figure 16: Oracle physical read activity averages by hour of the day

We can also plot the physical write activity in a graph, as shown in Figure 4-17 In this case, we see a gradual pattern of increasing writes to the database, peaking in the late afternoon. This pattern would be confirmed by an increase in the number of archived redo logs generated later in the day.

Figure 17: Oracle physical write activity averages by hour of the day

Now that we understand how to plot UNIX disk statistics, let?s wrap-up this chapter with a review of the major topics.