Oracle UNIX Monitoring Database Writers with STATSPACK Administration

Oracle UNIX/Linux Tips by Burleson Consulting

Monitoring the Database Writers with STATSPACK

We can begin our journey by looking at the stats$sysstat table. There are numerous statistics that STATSPACK keeps in this table that provide information about the DBWR behavior.

sql> select distinct name from stats$sysstat where name like ?DBWR%?

NAME
----------------------------------------------------------------
DBWR Flush object call found no dirty buffers
DBWR Flush object cross instance calls
DBWR buffers scanned
DBWR checkpoint buffers written
DBWR checkpoint write requests
DBWR checkpoints
DBWR cross instance writes
DBWR free buffers found
DBWR incr. ckpt. write requests
DBWR lru scans
DBWR make free requests
DBWR revisited being-written buffer
DBWR skip hot writes
DBWR summed scan depth
DBWR timeouts
DBWR transaction table writes
DBWR undo block writes

Most of these values are of no interest, but a few of them are quite useful. Let?s look at the functions of some of the useful values:

* DBWR checkpoints?This is the number of checkpoint messages that were sent to the DBWR from Oracle. During checkpoint processing, the log writer hands over to the DBWR a list of modified blocks that are to be written to disk.

* DBWR buffers scanned?This is the number of buffers looked at when scanning for dirty buffers to write to the database. This count includes all inspected buffers, including both dirty and clean buffers.

* Summed dirty queue length?This is the sum of the queue length after every write request has completed.

* Write requests?This is the total number of write requests that were made by Oracle to the database writers.

The main task is determining if the default configuration for the database writers is sufficient for your database. The summed dirty queue length and write requests are the two metrics in STATSPACK that are useful for measuring the efficiency of the DBWR background processes.

By dividing the summed dirty queue length by the number of write requests, you can get the average length of the queue following the completion of the write.

The following STATSPACK query will measure the dirty queue length for the time period between each snapshot. Any value above 100 indicates a shortage of DBWR processes.

rpt_dbwr_alert.sql
-- Written by Donald K. Burleson 1/25/01

set pages 999;

column c1 heading "Write request length" format 9,999.99
column c2 heading "Write Requests" format 999,999
column c3 heading "DBWR checkpoints" format 999,999
column mydate heading 'Yr. Mo Dy Hr.' format a16

select distinct
   to_char(snap_time,'yyyy-mm-dd HH24') mydate,
   a.value/b.value                      c1,
   b.value                              c2,
   c.value                              c3
from
   stats$sysstat a,
   stats$sysstat b,
   stats$sysstat c,
   stats$snapshot sn
where
   sn.snap_id = a.snap_id
and
   sn.snap_id = b.snap_id
and
   sn.snap_id = c.snap_id
and
   a.name = 'summed dirty queue length'
and
   b.name = 'write requests'
and
   c.name = 'DBWR checkpoints'
and
   a.value > 0
and
   b.value > 0
and
   a.value/b.value > 3
;

Here is the output from this report. Here we see that the average queue length is quite small, ranging from 2 to 5. According to Oracle, you should only become concerned if the average queue length after writes is more than 50 blocks.

Yr. Mo Dy Hr. Write request length Write Requests DBWR checkpoints
---------------- -------------------- -------------- ----------------
2000-12-25 01                    4.71         20,103           44,016
2000-12-25 02                    4.62         20,520           44,260
2000-12-25 03                    4.51         21,023           45,235
2000-12-25 04                    4.31         22,002           47,198
2000-12-25 05                    4.13         22,948           49,134
2000-12-25 06                    3.96         23,902           51,055
2000-12-25 07                    3.81         24,867           52,991
2000-12-25 08                    3.67         25,808           54,913
2000-12-25 09                    3.54         26,731           56,797
2000-12-25 10                    3.42         27,667           58,673
2000-12-25 11                    3.31         28,618           60,622
2000-12-25 12                    3.20         29,580           62,544
2000-12-25 13                    3.10         30,524           64,489
2000-12-25 14                    3.01         31,492          66,418
2001-01-01 01                    4.70         13,492           31,992
2001-01-01 02                    4.37         14,481           34,007
2001-01-01 03                    4.09         15,486           36,032

We can easily extend the STATSPACK report to report on the average values, aggregated by hour of the day and day of the week. This will help the DBA identify trends in database write activity. Below is an example of the STATSPACK script that averages the queue length values by hour of the day:

rpt_dbwr_hr.sql
set pages 999;

column c1 heading "Write request length" format 9,999.99
column c2 heading "Write Requests" format 999,999
column c3 heading "DBWR checkpoints" format 999,999

select distinct
   to_char(snap_time,'HH24')            mydate,
   avg(a.value/b.value)                      c1
from
   stats$sysstat a,
   stats$sysstat b,
   stats$snapshot sn
where
   sn.snap_id = a.snap_id
and
   sn.snap_id = b.snap_id
and
   a.name = 'summed dirty queue length'
and
   b.name = 'write requests'
and
   a.value > 0
and
   b.value > 0
group by
   to_char(snap_time,'HH24')
;

Here is the output from this script. We can now easily take this output and plot a graphical representation on the data from an Excel spreadsheet.

             Hr. Write request length
---------------- --------------------
00                               1.11
01                               2.60
02                               2.51
03                               2.43
04                               1.99
05                               1.91
06                               1.84
07                               1.55
08                                .96
09                                .98
10                                .80
11                                .75
12                                .76
13                                .74
14                                .74
15                                .71
16                                .61
17                                .99
18                                .97
19                                .93
20                                .86
21                                .89
22                                .86
23                                .95

Here we see that the DBWR is busiest in the early morning hours between midnight and 8:00 A.M. This is because this database does its batch updates during this processing window.

We can slightly alter the above script and aggregate the average queue length, summarized by the day of the week. Below, we take the averages and group them by day.

rpt_dbwr_dy.sql
Set pages 999;

column c1 heading "Write request length" format 9,999.99
column c2 heading "Write Requests" format 999,999
column c3 heading "DBWR checkpoints" format 999,999
column mydate heading ?Day of Week?

select distinct
   to_char(snap_time,'day')              mydate,
   avg(a.value/b.value)                      c1
from
   stats$sysstat a,
   stats$sysstat b,
   stats$snapshot sn
where
   sn.snap_id = a.snap_id
and
   sn.snap_id = b.snap_id
and
   a.name = 'summed dirty queue length'
and
   b.name = 'write requests'
and
   a.value > 0
and
   b.value > 0
group by
   to_char(snap_time,'day')
;

Here is the output. Again, it is simple to create a graph from this output.

Day of week      Write request length
---------------- ---------------------------
friday                               .18
monday                           2.31
saturday                            .02
sunday                            1.96
thursday                          1.53
tuesday                             .43
wednesday                        .10

Figure 4-3 shows the graph. Here, we see that the overall efficiency of the database writer is fine, but the peak write times are on Monday Wednesday and Sunday.

Please insert STATSPACK book figure 9-14 here

Figure 3: Average queue length after write completion by day of week

In summary, the database writer processes will work fine for most all Oracle databases without modification. However, when you detect that the summed dirty queue length is too high, you can look at increasing the number of database writer processes.

Next, let?s examine Oracle file organization techniques.

Oracle File Organization Techniques

Regardless of whether or not you use RAID, it is very important for the Oracle DBA to identify all high-volume and high-activity tables and move them into isolated tablespaces. By keeping the high-volume tables in a separate tablespace, the Oracle administrator can manipulate the datafiles in the tablespace to minimize I/O contention on the disk, as shown in Figure 4-4.

Figure 4: Segregating Oracle tables into separate tablespaces

Without segregation, some tablespaces may have hundreds of tables and indexes, and it is impossible to tell which objects are experiencing the high I/O. The stats$filestatxs table will provide details about read and write I/O at the file level, but it is often difficult to tell the tables that are causing the high I/O because a file may contain many objects.

With segregation, the DBA can generate STATSPACK file I/O reports from the stats$filestatxs table that show all read and write I/O for the datafile. If the Oracle datafile maps to only one table, we can easily see the total I/O for the table. Later in this chapter we will see a STATSPACK script called rpt_io_pct.sql that serves this purpose.

Because of the high-speed transaction-oriented nature of most Oracle applications, we generally see high activity in specific areas within each tablespace. While this chapter describes the basics of I/O load balancing for Oracle datafiles and tablespaces, the settings for the individual tables will also have a profound influence on the performance of the entire database. We will look at disk I/O within Oracle and examine several areas:

* Transient disk hot spots

* Disk I/O patterns within highly active tables