In UNIX you can control whether a file system uses
buffered or unbuffered IO. With Oracle the use of a buffered filesystem
is redundant and dangerous. An example of the dangers of a buffered
filesystem with Oracle is when power is lost. The buffer in a buffered
filesystem depends on the cache battery to provide enough power to allow
the buffer to be written to disk before the disk spins down. However,
many shops fail to monitor the cache battery lifetime limitations or
fail to change the batteries at all. This can result in loss of data in
a buffered filesystem on loss of power.
You can turn off buffered writes in several ways
(buffered reads aren?t an issue, but you should always use write-through
caching). One is to mount the filesystems used with Oracle files as
non-buffered using such options as:
AIX:
?dio?, ?rbrw?, ?nointegrity?
SUN:
?delaylog?, ?mincache=direct?, ?convosync=direct? ,?nodatainlog?
LINUX:
?async?, ?noatime?
HP: Use
VxFS with: ?delaylog?, ?nodatainlog?, ?mincache=direct?,
?convosync=direct?
Using Direct IO at the Oracle Level
For information about
Oracle direct I/O, refer to this URL by Steve Adams:
*
http://www.ixora.com.au/notes/filesystemio_options.htm
Checking Your Server
Methods for configuring the OS will vary depending
on the operating system and file system in use. Here are some examples
of quick checks that anyone can perform to ensure that you are using
direct I/O:
?
Solaris - Look for a "forcedirectio" option.
Oracle DBAs find this option often makes a huge difference in I/O speed
for Sun servers. Here is the Sun documentation:
http://docs.sun.com/db/doc/816-0211/6m6nc6713?a=view
?
AIX - Look for a "dio" option. Here is a
great link for AIX direct I/O:
http://www-106.ibm.com/developerworks/eserver/articles/DirectIO.html
?
Veritas VxFS - (including HP-UX, Solaris and AIX),
look for "convosync=direct". It is also possible to enable direct
I/O on a per-file basis using Veritas QIO; refer to the
"qiostat" command and corresponding man page for hints. For HPUX,
see
Oracle on HP-UX ? Best Practices.
?
Linux - Linux systems support direct I/O on a
per-filehandle basis (which is much more flexible), and I believe Oracle
enables this feature automatically. Someone should verify at what
release Oracle started to support this feature (it is called O_DIRECT).
See
Kernel Asynchronous I/O (AIO) Support for
Linux and this great OTN article:
Talking Linux: OCFS Update.
I?m Using LINUX and ATA Arrays, no Stress, but IO is slow!
Don?t panic! Most LINUX kernels will take the
default ATA interface setpoints that were the ?standard? when the kernel
was built (or even older ones). This can be corrected.
In LINUX there is the hdparm command
which allows you to reset how ATA drives are accessed by the operating
system. Using hdparm is simple and with it I have seen 300%
improvement in access speeds of various ATA drives. Let?s go through a
quick tuning sequence.
First, we will use the hdparm command with no
arguments but the full path to the disk device listing:
[root@aultlinux2 root]# hdparm /dev/hdb
/dev/hdb:
multcount = 16 (on)
IO_support = 0 (default
16-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
keepsettings = 0 (off)
readonly = 0
(off)
readahead = 8 (on)
geometry = 77557/16/63,
sectors = 78177792, start = 0
The hdparm
with no arguments but the disk device gives the current settings for the
disk drive. You should compare this to the specifications for your
drive. You may find that direct emmory access (DMA) is not being used,
readahead is too small, you are only using 16 bit when you should be
using 32 bit, etc.
Next, let?s do
a basic benchmark of the current performance of the drive, you do this
using the hdparm ?Tt option (for all options do a ?man hdparm? at the
command line.
[root@aultlinux2 root]# hdparm -Tt /dev/hdb
/dev/hdb:
Timing buffer-cache reads: 128 MB in
1.63 seconds = 78.53 MB/sec
Timing buffered disk reads: 64 MB in 14.20
seconds = 4.51 MB/sec
Now lets adjust the settings, the ?c option, when set to 1 enables 32
bit IO, the ?u option is used to get or set the interrupt-unmask flag
for the drive. A setting of 1 permits the driver to unmask other
interrupts during processing of a disk interrupt, which greatly improves
Linux's responsiveness and eliminates "serial port overrun" errors. Use
this feature with caution on older kernels: some drive/controller
combinations do not tolerate the increased I/O latencies possible when
this feature is enabled, resulting in massive filesystem corruption.
However most versions (RedHat 2.1 and greater) using modern controllers
don?t have this issue. The ?p option is used to autoset the PIO mode and
?d is used to set or unset the DMA mode.
[root@aultlinux2 root]# hdparm -c1 -u0 -p -d0
/dev/hdb
/dev/hdb:
attempting to set PIO mode to 0
setting 32-bit IO_support flag to 1
setting unmaskirq to 0 (off)
setting using_dma to 0 (off)
IO_support = 1 (32-bit)
unmaskirq = 0 (off)
using_dma = 0 (off)
So we turned on 32 bit mode and set DMA to mode
0. Let?s see the resulting performance change using our previous ?Tt
option.
[root@aultlinux2 root]# hdparm -Tt /dev/hdb
/dev/hdb:
Timing buffer-cache reads: 128 MB in
1.63 seconds = 78.53 MB/sec
Timing buffered disk reads: 64 MB in
9.80 seconds = 6.53 MB/sec
So we didn?t change the buffer-cache read timings, however, we improved
the buffered disk reads by 45%. Lets tweak some more and see if we can
do better. The ?m option sets the multi-sector IO count on the drive.
The ?c option sets the 32 bit option, the ?X sets the access mode to
mdma2 the ?d1 option turns on direct memory access, the ?a8 option
improves the readahead performance for large reads and ?u1 turns on the
unmasking operation described above.
[root@aultlinux2 root]# hdparm -m16 -c3 -X mdma2
-d1 -a8 -u1 /dev/hdb
/dev/hdb:
setting fs readahead to 8
setting 32-bit IO_support flag to 3
setting multcount to 16
setting unmaskirq to 1 (on)
setting using_dma to 1 (on)
setting xfermode to 34 (multiword DMA mode2)
multcount = 16 (on)
IO_support = 3 (32-bit
w/sync)
unmaskirq = 1 (on)
using_dma = 1 (on)
readahead = 8 (on)
So now let?s
see what we have done to performance using the ?Tt option.
[root@aultlinux2 root]# hdparm -Tt /dev/hdb
/dev/hdb:
Timing buffer-cache reads: 128 MB in
1.56 seconds = 82.05 MB/sec
Timing buffered disk reads: 64 MB in
4.29 seconds = 14.92 MB/sec
Not bad! We improved buffered cache reads by 5% and buffered disk reads
by 231%! These options can then be loaded into a startup file to make
them part of the system startup.
I?m Really Feeling SCSI About Disk Performance, what then?
Sorry for the bad pun (well, actually I?m not) what
can be done with SCSI interfaces? To tell you the truth, not a lot,
however, there are some items which you may find useful. Most interfaces
will buffer commands and issue them in batches, for example, most SCSI
interfaces use a 32 command buffer that stacks commands until it has 32
of them and then fires them off. This can be reset in LINUX using
options in the modules.conf file for the SCSI interface module.
In other UNIX flavors there are many settings which
can be changed, but an exact understanding of the interface and its
limitations as well as current system loads must be had before changing
any of the SCSI settings. If you feel you need to have them checked, ask
your SA.
Disk Stress In a Nut Shell
In summary, to determine if a disk or array is
undergoing IO related stress, perform an IO balance and an IO timing
analysis. If the IO timing analysis shows excessive read or write times
investigate the causes. Generally speaking, poor IO timings will result
when:
?
A single disk exceeds 110 ? 150 IO per second
?
An entire multi-read capable RAID10 array exceeds
#MIRRORS*#DPM*110 IO?s per second
?
An entire non-multi-read capable RAID10 array exceeds
#DPM*110 IO?s per second
?
If a RAID5 array exceeds (#DISKS-1)*66 IO?s per second
then it will probably experience poor IO timings.
?
Make sure Oracle is using direct IO at both the OS and
Oracle levels
?
Make sure your disk interface is tuned to perform
optimally
*DPM=Disks per mirror