This problem with concurrent access and RAID
arrays is one of the most prevailing in the industry. The ubiquitous
IO wait is usually the predominant wait event in any database system
simply due to the fact that IO to memory is in the nanosecond range
while IO to disk is in the millisecond range, when you add in
blocked access due to multi-disk IO requests you get a snowball
effect that can cripple your IO subsystem.
Array manufacturers have begun to recognize this concurrent access
problem and have increased the base stripe unit per disk to 64K,
matching the IO unit for many systems. Of course now systems such as
SUN and Windows utilize maximum IO sizes of 1 megabyte or
larger, so again the array manufacturers are playing catch up to the
So what is our second rule of tuning disks? Based on the above
information the rule is:
Always ensure that the primary IO size for your database system is
matched to the IO size of the disk array system.
Of course the inverse also holds true:
Always match the stripe unit per disk to the expected majority IO
request from your (database) application.
In the 1990's Paul Chen of the University Of Berkeley computer
center published a series of papers on tuning disk array stripe
units size based on expected concurrency. In these papers by Mr.
Chen and his associates they determined that the IO speed (as
measured by average seek time) and IO rate (as measured in megabytes
per second) for a disk determined the stripe size for performance in
an array even when the number of concurrent accesses is not known.
There were three formulae derived from these papers:
For non-RAID5 arrays when concurrency is known:
SU = (S*APT*DTR*(CON-1)*1.024)+.5K
SU - Striping unit per disk
S - Concurrency slope coefficient (~.25)
APT - Average positioning time (milliseconds)
DTR - Data transfer rate (Megabyte/sec)
CON - number of concurrent users.
1.024 = 1s/1000ms*1024K/1M (conversion factors for units)
So for a drive that has an average seek time of 5.6 ms and a
transfer rate of 20 Mbyte/second the calculated stripe unit for a 20
concurrent user base would be:
(.25*5.6*20*(19)*1.024)+.5 = 545K (or ~512K)
For a system where you didn't know the concurrency the calculation
So for the same drive:
2/3*5.6*20*1.024 = 76.46K so rounding up ~128K or rounding down 64K
And from Chen's final paper, a formula for RAID5 arrays is:
0.5*5.6*20*1.024 = 57.34 (rounding up 64K)
The values for average access time and transfer rate used in these
examples is actually fairly low when compared to more advanced
drives so the stripe sizes shown above are probably low by at least
a factor of 2 or more. I say this because will average seek times
drop, the transfer rate increases for example on a Ultra3 SCSI 15K
drive the spec for average seek may drop to 4.7 ms, however the
transfer rate leaps to 70 Mbyte per second. so the over all value of
the combined factor goes from 112 to 329, a 293% increase.
To summarize all of the above material:
1. Disks in an array should be based on IO needs
2. Stripe width would be based on concurrency needs
3. Oracle block size should be based on a multiple of your
db_file_multiblock_read and block size being equal to the stripe
width and your data needs.
Generally, a system that depends on single block reads and writes (OLTP)
with single point index reads (not scans) will benefit from a small
block size (4-8K) while a system that depends on full table scans
and index scans will benefit from a larger block size (16k, 32k).
This is an excerpt from my book "Oracle Disk I/O Tuning, and it's
only $19.95 at this link:
Burleson Consulting now offers a one-day performance review for
your Oracle database. Working with top experts who tune hundreds of
databases each year, you can get fast expert tuning advice to hypercharge
your Oracle database.
We also provide expert upgrades to Oracleand Oracle10g, and our DBAs
can quickly show you how to implement the important new features of new
Call now for remote Oracle support.