This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
When using the shared cluster
file system, there is just one copy of the entire directory
structure. Just one set of directories exists where the data files,
control files, redo log files, and archive log files are located.
This is advantageous in many respects. However, sometimes there may
be a desire for the relationship between a set of files or a
directory and the shared CFS directory to be unique for each
node/host within the cluster. For example, the DBA may want to keep
a local tnsnames.ora file or a local listener.ora file.
For the purpose of setting up
Intelligent Agent, the DBA may want to separate the $ORACLE_HOME/network
directory from the shared Oracle home installation without having to
physically install the Intelligent Agent on each node in the
cluster. In such situations, setting up a context dependent symbolic
link (CDSL) creates a node-dependent copy of the file or directory.
There are many cluster file
system products that can be used for building the RAC system. They
include: Tru64 CFS for HP/Compaq Cluster servers, Veritas CFS for
Solaris based RAC Clusters, PolyServe Matrix Server for Linux and
Windows based RAC Clusters, and Oracle cluster files systems for
Linux and Windows RAC clusters. Some of the important features of
these products will be examined next. Only some of these, however,
The VERITAS Database
Edition/Advanced Cluster for Oracle RAC enables Oracle to use the
CFS. The Veritas CFS is an extension of the VERITAS File System (VxFS).
The Veritas CFS allows the same file system to be simultaneously
mounted on multiple nodes. Veritas CFS is designed with master/slave
architecture. Any node can initiate a metadata operation (create,
delete, or resize data) and the master node carries out the actual
operation. All other (non metadata) I/O goes directly to the disk.
A distributed locking mechanism,
called the global lock manager (GLM) is used for metadata and cache
coherency across the multiple nodes. GLM provides a way to ensure
that all the nodes will have a consistent view of the file system.
When any node wishes to read data, it requests a shared lock. If
another node wishes to write to the same area of the file system, it
must request an exclusive lock. The GLM revokes all shared locks
before granting the exclusive lock and informs reading nodes that
their data is no longer valid.
CFS is used in DBE/AC to manage
a file system in a large database environment. When used in DBE/AC
for Oracle-RAC, Oracle accesses data files stored on CFS file
systems with the ODM interface. This essentially bypasses the file
system buffer and file system locking. This means that only Oracle
handles the tasks of buffering data and coordinating writes to files
and not the GLM, which is minimally used with the ODM interface.
Using this out-of-band fencing
is a significant benefit in large clustered environments where the
alternative fencing approach known as STOMITH (shoot the other
machine in the head) is neither sufficiently reliable nor
HP Tru64 CFS
Tru64 CFS is a layer on top of
the Advfs file system. When direct I/O is enabled for a file by
opening the file with the o_directio flag, read and write requests
on it are executed to and from disk storage through direct memory
access, bypassing AdvFS and CFS caching. This improves I/O
performance for database applications that do their own caching and
file region synchronization.
Oracle uses the direct I/O
feature available in CFS. Direct I/O enables Oracle to bypass the
buffer cache (no caching at file system level). Oracle manages the
concurrent access to the file itself as it does on raw devices.
Direct I/O does not go through the CFS server. File creation and
resizing is seen as a metadata operation by Advfs and this has to be
done by the CFS server. Consequently, file creations and resizes
must be run on the node where the CFS server is located. File
operations might take longer when the CFS server is remote.
In a TruCluster server, the
device request dispatcher subsystem controls all I/O to physical
devices. All cluster I/O passes through this subsystem, which
enforces single-system open semantics so only one program can open a
device at any one time.
The following are some
general features of a Tru64 CFS:
* The Cluster File System (CFS)
makes all files, including the root (/), /usr, and /var file
systems, visible to and accessible by all cluster members. There is
a single copy or image for all cluster members.
* A single cluster member serves
each file system. Other members access that file system as CFS
clients, with significant optimization for shared access.
* Oracle RAC automatically does
direct I/O on Tru64 UNIX file system storage. This can significantly
improve I/O performance for the database since Oracle9i does its own
caching and file region synchronization.