This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
The PolyServe Matrix Server software provides a
comprehensive infrastructure that allows users to build database
clusters with Oracle RAC on a Linux platform. The PolyServe SAN File
System (PSFS) is part of the Matrix Server, and it provides the
shared file system for the use of Oracle RAC. For building RAC on
Linux clusters, the PolyServe matrix server (MxS) offers an
alternative to the OCFS as well as the ASM facility.
MxS consists of two main products. The first
one is the Matrix HA component, which provides manageability and
high availability for applications running on groups of servers. The
product supports virtual hosts, standard IP service and device
monitoring, custom service and device monitoring, data replication,
and administrative event notification. The second component is the
PolyServe SAN File System (PSFS). The PolyServe SAN File System (PSFS)
provides a flexible and easy way to manage the RAC database. PSFS is
designed to complement the Oracle RAC architecture and to scale out
with Oracle9i RAC running on top of it. PSFS supports the Oracle
Disk Manager (ODM) and is tightly coupled with Oracle9i RAC. PSFS
supports shared block access with full data integrity and cache
coherency. It offers direct I/O and normal page-cache buffered I/O.
PSFS allows Oracle RAC to perform direct I/O against the disk and do
distributed lock management at the database level where system-wide
performance can best be optimized.
The Matrix Server has an important component
that deals with SAN management. It is called SCL or Storage Control
Layer. It operates as a daemon running on each member of the
cluster. It helps perform the I/O fencing.
The PolyServe Matrix Server complements Oracle
Cluster Management Services in such a way that nodes will never have
to be fenced via the default Linux RAC behavior of powering off a
node (STOMITH). PolyServe Matrix Server provides fencing via the SCL.
MxS software is installed on each of the
servers in the cluster or matrix. It has many components dealing
with different functionalities. Some of them are:
ClusterPulse daemon ? monitors the matrix,
controls the virtual hosts and devices, handles communication with
administrative console, and manages device monitors and event
DLM daemon ? provides a locking mechanism to
coordinate server access to shared resources in the matrix.
SANPulse daemon ? provides matrix
infrastructure for managing the SAN. It coordinates file system
mounts, un-mounts, and file recovery operations.
SCL daemon ? helps to manage the storage
devices. It assigns device names to shared disks when they are
imported into the matrix and enables cluster use. It also disables
the server?s access to shared disks during the split-brain
Psd drivers ? provides matrix-wide consistent
device names among all the servers.
HBA drivers ? includes drives for the supported FibreChannel host bus adapters.
PanPulse daemon ? monitors the network and
detects any communication problems.
Figure 15.15 shows the various components of
the Matrix Server.
Figure 5.15: Storage Volume Relations
Storage Control Layer (SCL) Role
One of the main objectives of the SCL module is
to physically disable SAN storage whenever the server drops out of
the matrix or cluster membership. Thus, it can perform I/O fencing.
I/O fencing ensures integrity within the cluster by excluding rogue
nodes from accessing critical data.
MxS features true I/O fencing, for example port
disabling at the FC switch, which allows an administrator to log in
to the ejected node, diagnose, and repair the problem before
restarting. Thereby, allowing an ejected server to rejoin the
Matrix Server provides each disk with a
globally unique device name that all servers in the Matrix can use
to access the device. When a SAN disk is imported into the matrix, a
name prefixed by psd is assigned, e.g. psd1, psd2, psd12. Individual
disk partitions also have a global device name. It has the name and
partition number. For example, partition 5 of disk psd14 is
represented as psd14p5. On each server, Matrix Server creates device
node entries in the directory /dev/psd for every partition on the
disk. The SCL stores the device name and physical UID for each
imported disk device in an internal database. This database resides
on a membership partition that is assigned at the time of MxS
For importing SAN disks, the shared disks need
to have a partition table on them before they are imported into the
matrix. Once they are imported, they are managed by Matrix controls.
That is, no access to the disk can occur unless it is through the
cluster file system. Matrix Server is managed via a graphical user
interface however; all functionality can also be carried out at the
command line through the CLI. Some useful commands are:
To import the disk:
# mx disk import <UUID>
To deport the disk:
# mx disk deport <UUID>
To display the disk information:
MxS uses multi-path I/O to eliminate single
points of failure and to perform I/O load balancing. The matrix can
include multiple FC switches, multiple HBAs per server, and
multi-ported SAN disks. When MxS is started, it automatically
configures all paths. It uses the first discovered path on each
server for I/O with the SAN devices. When the first path fails, MxS
automatically fails over the I/O to another path. Use the command
mxmpio to display and manage.
# mxmpio enableall | disableall