This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
Normally the Failover cluster is
implemented in two types of architectures. They are Active/Passive
architecture and Active/Active architecture.
Active/Passive Clusters
– This type comprises two near identical infrastructures, logically
sitting side-by-side. One node hosts the database service or
application, while the other rests idly waiting in case the primary
system goes down. They share a storage component, and the primary
server gracefully turns over control of the storage to the other
server or node when it fails. On failure of the primary node, the
inactive node becomes the primary and hosts the database or
application.
Active/Active Clusters
– In this type, one node acts as primary to a
database instance and another one acts as a secondary node for
failover purpose. At the same time, the secondary node acts as
primary for another instance and the primary node act as the
backup/secondary node.
Figure 3.9 shows an example of
active/active architecture.
Figure 3.9: Two Node Cluster
with Active /Active Resource groups
The Active/Passive architecture
is the most widely used. Unfortunately, this option is usually
capital intensive and an expensive option. For simplicity and
manageability reasons many administrators prefer to implement this
way. Active/Active looks attractive and is a more cost-benefit
solution as the backup server is put to use. However, it can result
in performance problems when both the database services (or
applications) failover to single node. As the surviving node picks
up the load from the failed node, performance issues may arise.
Oracle Database Service in HA
Cluster
The Oracle database is a widely
used database system. Large numbers of critical applications and
business operations depend on the availability of the database. Most
of the cluster products provide agents to support database fail over
processes.
The implementation of Oracle
Database service with failover in a HA cluster has the following
general features.
* A single instance of Oracle
runs on one of the nodes in the cluster. The Oracle instance and
listener has dependencies on other resources such as file systems,
mount points and IP address. etc.
* It has exclusive access to the
set of database disk groups on a storage array that is shared among
the nodes.
* Optionally, an Active/Active
architecture of Oracle databases can be established. One node acts
as the primary node to an Oracle instance and another node acts as a
secondary node for failover purposes. At the same time, the
secondary node acts as primary for another database instance and the
primary node acts as the backup/secondary node.
* When the primary node suffers
a failure, the Oracle instance is restarted on the surviving or
backup node in the cluster.
* The failover process involves
moving IP address, volumes, and file systems containing the Oracle
data files. In other words, on the backup node, IP address is
configured, disk group is imported, volumes are started and file
systems are mounted.
* The restart of the database
automatically performs crash recovery returning the database to a
transactional consistent state.
There are some issues connected
with Oracle Database failover one needs to be aware of:
* On restart of the database,
there is a fresh database cache (SGA) established and it loses all
the previous instance’s SGA contents. All the frequently used
packages and statements parsed images are lost.
* Once the new instance is
created and made available on the backup node, all the client
connections seeking the database service attempts to connect at the
same time. This could result in a lengthy waiting period.
* The impact of the outage may
be felt for an extended duration during the failover process. When
there is a failure at the primary node, all the relevant resources
such as mount points, disk group, listener, database instance have
to be logically off-lined or shutdown. This process may take
considerable time depending on failure situation.
However, when the Oracle
Database Cluster is implemented in Parallel, Scalable cluster such
as Oracle RAC, there are many advantages and it provides a
transparent failover for the clients. The main high availability
features include:
* Multiple Instances exist at
the same time accessing a single database. Data files are common to
the multiple instances.
* Multiple nodes have read/write
access to the shared storage at the same time. Data blocks are read
and updated by multiple nodes.
* Should a failure occur in a
node and the Oracle instance is not usable or has crashed, the
surviving node performs recovery for the crashed instance. There is
no need to restart the instance on the surviving node since a
parallel instance is already running there.
* All the client connections
continue to access the database through the surviving node/instance.
With the help of the Transparent Application Failover (TAF)
facility, clients will be able to move over to the surviving
instance near instantaneously.
* There is no such thing as the
moving of Volumes and File system to the surviving node.