RAC protects against instance and server failure
by providing multiple servers with which one can be connected.
However, remember that all data will be in centralized
storage. There is still
a possibility of data failure or data center loss.
Data failure is the worst of the three that have been seen thus far
(instance and system failure), resulting in the loss or corruption of
data. Some disk failures
are non-disastrous; for instance, if a disk is mirrored with hardware
or software RAID.
Even then, if
excessive disks are lost, it is possible that production data could be
lost as well, thereby requiring some form of recovery.
User error can also cause data loss if an operating system user
removes database files with a command such as
case, the file will be removed, and the disk mirror will provide no
corruption can occur if hardware or software bugs result in
inappropriate data being written to the datafiles.
Data center loss occurs when a system is completely
lost, usually as the result of some sort of natural disaster.
A hurricane, flood, or tornado may destroy or seriously disable
an entire data center resulting in a combined loss of servers and
disk. This is by far the
worst unplanned-downtime scenario and can only be protected against
with extensive, and usually expensive, disaster recovery methods.
Oracle provides many options for preventing downtime and data loss,
all of which make up the Maximum Availability Architecture.
The MAA provides redundancy on all components and employs
different Oracle tools.
RAC only makes up one piece of the MAA; it does not take into account
all possible problems.
These tools, as recently mentioned, must provide
protection for planned and unplanned downtime. They must also protect
against varying levels of unplanned downtime ranging from single
server outages, which RAC covers, to entire data center loss, which
RAC does not cover.
Some businesses choose not to follow all the
guidelines for maximum availability.
When considering a high availability strategy, the DBA must
The RTO defines the allowable downtime for the
database. An advertising
company may allow hours of downtime; however, a bank will usually
allow no downtime whatsoever.
RPO defines the allowable data loss if a failure occurs.
If batch processes load the data, it may be that hours or even
days of data could be reloaded.
However, for a system that allows direct access by the end
user, such as an online store or ATM machine, zero data loss is
Downtime can be expensive.
Depending on the system, costs can range from dollars per
minute to tens of thousands of dollars lost for every minute the
database is unavailable. However, uptime is expensive as well.
It has been shown how costly RAC can be for a business. Now it
can be seen that even more may be required for a fully bulletproof
Example of an HA Configuration using MAA Best Practices
Many other HA solutions require the backup server
to sit uselessly idle. A
solid HA solution like Oracle 11g RAC is good for the users,
management, System Administrators and DBAs.
With multiple instances, the RAC system gives a
near zero failure environment. Even when one or more nodes fail in the
cluster, for whatever reason, as long as there is one instance
running, the database resources are provided.
With the help of the transparent application
failover (TAF) configuration, operations are transferred automatically
to the surviving instance.
Users will appreciate the ability to always connect
to their apps even when a server node experiences a total hardware or
Management and the other System Administrators are happy when the
users are happy. Then the
DBA can sleep more soundly and work a more balanced 9 - 5 schedule.
A DBA can now take a node offline knowing the other nodes will
prevent the users from noticing.
The major benefits of the RAC database system are
scalability and high availability.
No business operations can run without the use of database
resources. That is why
the geeks, and more specifically, the DBAs shall inherit the earth.