This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
The Parallel Clustered Database
(PDB) is a complex application, which provides access to the same
database, or group of data tables, indexes and other objects, from
any server in the cluster concurrently without compromising data
integrity. Well known examples include Oracle Real Application
Cluster (Oracle RAC), the subject matter of this book, IBM UDB DB2
Enterprise Extended Edition (EEE), and IBM S/390 Parallel Sysplex
Clusters.
Parallel Databases typically
contain multiple nodes or servers accessing the same physical
storage or data concurrently. PDB Architecture allows multi-server
data sharing technology, allowing direct concurrent read/write
access to shared data from all the processing nodes in the parallel
configuration. However, this necessitates complex lock management to
maintain the data integrity and resource coordination.
In terms of storage access type,
a Parallel Clustered System is implemented in two ways, the Shared
Nothing Model and the Shared Disk Model.
Shared Nothing Model
In the Shared Nothing model,
also referred to as the Data Partitioning Model, each system owns a
portion of the database and each partition can only be read or
modified by the owning system as shown in Figure 3.10. Data
Partitioning enables each system to locally cache its portion of the
database in processor memory without requiring cross-system
communication to provide data access concurrency and coherency
controls.
Each server in the cluster has
its own independent subset of the data called a partition it can
work on independently without encountering resource contention from
other servers. The clustered nodes communicate by passing messages
through a network that interconnects the servers. Client requests
are automatically routed to the system that owns the particular
resource, memory or disk for example. Only one of the clustered
systems can own and access a particular resource at a time. In the
event of a failure, resource ownership can be dynamically
transferred to another system in the cluster.
Figure 3.10: Shared Nothing Mode
? Three Node database cluster
This architecture has several
advantages:
* Shared nothing systems provide
for incremental growth.
* Good for read-only databases
and decision support applications.
* Failure is local - if one node
fails, the other nodes stay up. However, disk system of failed node
moves over to the surviving node.
It does suffer from some
drawbacks as well:
* More coordination is required.
* More overhead is required in
terms of processing or function shipping for a SQL operation working
on a data/disk belonging to another node.
* Data skew is a potential
problem. As data is added to the database and access patterns
change, data re-partition is needed to balance IO.
Shared-Disk model
In the
Shared-Disk model, all the disks containing data are accessible by
all nodes of the cluster. Disk sharing architecture requires
suitable lock management techniques to control the update
concurrency control. Each of the nodes in the cluster has direct
access to all disks on which shared data is placed. Figure 3.11
shows a typical three node parallel database cluster. Each node has
local database buffer cache. IBM Parallel Sysplex and Oracle RAC
systems follow this approach of shared-disk.
Advantages of
shared-disk systems are as follows:
* Shared-disk
systems permit high availability. All data is accessible even if one
node fails.
* These systems
have the concept of One Database and multiple access points. In
other words, one can say it is multi-instance and single database.
There is no issue such as data skew as the data is located and
accessed at a common location.
* It provides
for incremental growth of nodes and thus adds to processing power.
Figure 3.11:
Shared Disk Parallel Database Cluster
Disadvantages of
shared disk systems are these:
* Inter-node
synchronization is required and involves complex lock management and
greater dependency on high-speed interconnect.
* If the
workload is not partitioned well among the processing nodes, there
may be high synchronization overhead.
* There is
operating system overhead of running shared disk software.