Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

Failover Cluster Architecture

Oracle RAC Cluster Tips by Burleson Consulting

This is an excerpt from the bestselling book Oracle Grid & Real Application Clusters.  To get immediate access to the code depot of working RAC scripts, buy it directly from the publisher and save more than 30%.


Normally the Failover cluster is implemented in two types of architectures. They are Active/Passive architecture and Active/Active architecture.

Active/Passive Clusters: This type comprises two near identical infrastructures, logically sitting side-by-side. One node hosts the database service or application, while the other rests idly waiting in case the primary system goes down. They share a storage component, and the primary server gracefully turns over control of the storage to the other server or node when it fails. On failure of the primary node, the inactive node becomes the primary and hosts the database or application.

Active/Active Clusters: In this type, one node acts as primary to a database instance and another one acts as a secondary node for failover purpose. At the same time, the secondary node acts as primary for another instance and the primary node act as the backup/secondary node.

Figure 3.9 shows an example of active/active architecture.

Figure 3.9:  Two Node Cluster with Active /Active Resource groups

The Active/Passive architecture is the most widely used. Unfortunately, this option is usually capital intensive and an expensive option. For simplicity and manageability reasons many administrators prefer to implement this way. Active/Active looks attractive and is a more cost-benefit solution as the backup server is put to use. However, it can result in performance problems when both the database services (or applications) failover to single node. As the surviving node picks up the load from the failed node, performance issues may arise.

Oracle Database Service in HA Cluster

The Oracle database is a widely used database system. Large numbers of critical applications and business operations depend on the availability of the database. Most of the cluster products provide agents to support database fail over processes.

The implementation of Oracle Database service with failover in a HA cluster has the following general features.

* A single instance of Oracle runs on one of the nodes in the cluster. The Oracle instance and listener has dependencies on other resources such as file systems, mount points and IP address. etc.

* It has exclusive access to the set of database disk groups on a storage array that is shared among the nodes.

* Optionally, an Active/Active architecture of Oracle databases can be established. One node acts as the primary node to an Oracle instance and another node acts as a secondary node for failover purposes. At the same time, the secondary node acts as primary for another database instance and the primary node acts as the backup/secondary node.

* When the primary node suffers a failure, the Oracle instance is restarted on the surviving or backup node in the cluster.

* The failover process involves moving IP address, volumes, and file systems containing the Oracle data files. In other words, on the backup node, IP address is configured, disk group is imported, volumes are started and file systems are mounted. 

* The restart of the database automatically performs crash recovery returning the database to a transactional consistent state.

There are some issues connected with Oracle Database failover one needs to be aware of:

* On restart of the database, there is a fresh database cache (SGA) established and it loses all the previous instance's SGA contents. All the frequently used packages and statements parsed images are lost.

* Once the new instance is created and made available on the backup node, all the client connections seeking the database service attempts to connect at the same time. This could result in a lengthy waiting period.

* The impact of the outage may be felt for an extended duration during the failover process. When there is a failure at the primary node, all the relevant resources such as mount points, disk group, listener, database instance have to be logically off-lined or shutdown. This process may take considerable time depending on failure situation.

However, when the Oracle Database Cluster is implemented in Parallel, Scalable cluster such as Oracle RAC, there are many advantages and it provides a transparent failover for the clients. The main high availability features include:

* Multiple Instances exist at the same time accessing a single database. Data files are common to the multiple instances.

* Multiple nodes have read/write access to the shared storage at the same time. Data blocks are read and updated by multiple nodes.

* Should a failure occur in a node and the Oracle instance is not usable or has crashed, the surviving node performs recovery for the crashed instance. There is no need to restart the instance on the surviving node since a parallel instance is already running there.

* All the client connections continue to access the database through the surviving node/instance. With the help of the Transparent Application Failover (TAF) facility, clients will be able to move over to the surviving instance near instantaneously.

* There is no such thing as the moving of Volumes and File system to the surviving node.

Server Redundancy

The database resides within a server. The server or host is an important component in the provision of the data service. Any failure in the host system causes the database to go down.

Necessity of Server Redundancy

Clustered servers utilize two or more nodes, essentially keeping the extra nodes as standby or sometimes as extra computing power, as in the case of the RAC system. With the help of the additional nodes, we ensure that the standby node can provide the same database service to the user community. However, when in standby, it loses the performance and scalability level for which it is intended.

Clustering servers assures the administrators and the application users that at least one node is alive. A cluster, in its most general form, comprises two or more interconnected computers that are viewed and used as a single, unified computing resource. By using multiple systems, the impact of the failure of any individual system is kept low by passing the failed system?s workload to the remaining members of the cluster.

The standby node becomes functional, or becomes the primary host, when the failed host is unable to provide any host services. When some of the internal components fail and the failure is non-recoverable without intervention, the server is declared not available or simply ?failed?. This indicates that there is a lot of scope for keeping the internal components safe or redundant.

Before losing the server and resorting to the use of the clustered backup node, there are many things we can do to keep the components from failing. Let us examine these methods that act as the first level of redundancy. Some people call it ?high availability without clustering.? In contrast to clustering, system availability can be improved without adding additional servers.

Redundancy Features

There are many features or options that add value to the redundancy at the server level. Taking advantage of such features helps avoid failures and avoids degraded cluster performance in systems like the RAC system. These features address different subsystems of the server, such as the memory and processors. Redundant components such as fans, power supplies, and adapters can also provide higher availability, particularly when used with software that provides monitoring and alerting capability to the system administrators.

To make the servers more reliable, we should use high-reliability components and best-system practices. Let us examine some features of the redundancy that administrators need to focus on.

 

 


This is an excerpt from the bestselling book Oracle Grid & Real Application Clusters, Rampant TechPress, by Mike Ault and Madhu Tumma.

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

http://www.rampant-books.com/book_2004_1_10g_grid.htm


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational