Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

Redundancy Features

Oracle RAC Cluster Tips by Burleson Consulting

This is an excerpt from the bestselling book Oracle Grid & Real Application Clusters.  To get immediate access to the code depot of working RAC scripts, buy it directly from the publisher and save more than 30%.


There are many features or options that add value to the redundancy at the server level. Taking advantage of such features helps avoid failures, and avoids degraded cluster performance in systems like the RAC system. These features address different subsystems of the server, such as the memory and processors. Redundant components such as fans, power supplies, and adapters can also provide higher availability, particularly when used with software that provides monitoring and alerting capability to the system administrators.

To make the servers more reliable, high-reliability components and best-system practices should be used. The following section examines some features of the redundancy that administrators need to focus on.

Dynamic Reorganization (DR) with in a server

DR is an operating environment feature that provides the ability to replace and reconfigure system hardware while the system is running. This feature is optional and can be implemented at the discretion of the system administrator. The main benefit of DR is that an administrator can add or replace hardware resources, such as CPUs, memory, and I/O interfaces, with little interruption of normal system operations. The DR process helps to increase the overall uptime and availability of servers.

For example, the DR method is available for Sun system architectures that contain multiple system boards and use board slots that support hot plugging. The DR facility is very well implemented for Sun Fire server series 3800-6800. By using the DR methodology, hardware components can be added or removed from a system with minimal interruption. The DR is performed at attachment points. DR allows connect or disconnect attachment points. The Sun Fire series supports the following attachment points for dynamic reorganization.

* I/O Assembly (PCI / ePCI assemblies)

* CPU/Memory Boards

* CPCI cards

* System Memory

* CPU/s

Predictive Failure Analysis (PFA)

Many server vendors provide a mechanism to anticipate system failures. It is called Predictive Failure Analysis (PFA). Servers keep running until they don't run anymore. Often, there are not clear signs that the servers will go down. If zero downtime is necessary, consider using predictive failure analysis technology. This technology warns a DBA up to 48 hours in advance of an imminent server failure. That's plenty of time to prevent disaster. The analysis method and terminology may differ, but most of the leading vendors provide PFA for the servers.

Error Correcting and Checking (ECC) Memory

Error Correcting and Checking (ECC) memory detects and corrects all single bit errors without impacting the operation of the system. It also detects all, and corrects some, double-bit errors. All error correction events are logged by the system.

IBM Chipkill memory is a good example. Chipkill ECC memory and automatic server restart features work to minimize server downtime. With the latest Chipkill memory technology available in select IBM xSeries and Netfinity servers, they are protected from any single memory chip that fails and any number of multi-bit errors, from any portion of a single memory chip.

To give another example, in Sun systems, memory error correction code has been adopted on all servers to minimize system downtime caused by faulty single inline memory modules (SIMMs) and dual inline memory modules (DIMMs).

Redundant Networking Components

To avoid network I/O channel failures, provide redundant physical elements in the path between the server and the network backbone, which includes network interface cards, cables, and patch panels.

Hot Swap Power

In its simplest form, two power supplies, each capable of providing power for the whole system, should be built into the server to share the load. When one of the supplies fails, the surviving supply keeps the server running. The facility of UPS is another most essential requirement.

Hot Swap Fans

A cooling fan failure will not bring a system down if it can be hot-swapped transparently. Most cooling-related issues are external to the system, such as keeping the computer room temperature stable below the required levels. High temperatures and temperature fluctuations are a form of stress to electronic components.

 


This is an excerpt from the bestselling book Oracle Grid & Real Application Clusters, Rampant TechPress, by Mike Ault and Madhu Tumma.

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

http://www.rampant-books.com/book_2004_1_10g_grid.htm


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational