Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

Free Oracle Tips

HTML Text

 Home
 E-mail Us
 Oracle Articles



 Oracle Training
 Oracle News

 Oracle Forum
 Class Catalog


 Our Staff
 Our Prices
 Help Wanted!

 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 UNIX
 Oracle UNIX
 Linux
 Oracle Linux
 Monitoring
 Remote help

 Remote plans
 Remote
services
 Oracle C++
 Oracle Java
 Apache
 JDeveloper
 App Server

 Applications
 Oracle Forms
 Oracle Portal
 11i Upgrades
 SQL Server
 Oracle Concepts
 HTML-DB Tips
 Software Help

 Remote Help  
 Development  

 Implementation


 Financials Training
 Oracle 11i
 Oracle Apps 11i
 Oracle Workflow
 Oracle AR 11i Class
 Oracle AP 11i class
 Oracle GL 11i class
 Oracle HR 11i class
 Oracle FA 11i class
 11i Project Mgt
 11i procurement
 11i collections


 Oracle Posters
 Oracle Books

 Oracle Tuning Book
 Oracle RAC Book
 Oracle Security
 Easy Oracle Books
 Oracle Scripts
 SQL Server DBA
 SQL Design Patterns
 WISE
 Excel-DB   


 BC Oracle News


 Rednecks!
 Dress code
 Arabian Stallion

 Burleson Arabians
 Guide Horses
 Don Burleson Blog
 Golf & Travel


 Privacy Policy
 

 

 

 
 

Oracle RAC and TAF to Guarantee availability 

Oracle Tips by Burleson Consulting


 
Get the RAC book!

Oracle 10g Grid & RAC book

This is the bestselling book on Oracle Real Application Clusters Configuration and Internals, now at 30% off, with a free 10g reference poster.
Need expert RAC support?

Burleson Consulting has Oracle certified RAC experts to help get you up-to-speed in RAC, fast.

Oracle Magazine
Donald Burleson
May/June, 2002

One of the most exciting new features in Oracle Database is Real Application Clusters (RAC). The Oracle RAC solution delivers 24/7 database availability, performance, and scalability. Cache Fusion is the key memory feature that enables Oracle RAC performance, and the new Transparent Application Failover (TAF) is what applications use to sync up with Oracle RAC availability. This article explores the cooperation between Oracle RAC, Cache Fusion, and TAF and offers insights into the architecture and use of these tools for continuous availability and infinite scalability.

Oracle RAC Architecture

Oracle has long recognized that a clustered environment is the best protection against hardware and software failure. In a clustered environment, many Oracle instances exist on separate servers, each with direct connectivity to a single Oracle database. Should any single server or instance fail, processing continues on the surviving servers.

Cache Fusion and Oracle RAC

The introduction of the Cache Fusion shared RAM cache for multiple Oracle instances is a breakthrough in clustered solutions. Oracle RAC fully implements Cache Fusion, which both provides high performance and enables continuous cluster availability. The high-availability capability of Oracle RAC is almost unfathomable. It's estimated that in a 12-computer configuration, any application running on Oracle RAC will not experience a catastrophic failure for well over 100,000 years.

Cache Fusion technology changes the internal configuration of the Oracle system global area (SGA). Cache Fusion moves the RAM data buffers from local RAM storage into a shared RAM area accessible by all Oracle instances.

Beyond high performance and high availability, Oracle RAC offers significant benefits as a scalability tool. Whenever the processing load becomes excessive in an existing Oracle RAC cluster, you can add additional processors—each with its own Oracle instance—to the Oracle RAC configuration. This allows companies to start small and scale infinitely as processing demands increase.

Oracle RAC and Hardware Failover

To detect a node failure, the Cluster Manager uses a background process—Global Enqueue Service Monitor (LMON)—to monitor the health of the cluster. When a node fails, the Cluster Manager reports the change in the cluster's membership to Global Cache Services (GCS) and Global Enqueue Service (GES). These services are then remastered based on the current membership of the cluster.

To successfully remaster the cluster services, Oracle RAC keeps track of all resources and resource states on each node and then uses this information to restart these resources on a backup node.

These processes also manage the state of in-flight transactions and work with TAF to either restart or resume the transactions on the new node. Now let's see how Oracle RAC and TAF work together to ensure that a server failure does not cause an unplanned service interruption.

Using Transparent Application Failover

After an Oracle RAC node crashes—usually from a hardware failure—all new application transactions are automatically rerouted to a specified backup node. The challenge in rerouting is to not lose transactions that were "in flight" at the exact moment of the crash. One of the requirements of continuous availability is the ability to restart in-flight application transactions, allowing a failed node to resume processing on another server without interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed Transparent Application Failover. TAF allows the DBA to configure the type and method of failover for each Oracle Net client.

For an application to use TAF, it must use failover-aware API calls from the Oracle Call Interface (OCI). Inside OCI are TAF callback routines that can be used to make any application failover-aware.

While the concept of failover is simple, providing an apparent instant failover can be extremely complex, because there are many ways to restart in-flight transactions. The TAF architecture offers the ability to restart transactions at either the transaction (SELECT) or session level:

  • SELECT failover. With SELECT failover, Oracle Net keeps track of all SELECT statements issued during the transaction, tracking how many rows have been fetched back to the client for each cursor associated with a SELECT statement. If the connection to the instance is lost, Oracle Net establishes a connection to another Oracle RAC node and re-executes the SELECT statements, repositioning the cursors so the client can continue fetching rows as if nothing has happened. The SELECT failover approach is best for data warehouse systems that perform complex and time-consuming transactions.
  • SESSION failover. When the connection to an instance is lost, SESSION failover results only in the establishment of a new connection to another Oracle RAC node; any work in progress is lost. SESSION failover is ideal for online transaction processing (OLTP) systems, where transactions are small.

Oracle TAF also offers choices on how to restart a failed transaction. The Oracle DBA may choose one of the following failover methods:

  • BASIC failover. In this approach, the application connects to a backup node only after the primary connection fails. This approach has low overhead, but the end user experiences a delay while the new connection is created.
  • PRECONNECT failover. In this approach, the application simultaneously connects to both a primary and a backup node. This offers faster failover, because a pre-spawned connection is ready to use. But the extra connection adds everyday overhead by duplicating connections.

Currently, TAF will fail over standard SQL SELECT statements that have been caught during a node crash in an in-flight transaction failure. In the current release of TAF, however, TAF must restart some types of transactions from the beginning of the transaction.

The following types of transactions do not automatically fail over and must be restarted by TAF:

  • Transactional statements. Transactions involving INSERT, UPDATE, or DELETE statements are not supported by TAF.
  • ALTER SESSION statements. ALTER SESSION and SQL*Plus SET statements do not fail over.
  • The following do not fail over and cannot be restarted:
  • Temporary objects. Transactions using temporary segments in the TEMP tablespace and global temporary tables do not fail over.
  • PL/SQL package states. PL/SQL package states are lost during failover.

Using Oracle RAC and TAF Together

The continuous availability features of Oracle RAC and TAF come together when these products cooperate in restarting failed transactions. Let's take a closer look at how this works.

Within each connected Oracle Net client, tnsnames.ora file parameters define the failover types and methods for that client. The parameters direct Oracle RAC and TAF on how to restart any transactions that may be in-flight during a hardware failure on the node.

It is important to note that TAF failover control is external to the Oracle RAC cluster, and each Oracle Net client may have unique failover types and methods, depending on processing requirements. The following is a client tnsnames.ora file entry for a node, including its current TAF failover parameters:

 

bubba.world =
  (DESCRIPTION_LIST =
    (FAILOVER = true)
    (LOAD_BALANCE = true)
    (DESCRIPTION =
    (ADDRESS =
      (PROTOCOL = TCP)
      (HOST = redneck)(PORT = 1521))
      (CONNECT_DATA =
        (SERVICE_NAME = bubba)
        (SERVER = dedicated)
        (FAILOVER_MODE = 
           (BACKUP=cletus)
           (TYPE=select)           
           (METHOD=preconnect)
           (RETRIES=20)
           (DELAY=3)
        )
      )
    )

 

The failover_mode section of the tnsnames.ora file lists the parameters and their values:

BACKUP=cletus. This names the backup node that will take over failed connections when a node crashes. In this example, the primary server is bubba, and TAF will reconnect failed transactions to the cletus instance in case of server failure.

TYPE=select. This tells TAF to restart all in-flight transactions from the beginning of the transaction (and not to track cursor states within each transaction).

METHOD=preconnect. This directs TAF to create two connections at transaction startup time: one to the primary bubba database and a backup connection to the cletus database. In case of instance failure, the cletus database will be ready to resume the failed transaction.

RETRIES=20. This directs TAF to retry a failover connection up to 20 times.

DELAY=3. This tells TAF to wait three seconds between connection retries.

Remember, you must set these TAF parameters in every tnsnames.ora file on every Oracle Net client that needs transparent failover.

Putting It All Together

An Oracle Net client can be a single PC or a huge application server. In the architectures of giant Oracle RAC systems, each application server has a customized tnsnames.ora file that governs the failover method for all connections that are routed to that application server.

Watching TAF in Action

The transparency of TAF operation is a tremendous advantage to application users, but DBAs need to quickly see what has happened and where failover traffic is going, and they need to be able to get the status of failover transactions. To provide this capability, the Oracle data dictionary has several new columns in the V$SESSION view that give the current status of failover transactions.

The following query calls the new FAILOVER_TYPE, FAILOVER_METHOD, and FAILED_OVER columns of the V$SESSION view. Be sure to note that the query is restricted to nonsystem sessions, because Oracle data definition language (DDL) and data manipulation language (DML) are not recoverable with TAF.

 

select
   username, 
   sid, 
   serial#, 
   failover_type, 
   failover_method, 
   failed_over
from
   v$session
where
   username not in ('SYS','SYSTEM',
'PERFSTAT')
and
   failed_over = 'YES';

 

You can run this script against the backup node after an instance failure to see those transactions that have been reconnected with TAF. Listing 1 shows a sample of the output. Remember, TAF will quickly redirect transactions, so you'll only see entries for a short period of time immediately after the failover.

As you can see in Listing 1, a backup node can have a variety of concurrent failover transactions, because the tnsnames.ora file on each Oracle Net client specifies the backup node, the failover type, and the failover method.

Conclusion

Oracle RAC, TAF, and Cache Fusion work together to guarantee continuous availability and infinite scalability. To summarize, here's a short description of each component:

Oracle RAC. The clustering component of Oracle that allows the creation of multiple, independent Oracle instances, all sharing a single database.

Cache Fusion. The shared RAM component of Oracle RAC that provides fast interchange of Oracle data blocks between SGA regions.

TAF. The failover method implemented on the Oracle Net client to restart in-flight transactions when a node crashes.

 

If you like Oracle tuning, see the book "Oracle Tuning: The Definitive Reference", with 950 pages of tuning tips and scripts. 

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.


    Need Oracle training?
  • Get Oracle training from a practicing Oracle expert
  • Get custom training designed to fit your needs
  • Conveniently offered at your workplace, anywhere in the USA

BC Oracle training offers some of the USA's most respected Oracle experts and authors.  Why spend thousands on cookie cutter Oracle classes when you can have the personalized attention of a real Oracle guru? Just call now:

 

 

 

 
 
 

Oracle performance tuning book

 

 

Oracle performance tuning software

 
Oracle performance tuning software
 
SearchOracle web site
 
Oracle performance Tuning 10g reference poster
 
Oracle performance tuning webcast
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 

 

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals. 
Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  and include the URL for the page.
 
 


Burleson Consulting

The Oracle of database support


 

Copyright © 1996 -  2007 by Burleson Enterprises, Inc. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation.


Hit Counter