Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

Block Access, Grants, and Interrupts

Oracle RAC Cluster Tips by Burleson Consulting

This is an excerpt from the bestselling book Oracle Grid & Real Application Clusters.  To get immediate access to the code depot of working RAC scripts, buy it directly from the publisher and save more than 30%.


The GCS maintains the status of the resources. It also keeps an inventory of the access requests for the data blocks. After the blocks are transferred from one instance to another to meet requests, the requesting processes need to be notified that the block is actually available. Therefore, processes utilize interrupts to inform of the arrival or completion of block transfers. The GCS uses various interrupts to manage resource allocation. These interrupts are:

* Blocking Interrupt - When exclusive access is needed for a requestor, the GCS sends a blocking interrupt to a process that currently owns the shared resource, notifying it that a request for an exclusive resource is waiting.

* Acquisition Interrupt - When the requested access (e.g., exclusive) is made available after releasing an earlier access mode, an acquisition interrupt is sent to alert the process that has requested the exclusive resource. The acquisition interrupt helps to notify the requesting process.

* Block Arrival Interrupt - When a process requests a block from the GCS, the request is forwarded to the instance holding the block. Then the requested block is sent to the requesting process, and the process informs the GCS that it has received the block. This notification is called block arrival interrupt.

The block requests are granted for many processes at the same time, but they follow a queuing mechanism. The GCS maintains two types of queues for resource requests. 

If the GCS is unable to grant a resource request immediately, then the GCS puts it in the convert queue. The GCS then tracks all waiting requests.

Once a resource is granted to the requesting process, it is kept in the granted queue. The GCS tracks resource requests in the granted queue.

Cache Fusion and Recovery

In the RAC system, whenever there is a node failure, the instance running on the failed node crashes and becomes unusable. There can be several reasons for such a failure. In this section, focus will be placed on the changes that take place in the global cache and how the recovery of the failed instance is undertaken by one of the surviving instances.

Recovery Features

Only the cache resources that reside on the failed nodes or are mastered by the GCS on the failed nodes need to be re-built or re-mastered. Rebuilt or re-master does not mean building a block; the lock ownership is merely changed and this is explained later with examples.

All resources previously mastered at the failed instance are redistributed across the remaining instances. These resources are reconstructed at their new master instance. All other resources previously mastered at surviving instances remain unaffected.

The cluster manager first detects the node and instance failure. It communicates the failure status to the GCS by way of the LMON process. At this stage, any surviving instance in the cluster initiates the recovery process.

Remember, instance recovery does not include restarting the failed instance or recovering applications that were running on that instance. Also note that, even after a node failure and instance loss, the redo log file of the failed instance is still available to the other recovering instance, since the redo log file is located on the shared cluster file system or shared raw partition. This is an important feature of the RAC system.

Because of past images, instance recovery is performed differently in the RAC implementation. The SMON process of a surviving instance performs recovery of the failed instance or thread. However, note that the foreground process performs recovery in a stand-alone instance.

Recovery Methodology and steps

Oracle performs the following steps to recover:

1. In the initial phase of recovery, GES enqueues are reconfigured and the global resource directory is frozen. All GCS resource requests and writes are temporarily halted.

2. GCS resources are reconfigured among the surviving instances. One of the surviving instances becomes the recovering instance. The SMON process of the recovering instance starts a first pass of the redo log read of the failed instance's redo thread.

3. Block resources that need to be recovered are identified and the global resource directory is reconstructed. Pending requests or writes are cancelled or replayed.

4. Resources identified in the previous log read phase are defined as recovery resources. Buffer space for recovery is allocated.

5. Assuming that there are past images of blocks to be recovered in other caches in the cluster, source buffers are requested from other instances. The resource buffers are the starting point of recovery for a particular block.

6. All resources and enqueues required for subsequent processing have been acquired and the global resource directory is now unfrozen. Any data blocks that are not in recovery can now be accessed. At this time, the system is partially available.

7. The SMON merges the redo thread order by SCN to ensure that changes are written in an orderly fashion. This process is important for multiple simultaneous failures. If multiple instances die simultaneously, neither the PI buffers nor the current buffers for a data block can be found in any surviving instance's cache. Then a log merger of the failed instances is performed.

8. Now the second pass of recovery begins and redo is applied to data files, releasing the recovery resources immediately after block recovery, so that more and more blocks become available as cache recovery proceeds.

9. After all blocks have been recovered and recovery resources have been released, the system is available for normal use.

Figure 7.7 shows the basic steps in the recovery.

Figure 7.7: Online Instance Recovery Steps

 


This is an excerpt from the bestselling book Oracle Grid & Real Application Clusters, Rampant TechPress, by Mike Ault and Madhu Tumma.

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

http://www.rampant-books.com/book_2004_1_10g_grid.htm


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational