Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 
 

Popular RAC Storage Options in 11g

Oracle Tips by Burleson Consulting

By Steve Karam, the world's youngest Oracle ACE and Oracle certified Master.

When one of my clients is working on a new RAC configuration, I,m guaranteed to receive tons of questions.  One of the most common is: what is the best storage option for RAC?

 

Despite the plethora of articles and information regarding storage options, most companies end up going with the advice of their storage vendor. 

In this article we will explore some of these options.

  • Raw storage, which is often demanded yet rarely used.  Popular misconceptions and difficult management make this a fading technology.

  • ASM, the new standard for RAC storage.  As a one-time skeptic of this technology, I have found myself consistently pleased with it.

  • Direct NFS, 11g's new networked storage product sure to excite users of NAS filers.

  • OCFS2, a cluster file system developed for Oracle RAC environments.

We will also discuss udev rules, a device management solution which replaces traditional methods in RHEL5.

RAC Using Raw Storage

Some Oracle files can be written to unformatted disk areas known as raw devices.

 

Note:  Some sources may also call these raw volumes, raw partitions, or raw disks.

 

The Oracle files which can be written to raw devices are:

  • OCR

  • Voting Disk

  • Datafiles

  • Redo Logs

  • Control File

  • SPFILE

It is worth noting the reason archive logs and RMAN backups do not make the "raw storage" list.  This is because raw devices cannot handle files created and runtime.

 

Given a partition with no filesystem, there are three available options: format the partition for a particular filesystem, use the partition in an ASM diskgroup (discussed later), or use the partition as a raw device on which a single file may be placed.

 

One reason behind the popularity of raw devices is performance.  In the past, raw devices were the only way by which a system could be set up to take advantage of Direct I/O (DIO); that is, I/O that bypasses the filesystem cache.  In fact, Direct I/O has been supported in the ext3 filesystem since Enterprise Linux 2.1.  Support for enhanced Asynchronous I/O (AIO) with Direct I/O was added in Enterprise Linux 4, even when using an ext based filesystem.  According to Red Hat, ext3 filesystem access with AIO and DIO can perform within 3% of raw I/O performance.  Direct I/O is also enabled when using OCFS/OCFS2.

%  Note:  The filesystemio_options parameter allows a DBA to direct how Oracle will perform I/O.  A setting of "directio" will allow Direct I/O access.  "asynch" allows Asynchronous I/O access.  "setall" allows both.  Consult your OS specific documentation to determine if your system is optimized for both DIO and AIO.

In Oracle 11g it is very common to find the OCR and Voting Disk of a RAC cluster on raw devices.  This is because those two files are 1) very small, 2) very static in size, and 3) cannot be placed in ASM.  However, according to Oracle MOSC (Oracle's support system), raw device support will be completely unavailable in Oracle 12g.  This may be due to the fact that raw devices have been declared obsolete in Linux since kernel version 2.6.3, and support for raw devices will soon be gone.

 

However, there is no need to fear this change.  Instead, it is only necessary to make room for a few changes in vocabulary.

 

Those used to using rawdevices on Linux may get a shock when using Redhat Enterprise Linux 5 (RHEL5) or Oracle Enterprise Linux 5 (OEL5), as they will not find the traditional raw device configuration.  As mentioned above, in kernel version 2.6.3 this support is officially deprecated.  However, it is still possible to configure a /dev/raw volume using udev rules.

 

In RHEL4 it was possible to simply place entries in /etc/sysconfig/rawdevices which mapped a block device (i.e. /dev/sda1) to a raw device (i.e. /dev/raw1).  Using the "rawdevices" service, the mapping would take effect and /dev/raw would be a usable area.

%  Note:  In a Windows environment, a raw device is simply a logical partition created in Disk Manager that is not formatted and has no drive letter.

In RHEL and OEL 5, entries must be made under the rules specified in /etc/udev.  "udev" is responsible for managing the /dev area in Linux, and udev rules determine how /dev will be presented.  Udev rules were also allowed in  RHEL4, though not required.

 

While /bin/raw can be used to bind a block device to a raw device, /bin/raw binding alone is not meant to be a long term configuration.  One of the primary purposes of udev is to keep disk areas and naming consistent.

 

To create a udev rule that maps block device /dev/sda1 to raw device /dev/raw1:

 

1.      Create a file called:  /etc/udev/rules.d/60-raw.rules

o       Any number greater than or equal to 60 may be used

2.      Add the line: 

ACTION=='add', 'KERNEL=='sda1', RUN+=?/bin/raw /dev/raw/raw1 %N'


Despite this ability, in Oracle 11g there is really no point in creating /dev/raw devices unless it is being done for comfort value.  This is because in Oracle 10.2.0.2 up, block devices are accessible by Oracle using the O_DIRECT flag, meaning they are able to perform direct I/O without using the rawio interface.  OUI and ASMlib will both accept a block device (i.e. /dev/sda1) as input for file placement in Oracle 11g on Linux.

%  Note:  In 10gR2, even though Oracle allowed block devices to be used in 10.2.0.2 and up, OUI was not able to handle a block device name.  Instead, symbolic links had to be created to map the block device to a different name under /dev.  While effective, this does not follow the udev rules.

Even with direct support for block devices, in order to configure a block device for Oracle's use udev rules must still be created.  Since udev manages the /dev area, the rules will need to state ownership of your block devices in order to grant Oracle the permissions necessary to use them.

 

1.       Edit the file:  /etc/udev/rules.d/50-udev.rules

2.      At the bottom of the file, add the new rules in the following format:

 

KERNEL=='blockdevicename', OWNER='deviceowner', GROUP='devicegroup', MODE='4digitpermissions'

  • blockdevicename is the name of the block device.  For instance, if the device is listed as /dev/sda1, the block device name is sda1.

  • deviceowner should be set to the name of the OS user that will own the block device.  For instance, if the device is going to be used for placement of the OCR, root should be the owner.  For the voting disk or ASM disks, oracle should be the owner.

  • devicegroup should be set to the name of the group which owns the block device.  This will usually be oinstall or dba.

  • 4digitpermissions should be set to the permissions mask of the block device.  For the OCR and ASM devices this will be 0640.  For the Voting Disk it will usually be 0644.

It is important to note that even though Oracle is writing to a block device instead of a raw device, this is still technically raw storage.  Instead of using the rawio interface, a direct interface to the block device has been provided by the Linux kernel and Oracle.

 

The limitations of raw are still a factor when writing to block devices.  Only one file may be present on any single block device when the device is unformatted.  As such, raw devices are usually not recommended for most Oracle files.

 

Creating the OCR and Voting Disk on block devices is a popular option, as the only other storage method available is a cluster file system such as OCFS2 which presents yet another layer of dependency for an Oracle installation.  For datafiles, control files, SPFILEs, redo logs, RMAN backups, and archive logs, ASM is the new de facto standard for RAC storage.

RAC using Automatic Storage Management (ASM)

ASM was introduced in Oracle 10g, and is widely used in both RAC and single instance environments.  Oracle created ASM as a way for DBAs to simplify their storage options, especially when using a cluster environment.

 

ASM will work directly with block devices and provide a combination of software RAID and volume management specifically built for Oracle files.  However, files stored within ASM are not available to the operating system without the use of special tools.

 

This means that ASM volumes (called 'diskgroups') cannot simply be mounted at the OS level and browsed, copied, edited, or otherwise managed.  However, a whole host of commands have been created which can be performed through SQL*Plus.  Tools such as ASMlib and ASMCMD simplify management of files inside of ASM volumes.  For example, ASMCMD allows an ASM volume to be browsed as if it were a standard filesystem:

 

bash-3.00$ asmcmd

 ASMCMD> ls -ltr

 State    Type    Rebal  Name

 MOUNTED  EXTERN  N      DATA/

 ASMCMD> cd DATA

 ASMCMD> ls -ltr

 Type  Redund  Striped  Time             Sys  Name

                                         Y    RACDB/

 ASMCMD> cd RACDB

 ASMCMD> ls -ltr

 Type           Redund  Striped  Time             Sys  Name

                                                  Y    CONTROLFILE/

                                                  Y    DATAFILE/

                                                  Y    ONLINELOG/

                                                  Y    PARAMETERFILE/

                                                  Y    TEMPFILE/

                                                  N    spfileRACDB.ora => +DATA/RACDB/PARAMETERFILE/spfile.269.679922899

 ASMCMD> cd DATAFILE

 ASMCMD> ls -ltr

 Type      Redund  Striped  Time             Sys  Name

 DATAFILE  UNPROT  COARSE   JAN 15 11:00:00  Y    SYSTEM.260.679921433

 DATAFILE  UNPROT  COARSE   JAN 15 11:00:00  Y    UNDOTBS1.263.679921435

 DATAFILE  UNPROT  COARSE   JAN 15 11:00:00  Y    UNDOTBS2.259.679922629

 DATAFILE  UNPROT  COARSE   JAN 15 11:00:00  Y    USERS.267.679921435

 DATAFILE  UNPROT  COARSE   JAN 15 13:00:00  Y    SYSAUX.268.679921433

  

In addition, ASM gives storage administrators and DBAs the option to add or remove disks from the configuration as needed, allowing easy scalability at the storage level while remaining online.  This level of granularity was previously not possible with most Logical Volume Managers (LVMs).  Enhanced striping is available as well, allowing a database to stripe not only across multiple disks, but multiple trays and storage arrays.

 

Block devices at the OS level are recognized by ASM as 'ASM disks.'  Even if a volume is formed of a twelve disk RAID 10 volume and presented to ASM, in ASM it is still considered a disk.  ASM Disks can then be added to ASM diskgroups, which take on the format '+NAME'.  The plus sign (+) is used in naming an ASM diskgroup, and when creating files inside of an ASM diskgroup.  For example:

 

SQL> create tablespace ASM_TBS datafile '+DATA' size 100M;

 

Information about ASM Disks can be found in the V$ASM_DISK view, while information about diskgroups can be found in V$ASM_DISKGROUP.

 

When multiple disks are added to an ASM diskgroup, ASM will automatically rebalance data between the disks in the diskgroup.  For instance, if a shelf of 14 disks is made into a single RAID 10 volume (7 mirrored disks striped), and another four disks are made into a RAID 10 volume, it would be possible to combine the two into an ASM diskgroup.  ASM will rebalance the data across both volumes to optimize I/O throughput.  Additionally, ASM adds no overhead to standard raw device I/O; as a result, ASM works at 'the speed of raw'.

RAC Using NFS with Direct NFS (DNFS)

Oracle 11g comes with enhanced support for Oracle storage over the network using the new Direct NFS feature.  Direct NFS allows for costs savings by sticking with one connection model: the network.  This allows for multipathing and unified storage.  In addition, Direct NFS even works in Windows, even though Windows has no NFS support.

 

The reason this support is available is because DNFS is not NFS.  It is a completely new network storage model build specifically for and within Oracle.  DNFS takes the fundamentals of NFS and strips away much of the overhead (such as data cache copying between user and kernel space), adding features specifically required for the enterprise Oracle database.

 

The result is a storage method that is convenient for Oracle shops who prefer NAS devices or those who are simply browsing for a new platform.  DNFS allows for Direct I/O and Asynchronous I/O out of the box, and provides a familiar filesystem environment for storage administrators and DBAs.  Whereas Oracle over NFS was a possible option, Oracle over DNFS is more viable in resource intensive environments.

%  Note:  Oracle 11g Direct NFS only works with NFS V3 compatible NAS devices.

To use DNFS, the Direct NFS Client which ships with Oracle 11g must be configured on all necessary nodes.

 

On Linux, this can be done in a few easy steps:

 

  • Add your mount point details to /etc/mtab or $ORACLE_HOME/dbs/oranfstab

  • Shut down the database (all nodes in a RAC environment)

  • cd $ORACLE_HOME/lib

  • mv libodm11.so libodm11.so.old

  • ln -s libnfsodm11.so libodm11.so

  • Start your database

 

While DNFS has not survived the test of several versions, it is quickly emerging in the Oracle world.  While RAC environments sharing storage with a SAN are more likely to use ASM for management, environments using or considering NAS filers can benefit quickly and easily from DNFS support.  Additionally, it is possible to blend DNFS with ASM in order to stripe across multiple filers if necessary.

OCFS2

OCFS2 is a shared-disk cluster file system (CFS) available for Linux which provides a shared environment for RAC.

 

Previous releases of OCFS were incapable of storing standard non-Oracle files, a level of support some DBAs found inconvenient.  OCFS2 is much more robust and offers not only Oracle shared storage, but standard filesystem capabilities which provide clustering for a wide range of server needs such as webservers, mailservers, and file servers.

 

As noted on the project page at http://oss.oracle.com/projects/ocfs2, OCFS2 offers some notable features associated with complex filesystems such as:

  • Variable Block sizes

  • Flexible Allocations (extents, sparse, unwritten extents with the ability to punch holes)

  • Journaling (ordered and writeback data journaling modes)

  • Endian and Architecture Neutral (x86, x86_64, ia64 and ppc64)

  • In-built Clusterstack with a Distributed Lock Manager

  • Support for Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os

  • Comprehensive Tools support

 

OCFS2 installation is as simple as installing RPMs on your Linux server.  As of the time of this writing, OCFS2 is at version 1.4, with three specific RPM files required for installation:

  • ocfs2-tools

  • ocfs2console

  • ocfs2

Once installed, OCFS2 can be configured manually or using OCFS2 console, pictured below:

 

Example OCFS2 Console screen.  Source http://oss.oracle.com

 

The benefits of OCFS2 are much the same as DNFS except that it does not require Network Attached Storage.  An OCFS2 filesystem is usable as a shared environment for RAC while providing standard filesystem capabilities and commands along with high performance through low-overhead DIO and AIO.  For users who wish to use their SAN for RAC but require the use of filesystem commands such as 'ls', 'cp', 'mv', et al., OCFS2 is a viable alternative to ASM.

Conclusion

There are many options for data storage in a RAC environment.  This is much different from the days where the only options were raw volumes or third party cluster file systems. 

 

With these options, it is possible for the DBA and System Administrators to work together to find an optimal environment for their RAC cluster.  Between ASM, OCFS2, and DNFS, Oracle offers high performance solutions for any need: ASM and OCFS for direct attached methods depending upon the need for filesystem access, DNFS for network attached storage requiring high performance, raw storage for required files such as the OCR and Voting disk, or a combination of all of these technologies to suit the needs of the environment.

 
   
Oracle Grid and Real Application Clusters

See working examples of Oracle Grid and RAC in the book Oracle Grid and Real Application Clusters.

Order directly from Rampant and save 30%. 
 

  
   


 

   

 

��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational