RAC Filesystem Options

Natalka Roshak's picture
articles: 

DBAs wanting to create a 10g Real Applications Cluster face many configuration decisions. One of the more potentially confusing decisions involves the choice of filesystems. Gone are the days when DBAs simply had to choose between "raw" and "cooked". DBAs setting up a 10g RAC can still choose raw devices, but they also have several filesystem options, and these options vary considerably from platform to platform. Further, some storage options cannot be used for all the files in the RAC setup. This article gives an overview of the RAC storage options available.

RAC Review

Let's begin by reviewing the structure of a Real Applications Cluster. Physically, a RAC consists of several nodes (servers), connected to each other by a private interconnect. The database files are kept on a shared storage subsystem, where they're accessible to all nodes. And each node has a public network connection.

In terms of software and configuration, the RAC has three basic components: cluster software and/or Cluster Ready Services, database software, and a method of managing the shared storage subsystem.

  • The cluster software can be vendor-supplied or Oracle-supplied, depending on platform. Cluster Ready Services, or CRS, is a new feature in 10g. Where vendor clusterware is used, CRS interacts with the vendor clusterware to coordinate cluster membership information; without vendor clusterware, CRS, which is also known as Oracle OSD Clusterware, provides complete cluster management.

  • The database software is Oracle 10g with the RAC option, of course.

  • Finally, the shared storage subsystem can be managed by one of the following options: raw devices; Automatic Storage Management (ASM); Vendor-supplied cluster file system (CFS), Oracle Cluster File System (OCFS), or vendor-supplied logical volume manager (LVM); or Networked File System (NFS) on a certified Network Attached Storage (NAS) device.

Storage Options

Let me clarify the foregoing alphabet soup with a table:

Table 1. Storage options for the shared storage subsystem.
Abbrev.Storage Option
RawRaw devices, no filesystem
ASMAutomatic Storage Management
CFSCluster File System
OCFSOracle Cluster File System
LVMLogical Volume Manager
NFSNetwork File System (must be on certified NAS device)

Before I delve into each of these storage options, a word about file types. A regular single-instance database has three basic types of files: database software and dump files; datafiles, spfile, control files and log files, often referred to as "database files"; and it may have recovery files, if using RMAN. A RAC database has an additional type of file referred to as "CRS files". These consist of the Oracle Cluster Registry (OCR) and the voting disk.

Not all of these files have to be on the shared storage subsystem. The database files and CRS files must be accessible to all instances, so must be on the shared storage subsystem. The database software can be on the shared subsystem and shared between nodes; or each node can have its own ORACLE_HOME. The flash recovery area must be shared by all instances, if used.

Some storage options can't handle all of these file types. To take an obvious example, the database software and dump files can't be stored on raw devices. This isn't important for the dump files, but it does mean that choosing raw devices precludes having a shared ORACLE_HOME on the shared storage device.

And to further complicate the picture, no OS platform is certified for all of the shared storage options. For example, only Linux and SPARC Solaris are supported with NFS, and the NFS must be on a certified NAS device. The following table spells out which platforms and file types can use each storage option.

Table 2. Platforms and file types able to use each storage option
Storage optionPlatformsFile types supportedFile types not supported
RawAll platformsDatabase, CRSSoftware/Dump files, Recovery
ASMAll platformsDatabase, RecoveryCRS, Software/Dump
Certified Vendor CFSAIX, HP Tru64 UNIX, SPARC SolarisAllNone
LVMHP-UX, HP Tru64 UNIX, SPARC SolarisAllNone
OCFSWindows, LinuxDatabase, CRS, RecoverySoftware/Dump files
NFSLinux, SPARC SolarisAllNone

(Note: Mike Ault and Madhu Tumma have summarized the storage choices by platform in more detail in this excerpt from their recent book, Oracle 10g Grid Computing with RAC, which I used as one source for this table.)

Now that we have an idea of where we can use these storage options, let's examine each option in a little more detail. We'll tackle them in order of Oracle's recommendation, starting with Oracle's least preferred, raw devices, and finishing up with Oracle's top recommendation, ASM.

Raw devices

Raw devices need little explanation. As with single-instance Oracle, each tablespace requires a partition. You will also need to store your software and dump files elsewhere.

Pros: You won't need to install any vendor or Oracle-supplied clusterware or additional drivers.
Cons: You won't be able to have a shared oracle home, and if you want to configure a flash recovery area, you'll need to choose another option for it. Manageablility is an issue. Further, raw devices are a terrible choice if you expect to resize or add tablespaces frequently, as this involves resizing or adding a partition.

NFS

NFS also requires little explanation. It must be used with a certified NAS device; Oracle has certified a number of NAS filers with its products, including products from EMC, HP, NetApp and others. NFS on NAS can be a cost-effective alternative to a SAN for Linux and Solaris, especially if no SAN hardware is already installed.

Pros: Ease of use and relatively low cost.
Cons: Not suitable for all deployments. Analysts recommend SANs over NAS for large-scale transaction-intensive applications, although there's disagreement on how big is too big for NAS.

Vendor CFS and LVMs

If you're considering a vendor CFS or LVM, you'll need to check the 10g Real Application Clusters Installation Guide for your platform and the Certify pages on MetaLink. A discussion of all the certified cluster file systems is beyond the scope of this article. Pros and cons depend on the specific solution, but some general observations can be made:

Pros: You can store all types of files associated with the instance on the CFS / logical volumes.
Cons: Depends on CFS / LVM. And you won't be enjoying the manageability advantage of ASM.

OCFS

OCFS is the Oracle-supplied CFS for Linux and Windows. This is the only CFS that can be used with these platforms. The current version of OCFS was designed specifically to store RAC files, and is not a full-featured CFS. You can store database, CRS and recovery files on it, but it doesn't fully support generic filesystem operations. Thus, for example, you cannot install a shared ORACLE_HOME on an OCFS device.

The next version of OCFS, OCFS2, is currently out in beta version and will support generic filesystem operations, including a shared ORACLE_HOME.

Pros: Provides a CFS option for Linux and Windows.
Cons: Cannot store regular filesystem files such as Oracle software. Easier to manage than raw devices, but not as manageable as NFS or ASM.

ASM

Oracle recommends ASM for 10g RAC deployments, although CRS files cannot be stored on ASM. In fact, RAC installations using Oracle Database Standard Edition must use ASM.

ASM is a little bit like a logical volume manager and provides many of the benefits of LVMs. But it also provides benefits LVMs don't: file-level striping/mirroring, and ease of manageability. Instead of running LVM software, you run an ASM instance, a new type of "instance" that largely consists of processes and memory and stores its information in the ASM disks it's managing.

Pros: File-level striping and mirroring; ease of manageability through Oracle syntax and OEM.
Cons: ASM files can only be managed through an Oracle application such as RMAN. This can be a weakness if you prefer third-party backup software or simple backup scripts. Cannot store CRS files or database software.

Conclusion

We've seen that there's an array of storage options for the shared storage device in your RAC. These options depend on your platform, and many options don't store all types of database files, meaning they have to be used in conjunction with another option. For example, a DBA wanting to use ASM to store database files might take a 12-disk SAN, create 11 ASM disks for the database files and flash recovery area, leave the 12th disk raw and store CRS files on it, and maintain separate ORACLE_HOMEs on the non-shared disks on each node.

Table 3: Sample Disk Configuration w/ ASM
In shared storage subsytem:
SAN Disk#
In each node:
Internal Disk#
1234567 89101112 12
ASM disks
DB files; flash recovery
RAW
CRS
OS files ORA
HOME

When weighing the shared storage device options for your platform, start with the Oracle Real Application Clusters Installation and Configuration Guide for your platform, available from OTN. Section III has a platform-specific discussion of storage options. Be sure to check the certification matrices on MetaLink as well.

Comments

Missed to include one of more complete cluster file systems - PolyServe File System (supports RAC on Windows, SuSE and Red Hat).

Otherwise quite informative.

John -- PolyServe does indeed support RAC, but Oracle has not certified RAC on this file system for any platform. The article only discusses Oracle-certified configurations.

Hey,

we are evaluating to use PolyServe with Oracle RAC on Linux. But unfortunatelly I did not find to much Information about this.
Can anyone point me to some documentation?
Also a link to an Oracle statement, that PolyServe is (not) supported would be nice. I did not find anything until now.

Thanks,
Thorsten

How would each of these options handle hyper volumes and meta volumes? Would ASM know the difference between a hyper volume and a disk spindle? Do any of these options know about head contention?

With specific reference to NFS, I have noticed that it is considered a "valid storage option" for Oracle RAC on SPARC Solaris servers. Would you say that NFS, used with a certified NAS device, can be an alternative to a SAN also in a Oracle 9i RAC installation?
Thanks a lot
Luciano

Hi,

We are testing Oracle10gR2 RAC with ASM using NetApp NFS volumes on RHEl4 update 4. We are facing an intermittent issue, that when we create ASM instance from 1 node, the diskgroups get mounted on node 1, although ASM instance get successfully created on node2, but it doesnt mount the diskgroups on node2.

Can anyone help on this. To use NFS as block device, we have created Loop device using losetup command for the NFS volume. and that loop device is then used to create ASM disk using ASMlib.

Regards.
Raj

Very useful summary of the disk storage options, thank you. Why can't Oracle themselves give this kind of information as a primer?

I am using EMS shared device as NFS to keep OCR and voting disks on HP platform. Every thing works, but while running vipca and while configuring ons I get eror:

onsctl start
Failed to get IP for localhost (0)
Failed to get IP for localhost (0)
Failed to get IP for localhost (0)
onsctl: ons failed to start

Any idea on this?

Swamy

Nice explanation. Thanks.

As of now it appears like Oracle doesn't support any third party vendor CFS (as per Metalink's certification matrix). So, basically no other option than using RAW devices or ASM on Solaris (NFS is not an option for us).

The comments on this thread regarding certification of PolyServe for RAC on Linux is missing one significant fact--the fact that until Feb 2006 there was no certification program for 3rd party CFS on Linux. As soon as Oracle came up with a certification program, we went through it and are indeed certified. We have a significant number of Global 2000 accounts happily using the product for Oracle HA (consolidation) and, of course, RAC.

This URL that introduces the new (Feb 2006) Third Party Cluster Filesystems and Clusterware Compatibility Test Program is:
http://www.oracle.com/technology/software/oce/oce_fact_sheet.htm

Certify status for PolyServe is:

http://www.oracle.com/technology/products/database/clustering/certify/tech_linux_x86.html

If you don't like Raw disk (ASM or otherwise), there are choices.

I am Oracle DBA mostly depend on Internet to refresh my knowldge and keep myself up-to-date. I frequently visit asktom and this site. I am very much impressed by the blogs from Natalka Roshak. All her posts are superb. I must appreciate her in order to encourage her to keep this work for assistance and referece to Oracle professionals like me.

Thanks and best regards.

(SHAHID FAROOQ)
ORACLE OCP DBA 8i, 9i and 10g

Can I use network attached storage (not SAN) for Oracle Cluster Register (OCR) and Voting disk?

Can I Use Oracle Cluster File System (OCFS) or Network File System (NFS) in Windows?

If so, please explain.

Can't we use NFS for flash recovery area in RAC system?