RE: OCR / VD external vs. normal redundancy using NFS.

From: D'Hooge Freek <Freek.DHooge_at_uptime.be>
Date: Thu, 15 Jul 2010 07:40:18 +0200
Message-ID: <4814386347E41145AAE79139EAA3989810265B2046_at_ws03-exch07.iconos.be>



David,

I think the ownership in the udev script did not work, because in your script you are only creating a symlink and not the actual block device.

I have used the udev script below to manage multipath luns (named via multipath.conf). What it does is check if the device is a multipath block device (dm-) and if the name of the device starts with "asm_" (alias name giving by multipath). If so a block device is created in /dev/oracle/ with the same name as given by multipath. The ownership of that device is set to grid and the group to dba with mode 660. The same is done for the partitions.

I also specified that no other rules can be applied on the device I created.

SUBSYSTEM!="block", GOTO="end_oracle"
KERNEL!="dm-[0-9]*", GOTO="end_oracle"
PROGRAM!="/sbin/mpath_wait %M %m", GOTO="end_oracle" ACTION=="add", RUN+="/sbin/dmsetup ls --target multipath --exec '/sbin/kpartx -a -p p' -j %M -m %m"

PROGRAM=="/sbin/dmsetup ls --target multipath --exec /bin/basename -j %M -m %m", RESULT=="asm_*", NAME="oracle/%c", OWNER="grid", GROUP="dba", MODE="0660", OPTIONS="last_rule"
PROGRAM!="/bin/bash -c '/sbin/dmsetup info -c --noheadings -j %M -m %m | /bin/grep -q .*:.*:.*:.*:.*:.*:.*:part[0-9]*-mpath-'", GOTO="end_oracle"
PROGRAM=="/sbin/dmsetup ls --target linear --exec /bin/basename -j %M -m %m", RESULT=="asm_*", NAME="oracle/%c", OWNER="grid", GROUP="dba", MODE="0660", OPTIONS="last_rule"
LABEL="end_oracle"

Regards,  

Freek D'Hooge
Uptime
Oracle Database Administrator
email: freek.dhooge_at_uptime.be
tel +32(0)3 451 23 82
http://www.uptime.be
disclaimer: www.uptime.be/disclaimer

-----Original Message-----
From: David Robillard [mailto:david.robillard_at_gmail.com] Sent: woensdag 14 juli 2010 23:24
To: Andrew Kerber
Cc: D'Hooge Freek; oracle-l mailing list; LS Cheng Subject: Re: OCR / VD external vs. normal redundancy using NFS.

Hello Andrew,

On Wed, Jul 14, 2010 at 4:33 PM, Andrew Kerber <andrew.kerber_at_gmail.com> wrote:
> I would love to see your udev script.

It's based on two scripts which are built for a RedHat Linux machine. The first one is a simple udev(7) configuration file which creates the desired symbolic links to the iSCSI devices. See this RedHat Knowledge Base article on how it works [1]. That's the persistent naming that ASMLib creates.

The second one sets the permissions so that the grid user, and hence ASM, can access the disks.

The key to get the persistent naming working is to get the LUN's ID of each LUN. If you have a Sun Unified Storage 7000 unit, you can get them like this if your project is called, say "oracle.asm": (but you need to add the number "3" in front of all LUN ID for udev(7) to work, strange?).

nas:> shares select oracle.asm ls

Or you get them one by one by running this (assuming your iSCSI disk in /dev/sdar for instance):

sudo scsi_id -gus /block/sdar # Notice that you need to change "/dev" for "/block" for this to work!

That will return the LUN ID, such as
"3600144f0aa6313ac00004c3dd997000d". That's what you use in the udev script. Do this for all iSCSI LUNs you have and then create the /etc/udev/rules.d/20-names.rules file with this:

<20-names.rules>

# /etc/udev/rules.d/20-names.rules
#
# $Id: 20-names.rules,v 1.1 2010/07/14 20:59:11 drobilla Exp drobilla $
#
# Persistent naming rules for iSCSI devices.
# Don't forget to create a partition on all iSCSI disks before they
are included in ASM.
#
# IMPORTANT: All disks within the same ASM disk group *MUST BE OF EQUAL SIZE!*
#
# David Robillard, July 9th, 2010.

##

# ASM disk group +CRS disks must be 1 GB in size. ##

# /dev/iscsi/crs1.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c3dd997000d", SYMLINK+="iscsi/crs1p%n"

# /dev/iscsi/crs2.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c3dd9ac000e", SYMLINK+="iscsi/crs2p%n"

# /dev/iscsi/crs3.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c3dd9c3000f", SYMLINK+="iscsi/crs3p%n"

##
# ASM disk group +DATA disks must be 15 GB in size.
##

# /dev/iscsi/data1.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379e750003", SYMLINK+="iscsi/data1p%n"

# /dev/iscsi/data2.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379e8e0004", SYMLINK+="iscsi/data2p%n"

# /dev/iscsi/data3.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379ea30005", SYMLINK+="iscsi/data3p%n"

# /dev/iscsi/data4.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379eb90006", SYMLINK+="iscsi/data4p%n"

##
# ASM disk group +FRA disks must be 15 GB in size.
##

# /dev/iscsi/fra1.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379ecf0007", SYMLINK+="iscsi/fra1p%n"

# /dev/iscsi/fra2.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379ee00008", SYMLINK+="iscsi/fra2p%n"

# /dev/iscsi/fra3.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379f990009", SYMLINK+="iscsi/fra3p%n"

# /dev/iscsi/fra4.
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -gus %p", RESULT=="3600144f0aa6313ac00004c379fad000a", SYMLINK+="iscsi/fra4p%n"

# EOF
</20-names.rules>

Of course, your choice of LUN sizes and device names will be different in your organization. Edit the script to change the LUN IDs to match the ones you find on your servers.

Since LUN IDs are the same on all nodes, you can just drop the same file on all your cluster nodes.

In the file above, the SYMLINK+="iscsi/fra4p%n" variable causes udev to create several symbolic links in /dev/iscsi. Like so:

[drobilla_at_ares] by-path {1142}$ ls -alF /dev/iscsi/ total 0K

drwxr-xr-x  2 root root  480 Jul 14 16:16 ./
drwxr-xr-x 13 root root 6740 Jul 14 16:56 ../
lrwxrwxrwx  1 root root    7 Jul 14 13:29 crs1p -> ../sdar
lrwxrwxrwx  1 root root    8 Jul 14 13:29 crs1p1 -> ../sdar1
lrwxrwxrwx  1 root root    7 Jul 14 13:29 crs2p -> ../sdas
lrwxrwxrwx  1 root root    8 Jul 14 13:29 crs2p1 -> ../sdas1
lrwxrwxrwx  1 root root    7 Jul 14 13:29 crs3p -> ../sdat
lrwxrwxrwx  1 root root    8 Jul 14 13:29 crs3p1 -> ../sdat1
lrwxrwxrwx  1 root root    6 Jul 12 18:14 data1p -> ../sdh
lrwxrwxrwx  1 root root    7 Jul 12 18:14 data1p1 -> ../sdh1
lrwxrwxrwx  1 root root    6 Jul 12 18:14 data2p -> ../sdp
lrwxrwxrwx  1 root root    7 Jul 12 18:14 data2p1 -> ../sdr1
lrwxrwxrwx  1 root root    7 Jul 12 18:14 data3p -> ../sdab
lrwxrwxrwx  1 root root    8 Jul 12 18:14 data3p1 -> ../sdab1
lrwxrwxrwx  1 root root    7 Jul 12 18:14 data4p -> ../sdak
lrwxrwxrwx  1 root root    8 Jul 12 18:14 data4p1 -> ../sdak1
lrwxrwxrwx  1 root root    7 Jul 12 18:14 fra1p -> ../sdad
lrwxrwxrwx  1 root root    8 Jul 12 18:14 fra1p1 -> ../sdad1
lrwxrwxrwx  1 root root    7 Jul 12 18:14 fra2p -> ../sdaj
lrwxrwxrwx  1 root root    8 Jul 12 18:14 fra2p1 -> ../sdaj1
lrwxrwxrwx  1 root root    6 Jul 12 18:14 fra3p -> ../sdm
lrwxrwxrwx  1 root root    7 Jul 12 18:14 fra3p1 -> ../sdd1
lrwxrwxrwx  1 root root    6 Jul 12 18:14 fra4p -> ../sdx
lrwxrwxrwx  1 root root    7 Jul 12 18:14 fra4p1 -> ../sdx1

Next thing you need to do is create a single whole-disk partition on all your iSCSI LUNs. Like this:

sudo fdisk /dev/iscsi/crs1p # <-- Notice that it ends with the letter "p" and not the number "1".

That creates the /dev/iscsi/crs1p1 partition and that's the ones you use with ASM. An article from Jeffrey Hunter [2] did help me to understand the iSCSI + ASM requirements.

The udev(7) man pages says you can use OWNER="grid", GROUP="oinstall", MODE="0660" but it didn't work for me. So I wrote a small script to set the permissions:

<iscsi.asm>

#!/bin/sh
#
# iscsi.asm	Fix permissions after udev has created the
#		iSCSI persistent naming for ASM volumes.
#
# chkconfig: 2345 60 86
# description:	Fix permissions after udev(7) has created the
#		iSCSI persistent naming for Oracle ASM volumes.
#
# David Robillard, July 13th, 2010.

# Source function library.
. /etc/rc.d/init.d/functions

# ISCSI_DEV_PATH
#	Full path to the iSCSI devices created by udev(7).
#	See /etc/udev/rules.d/20-names.rules for path.
#

ISCSI_DEV_PATH="/dev/iscsi"
export ISCSI_DEV_PATH
# GRID_USER
#	Username of Oracle Grid Infrastructure owner.
#

GRID_USER="grid"
export GRID_USER

# Make sure GRID_USER exists.

id -un ${GRID_USER} 2>&1 >/dev/null
if [ $? -ne 0 ]; then

        echo "ERROR: could not find the Oracle Grid Infrastructure owner on this system."

	echo "ERROR: please set the GRID_USER variable in $0."
	exit 1

fi

# Get primary group for GRID_USER.

GRID_GROUP=`id -gn ${GRID_USER}`
if [ "x${GRID_GROUP}" == "x" ]; then

        echo "ERROR: could not find primary group for GRID_USER=${GRID_USER} on this system."

        exit 1
fi

RETVAL=0
start() {

        echo -n $"Fixing iSCSI permissions for Oracle ASM: "

        ls -1 ${ISCSI_DEV_PATH} | grep p1 | while read DEVICE; do

		chown ${GRID_USER}:${GRID_GROUP} ${ISCSI_DEV_PATH}/${DEVICE}
		RETVAL=$?
	done
	return $RETVAL

}

stop() {

	echo -n $"Stop iSCSI permissions for Oracle ASM: "
	# There's nothing do to, really.
	return $RETVAL

}

# See how we were called.
case "$1" in

	start)
		start
	;;
	stop)
		stop
	;;
	*)
		echo $"Usage: $0 {start|stop}"
		RETVAL=3
	;;

esac

exit $RETVAL

# EOF
</iscsi.asm>

Place that script in /etc/init.d and then test if all is ok?

sudo cp iscsi.asm /etc/init.d/iscsi.asm
sudo chown root:root /etc/init.d/iscsi.asm sudo chmod a+x /etc/init.d/iscsi.asm

sudo /etc/init.d/iscsi.asm start

Check the permissions on the /dev/iscsi/crs1p1 target. In our example above, it's /dev/sdar1

[drobilla_at_ares] by-path {1145}$ ls -alF /dev/sdar1 brw-r----- 1 grid oinstall 66, 177 Jul 14 17:15 /dev/sdar1

Cool, it worked :)

Now enable it so that it runs at each machine reboot.

sudo chkconfig --add iscsi.asm
sudo chkconfig --list iscsi.asm

Repeat the process for all nodes in your cluster.

[1] https://access.redhat.com/kb/docs/DOC-7319

[2] http://www.oracle.com/technology/pub/articles/hunter-rac11gr2-iscsi.html

> Keep in mind that Oracle will automatically back up your OCR to all the
> cluster nodes on a regular basis so you will have that available for
> recovery. I would hope that the ASM access is sufficiently restricted that
> the uneducated person wont be able to accidentally delete the wrong files.

I hope. Still, I need to look into ASM security in more details.

> That being said, I do recall the time as a junior DBA when I deleted all the
> data files from my qa database when I thought I was working in an entirely
> different environment, so I have to agree having another backup to account
> for user error cannot hurt..

Yes, that's exactly what I had in mind when I asked that question! ;) That's why I created a second OCR disk in the +FRA disk group.

sudo ocrconfig -add +FRA
sudo ocrcheck

Which gives me this:

[drobilla_at_enyo] ~ {1001}$ sudo ocrcheck

Status of Oracle Cluster Registry is as follows :

	 Version                  :          3
	 Total space (kbytes)     :     262120
	 Used space (kbytes)      :       2572
	 Available space (kbytes) :     259548
	 ID                       : 1482882189
	 Device/File Name         :       +CRS
                                    Device/File integrity check succeeded
	 Device/File Name         :       +FRA
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

	 Cluster registry integrity check succeeded

	 Logical corruption check succeeded


Cheers!

David

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jul 15 2010 - 00:40:18 CDT

Original text of this message