Re: Voting Disks and OCR on 11.2.0.2

From: Radoulov, Dimitre <cichomitiko_at_gmail.com>
Date: Wed, 19 Sep 2012 11:37:59 +0200
Message-ID: <50599277.1010200_at_gmail.com>



On 19/09/2012 10:47, Zabair Ahmed wrote:
> 11.2.0.2 on 2 node AIX RAC.
>
> I've got a single voting disk as per below.
>
> crsctl query css votedisk
> ## STATE File Universal Id File Name Disk group
> -- ----- ----------------- --------- ---------
> 1. ONLINE 597a4cd2ecc84fe1bfe8eaf5520b3de7 (/dev/ora_asmsys_disk01) [ASMSYS]
> Located 1 voting disk(s).
>
> Also i've also got a single OCR file as below.
>
> ocrcheck
>
> Status of Oracle Cluster Registry is as follows :
> Version : 3
> Total space (kbytes) : 262120
> Used space (kbytes) : 3188
> Available space (kbytes) : 258932
> ID : 764672799
> Device/File Name : +ASMSYS
> Device/File integrity check succeeded
> Device/File not configured
> Device/File not configured
> Device/File not configured
> Device/File not configured
> Cluster registry integrity check succeeded
> Logical corruption check bypassed due to non-privileged user
>
> home/grid > cat /etc/oracle/ocr.loc
> ocrconfig_loc=+ASMSYS
> local_onlyúLSE
>
>
> The ASMSYS diskgroup is set to for EXTERNAL REDUNDANCY.
>
> select NAME,TYPE,COMPATIBILITY,DATABASE_COMPATIBILITY,VOTING_FILES from v$asm_diskgroup where name = 'ASMSYS';
> NAME TYPE COMPATIBIL DATABASE_COMPATIBILITY V
> ------------------------------ ------ ---------- ------------------------------------------------------------ -
> ASMSYS EXTERN 11.2.0.0.0 10.1.0.0.0 Y
>
> My question is do I have a single point of failure here. Am not sure what my RAID configuration is and therefore what level of redunancy the RAID is giving me.
>
> Should I be adding extra voting disks and OCR to this setup. Any suggestions greatly appreciated.

Hi Ahmed,
if you loose ora_asmsys_disk01 you'll need to restore the clusterware files from backup,
clarify the RAID level used for /dev/ora_asmsys_disk01 with your sysadmins.

I prefer 3 vote disks even with external RAID protection so I use an asm disk group with normal redundancy for clusterware files anyway.

Note that even with normal redundancy you need to do some work if/when you loose a disk.
I did a test once and I intentionally corrupted one of the 3 asm disks of a normal redundancy asm disk group containing the clusterware files. All processes remained online, there were no messages in the cluster alert logs (at least for a while, I didn't wait too much). I stopped the entire stack, after that I tried to start it again - the crsd failed to start with:

2012-05-04 12:09:53.820
[crsd(26887)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /xxxx/crsd/crsd.log. 2012-05-04 12:09:54.019
[crsd(26887)]CRS-1011:OCR cannot determine that the OCR content contains the latest updates. Details in /xxxx/crsd/crsd.log. 2012-05-04 12:09:54.024
[crsd(26887)]CRS-0804:Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage ]. Details at (:CRSD00111:) in /xxxx/crsd/crsd.log. 2012-05-04 12:09:54.404
[ohasd(26129)]CRS-2765:Resource 'ora.crsd' has failed on server 'xxxx'. [...]
2012-05-04 12:10:14.679
[ohasd(26129)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.

The solution was to force drop the corrupted asm disk ...

I always add an ocr mirror in a separate asm disk group for safety.

Regards
Dimitre

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Sep 19 2012 - 04:37:59 CDT

Original text of this message