Re: Trying to Simulate a disk failure for one of the disks used by ASM disk group

From: Hanan Hit <hithanan_at_gmail.com>
Date: Thu, 28 Aug 2014 09:30:40 -0700
Message-Id: <4E932C59-E526-48D0-B2A0-5FE73760A548_at_gmail.com>



Thanks Hemant K Chitale,

Yes I think my next step would be to just fdisk one of the drives - basically the header.

I am not getting any real answer from Oracle though.  

On Aug 27, 2014, at 10:52 PM, Chitale, Hemant K <Hemant-K.Chitale_at_sc.com> wrote:

> I would use “dd” to overwrite a disk (or just the header of it) to simulate a failure.
>
> Was your second test also about removing a disk physically ? And the DG didn’t dismount merely because permissions on the dev had been changed ?
> OR was the second test different in some other way ?
>
> Hemant K Chitale
>
>
> From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Hanan Hit
> Sent: Thursday, August 28, 2014 7:16 AM
> To: Oracle L
> Cc: Hit Hanan Gmail
> Subject: Trying to Simulate a disk failure for one of the disks used by ASM disk group
>
> Hi All,
>
> I am sorry for the large distribution but I am somehow hitting a wall.
>
> I have a new 12c single instance install (Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production) using ASM on RHEL 6.5.
>
>
> The underline storage array that I am using is MD-1220 from HP.
>
> I have total of 24 drives that each presented as a single drives.
>
> Using ASMlib.
>
>
> I was able to create the ASM instance using two disk groups (DATADG with 16 drives and FRADG with 8 disks). I am using Normal Redundancy.
>
> All drives were labeled (first 1M).
>
>
> Here are the compatibility details of both disk groups:
>
> GROUP_NUMBER NAME
> ------------ ------------------------------
> COMPATIBILITY
> ------------------------------------------------------------
> DATABASE_COMPATIBILITY
> ------------------------------------------------------------
> 2 FRADG
> 12.1.0.0.0
> 12.1.0.0.0
>
> 1 DATADG
> 12.1.0.0.0
> 12.1.0.0.0
>
>
> I also modified the disk repair time for the given disk to 6 hours from the default of 3.6 hours .
>
>
>
> SQL> show parameter disk_
>
> NAME TYPE VALUE
> ------------------------------------ ----------- ------------------------------
> asm_diskgroups string FRADG, DATADG
> asm_diskstring string ORCL:*
>
>
> Now I am trying to simulate a failure of one disk (that of course shouldn’t fail the DATADG).
>
> In the first test we physically plug out one drive (found the right device) and the DATADG was dismounted (to my surprise). This obviously didn’t work as I expected. I don’t think it’s a fat finger issue.
>
> In the second test after opening a SR with Oracle, I modified the permission for the device during the run, rescan the drives but all is functioning well and not failed disk were encountered.
>
> # chmod 000 /dev/sdd1
>
> # /etc/init.d/oracleasm scandisks
> Scanning the system for Oracle ASMLib disks: [ OK ]
>
>
> So finally, is there a better way to logically simulate a disk failure with Normal Redundancy and while using my infrastructure?
>
> Any help would be highly appreciated.
>
> Best,
> Hanan
>
>
>
>
>
> This email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please delete all copies and notify the sender immediately. You may wish to refer to the incorporation details of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at https://www.sc.com/en/incorporation-details.html.

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Aug 28 2014 - 18:30:40 CEST

Original text of this message