Re: Removing a diskgroup from a corrupt ASM database

From: <daniel.ostertag_at_visaer.com>
Date: Tue, 26 May 2009 14:19:49 -0700 (PDT)
Message-ID: <178b379d-9a95-4f52-9e1e-9f77fad6d25e_at_k38g2000yqh.googlegroups.com>



On May 26, 7:16 am, johnbhur..._at_sbcglobal.net wrote:
> On May 25, 9:00 pm, onedbguru <onedbg..._at_yahoo.com> wrote:
>
> snip
>
>
>
>
>
> > > As part of DR testing and validating that we can recover from anything
> > > ASM related the CREATE DISKGROUP ... FORCE has worked well for us.
>
> > > It doesn't matter if the diskgroup is can be mounted or fails during a
> > > mount attempt ... it destroys anything ASM related on the disks and re-
> > > initializes them.  This may be an 11g only thing so you might need to
> > > upgrade the ASM to 11g before FORCE become s a valid option in ASM
> > > create diskgroup.
>
> > Anyone using "single disks" (as opposed to SAN mirrored/RAIDed LUNS)
> > without the use of failure groups is asking for trouble.  Replacing
> > the failed device is futile without having thoroughly worked out the
> > failover mechanisms to be able to reconstruct it without having to
> > restore all of your databases.  Should you proceed with the the
> > "FORCE" option, you will need to be able to restore all of your
> > databases that resided in that ASM instance.
>
> Who exactly does not know that?
>
> Anyone who has deployed ASM without testing and documenting and
> recovering from all conditions including having to rebuild the ASM
> diskgroups from scratch just has not done their homework.
>
> Go back and read the original post in this thread if you still are not
> understanding what was asked.- Hide quoted text -
>
> - Show quoted text -

John, Onedbguru, et al,

Thanks for your help. I'm just a little taken aback by the sniping at me. I posted one message, suddenly it's grown into a situation where I have a "single disk", I've hosed my production system, I should look for another job, I should be ashamed of myself for lack of preparation, etc. Respectfully, I'm just looking for a little help on this, no big deal.

Some background: This is a test system. This is a system on which I am trying to better learn ASM. No, I am not an expert on ASM. I have practiced many scenarios on which I've physically removed disks from a running database, rebalanced, put new disks back, rebalanced again, etc, etc. No, maybe I haven't practiced every scenario.

We have a 14 disk Dell Powervault array, I use NORMAL REDUNDANCY in ASM. Per recommendations from many, including Tom Kyte, I'm not using the external redundancy provided by the array, instead I'm letting ASM do it with its striping/mirroring.

What happened was one day (on its own, not purposely caused by me while practicing DR) our Windows system had a disk i/o error, and soon after the diskgroup became unmounted and the database crashed. I have run diagnostics on the disks, worked with Dell, made sure all drivers/ firmware were updated, etc. THe hardware has passed every check. I tried to remount with the force option, I tried creating the original diskgroup from scratch, tried disk discovery, cleared configuration (without initializing) etc. Because I don't care about the data or database, I decided to delete the non-ASM database, wipe out the array, initialize, re-partition and stamp the disks in ASM. The disks are not set up exactly as before, just with no data. The ASM database still does not see the original diskgroup. I can probably use a different named diskgroup to get this up and running, but ASM is still looking for the original. I can also uninstall/reinstall Oracle and start from scratch, but I'd rather not punt if I don't have to.

I hope this is enough info. It's not a desperate situation, I'm just looking for a little help while I research offline.

Thanks in advance for your help.
Dan Received on Tue May 26 2009 - 16:19:49 CDT

Original text of this message