Re: corrupt block in ASM disk

From: onedbguru <onedbguru_at_yahoo.com>
Date: Thu, 28 Apr 2011 18:50:23 -0700 (PDT)
Message-ID: <078634bf-566a-428a-9f7e-36cfa69f7705_at_l18g2000yqm.googlegroups.com>



On Apr 28, 9:19 am, John Hurley <hurleyjo..._at_yahoo.com> wrote:
> On Apr 28, 4:10 am, lsllcm <lsl..._at_gmail.com> wrote:
>
>
>
>
>
>
>
>
>
> > Hi All,
>
> > I meet one corrupt block issue in ASM disk. Below is replicate steps:
>
> > 1. create tablespace
> > create tablespace aa_data
> > datafile
> >  '+DATA/dbs11g/aa_data01.dbf' size 20M
> > EXTENT MANAGEMENT LOCAL AUTOALLOCATE
> > SEGMENT SPACE MANAGEMENT AUTO
> > /
>
> > 2. It prompts the message:
> > ORA-01119: error in creating database file '+DATA/dbs11g/
> > aa_data01.dbf'
> > ORA-17502: ksfdcre:4 Failed to create file +DATA/dbs11g/aa_data01.dbf
> > ORA-15130: diskgroup "DATA" is being dismounted
> > ORA-15066: offlining disk "DATAVOL1" may result in a data loss
>
> > 3. check alert.log
> > WARNING: IO Failed. group:1 disk(number.incarnation):0.0xe96892e8
> > disk_path:ORCL:DATAVOL1
> >          AU:2 disk_offset(bytes):2097152 io_size:4096 operation:Read
> > type:synchronous
> >          result:I/O error process_id:11679
> > WARNING: cache failed reading from group=DATA fn=1 blk=0 count=1 from
> > disk= 0 DATAVOL1 kfkist=0x20 status=0x02 file=kfc.c line=10225
> > ERROR: cache failed to read group=DATA fn=1 blk=0 from disk(s): 0
> > DATAVOL1
> > ORA-15080: synchronous I/O operation to a disk failed
> > System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM/
> > trace/+ASM_ora_11679.trc
>
> > 4. check amdu log
> > /u01/app/grid/diag/asm/+asm/+ASM/trace/ amdu_2011_04_26_17_13_28
> > ---------------------------- SCANNING DISK N0002
> > -----------------------------
> > Disk N0002: 'ORCL:DATAVOL1'
> > AMDU-00407: asmlib error!! function = [asm_close], error = [0], mesg =
> > [I/O Error]
> > AMDU-00200: Unable to read [262144] bytes from Disk N0002 at offset
> > [2097152]
> > AMDU-00201: Disk N0002: 'ORCL:DATAVOL1'
> >            Allocated AU's: 3
> >                 Free AU's: 0
> >        AU's read for dump: 2
> >        Block images saved: 512
> >         Map lines written: 2
> >           Heartbeats seen: 0
> >   Corrupt metadata blocks: 0
> >         Corrupt AT blocks: 0
>
> > 5. check dmesg
> > dmesg|more
>
> > Info fld=0x1fa81d1, Current sda: sense key Medium Error
> > Additional sense: Data synchronization mark error
> > end_request: I/O error, dev sda, sector 33194449
> > scsi6: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 01 fa 81 d1
> > 00 02 00 0
>
> > 6. I use amdu dump the asm disk
> > amdu -dump 'DATA'
>
> > ---------------------------- SCANNING DISK N0002
> > -----------------------------
> > Disk N0002: 'ORCL:DATAVOL1'
> > AMDU-00209: Corrupt block found: Disk N0002 AU [84926] block [0] type
> > [0]
> > AMDU-00201: Disk N0002: 'ORCL:DATAVOL1'
> > AMDU-00204: Disk N0002 is in currently mounted diskgroup DATA
> > AMDU-00201: Disk N0002: 'ORCL:DATAVOL1'
> > ** HEARTBEAT DETECTED **
> >            Allocated AU's: 84927
> >                 Free AU's: 12733
> >        AU's read for dump: 82
> >        Block images saved: 3774
> >         Map lines written: 82
> >           Heartbeats seen: 1
> >   Corrupt metadata blocks: 1
> >         Corrupt AT blocks: 0
>
> > I tried to use remap, but the issue still exists
>
> > remap DATA DATAVOL1 173928448-173928448
>
> > Can anyone help?
>
> > Thanks
>
> Got a good rman backup?
>
> How many databases share this disk group?
>
> One way to approach it is to get the disk fixed at the storage
> level ... recreate the ASM disk group with force ... restore the
> database.  If approaching it like that you may need to startup nomount
> with a pfile copy and then restore a controlfile backup then mount
> then do an rman restore.
>
> I for one do not store my rman disk backups in ASM disk groups.

I would echo John's question. Do you have a good backup?

What version ASM?
RAC? Version?
What type of storage (direct-connect RAID? SCSI? SAN?) How are the underlying devices partitioned? or are they? What is your REDUNDANCY level? If you are using EXTERNAL with individual direct-attached SCSI disks, you should be taken out and shot.

I typically will partition the device such that: p1 = first block block 1 to block 1
p2 = rest of the device (block 2 to the end)

and the partition used by ASM is p2 only.

What happens when you use the following syntax for creating the tablespace? If you are going to use ASM, it is time to get out of the "I gotta know what datafile my data is in..." DBA mentality. I have used this on ELDB (V V VLDB??) environments with no performance degradation. ASM is supposed to help make your life easier and if you understand ASM, it will. Or you can continue to do things the hard way.

make sure that
alter system set db_create_file_dest='+DATA'; or
alter system set db_create_file_dest='+DATA/sub-dir/sub-dir'; -- if you really need to find your datafile.

and then
create tablespace abc;

These are default when using ASM so no need to specify them: EXTENT MANAGEMENT LOCAL AUTOALLOCATE SEGMENT SPACE MANAGEMENT AUTO Received on Thu Apr 28 2011 - 20:50:23 CDT

Original text of this message