Re: ORA-15042: ASM disk "xx" is missing

From: Fergal Taheny <ftaheny_at_gmail.com>
Date: Thu, 3 Jan 2013 08:57:01 +0000
Message-ID: <CAOuMUT5W521Aff=vqwoX_5aGmLd3gFAo0nsCvS3EYq3CF7177A_at_mail.gmail.com>



Hi Luis,
Thanks. I didn't know or had forgotten about the amdu utility. Unfortunately it only gives me info relating to the disks it can see. I'm trying to find out where ASM stores data relating to the other missing disks.

It's really only a curiosity at this stage to figure out how ASM works.

I know it does a discovery where it looks for all device file as per the asm_diskstring. It reads their headers so it knows if the disks found are ASM disks and if so it reads the Disk Name,Failure Group Name, Group Name etc.

But then how does it know some disks are missing? It must have a list of the other disks or it must have the total size of the disk group or the number of disks in the disk group but I haven't found this yet.

And I suppose the other thing of interest is does it store the actual disk path (i.e. the link to the device files). It probably doesn't as that allows you to move disks and once you change your asm_diskstring correctly it will find them. That makes sense from one point of view.

But when you have separate teams of Oracle DBAs, Unix Admins and SAN Admins and one of your ASM disks goes missing it can be left up to the Oracle DBA to prove that a disk is missing and find the device file etc. Maybe that's where good documentation comes into play! And certainly amdu could help with that - before the fact.

I think at this point that going back to old alert logs, which is what I have done, is the way to go. Although I'm looking at truss too as suggested by Rui.

Regards,
Fergal

  • AMDU Settings
    ORACLE_HOME = /u02/app/oracle/product/11.1.0.7/asm System name: SunOS Node name: m4tst1 Release: 5.10 Version: Generic_147440-24 Machine: sun4u amdu run: 02-JAN-13 22:22:49 Endianess: 0
    --------------------------------- Operations

    -dump ORADATA
    ------------------------------- Disk Selection

    -diskstring '/dev/oradsk/ASMDA1' ------------------------------ Reading Control
    • Output Control
      • DISCOVERY
        • DISK REPORT N0001
          Disk Path: /dev/oradsk/ASMDA1 Unique Disk ID: Disk Label: Physical Sector Size: 512 bytes Disk Size: 204760 megabytes Group Name: ORADATA Disk Name: ORADATA_0009 Failure Group Name: ORADATA_0009 Disk Number: 9 Header Status: 3 Disk Creation Time: 2012/10/15 22:27:18.501000 Last Mount Time: 2012/10/31 19:57:43.221000 Compatibility Version: 0x0a100000(10010000) Disk Sector Size: 512 bytes Disk size in AUs: 204760 AUs Group Redundancy: 1 Metadata Block Size: 4096 bytes AU Size: 1048576 bytes Stride: 113792 AUs Group Creation Time: 2010/10/20 11:54:51.974000 File 1 Block 1 location: AU 0 ***************** Slept for 6 seconds waiting for heartbeats
          • SCANNING DISKGROUP ORADATA
            Creation Time: 2010/10/20 11:54:51.974000 Disks Discovered: 1 Redundancy: 1 AU Size: 1048576 bytes Metadata Block Size: 4096 bytes Physical Sector Size: 512 bytes Metadata Stride: 113792 AU Duplicate Disk Numbers: 0
            • SCANNING DISK N0001
              Disk N0001: '/dev/oradsk/ASMDA1' Allocated AU's: 175661 Free AU's: 29099 AU's read for dump: 37 Block images saved: 1769 Map lines written: 37 Heartbeats seen: 0 Corrupt metadata blocks: 0 Corrupt AT blocks: 0
              • SUMMARY FOR DISKGROUP ORADATA
                Allocated AU's: 175661 Free AU's: 29099 AU's read for dump: 37 Block images saved: 1769 Map lines written: 37 Heartbeats seen: 0 Corrupt metadata blocks: 0 Corrupt AT blocks: 0
  • END OF REPORT

On 2 January 2013 18:23, Luis <lcarapinha_at_gmail.com> wrote:

> Hi Fergal,
>
> AFAIK all V$ASM_* views are based on X$KF*  tables and you probably are
> out of luck here..
>
> Maybe amdu can help you. Something like this:
>
> amdu -diskstring /dev/oradsk/ASMDA1  -dump  ORADATA
>
> It will output some metadata on report.txt that can help you.
>
> Thanks,
> Luís Marques
>
>
>
>
>  On Wed, Jan 2, 2013 at 5:45 PM, Fergal Taheny <ftaheny_at_gmail.com> wrote:
>
>>  Hi,
>> ASM 11.1.0.7.0
>> Solaris 10
>>
>> We have a diskgoup that won't mount. I'm believe it a SAN/disk issue as
>> opposed to an ASM issue but I'm trying to eliminate some guesswork in
>> troubleshooting this.
>>
>> In the +ASM alert log we have:
>>
>> NOTE: cache registered group ORADATA number=1 incarn=0x2bc38c91
>> NOTE: cache began mount (first) of group ORADATA number=1
>> incarn=0x2bc38c91
>> Wed Jan 02 14:16:33 2013
>> NOTE: Assigning number (1,15) to disk (/dev/oradsk/ASMDA7)
>> NOTE: Assigning number (1,10) to disk (/dev/oradsk/ASMDA2)
>> NOTE: Assigning number (1,18) to disk (/dev/oradsk/ASMDA10)
>> NOTE: Assigning number (1,9) to disk (/dev/oradsk/ASMDA1)
>> NOTE: Assigning number (1,19) to disk (/dev/oradsk/ASMDA11)
>> NOTE: Assigning number (1,20) to disk (/dev/oradsk/ASMDA12)
>> NOTE: Assigning number (1,21) to disk (/dev/oradsk/ASMDA13)
>> NOTE: Assigning number (1,22) to disk (/dev/oradsk/ASMDA14)
>> NOTE: Assigning number (1,23) to disk (/dev/oradsk/ASMDA15)
>> NOTE: Assigning number (1,24) to disk (/dev/oradsk/ASMDA16)
>> NOTE: Assigning number (1,25) to disk (/dev/oradsk/ASMDA17)
>> NOTE: Assigning number (1,26) to disk (/dev/oradsk/ASMDA18)
>> NOTE: Assigning number (1,12) to disk (/dev/oradsk/ASMDA4)
>> Wed Jan 02 14:16:37 2013
>> NOTE: start heartbeating (grp 1)
>> kfdp_query(): 12
>> kfdp_queryBg(): 12
>> NOTE: Assigning number (1,11) to disk ()
>> NOTE: Assigning number (1,13) to disk ()
>> NOTE: Assigning number (1,14) to disk ()
>> NOTE: Assigning number (1,16) to disk ()
>> NOTE: Assigning number (1,17) to disk ()
>> kfdp_query(): 13
>> kfdp_queryBg(): 13
>> NOTE: cache dismounting group 1/0x2BC38C91 (ORADATA)
>> NOTE: dbwr not being msg'd to dismount
>> NOTE: lgwr not being msg'd to dismount
>> NOTE: cache dismounted group 1/0x2BC38C91 (ORADATA)
>> NOTE: cache ending mount (fail) of group ORADATA number=1
>> incarn=0x2bc38c91
>> kfdp_dismount(): 14
>> kfdp_dismountBg(): 14
>> NOTE: De-assigning number (1,9) from disk (/dev/oradsk/ASMDA1)
>> NOTE: De-assigning number (1,10) from disk (/dev/oradsk/ASMDA2)
>> NOTE: De-assigning number (1,12) from disk (/dev/oradsk/ASMDA4)
>> NOTE: De-assigning number (1,15) from disk (/dev/oradsk/ASMDA7)
>> NOTE: De-assigning number (1,18) from disk (/dev/oradsk/ASMDA10)
>> NOTE: De-assigning number (1,19) from disk (/dev/oradsk/ASMDA11)
>> NOTE: De-assigning number (1,20) from disk (/dev/oradsk/ASMDA12)
>> NOTE: De-assigning number (1,21) from disk (/dev/oradsk/ASMDA13)
>> NOTE: De-assigning number (1,22) from disk (/dev/oradsk/ASMDA14)
>> NOTE: De-assigning number (1,23) from disk (/dev/oradsk/ASMDA15)
>> NOTE: De-assigning number (1,24) from disk (/dev/oradsk/ASMDA16)
>> NOTE: De-assigning number (1,25) from disk (/dev/oradsk/ASMDA17)
>> NOTE: De-assigning number (1,26) from disk (/dev/oradsk/ASMDA18)
>> ERROR: diskgroup ORADATA was not mounted
>> ORA-15032: not all alterations performed
>> ORA-15040: diskgroup is incomplete
>> ORA-15042: ASM disk "17" is missing
>> ORA-15042: ASM disk "16" is missing
>> ORA-15042: ASM disk "14" is missing
>> ORA-15042: ASM disk "13" is missing
>> ORA-15042: ASM disk "11" is missing
>> ERROR: alter diskgroup oradata mount
>>
>>
>> So ASM identified 13 disks and couldn't find 5 disks.
>>
>> The first problem is how to map these missing ASM disks to unix device
>> files.
>>
>> Looking in /dev/oradsk I find device files for the 13 good disks listed
>> above and also 5 other device files not listed above. I assume these are
>> the device files for the 5 problem disks but I want to prove that.
>>
>> When I use the od command I can read the file headers for the 13 good
>> disks:
>>
>> > od -c -N 128 /dev/oradsk/ASMDA1
>> 0000000  \0 202 001 001  \0  \0  \0  \0 200  \0  \0  \t 326 331 223 236
>> 0000020  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
>> 0000040   O   R   C   L   D   I   S   K  \0  \0  \0  \0  \0  \0  \0  \0
>> 0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
>> 0000100  \n 020  \0  \0  \0  \t 001 003   O   R   A   D   A   T   A   _
>> 0000120   0   0   0   9  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
>> 0000140  \0  \0  \0  \0  \0  \0  \0  \0   O   R   A   D   A   T   A  \0
>> 0000160  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
>> 0000200
>>
>> And I get an IO error for the other 5 device files
>>
>> > od -c -N 128 /dev/oradsk/ASMDA3
>> od: cannot open ASMDA3: I/O error
>>
>> Ok still promising but still guessing a bit.
>>
>> So I look in v$asm_disk and I only see the 13 good disks (same for
>> v$asm_disk_stat and X$KFDSK)
>>
>> sys_at_+ASM> select path from v$asm_disk;
>>
>> PATH
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> /dev/oradsk/ASMDA1
>> /dev/oradsk/ASMDA2
>> /dev/oradsk/ASMDA4
>> /dev/oradsk/ASMDA7
>> /dev/oradsk/ASMDA10
>> /dev/oradsk/ASMDA11
>> /dev/oradsk/ASMDA18
>> /dev/oradsk/ASMDA13
>> /dev/oradsk/ASMDA14
>> /dev/oradsk/ASMDA15
>> /dev/oradsk/ASMDA16
>> /dev/oradsk/ASMDA17
>> /dev/oradsk/ASMDA12
>>
>> 13 rows selected.
>>
>>
>> Eventually I go back through old alert logs to find the last time the
>> diskgroup was successfully mounted and I find that indeed the five disks I
>> suspected were mounted in the disk group at that time. So that's fairly
>> conclusive but my questions are:
>>
>> Is there a view in ASM that lists all the disks that belong in a disk
>> group? Including ones asm can't see at present.
>>
>> Or is this information stored in the file headers? If so how could I read
>> it?
>>
>> Or does ASM just probe all the disks (as defined in asm_diskstring) to
>> find
>> all the disks in a disk group? And if it does how does it know how many
>> disks to expect? I don't see the "numbers of disks" displayed anywhere in
>> V$ASK_DISKGROUP or in the output of kfed (shown below). Where would I find
>> the number of disks in each disk group?
>>
>> > kfed read /dev/oradsk/ASMDA1
>> kfbh.endian:                          0 ; 0x000: 0x00
>> kfbh.hard:                          130 ; 0x001: 0x82
>> kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
>> kfbh.datfmt:                          1 ; 0x003: 0x01
>> kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
>> kfbh.block.obj:              2147483657 ; 0x008: TYPE=0x8 NUMB=0x9
>> kfbh.check:                  3604583326 ; 0x00c: 0xd6d9939e
>> kfbh.fcn.base:                        0 ; 0x010: 0x00000000
>> kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
>> kfbh.spare1:                          0 ; 0x018: 0x00000000
>> kfbh.spare2:                          0 ; 0x01c: 0x00000000
>> kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
>> kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
>> kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
>> kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
>> kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
>> kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
>> kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
>> kfdhdb.compat:                168820736 ; 0x020: 0x0a100000
>> kfdhdb.dsknum:                        9 ; 0x024: 0x0009
>> kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
>> kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
>> kfdhdb.dskname:            ORADATA_0009 ; 0x028: length
>> kfdhdb.grpname:                 ORADATA ; 0x048: length=7
>> kfdhdb.fgname:             ORADATA_0009 ; 0x068: length
>> kfdhdb.capname:                         ; 0x088: length=0
>> kfdhdb.crestmp.hi:             32975350 ; 0x0a8: HOUR=0x16 DAYS=0xf
>> MNTH=0xa YEAR=0x7dc
>> kfdhdb.crestmp.lo:           1831326720 ; 0x0ac: USEC=0x0 MSEC=0x1f5
>> SECS=0x12 MINS=0x1b
>> kfdhdb.mntstmp.hi:             32975859 ; 0x0b0: HOUR=0x13 DAYS=0x1f
>> MNTH=0xa YEAR=0x7dc
>> kfdhdb.mntstmp.lo:           3870520320 ; 0x0b4: USEC=0x0 MSEC=0xdd
>> SECS=0x2b MINS=0x39
>> kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
>> kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
>> kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
>> kfdhdb.mfact:                    113792 ; 0x0c0: 0x0001bc80
>> kfdhdb.dsksize:                  204760 ; 0x0c4: 0x00031fd8
>> kfdhdb.pmcnt:                         3 ; 0x0c8: 0x00000003
>> kfdhdb.fstlocn:                       1 ; 0x0cc: 0x00000001
>> kfdhdb.altlocn:                       2 ; 0x0d0: 0x00000002
>> kfdhdb.f1b1locn:                      0 ; 0x0d4: 0x00000000
>> kfdhdb.redomirrors[0]:                0 ; 0x0d8: 0x0000
>> kfdhdb.redomirrors[1]:                0 ; 0x0da: 0x0000
>> kfdhdb.redomirrors[2]:                0 ; 0x0dc: 0x0000
>> kfdhdb.redomirrors[3]:                0 ; 0x0de: 0x0000
>> kfdhdb.dbcompat:              168820736 ; 0x0e0: 0x0a100000
>> kfdhdb.grpstmp.hi:             32942731 ; 0x0e4: HOUR=0xb DAYS=0x14
>> MNTH=0xa YEAR=0x7da
>> kfdhdb.grpstmp.lo:           3678353408 ; 0x0e8: USEC=0x0 MSEC=0x3ce
>> SECS=0x33 MINS=0x36
>> kfdhdb.ub4spare[0]:                   0 ; 0x0ec: 0x00000000
>> kfdhdb.ub4spare[1]:                   0 ; 0x0f0: 0x00000000
>> kfdhdb.ub4spare[2]:                   0 ; 0x0f4: 0x00000000
>> kfdhdb.ub4spare[3]:                   0 ; 0x0f8: 0x00000000
>> kfdhdb.ub4spare[4]:                   0 ; 0x0fc: 0x00000000
>> kfdhdb.ub4spare[5]:                   0 ; 0x100: 0x00000000
>> kfdhdb.ub4spare[6]:                   0 ; 0x104: 0x00000000
>> kfdhdb.ub4spare[7]:                   0 ; 0x108: 0x00000000
>> kfdhdb.ub4spare[8]:                   0 ; 0x10c: 0x00000000
>> kfdhdb.ub4spare[9]:                   0 ; 0x110: 0x00000000
>> kfdhdb.ub4spare[10]:                  0 ; 0x114: 0x00000000
>> kfdhdb.ub4spare[11]:                  0 ; 0x118: 0x00000000
>> kfdhdb.ub4spare[12]:                  0 ; 0x11c: 0x00000000
>> kfdhdb.ub4spare[13]:                  0 ; 0x120: 0x00000000
>> kfdhdb.ub4spare[14]:                  0 ; 0x124: 0x00000000
>> kfdhdb.ub4spare[15]:                  0 ; 0x128: 0x00000000
>> kfdhdb.ub4spare[16]:                  0 ; 0x12c: 0x00000000
>> kfdhdb.ub4spare[17]:                  0 ; 0x130: 0x00000000
>> kfdhdb.ub4spare[18]:                  0 ; 0x134: 0x00000000
>> kfdhdb.ub4spare[19]:                  0 ; 0x138: 0x00000000
>> kfdhdb.ub4spare[20]:                  0 ; 0x13c: 0x00000000
>> kfdhdb.ub4spare[21]:                  0 ; 0x140: 0x00000000
>> kfdhdb.ub4spare[22]:                  0 ; 0x144: 0x00000000
>> kfdhdb.ub4spare[23]:                  0 ; 0x148: 0x00000000
>> kfdhdb.ub4spare[24]:                  0 ; 0x14c: 0x00000000
>> kfdhdb.ub4spare[25]:                  0 ; 0x150: 0x00000000
>> kfdhdb.ub4spare[26]:                  0 ; 0x154: 0x00000000
>> kfdhdb.ub4spare[27]:                  0 ; 0x158: 0x00000000
>> kfdhdb.ub4spare[28]:                  0 ; 0x15c: 0x00000000
>> kfdhdb.ub4spare[29]:                  0 ; 0x160: 0x00000000
>> kfdhdb.ub4spare[30]:                  0 ; 0x164: 0x00000000
>> kfdhdb.ub4spare[31]:                  0 ; 0x168: 0x00000000
>> kfdhdb.ub4spare[32]:                  0 ; 0x16c: 0x00000000
>> kfdhdb.ub4spare[33]:                  0 ; 0x170: 0x00000000
>> kfdhdb.ub4spare[34]:                  0 ; 0x174: 0x00000000
>> kfdhdb.ub4spare[35]:                  0 ; 0x178: 0x00000000
>> kfdhdb.ub4spare[36]:                  0 ; 0x17c: 0x00000000
>> kfdhdb.ub4spare[37]:                  0 ; 0x180: 0x00000000
>> kfdhdb.ub4spare[38]:                  0 ; 0x184: 0x00000000
>> kfdhdb.ub4spare[39]:                  0 ; 0x188: 0x00000000
>> kfdhdb.ub4spare[40]:                  0 ; 0x18c: 0x00000000
>> kfdhdb.ub4spare[41]:                  0 ; 0x190: 0x00000000
>> kfdhdb.ub4spare[42]:                  0 ; 0x194: 0x00000000
>> kfdhdb.ub4spare[43]:                  0 ; 0x198: 0x00000000
>> kfdhdb.ub4spare[44]:                  0 ; 0x19c: 0x00000000
>> kfdhdb.ub4spare[45]:                  0 ; 0x1a0: 0x00000000
>> kfdhdb.ub4spare[46]:                  0 ; 0x1a4: 0x00000000
>> kfdhdb.ub4spare[47]:                  0 ; 0x1a8: 0x00000000
>> kfdhdb.ub4spare[48]:                  0 ; 0x1ac: 0x00000000
>> kfdhdb.ub4spare[49]:                  0 ; 0x1b0: 0x00000000
>> kfdhdb.ub4spare[50]:                  0 ; 0x1b4: 0x00000000
>> kfdhdb.ub4spare[51]:                  0 ; 0x1b8: 0x00000000
>> kfdhdb.ub4spare[52]:                  0 ; 0x1bc: 0x00000000
>> kfdhdb.ub4spare[53]:                  0 ; 0x1c0: 0x00000000
>> kfdhdb.ub4spare[54]:                  0 ; 0x1c4: 0x00000000
>> kfdhdb.ub4spare[55]:                  0 ; 0x1c8: 0x00000000
>> kfdhdb.ub4spare[56]:                  0 ; 0x1cc: 0x00000000
>> kfdhdb.ub4spare[57]:                  0 ; 0x1d0: 0x00000000
>> kfdhdb.acdb.aba.seq:                  0 ; 0x1d4: 0x00000000
>> kfdhdb.acdb.aba.blk:                  0 ; 0x1d8: 0x00000000
>> kfdhdb.acdb.ents:                     0 ; 0x1dc: 0x0000
>> kfdhdb.acdb.ub2spare:                 0 ; 0x1de: 0x0000
>>
>> Thanks,
>> Fergal
>>
>>
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>
>
> --
> Cumprimentos,
> Luís Marques
>



-- 
Fergal Taheny
Pentec IT Limited
2 knightsbrook court, Dublin Road, Trim, Co. Meath.
+353 (0) 87 9823137
ftaheny_at_gmail.com

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify us immediately.  It
is possible for data transmitted by email to be deliberately or
accidentally corrupted or intercepted. For this reason, where the
communication is by email, Pentec IT does not accept any responsibility for
any breach of confidence which may arise through the use of this medium.
Pentec IT Limited is Registered in Ireland: No 443280 with a registered
office at 2 Knightsbrook Court, Dublin Road, Trim, Co. Meath. Company
Directors: Fergal Taheny, Caitriona Ni Riain.

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jan 03 2013 - 09:57:01 CET

Original text of this message