Mislabeled ASM disk corrects the label by itself
Date: Mon, 6 Feb 2017 16:16:17 +0000 (UTC)
Message-ID: <1297675356.2714973.1486397777505_at_mail.yahoo.com>
Oracle 11.2.0.3 on Red Hat Enterprise Linux 6.6. Using ASMLib.
We probably hit
Bug 19601762 : ASMLIB DISK HEADER LABEL WILL BE REMOVED AFTER ONLINING PRIOR OFFLINED DISKS
After storage maintenance work, some ASM disks got their labels switched:
SQL> select path, label from v$asm_disk where mount_status = 'CLOSED' and header_status = 'MEMBER';
PATH LABEL ------------------- -------------- ORCL:ASM_DATA08_1MC ASM_DATA08_1MC ORCL:ASM_GRID01_1MC ASM_GRID01_1MC
ORCL:ASM_DATA14_1MC ASM_DATA14_1MC
ORCL:ASM_DATA11_1MC ASM_DATA11_1MC Take the first two as examples and check the headers:
$ kfed read /dev/oracleasm/disks/ASM_DATA08_1MC | egrep 'provstr|dskname|grpname|fgname'
kfdhdb.driver.provstr:ORCLDISKASM_GRID01_1MC ; 0x000: length=22 <-- wrong provstring kfdhdb.dskname: ASM_GRID01_1MC ; 0x028: length=14 <-- wrong disk name kfdhdb.grpname: GRID_DG ; 0x048: length=7 <-- wrong group name kfdhdb.fgname: ASM_GRID01_1MC ; 0x068: length=14 <-- wrong failgroup name
$ kfed read /dev/oracleasm/disks/ASM_GRID01_1MC | egrep 'provstr|dskname|grpname|fgname'
kfdhdb.driver.provstr:ORCLDISKASM_DATA08_1MC ; 0x000: length=22 <-- wrong kfdhdb.dskname: ASM_DATA08_1MC ; 0x028: length=14 <-- wrong kfdhdb.grpname: CRT_DG1 ; 0x048: length=7 <-- wrong kfdhdb.fgname: ASM_1MC ; 0x068: length=7 <-- wrong
Those two ASM disks got the header content switched between them. But after about 20 hours, they were corrected:
$ kfed read /dev/oracleasm/disks/ASM_DATA08_1MC | egrep 'provstr|dskname|grpname|fgname'
kfdhdb.driver.provstr:ORCLDISKASM_DATA08_1MC ; 0x000: length=22 kfdhdb.dskname: ASM_DATA08_1MC ; 0x028: length=14 kfdhdb.grpname: CRT_DG1 ; 0x048: length=7 kfdhdb.fgname: ASM_1MC ; 0x068: length=7
That is, the 'kfed read' result matches the label ASM_DATA08_1MC. We can't find any possible event that could have corrected it. Right before the next day recheck, we ran command 'oracleasm scandisks'. Although we didn't check *before* running this command, we had exactly the same problem on another cluster and we ran 'oracleasm scandisks' multiple times on that cluster without correcting the problem. So scandisks would not likely have corrected the labels. Something must have happened to trigger this self-correction. But ASM alert.log or other trace files or /var/log/messages don't show anything relevant.
We know we can run 'oracleasm renamedisk' to correct the label. But we're curious about this self-correction. Has anybody seen this?
Yong Huang
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Feb 06 2017 - 17:16:17 CET