Re: 11.2: how to recover from corrupt ASM spfile?

From: Robert Bialek <bialekr_at_gmail.com>
Date: Fri, 05 Mar 2010 21:21:25 +0100
Message-ID: <4B9167C5.6010109_at_gmail.com>



Hello Andreas,
> +ASM1.__oracle_base='/opt/oracle/base'#ORACLE_BASE set from in memory value
> +ASM2.__oracle_base='/opt/oracle/base'#ORACLE_BASE set from in memory value
> +ASM1.asm_diskgroups='RECO'#Manual Mount
> +ASM2.asm_diskgroups='RECO'#Manual Mount
> *.asm_power_limit=1
> *.diagnostic_dest='/opt/oracle/base'
> *.instance_type='asm'
> *.large_pool_size=12M
> *.remote_login_passwordfile='EXCLUSIVE'
>
> and could you send me the ASM alert.log from a manual startup with pfile please?
>
>   

Actually in exclusive mode the cluster should be able to start the ASM instance
with default parameter settings:

SQL> select sid, name,value from v$spparameter where isspecified='TRUE';

SID   NAME                 VALUE
----- ------------------------------ ------------------------------

* large_pool_size 12582912
* instance_type asm
* remote_login_passwordfile EXCLUSIVE
* asm_diskstring /dev/sd*
+ASM1 asm_diskgroups FRA +ASM2 asm_diskgroups FRA
* asm_power_limit 1
* diagnostic_dest /u00/app/oracle

8 rows selected.

SQL> alter system set asm_preferred_read_failure_groups='XY$_*' scope=spfile;
System altered.

SQL> select sid, name,value from v$spparameter where isspecified='TRUE';

SID   NAME                       VALUE
----- ---------------------------------------- 
------------------------------

* large_pool_size 12582912
* instance_type asm
* remote_login_passwordfile EXCLUSIVE
* asm_diskstring /dev/sd*
* asm_preferred_read_failure_groups XY$_*
+ASM1 asm_diskgroups FRA +ASM2 asm_diskgroups FRA
* asm_power_limit 1
* diagnostic_dest /u00/app/oracle

9 rows selected.

oracle_at_rac03-n2:~/ [+ASM2] sudo crsctl stop cluster -all

CRS-2673: Attempting to stop 'ora.crsd' on 'rac03-n2'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac03-n1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources 
on 'rac03-n2'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac03-n1'
...
...

oracle_at_rac03-n2:~/ [+ASM2] sudo crsctl start cluster -all CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac03-n2' CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac03-n1' ...

CRS-2672: Attempting to start 'ora.asm' on 'rac03-n1'
CRS-2672: Attempting to start 'ora.evmd' on 'rac03-n1'
CRS-2676: Start of 'ora.evmd' on 'rac03-n2' succeeded
ORA-01078: failure in processing system parameters
CRS-2674: Start of 'ora.asm' on 'rac03-n2' failed    <<---
...

node1: oracle_at_rac03-n1:~/ [+ASM1] sudo crsctl stop crs -f node2: oracle_at_rac03-n2:~/ [+ASM2] sudo crsctl stop crs -f

#asm will be started with default settings

oracle_at_rac03-n2:/tmp/ [+ASM2] sudo crsctl start crs -excl

CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.gipcd' on 'rac03-n2'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac03-n2'
CRS-2676: Start of 'ora.gipcd' on 'rac03-n2' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac03-n2'
CRS-2676: Start of 'ora.gpnpd' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac03-n2'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac03-n2'
CRS-2679: Attempting to clean 'ora.diskmon' on 'rac03-n2'
CRS-2681: Clean of 'ora.diskmon' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'rac03-n2'
CRS-2676: Start of 'ora.diskmon' on 'rac03-n2' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac03-n2'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac03-n2'
CRS-2676: Start of 'ora.drivers.acfs' on 'rac03-n2' succeeded
CRS-2676: Start of 'ora.ctssd' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac03-n2'   
CRS-2676: Start of 'ora.asm' on 'rac03-n2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac03-n2' 
CRS-2676: Start of 'ora.crsd' on 'rac03-n2' succeeded


SQL> select sid, name,value from v$spparameter where isspecified='TRUE';

no rows selected

SQL> show parameter spfile

NAME                     TYPE     VALUE
------------------------------------ ----------- 
------------------------------
spfile                     string

SQL> show parameter diagnos
NAME                     TYPE     VALUE
------------------------------------ ----------- 
------------------------------
diagnostic_dest              string     /u00/app/grid/11.2.0.1/log
SQL>  From the ASM alert.log:

Fri Mar 05 20:54:56 2010
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth1 10.0.0.0 configured from GPnP Profile for use as a cluster interconnect
Interface type 1 eth0 192.168.0.0 configured from GPnP Profile for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u00/app/grid/11.2.0.1/dbs/arch
Autotune of undo retention is turned on. LICENSE_MAX_USERS = 0
SYS auditing is disabled
NOTE: Volume support enabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options. WARNING: using default parameter settings without any parameter file <<<----
Cluster communication is configured to use the following interface(s) for this instance
  10.0.0.20
cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2
Fri Mar 05 20:55:07 2010
PMON started with pid=2, OS id=5731
Fri Mar 05 20:55:07 2010
VKTM started with pid=3, OS id=5733 at elevated priority VKTM running at (10)millisec precision with DBRM quantum (100)ms Fri Mar 05 20:55:07 2010
GEN0 started with pid=4, OS id=5737
Fri Mar 05 20:55:07 2010
..
MMNL started with pid=21, OS id=5773
lmon registered with NM - instance number 2 (internal mem no 1) Reconfiguration started (old inc 0, new inc 2) ASM instance
List of instances:
 2 (myinst: 2)
 Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps  Non-local Process blocks cleaned out
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived  Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests  Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Fri Mar 05 20:55:11 2010
LCK0 started with pid=22, OS id=5778
Fri Mar 05 20:55:12 2010
ORACLE_BASE not set in environment. It is recommended that ORACLE_BASE be set in the environment Fri Mar 05 20:55:13 2010
SQL> ALTER DISKGROUP ALL MOUNT /* asm agent */ NOTE: Diskgroup used for Voting files is:

         DATA
Diskgroup used for OCR is:DATA

NOTE: cache registered group DATA number=1 incarn=0xff69a9b6
NOTE: cache began mount (first) of group DATA number=1 incarn=0xff69a9b6
NOTE: Assigning number (1,6) to disk (/dev/sdc)
NOTE: Assigning number (1,3) to disk (/dev/sdd)
NOTE: Assigning number (1,0) to disk (/dev/sdf)
NOTE: Assigning number (1,4) to disk (/dev/sdg)
NOTE: Assigning number (1,5) to disk (/dev/sdh)
NOTE: Assigning number (1,7) to disk (/dev/sdl)
NOTE: Assigning number (1,2) to disk (/dev/sdm)
NOTE: Assigning number (1,1) to disk (/dev/sdn)
NOTE: start heartbeating (grp 1)

kfdp_query(DATA): 3
kfdp_queryBg(): 3
NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/sdf
NOTE: F1X0 found on disk 0 au 2 fcn 0.1819
NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/sdn
NOTE: F1X0 found on disk 1 au 2 fcn 0.2831
NOTE: cache opening disk 2 of grp 1: DATA_0002 path:/dev/sdm
NOTE: cache opening disk 3 of grp 1: DATA_0003 path:/dev/sdd
NOTE: F1X0 found on disk 3 au 2 fcn 0.2707
NOTE: cache opening disk 4 of grp 1: DATA_0004 path:/dev/sdg
NOTE: cache opening disk 5 of grp 1: DATA_0005 path:/dev/sdh
NOTE: cache opening disk 6 of grp 1: DATA_0006 path:/dev/sdc
NOTE: cache opening disk 7 of grp 1: DATA_0007 path:/dev/sdl
NOTE: cache mounting (first) normal redundancy group 1/0xFF69A9B6 (DATA)
* allocate domain 1, invalid = TRUE
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=18.447 group=1 (DATA)
NOTE: advancing ckpt for thread=1 ckpt=18.447
NOTE: cache recovered group 1 to fcn 0.2847
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA)
NOTE: LGWR found thread 1 closed at ABA 18.446
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA)
NOTE: LGWR opening thread 1 at fcn 0.2847 ABA 19.447
NOTE: cache mounting group 1/0xFF69A9B6 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=1 incarn=0xff69a9b6
kfdp_query(DATA): 4
kfdp_queryBg(): 4
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1 SUCCESS: diskgroup DATA was mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT /* asm agent */ SQL> ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent */ SUCCESS: ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent */ Fri Mar 05 20:55:21 2010
Starting background process ASMB
Fri Mar 05 20:55:21 2010
ASMB started with pid=24, OS id=5800
Fri Mar 05 20:55:21 2010
WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD)
oracle_at_rac03-n2:/tmp/ [+ASM2] cat initASM.ora
+ASM1.asm_diskgroups='FRA'
+ASM2.asm_diskgroups='FRA'
*.asm_diskstring='/dev/sd*'
*.asm_power_limit=1
*.diagnostic_dest='/u00/app/oracle'
*.instance_type='asm'

*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'

oracle_at_rac03-n2:/tmp/ [+ASM2] sqh

SQL*Plus: Release 11.2.0.1.0 Production on Fri Mar 5 20:59:18 2010

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options

SQL> create spfile='+DATA' from pfile='/tmp/initASM.ora';

File created.

SQL> oracle_at_rac03-n2:/tmp/ [+ASM2] sudo crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac03-n2'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac03-n2' CRS-2677: Stop of 'ora.crsd' on 'rac03-n2' succeeded ...

oracle_at_rac03-n2:/tmp/ [+ASM2] sudo crsctl start crs CRS-4123: Oracle High Availability Services has been started. oracle_at_rac03-n2:/tmp/ [+ASM2]

oracle_at_rac03-n1:~/ [+ASM1] sudo crsctl start crs CRS-4123: Oracle High Availability Services has been started. oracle_at_rac03-n1:~/ [+ASM1]

oracle_at_rac03-n2:~/ [+ASM2] crsctl status resource ora.asm -t


NAME           TARGET  STATE        SERVER                   
STATE_DETAILS      
--------------------------------------------------------------------------------
Local Resources

ora.asm
               ONLINE  ONLINE       rac03-n1                 
Started            
               ONLINE  ONLINE       rac03-n2                 
Started            

oracle_at_rac03-n2:~/ [+ASM2] sqh

SQL*Plus: Release 11.2.0.1.0 Production on Fri Mar 5 21:10:28 2010

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options

SQL> select sid, name,value from v$spparameter where isspecified='TRUE';

SID   NAME                 VALUE
----- ------------------------------ 
----------------------------------------

* large_pool_size 12582912
* instance_type asm
* remote_login_passwordfile EXCLUSIVE
* asm_diskstring /dev/sd*
+ASM1 asm_diskgroups FRA +ASM2 asm_diskgroups FRA
* asm_power_limit 1
* diagnostic_dest /u00/app/oracle

8 rows selected.

Cheers,
Robert

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Mar 05 2010 - 14:21:25 CST

Original text of this message