Re: ASM diskgroup is dismount

From: Vinay Kumar Narisetty <narisetty.vinay_at_gmail.com>
Date: Thu, 2 Jan 2020 10:45:06 -0600
Message-ID: <CAFtUX+anFvbHN2b0OtOsCN+T8Bvqc7FvoPTvnoFupSidWtTAMw_at_mail.gmail.com>



Checked /var/log/messages. I see timeout on ISCSI. I will check with my storage admin

2019-12-29T18:34:56.518192-06:00 rmpp-proddb iscsid[2512]: iscsid: Kernel reported iSCSI connection 1:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
2019-12-29T18:34:56.522105-06:00 rmpp-proddb kernel: [18434655.285178]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 8903731264, last ping 8903732544, now 8903733825 2019-12-29T18:34:56.522127-06:00 rmpp-proddb kernel: [18434655.285183]  connection1:0: detected conn error (1022) 2019-12-29T18:34:56.522127-06:00 rmpp-proddb kernel: [18434655.285299] sd 18:0:0:6: [sdc] tag#2 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:34:56.522128-06:00 rmpp-proddb kernel: [18434655.285301] sd 18:0:0:6: [sdc] tag#2 CDB: Test Unit Ready 00 00 00 00 00 00 2019-12-29T18:34:56.522128-06:00 rmpp-proddb kernel: [18434655.285308] sd 18:0:0:7: [sde] tag#0 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:34:56.522129-06:00 rmpp-proddb kernel: [18434655.285309] sd 18:0:0:7: [sde] tag#0 CDB: Test Unit Ready 00 00 00 00 00 00 2019-12-29T18:34:56.522130-06:00 rmpp-proddb kernel: [18434655.285313] sd 18:0:0:4: [sdb] tag#1 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:34:56.522130-06:00 rmpp-proddb kernel: [18434655.285314] sd 18:0:0:4: [sdb] tag#1 CDB: Test Unit Ready 00 00 00 00 00 00 2019-12-29T18:34:58.562161-06:00 rmpp-proddb iscsid[2512]: iscsid: Kernel reported iSCSI connection 2:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)
2019-12-29T18:34:58.566072-06:00 rmpp-proddb kernel: [18434657.329097]  connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 8903731776, last ping 8903733056, now 8903734336 2019-12-29T18:34:58.566090-06:00 rmpp-proddb kernel: [18434657.329099]  connection2:0: detected conn error (1022) 2019-12-29T18:34:58.566091-06:00 rmpp-proddb kernel: [18434657.329201] sd 19:0:0:1: [sdd] tag#3 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:34:58.566091-06:00 rmpp-proddb kernel: [18434657.329203] sd 19:0:0:1: [sdd] tag#3 CDB: Test Unit Ready 00 00 00 00 00 00 2019-12-29T18:34:58.566092-06:00 rmpp-proddb kernel: [18434657.329219] sd 19:0:0:2: [sdf] tag#2 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:34:58.566092-06:00 rmpp-proddb kernel: [18434657.329220] sd 19:0:0:2: [sdf] tag#2 CDB: Test Unit Ready 00 00 00 00 00 00 2019-12-29T18:34:58.566093-06:00 rmpp-proddb kernel: [18434657.329224] sd 19:0:0:3: [sdg] tag#0 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:34:58.566093-06:00 rmpp-proddb kernel: [18434657.329225] sd 19:0:0:3: [sdg] tag#0 CDB: Test Unit Ready 00 00 00 00 00 00 2019-12-29T18:34:58.566110-06:00 rmpp-proddb kernel: [18434657.329229] sd 19:0:0:5: [sdh] tag#1 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
2019-12-29T18:35:01.638133-06:00 rmpp-proddb kernel: [18434660.401040]  session1: session recovery timed out after 5 secs 2019-12-29T18:35:01.638148-06:00 rmpp-proddb kernel: [18434660.401049] sd 18:0:0:4: rejecting I/O to offline device 2019-12-29T18:35:01.638149-06:00 rmpp-proddb kernel: [18434660.401060] sd 18:0:0:6: rejecting I/O to offline device 2019-12-29T18:35:01.638150-06:00 rmpp-proddb kernel: [18434660.401066] sd 18:0:0:7: rejecting I/O to offline device  iscsid: Kernel reported iSCSI connection 2:0 error (1022 - ISCSI_ERR_NOP_TIMEDOUT: A NOP has timed out) state (3)

On Thu, Jan 2, 2020 at 10:30 AM Mladen Gogala <gogala.mladen_at_gmail.com> wrote:

> Looks like one of the disks comprising the disk group gave up its ghost.
> Please check /var/log/messages. Try dropping the defective disk from the
> disk group. If that doesn't work, there is always RMAN. This doesn't look
> like a good way to start the year 2020. However you should always look on
> the bright side of life 😉.
> Regards
>
> On Thu, Jan 2, 2020, 11:18 Vinay Kumar Narisetty <
> narisetty.vinay_at_gmail.com> wrote:
>
>> Hi,
>>
>> Oracle 19c database my ASM disgroups got dismounted.Diskgroup name
>> "DATA" .I can mount it back today morning and startup the database instance
>> without any issue.
>>
>> I see below message in alert.log file:
>> 2020-01-01 18:36:14.477+ORA-15032: not all alterations performed
>> ORA-15017: diskgroup "DATA" cannot be mounted
>> ORA-15040: diskgroup is incomplete
>> ORA-15080: synchronous I/O operation failed to read block 0 of disk 0 in
>> disk group
>> ORA-27061: waiting for async I/Os failed
>> Linux-x86_64 Error: 5: Input/output error
>> Additional information: 4294967295
>> Additional information: 4096
>> . For details refer to "(:CLSN00107:)" in
>> "/u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc".
>> 2020-01-01 18:36:14.486 [OHASD(4156)]CRS-2878: Failed to restart resource
>> 'ora.DATA.dg'
>> 2020-01-01 18:36:14.545 [ORAAGENT(4337)]CRS-8503: Oracle Clusterware
>> process ORAAGENT with operating system process ID 4337 experienced fatal
>> signal or exception code 11.
>> 2020-01-01T18:36:14.599529-06:00
>> Errors in file
>> /u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc
>> (incident=1):
>> CRS-8503 [__lll_unlock_elision()+48] [Signal/Exception: 11] [Instruction
>> Addr: 0x7f7629fe0640] [Memory Addr: (nil)] [] [] [] [] [] [] [] []
>> Incident details in:
>> /u01/app/grid/diag/crs/rmpp-proddb/crs/incident/incdir_1/ohasd_oraagent_grid_i1.trc
>>
>> Can any one advice why my ASM diskgroup is dismounted ?
>>
>> alert.log;
>> 2019-12-31 13:53:04.324 [ORAAGENT(4337)]CRS-5010: Update of configuration
>> file "/u01/app/oracle/product/19.3.0/db_1/network/admin/listener.ora"
>> failed: details at "(:CLSN00015:)" in
>> "/u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc"
>> 2020-01-01 18:36:12.106 [ORAAGENT(4337)]CRS-5011: Check of resource
>> "PUNV" failed: details at "(:CLSN00007:)" in
>> "/u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc"
>> 2020-01-01 18:36:14.477 [ORAAGENT(4337)]CRS-5017: The resource action
>> "ora.DATA.dg start" encountered the following error:
>> 2020-01-01 18:36:14.477+ORA-15032: not all alterations performed
>> ORA-15017: diskgroup "DATA" cannot be mounted
>> ORA-15040: diskgroup is incomplete
>> ORA-15080: synchronous I/O operation failed to read block 0 of disk 0 in
>> disk group
>> ORA-27061: waiting for async I/Os failed
>> Linux-x86_64 Error: 5: Input/output error
>> Additional information: 4294967295
>> Additional information: 4096
>> . For details refer to "(:CLSN00107:)" in
>> "/u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc".
>> 2020-01-01 18:36:14.486 [OHASD(4156)]CRS-2878: Failed to restart resource
>> 'ora.DATA.dg'
>> 2020-01-01 18:36:14.545 [ORAAGENT(4337)]CRS-8503: Oracle Clusterware
>> process ORAAGENT with operating system process ID 4337 experienced fatal
>> signal or exception code 11.
>> 2020-01-01T18:36:14.599529-06:00
>> Errors in file
>> /u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc
>> (incident=1):
>> CRS-8503 [__lll_unlock_elision()+48] [Signal/Exception: 11] [Instruction
>> Addr: 0x7f7629fe0640] [Memory Addr: (nil)] [] [] [] [] [] [] [] []
>> Incident details in:
>> /u01/app/grid/diag/crs/rmpp-proddb/crs/incident/incdir_1/ohasd_oraagent_grid_i1.trc
>>
>> 2020-01-01 18:36:16.381 [ORAAGENT(33223)]CRS-8500: Oracle Clusterware
>> ORAAGENT process is starting with operating system process ID 33223
>> 2020-01-01 18:36:22.963 [ORAAGENT(33223)]CRS-5017: The resource action
>> "ora.punv.db start" encountered the following error:
>> 2020-01-01 18:36:22.963+ORA-01078: failure in processing system parameters
>> LRM-00109: could not open parameter file
>> '/u01/app/oracle/product/19.3.0/db_1/dbs/initPUNV.ora'
>> . For details refer to "(:CLSN00107:)" in
>> "/u01/app/grid/diag/crs/rmpp-proddb/crs/trace/ohasd_oraagent_grid.trc".
>> 2020-01-01 18:36:24.046 [OHASD(4156)]CRS-2878: Failed to restart resource
>> 'ora.punv.db'
>>
>>
>> Regards,
>> Vinay Narisetty
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jan 02 2020 - 17:45:06 CET

Original text of this message