Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: startup/shutdown problem - long

Re: startup/shutdown problem - long

From: Ruth Gramolini <rgramolini_at_tax.state.vt.us>
Date: Fri, 28 Jul 2000 11:15:42 -0400
Message-Id: <10572.113250@fatcity.com>


Do you have some full directorys? If your archivelog directory it is full for instance you will have problems. If you system tablespace is full and you can't write to the dictionary it can hang. If the directory where one of your control files is is full it will hang etc. Check for space problems, in other words.

HTH,
Ruth
----- Original Message -----
From: Sindu <sindu_at_bigfoot.com>
To: Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com> Sent: Friday, July 28, 2000 6:55 AM
Subject: startup/shutdown problem - long

Hi all,

Oracle 8.0.5.2.1, Solaris 2.6, 1GB Ram, 2 instances/2 databases...

I had problem shutting down both our db today, nothing strange in the alert file, no trace file, the alert is like this:

Thu Jul 27 14:29:59 2000
Shutting down instance (normal)
License high water mark = 2
Thu Jul 27 14:29:59 2000
ALTER DATABASE CLOSE NORMAL
Thu Jul 27 14:30:00 2000
SMON: disabling tx recovery
SMON: disabling cache recovery
Thu Jul 27 14:30:00 2000
Thread 1 closed at log sequence 1
Thu Jul 27 14:30:01 2000
Completed: ALTER DATABASE CLOSE NORMAL
Thu Jul 27 14:30:01 2000
ALTER DATABASE DISMOUNT
Completed: ALTER DATABASE DISMOUNT

And it just hangs there, there aren't any CPU / disk activity.. I waited almost 1 hour before deciding to shutdown abort.. And after shutdown abort, now I can't start (mount) the database, the alert showed:

PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7

Thu Jul 27 16:44:29 2000
alter database mount

and it hang here...

After that I tried to shutdown the other instance/database, and I had exactly same problem for 2nd instance...

I search metalink and found suggestions to reboot the machine, something wrong about the shared memory... Anyway this is our development machine so I have time to do some experiment with it, its uptime is already 267 days, so I'm trying to preserve it :)

I then did truss for svrmgrl, as suggested from metalink: truss -fae -vall -o /tmp/mytruss svrmgrl SVRMGR> connect internal;
Connected.
SVRMGR> startup nomount;
ORACLE instance started.

Total System Global Area                        121265680 bytes
Fixed Size                                          48656 bytes
Variable Size                                    18735104 bytes
Database Buffers                                102400000 bytes
Redo Buffers                                        81920 bytes
SVRMGR> And I got the following repeated again and again non-stop:
26674:  semop(1638400, 0xEFFFE9D0, 1)   (sleeping...)
26678:  semop(1638400, 0xEFFFE920, 1)   (sleeping...)
26682:  semop(1638400, 0xEFFFE920, 1)   (sleeping...)
26686:  semop(1638400, 0xEFFFE920, 1)   (sleeping...)
26700:  semop(1638400, 0xEFFFE920, 1)   (sleeping...)
26674:      Received signal #14, SIGALRM, in semop() [caught]
26674:  semop(1638400, 0xEFFFE9D0, 1)                   Err#91 ERESTART
26674:          semnum=2     semop=-1    semflg=0
26674:  sigprocmask(SIG_BLOCK, 0xEFFFE5E8, 0x00000000)  = 0
26674:           set = 0 0 0 0
26674:  times(0xEFFFE578)                               = -1987064759
26674:          utim=1      stim=5      cutim=0      cstim=0      (HZ=100)
26674:  setitimer(ITIMER_REAL, 0xEFFFE578, 0x00000000)  = 0
26674:           value:  interval:    0.000000 sec  value:    3.000000 sec
26674:  sigprocmask(SIG_UNBLOCK, 0xEFFFE5E8, 0x00000000) = 0
26674:           set = 0 0 0 0
26674:  setcontext(0xEFFFE6B8)
26678:      Received signal #14, SIGALRM, in semop() [caught]
26678:  semop(1638400, 0xEFFFE920, 1)                   Err#91 ERESTART
26678:          semnum=3     semop=-1    semflg=0
26678:  sigprocmask(SIG_BLOCK, 0xEFFFE538, 0x00000000)  = 0
26678:           set = 0 0 0 0
26678:  times(0xEFFFE4C8)                               = -1987064746
26678:          utim=2      stim=1      cutim=0      cstim=0      (HZ=100)
26678:  setitimer(ITIMER_REAL, 0xEFFFE4C8, 0x00000000)  = 0
26678:           value:  interval:    0.000000 sec  value:    3.000000 sec
26678:  sigprocmask(SIG_UNBLOCK, 0xEFFFE538, 0x00000000) = 0
26678:           set = 0 0 0 0
26678:  setcontext(0xEFFFE608)
26682:      Received signal #14, SIGALRM, in semop() [caught]
26682:  semop(1638400, 0xEFFFE920, 1)                   Err#91 ERESTART
26682:          semnum=4     semop=-1    semflg=0
26682:  sigprocmask(SIG_BLOCK, 0xEFFFE538, 0x00000000)  = 0
26682:           set = 0 0 0 0
26682:  times(0xEFFFE4C8)                               = -1987064733
26682:          utim=1      stim=3      cutim=0      cstim=0      (HZ=100)
26682:  setitimer(ITIMER_REAL, 0xEFFFE4C8, 0x00000000)  = 0
26682:           value:  interval:    0.000000 sec  value:    3.000000 sec
26682:  sigprocmask(SIG_UNBLOCK, 0xEFFFE538, 0x00000000) = 0
26682:           set = 0 0 0 0
26682:  setcontext(0xEFFFE608)
26686:      Received signal #14, SIGALRM, in semop() [caught]
26686:  semop(1638400, 0xEFFFE920, 1)                   Err#91 ERESTART
26686:          semnum=5     semop=-1    semflg=0
26686:  sigprocmask(SIG_BLOCK, 0xEFFFE538, 0x00000000)  = 0
26686:           set = 0 0 0 0
26686:  times(0xEFFFE4C8)                               = -1987064712
26686:          utim=0      stim=6      cutim=0      cstim=0      (HZ=100)
26686:  setitimer(ITIMER_REAL, 0xEFFFE4C8, 0x00000000)  = 0
26686:           value:  interval:    0.000000 sec  value:    3.000000 sec
26686:  sigprocmask(SIG_UNBLOCK, 0xEFFFE538, 0x00000000) = 0
26686:           set = 0 0 0 0
26686:  setcontext(0xEFFFE608)
26700:      Received signal #14, SIGALRM, in semop() [caught]
26700:  semop(1638400, 0xEFFFE920, 1)                   Err#91 ERESTART
26700:          semnum=7     semop=-1    semflg=0
26700:  sigprocmask(SIG_BLOCK, 0xEFFFE538, 0x00000000)  = 0
26700:           set = 0 0 0 0
26700:  times(0xEFFFE4C8)                               = -1987064676
26700:          utim=2      stim=3      cutim=0      cstim=0      (HZ=100)
26700:  setitimer(ITIMER_REAL, 0xEFFFE4C8, 0x00000000)  = 0
26700:           value:  interval:    0.000000 sec  value:    3.000000 sec
26700:  sigprocmask(SIG_UNBLOCK, 0xEFFFE538, 0x00000000) = 0
26700:           set = 0 0 0 0
26700:  setcontext(0xEFFFE608)


$ps -fe|grep ora_
oracle 26682 1 0 16:27:08 ? 0:00 ora_lgwr_TEST oracle 26700 1 0 16:27:08 ? 0:00 ora_reco_TEST oracle 26696 1 0 16:27:08 ? 0:00 ora_smon_TEST oracle 26678 1 0 16:27:07 ? 0:00 ora_dbw0_TEST oracle 26686 1 0 16:27:08 ? 0:00 ora_ckpt_TEST oracle 26674 1 0 16:27:07 ? 0:00 ora_pmon_TEST

$ipcs

IPC status from <running system> as of Fri Jul 28 16:30:36 2000 T ID KEY MODE OWNER GROUP Message Queues:
Shared Memory:
m 2000 0xf63e48fc --rw-r----- oracle dba Semaphores:
s 1638400 00000000 --ra-r----- oracle dba

And when i type "alter database mount", it just hang and truss output is giving error on semop Err #91 ERESTART like above...

Anybody has any idea how to solve this (preferrably by not rebooting the sun machine)??
I have tried removing (ipcrm) all shared memory/semaphore before running svrmgrl, still no luck... Btw, isn't it strange that the key for the above semaphore is 00000000 (?)

/etc/system:

set shmsys:shminfo_shmmni=200
set shmsys:shminfo_shmmax=4294967295
set shmsys:shminfo_shmseg=200
set semsys:seminfo_semmap=1000
set semsys:seminfo_semmsl=1000
Received on Fri Jul 28 2000 - 10:15:42 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US