RE: hanging shutdowns (addressing the requirement for a UNIX reboot)

From: <oracle-l-bounce_at_freelists.org>
Date: Tue, 28 Feb 2006 07:38:46 -0500
Message-ID: <AA29A27627F842409E1D18FB19CDCF2706FCDC02@AABO-EXCHANGE02.bos.il.pqe>

John,

Very nice explanation, thanks! And I'm in that same camp w/ you and Jeremiah.

Thanks,

-Mark

--

Mark J. Bobak
Senior Oracle Architect
ProQuest Information & Learning

"Exception: Some dividends may be reported as qualified dividends but are not qualified dividends. These include:

Dividends you received on any share of stock that you held for less than 61 days during the 121-day period that began 60 days before the ex-dividend date. The ex-dividend date is the first date following the declaration of a dividend on which the purchaser of a stock is not entitled to receive the next dividend payment. When counting the number of days you held the stock, include the day you disposed of the stock but not the day you acquired it. See the examples below. Also, when counting the number of days you held the stock, you cannot count certain days during which your risk of loss was diminished. See Pub. 550 for more details." --IRS, Form 1040-A Instruction Booklet, Line 9b: Qualified Dividends

-----Original Message-----

From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]
Sent: Monday, February 27, 2006 8:19 PM
To: roger_xu_at_dp7uptx.com; Oracle-L_at_Freelists Subject: RE: hanging shutdowns (addressing the requirement for a UNIX reboot)

All,

I am with Jeremiah on this: A shutdown abort DOES NOT harm a database (at least in the five years I had used it on a set of active databases a few years ago). The ONLY time a Db had a problem after shutdown abort was in a 8i upgraded to 9i database (there was a bug a while ago which was related to the change of format in the redo log to support LSB which manifested itself when a shutdown abort was issued in between the upgrade before it was completed - I don't remember the specifics, but it manifested only during the upgrade).

As to the requirement to reboot the Solaris server, was this because the Database did not restart and complained of 'Unable to create Shared Mem segment' (Or similar message)? I believe this could have been because you killed the background processes after a 'shutdown immediate' "hang". This is because once you initiate a 'shutdown immediate' and 'control-c'ed out of it, then you will never be able to login since any new attaches will complain that a shutdown is in progress, and the only way out is to kill the backend processes. In this case, the shared memory segment is never released and you get the error at database restart because the SHM start address is calculated to the same existing but currently open value, everything being equal). You can very easily get out of this using the example in the following real life event:

In this case, I had three databases (the surviving Ist, 2nd Dbs and then the third whose backend had to be killed). In this case, use 'ipcs -am' to determine the memory segments, calculate the SGA size of the surviving databases and map the segment IDs using the LPIDs as shown below. Then use 'ipcrm -m <Key>' to kill the *right* segment (ipcrm -m 23175 in tis case) which will then allow you to restart the database. (Take it from me, I have done it many times before). In addition, the NATTCH column which shows 0 attaches is another giveaway!

$ ipcs -am | head -2; ipcs -am | grep oracle IPC status from <running system> as of Thu Dec 8 13:47:57 BST 2005

T         ID      KEY        MODE        OWNER    GROUP  CREATOR
CGROUP NATTCH      SEGSZ  CPID  LPID   ATIME    DTIME    CTIME 
m     147840   0          --rw-r-----   oracle      dba   oracle
dba      0  655441920  8931 23175 13:47:22 13:47:22 11:42:07
m          2   0xdd27ed28 --rw-r-----   oracle      dba   oracle
dba     16  371458048  6548 22193 13:45:01 13:45:01 14:35:12
m     276867   0xfa9fd35c --rw-r-----   oracle      dba   oracle
dba      0  502874112  8931 23175 13:47:22 13:47:22 11:42:11
m     787590   0          --rw-r-----   oracle      dba   oracle
dba    139  655441920 11593 23223 13:47:46 13:47:47  6:06:10
m     716359   0xe315db0c --rw-r-----   oracle      dba   oracle

dba 139 502874112 11593 23223 13:47:46 13:47:47 6:06:15

Ist surviving DB SQL> show sga

Total System Global Area 1157681312 bytes <== LPID 23223, 139 attaches)

Fixed Size                    73888 bytes
Variable Size             501182464 bytes
Database Buffers          655360000 bytes
Redo Buffers                1064960 bytes

1158316032 = 655441920 + 502874112 (LPID 23223 - 2 segments)

2nd surviving DB SQL> show sga

Total System Global Area 370548720 bytes <== LPID 22193)

Fixed Size                    69616 bytes
Variable Size             328454144 bytes
Database Buffers           40960000 bytes
Redo Buffers                1064960 bytes

John Kanagaraj <><
DB Soft Inc
Phone: 408-970-7002 (W)

Co-Author: Oracle Database 10g Insider Solutions http://www.amazon.com/exec/obidos/tg/detail/-/0672327910/

The opinions and facts contained in this message are entirely mine and do not reflect those of my employer or customers **

-----Original Message-----

From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Roger Xu Sent: Monday, February 27, 2006 3:24 PM
To: Oracle-L_at_Freelists
Subject: RE: hanging shutdowns

What should I do if "shutdown immediate" hangs? Last time, I had to reboot the Solaris Server.

-----Original Message-----

From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]On Behalf Of Edgar Chupit Sent: Monday, February 27, 2006 2:12 PM
To: Oracle-L_at_Freelists
Subject: Re: hanging shutdowns

Dear Jeremiah,

First of all, I would like to mention that I don't like to shutdown database without any practical reason (like hardware/OS maintenance/upgrades/etc).

And still I would like to argue that under normal circumstances startup force restrict + shutdown immediate (or shutdown abort, startup force, shutdown immediate) will run almost as fast and is as dangerous as a single shutdown immediate.

After shutting down abort in order to perform cold backup you still need to startup database and close it in consistent mode. Database startup is not very fast process in it self, because Oracle not only needs to recover database into consistent state (rollback uncommitted transactions), but also allocate memory structures and prepare itself for a normal work. And to shutdown database in consistent state you still need to issue shutdown immediate.

One of the popular reasons why shutdown immediate can take a longer time to proceed is because Oracle waits for SNP process to wakeup (Note: 1018421.102), but this can also happened when the shutdown immediate is called second time (after startup force), so even checkpointing and using startup force restrict can cause database to hang in shutdown immediate mode.

Also, there is a Note: 46001.1 that suggest to minimize usage of shutdown abort on Windows systems, because it can cause "allocation problems when Oracle is next started.". Note: 161234.1 that describes situation when shutdown abort can hang. Note: 222553.1 that states that startup force can be safer than shutdown abort. And plenty of other notes that describes different problems that can occur during database shutdown.

And surely there are many bugs that can occur after shutdown abort (but under normal circumstances shutdown abort is very safe).

Saying all this, I would like to return to thread subject and suggest to the original poster to try to convince the management to switch to hot backups, and forget about shutting down the databases because of backup at all.

On 2/27/06, Jeremiah Wilton <jeremiah_at_ora-600.net> wrote:

> If you 'alter system checkpoint' before the 'shutdown abort' then it 
> should be a lot faster for the user with a hanging or prolonged 
> 'shutdown immediate'.

> Jeremiah Wilton
> ORA-600 Consulting

> Recoveries - Seminars - Hiring
> http://www.ora-600.net

--

Best regards,
Edgar Chupit
callto://edgar.chupit
--

http://www.freelists.org/webpage/oracle-l

For technical support please email tech_support_at_dp7uptx.com or you can call (972)721-8257.
This email has been scanned for all viruses by the MessageLabs Email Security System.

This e-mail is intended solely for the person or entity to which it is addressed and may contain confidential and/or privileged information. Any review, dissemination, copying, printing or other use of this e-mail by persons or entities other than the addressee is prohibited. If you have received this e-mail in error, please contact the sender immediately and delete the material.

This email has been scanned for all viruses by the MessageLabs Email Security System. Any questions please call 972-721-8257 or email your request to tech_support_at_dp7uptx.com.
--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Tue Feb 28 2006 - 06:38:46 CST