RE: oracle database memory access becomes slow after restart

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Sat, 18 Aug 2012 08:02:59 -0400
Message-ID: <013801cd7d39$6a26ada0$3e7408e0$_at_rsiz.com>



What Gaja said, AND,

You've neatly boxed the problem to "what is different between restarting the server and the database instance and just restarting the database instance." Without examining the system I cannot be comprehensive, but a couple quick questions come to mind:

Q1: Have you checked whether there are any oracle processes still attached to the previous shared memory that are still being cleaned up?

Run a ps and specifically check. Now, having a bunch of zombies around is symptomatic of something else gone wrong while the instance was active, and I won't speculate on that. But if ps shows oracle processes for the instance that is already shut down, I suggest you zap them before you restart, since they will interfere with memory re-allocation and that nice contiguous space that would nicely service the restart *might* have to be cobbled together from hither and yon where a lot of other processes are running.

Further, what else is running on this box? Are there things that are *not* being stopped that have chunks (and they can be little, spread out chunks from one or more processes and still make allocation of big chunks difficult) of memory allocated.

Q2: What is the timing on a second x$kslei query on the restart? If a repeat is fast (and presuming no result caching could apply to this query) then we are dealing with a start-up artifact, while if a repeat is also slow we are dealing with something else. Turning on 10046 as Gaja suggested should be very informative, especially differences between the two runs if they are very different.

mwf

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Gaja Krishna Vaidyanatha
Sent: Friday, August 17, 2012 8:53 PM
To: oracle-l_at_freelists.org
Subject: Re: oracle database memory access becomes slow after restart

Hi Eagle,
Not sure whether you got any responses on this. Some questions for you:

  1. Have you traced the Oracle session that queries x$kslei using 10046 and found out what the session is waiting for?
  2. Do you see the same problem for all queries? How about something simple like - "select sysdate from dual;"?
  3. Have you DTraced the server process for the Oracle session and found anything interesting in the output?

Let us know.

Cheers,

Gaja
 
Gaja Krishna Vaidyanatha,
CEO & Founder, DBPerfMan LLC
http://www.dbperfman.com
http://www.dbcloudman.com

Phone - +1-650-743-6060
LinkedIn - http://www.linkedin.com/in/gajakrishnavaidyanatha

Co-author: Oracle Insights:Tales of the Oak Table - http://www.apress.com/9781590593875
Primary Author: Oracle Performance Tuning 101 - http://www.amzn.com/0072131454 Enabling Cloud Deployment & Management for Oracle Databases



 From: Eagle Fan <eagle.f_at_gmail.com>
To: oracle-l_at_freelists.org
Sent: Monday, August 13, 2012 8:16 PM
Subject: oracle database memory access becomes slow after restart  

Hi:
We are seeing this problem several times on Sun T3-1 servers. We also see the problem 1~2 times on Sun T4-1 Server

The server has 128G memory, database version is 11.2.0.2, 10.2.0.4 or 10.2.0.3. SGA is set as about 105G.

After database restart, the access to memory becomes slow. It doesn't always happen, usually it happens on the servers which are up for a long time.

For example: 11.2.0.2 version:

Before restart:

select count(*) from x$kslei;

COUNT(*)



1142

*Elapsed: 00:00:06.57*

After restart:
select count(*) from x$kslei;

COUNT(*)



1142

*Elapsed: 00:00:43.33*

And we also see mutex, latch problem on the databases. I think that's the result of the slow memory access.

If we reboot the server and then restart the database, it's back to normal.

A possible reason is memory fragmentation. Here is the explanation from oracle support:

*S**ince T3 has page size of 4 MB and Solaris kernel has single thread free memory coalescing thread, it takes 15-20 minutes to coalesce the free memory to create large contiguous free memory chunk after we shutdown the Oracle database. Since we immediately start bringing up the database before the free memory is coalesced, the next shared memory segment allocation is fragmented and thus causes more memory latency compared to 1 single large memory chunk. *

The granule size is set as 128M in our database. The current solution is we do database failover instead of database restart. But it's more complicated and we need to make sure the inactive node is just rebooted before failover. Includes the inactivate nodes restart time, it takes much more time than database restart.

Do you have the same problem on T3 server? How do you deal with it?

Is there any way to check the OS memory fragmentation status?

Thanks.

--

Eagle Fan

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Sat Aug 18 2012 - 07:02:59 CDT

Original text of this message