Re: oracle database memory access becomes slow after restart

From: Gaja Krishna Vaidyanatha <gajav_at_yahoo.com>
Date: Fri, 17 Aug 2012 17:53:15 -0700 (PDT)
Message-ID: <1345251195.78759.YahooMailNeo_at_web83601.mail.sp1.yahoo.com>



Hi Eagle,
Not sure whether you got any responses on this. Some questions for you:
  1. Have you traced the Oracle session that queries x$kslei using 10046 and found out what the session is waiting for? 
  2. Do you see the same problem for all queries? How about something simple like - "select sysdate from dual;"?
  3. Have you DTraced the server process for the Oracle session and found anything interesting in the output?

Let us know.

Cheers,

Gaja
 
Gaja Krishna Vaidyanatha,
CEO & Founder, DBPerfMan LLC
http://www.dbperfman.com
http://www.dbcloudman.com

Phone - +1-650-743-6060
LinkedIn - http://www.linkedin.com/in/gajakrishnavaidyanatha

Co-author: Oracle Insights:Tales of the Oak Table - http://www.apress.com/9781590593875 Primary Author: Oracle Performance Tuning 101 - http://www.amzn.com/0072131454 Enabling Cloud Deployment & Management for Oracle Databases



 From: Eagle Fan <eagle.f_at_gmail.com>
To: oracle-l_at_freelists.org
Sent: Monday, August 13, 2012 8:16 PM
Subject: oracle database memory access becomes slow after restart  

Hi:
We are seeing this problem several times on Sun T3-1 servers. We also see the problem 1~2 times on Sun T4-1 Server

The server has 128G memory, database version is 11.2.0.2, 10.2.0.4 or 10.2.0.3. SGA is set as about 105G.

After database restart, the access to memory becomes slow. It doesn't always happen, usually it happens on the servers which are up for a long time.

For example: 11.2.0.2 version:

Before restart:

select count(*) from x$kslei;

COUNT(*)



1142

*Elapsed: 00:00:06.57*

After restart:
select count(*) from x$kslei;

COUNT(*)



1142

*Elapsed: 00:00:43.33*

And we also see mutex, latch problem on the databases. I think that's the result of the slow memory access.

If we reboot the server and then restart the database, it's back to normal.

A possible reason is memory fragmentation. Here is the explanation from oracle support:

*S**ince T3 has page size of 4 MB and Solaris kernel has single thread free
memory coalescing thread, it takes 15-20 minutes to coalesce the free memory to create large contiguous free memory chunk after we shutdown the Oracle database. Since we immediately start bringing up the database before the free memory is coalesced, the next shared memory segment allocation is fragmented and thus causes more memory latency compared to 1 single large memory chunk. *

The granule size is set as 128M in our database. The current solution is we do database failover instead of database restart. But it's more complicated and we need to make sure the inactive node is just rebooted before failover. Includes the inactivate nodes restart time, it takes much more time than database restart.

Do you have the same problem on T3 server? How do you deal with it?

Is there any way to check the OS memory fragmentation status?

Thanks.

-- 
Eagle Fan


--
http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Aug 17 2012 - 19:53:15 CDT

Original text of this message