Re: Solaris T5220 server problem

From: De DBA <>
Date: Thu, 28 Apr 2011 10:47:11 +1000
Message-ID: <>


Recently I had a similar problem, where two instances shared a single
server. Neither instance was particularly busy, and the server was only
ever used by 3 people at a time. That particular problem turned out to
be an ad-hoc query which kept reading one of the biggest tables in the
system, in the default parallel mode ( 2 * 10 * number_cpus). This
query had been tried in both instances by a developer and although the
sessions had long since disconnected, they kept running. It took a long
time before we had found that..

Did you check for long running queries?


Wolfson Larry - lwolfs wrote:


            Finally convinced client long running code wasn’t database, application, network problem.


Noticed when I was running one of my queries, that usually  runs in a tenth of a second elapsed time, was taking about 8 seconds on production server

8G, 32 CPUs with both prod & test (separate ORACLE_HOMES) on same server.


Wanted Unix admin to run some type of Dtrace.   I had already run truss a number of times.

Didn’t get that, but SA found  echo was running about 30-60 times longer on this server than dozens of others we manage (most not T5220s).

They ran GUDS, which didn’t help and then support person came up with this from a buddy he reached out to.


He suggested turning page coalescing off, which we found to be beneficial in many performance escalations.  This is something you can do on the fly and if it's found to have a desirable effect, it can be permanently set in /etc/system. There are no know downsides to doing this in the real world.


Once this is enabled, could your DBA's run some test jobs which can  be compared against timings for the same jobs when the test DB is down?


Here are the dirty details from previous communications on the topic:

quote --->

Large pages are not a problem. It is finding or coalescing them when none is available needs improvment. LPOOB feature is designed to improve application out of box performance. There are number of LPOOB fixes already been integrated in Sol10 U4 and more are planned for U5 and U6.


It is wiser to disable coalescing than disable LPOOB. If you don't want page coalescing then set following tunables dynamically or in /etc/system file.



What I didn't mention before is that the page coalescing issue is specifically mentioned with the Niagara family of CPUs, which is what this T5220, is running on systems running Java applications and Oracle databases (the Oracle part being pertinent here.)  Still not saying that it’s definitively going to resolve the problems, but it’s worth trying based on the system type, Oracle, and symptoms.


This is dynamic change.  Support person says we can easily toggle this back with no service interruption

Client is not buying that and I was just wondering  what experience anyone else has had with T5220s?


Support said they did this mostly for SAP and while we run a number of SAPs, not on this server which I would categorize as relatively lightly loaded.

Prod is far busier during nightly batch window.   Scheduled stats run well prior to that for 3-13 minutes.


Server and database have been up close to 2 years and they just noticed these processes running longer about 6 weeks ago.

They put a new release in TEST but claim problem started just prior to that.  Not refuting that.


Thanks for any ideas, suggestions, experiences.



The information contained in this communication is confidential, is
intended only for the use of the recipient named above, and may be legally

If the reader of this message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited.

If you have received this communication in error, please resend this
communication to the sender and delete the original message or any copy
of it from your computer system.

Thank You.

-- Received on Wed Apr 27 2011 - 19:47:11 CDT

Original text of this message