Re: cache buffer chains/where in code
Date: Sat, 28 Nov 2009 03:13:18 -0500
It's a single UltraSparc T2 CPU, which is 8 cores, 8 threads. Note that each core has 2 integer pipelines. So you could assume 16 CPUs and 64 threads.
There are many things that are wrong with this setup, and reducing the number of connections is something I am considering. However it's not that simple. Imagine that instead of CPU those were doing IO. You want to have a relatively deep IO queue to allow the raid array to deliver.
One thing that puzzles me is given that the suspicion is deep cpu run queue is problems, why only one very specific latch is causing the problem. There are several different types of queries running at the same time, why only one specific query is causing latch contention, why not the other ones.
On Fri, Nov 27, 2009 at 11:37 PM, Greg Rahn <greg_at_structureddata.org> wrote:
> 400 sessions seems very excessive for this hardware (how many and what
> model are the CPUs?, what does cpu_count show if defaulted). I've
> seen numerous systems that run significantly better when they reduce
> the number of connections/sessions significantly. Most think that
> more == better, and that is usually not the case. Generally I refer
> to this scenario as being "over processed".
> I'd be interested to know if the issue still appears with a reduced
> number of sessions. I'd suggest to experiment what is the minimal
> number of sessions required to keep the response times acceptable and
> how that impacts the CPU usage and run queue. As a starting point I'd
> use 1 session per CPU core (thread in the case of the CMT processors).
> On Fri, Nov 27, 2009 at 11:18 AM, Christo Kutrovsky
> <kutrovsky.oracle_at_gmail.com> wrote:
> > I've analyzed ASH data for problem period, usually there's 10-20 sesions
> > each sample. When this happens, there's near 400 sesions, with 250 of
> > waiting on the same latch/latch address, and 170 "ON CPU".
> > So that drives me towards Greg's suggestion that it could be a deep CPU
> > run-queue issue. This can be comfired with your suggestions of capturing
> > vmstat/prstat information.
> > I wonder what is the correct approach here to prevent deep CPU run-queues
> > from causing latch contention, considering UltraSparc T2 CMT cpus. Reduce
> > the number of sessions? Implement resource manager?
> Greg Rahn
-- Christo Kutrovsky Senior Consultant Pythian.com I blog at http://www.pythian.com/blogs/ -- http://www.freelists.org/webpage/oracle-lReceived on Sat Nov 28 2009 - 02:13:18 CST