RE: high "latch: cache buffers chains" waits in 10.2.0.3 DB
Date: Wed, 27 Feb 2008 23:30:48 -0700
I still maintain that divide and conquer and partitioned problem solving yields more deterministic results than trying different solutions. To that extent, I believe that, the MTS is a way to isolate the activity originating from the web services. There is perhaps a beneficial side effect of MTS, the number of processes afforded to the Web processing is capped and to that extent it cannot overrun the server.
I agree with Greg that the free lists is not relevant to your problem to the extent that the query is the consumer of resources. And that brings up two aspects related to the CPU consumption and run queues.
- Process creation while has gotten cheaper in modern architectures is not free. To that extent MTS will reduce the CPU run queues (I concede that session creation is far more expensive than process creation and does not compare database activity that follow).
- The Query could be processing a larger number of blocks. That the number of processes is growing suggests that the ones that have been submitted are not completing (assuming a run away condition or a large number of users hitting the back button and revisiting the page has been ruled out). See works by Cary Millsap and several others on logical IO.
Thus I believe the tactic should be to buy time and introduce stabilizing factors and then tackle the fundamental problem.
President/Executive Architect, Quasar Database Technologies, LLC http://www.linkedin.com/in/quasardb
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of bkaltofen_at_gmx.de
Sent: Wednesday, February 27, 2008 6:15 AM To: Greg Rahn
Subject: Re: high "latch: cache buffers chains" waits in 10.2.0.3 DB
thanks for all the responses.
I will try to respond to all suggestions.
SQL tuning is not an option, as I said. I know that it is the preferred approach in this case, I suggested that to teh customer myself, but not possibel. The customer knows about that, but pleases us to look for an other way.
ASH and AWR are not usable, because Diag Pack (which is needed for that, or am I wrong?) is not licensed...
I agree with Greg, that MTS will change nothing, as we already increased the max process from 300 to 600 some weeks ago. That only changed the time, till no new connections are possible. With tha actual 600 processes the maschine goes to the limit. The load goes up to 15-20 when the problem occurres.
Also connection pooling is not possible my the application, and the customer will not spend money in changes to the app, as they plan to replace it in the middle future.
We identified one select statement accessing the table with hot blocks, that consumed over 90% of CPU time during high load and max processes. And there is only one insert statement on the mentioned table.
What you think about moving the table to a tablespace without ASSM and increasing the freelists and freelist_groups? As far as I know freelist-changes are not possible in ASSM tablespaces.
Again, I can not test anything, as we have found no way to reproduce the problem by will and we have no test database.
Greg Rahn schrieb:
> On 2/26/08, krish.hariharan_at_quasardb.com <krish.hariharan_at_quasardb.com>
>> If you are dealing with two separate problems it may be better to eliminate
>> one of them. The first problem I would try to eliminate is the max >> processes. Start out with Shared Servers (a measure of success would be to
>> know, by sampling, max number of active sessions); you could go to town by
>> implementing connection pooling and multiplexing, but start with a simple >> shared server model. I have built perl/cgi based web apps and have >> successfully dealt with them using MTS/SS. >> >> This may give you the breather necessary to see if the second problem is >> related to above or a separate issue. >> >> Regards, >> -Krish >> >
> While the principle is novel, I'm going to respectfully disagree with
> this suggestion.
> Based on the basic information provided, the high number of processes
> is a *symptom* of some locking, not a cause. When the locking occurs,
> other processes just queue up until the max process limit is hit and
> the db is unusable. Adding more processes would only allow this run
> away train to gain more speed and cause more damage, possibly
> rendering the host unusable and instead of a db reboot, now a hard
> reboot is required. In this case, resources are likely to be consumed
> directly proportional to their availability.
> If the issue is related to locking, then Shared Servers isn't going to
> yield any benefit since there really isn't any time where a connection
> is not doing anything: either it (1) logs on, does its work, then logs
> off or (2) logs on, does work and is blocked. Either way, the
> connection would be active (as the db sees it) and would require a
> server process.
> In this case, I think it is safe to say "A" causes "B" so fixing "A"
> should be the goal, not working around "B".
Received on Thu Feb 28 2008 - 00:30:48 CST