Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Statspack ratios help
While all of this is good information, I don’t necessarily see an indication
of a problem…
How long and at what times does the snapshot include?
What are the top 5 wait events reported?
Are users complaining?
What was it like when it was "fast" or when it worked correctly?
Bryan,
Never be afraid to ask a question...and there are no stupid questions...only
stupid answers. I shall endeavor to ensure this is not a stupid answer.
There is nothing inherently wrong with hit ratios. The problem is when
people rely on hit ratios as a valid performance metric. If you want to
monitor hit ratios and you happen to notice a precipitous drop in a
particular hit ratio, well, perhaps there's something there....perhaps.
The thing to remember is that a hit ratio does not reliably correlate to
performance, and therefore is NOT a valid performance metric. If you
approach a database that is having performance problems, look at a hit
ratio, and determine it's "too low", you're likely to spin your wheels
trying to "fix" the hit ratio by improving it, but miss the root cause of
the performance problem.
A few words about 100% buffer cache hit ratio. This is actually often a
sign of trouble, perhaps big trouble. The theory goes like this "disk I/O
is X times slower than memory access (pick your favorite number for X, 100,
1000, 10000, whatever) so therefore, you want all your I/O to be logical
(from the buffer cache) thereby eliminating all that slow disk I/O." Sounds
good, right? There are some things you have to remember.....Oracle's buffer
cache is part of the SGA, which is a shared memory segment, which all the
Oracle server processes share access to. In order to ensure the integrity
of that data in shared memory, Oracle has some pretty elaborate locking
mechanisms. I won't go into details, but, very breifly (and incompletely) a
"latch" is a small chunk of memory which serializes access to a specific
part of the SGA. The buffer cache is protected by latches and buffer locks
that make sure that while one process is reading data from a particular
block, another process can't overwrite that buffer.
So, when you consider the serialization overhead of Oracle's SGA, the
difference in cost between a logical I/O and a physical I/O is nowhere near
10000 or 1000 or even 100. If I recall correctly, Cary Millsap published a
number, based on huge amounts of empirical data (trace files) that
calculated it to be something like a factor of 37. Ok, that's still 37x
slower! That's worth eliminating, right??
Ok, next, a little change in focus here. There is no problem if a
particular well-tuned query has a hit ratio of 100%. In fact, most anyone
would agree that this is a good thing. So, what's the problem with a 100%
(or nearly 100%) buffer cache hit ratio instance wide?? Well, maybe
nothing....maybe. But it's also very possible to have an instance with a
100% buffer cache hit ratio that is so bottlenecked that noone can get any
use out of it. The problem lies in the serialization I mentioned before.
Even if you have a 100% buffer cache hit ratio, every buffer cache access is
serialized through latches, buffer locks, and buffer pins. This is
necessary for Oracle to maintain data integrity. So, what happens if you
have a query that (for example) uses an index very inefficiently? You have
a query that hits the same blocks over and over again. Because they are
repeatedly hit, they remain in the buffer cache, so, no disk I/O! Yay!
But, the serialization is killing you. What happens when you run several
instances of these inefficient queries concurrently? They will push your
buffer cache hit ratio well past 99.99%, but, the system will be suffering.
The serialization invlolved in locking and latching will bring the system to
it's knees. So, in this case, you have the "ideal" BCHR, but your system
will run like a dog.
The most effective way to determine the root cause of a performance problem
is to specifically analyze the process that's having the problem. Look at
the performance profile. Where is time actually being spent? Focus on
that! Tools like the 10046 trace at level 8, the Oracle wait interface, and
StatsPack, can be very useful in determining the root cause.
-Mark
PS If you go to http://www.hotsos.com/ and click on Library, Cary Millsap
has a paper there called something like "Why a 99+% buffer cache hit ratio
is bad", and he makes a much more eloquent argument than I do above. Lots
of other good papers there too....take some time to read through some of
them. Registration is required, but free.
-- Mark J. Bobak Senior Oracle Architect ProQuest Information & Learning For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled. --Richard P. Feynman, 1918-1988Received on Tue Jun 06 2006 - 15:38:41 CDT
________________________________________ From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Bryan Wells Sent: Tuesday, June 06, 2006 3:39 PM To: Oracle-L Subject: Re: Statspack ratios help As a Jr. wanting to be Sr. (someday) I have to ask the stupid questions:
why is Buffer % bad? Dont you want 100% buffer cache hit vs. 100% disk i/o?
On 6/6/06, David Sharples <davidsharples_at_gmail.com> wrote: looks like you got some nasty sql in there (a buffer cache of 100% is not a good thing, looks like you are doing much logical io)
Anyway, does web logic cache its statements for you, usually a parameter withing the database connections properties
On 06/06/06, Sandeep Dubey <dubey.sandeep_at_gmail.com > wrote: Rollback per transaction %: 44.17 Buffer Nowait %: 100.00 Redo NoWait %: 99.97
Buffer Hit %: 100.00 In-memory Sort %: 100.00
Library Hit %: 100.00 Soft Parse %: 100.00
Execute to Parse %: 9.62 Latch Hit %: 99.88
Parse CPU to Parse Elapsd %: 69.06 % Non-Parse CPU: 91.60 With 100% soft parse, execute to parse ratio is so low. Is it bad, how I can I improve it? I see rollback per transaction as 44.17. We are using Hibernate that generates database mapping and produces most of the SQLs. How can I invetigate further? But I doubt if application is doing some big time rollbacks. -- Bryan S Wells Cell: 303.532.9879 Email: bunjibry_at_gmail.com -- http://www.freelists.org/webpage/oracle-l