Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Statspack ratios help

RE: Statspack ratios help

From: Khemmanivanh, Somckit <somckit.khemmanivanh_at_weyerhaeuser.com>
Date: Tue, 6 Jun 2006 16:49:17 -0700
Message-ID: <65C0D8935651CB4D96E97CEFAC5A12B9022F1404@wafedixm10.corp.weyer.pri>


>>>The thing to remember is that a hit ratio does not reliably correlate
to performance, and therefore is NOT a valid performance metric. If you approach a database that is having performance problems, look at a hit ratio, and determine it's "too low", you're likely to spin your wheels trying to "fix" the hit ratio by improving it, but miss the root cause of the performance problem.<<<  

I'll have to disagree with that somewhat. Consider that case of index scans that are serviced mainly from the DB Buffer Cache.

Your hit ratio will be good because the query(ies) are continually hitting the cache and not reading from disk.

Once you start getting cache misses then your performance drops as well....

I know what most of the response will be -- go tune the code to be more efficient...of course that is the ideal solution but in reality that is not always a possilbility...  

So you are left with things like trying to buffer as much of the data as possible (i.e. big DB cache sizes, keep pools, etc..)  

I'm not saying you don't have a valid point (most of it is) but there are always exceptions...


From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Bobak, Mark Sent: Tuesday, June 06, 2006 1:08 PM
To: bunjibry_at_gmail.com; Oracle-L
Subject: RE: Statspack ratios help

Bryan,  

Never be afraid to ask a question...and there are no stupid questions...only stupid answers. I shall endeavor to ensure this is not a stupid answer.  

There is nothing inherently wrong with hit ratios. The problem is when people rely on hit ratios as a valid performance metric. If you want to monitor hit ratios and you happen to notice a precipitous drop in a particular hit ratio, well, perhaps there's something there....perhaps.  

The thing to remember is that a hit ratio does not reliably correlate to performance, and therefore is NOT a valid performance metric. If you approach a database that is having performance problems, look at a hit ratio, and determine it's "too low", you're likely to spin your wheels trying to "fix" the hit ratio by improving it, but miss the root cause of the performance problem.  

A few words about 100% buffer cache hit ratio. This is actually often a sign of trouble, perhaps big trouble. The theory goes like this "disk I/O is X times slower than memory access (pick your favorite number for X, 100, 1000, 10000, whatever) so therefore, you want all your I/O to be logical (from the buffer cache) thereby eliminating all that slow disk I/O." Sounds good, right? There are some things you have to remember.....Oracle's buffer cache is part of the SGA, which is a shared memory segment, which all the Oracle server processes share access to. In order to ensure the integrity of that data in shared memory, Oracle has some pretty elaborate locking mechanisms. I won't go into details, but, very breifly (and incompletely) a "latch" is a small chunk of memory which serializes access to a specific part of the SGA. The buffer cache is protected by latches and buffer locks that make sure that while one process is reading data from a particular block, another process can't overwrite that buffer.  

So, when you consider the serialization overhead of Oracle's SGA, the difference in cost between a logical I/O and a physical I/O is nowhere near 10000 or 1000 or even 100. If I recall correctly, Cary Millsap published a number, based on huge amounts of empirical data (trace files) that calculated it to be something like a factor of 37. Ok, that's still 37x slower! That's worth eliminating, right??  

Ok, next, a little change in focus here. There is no problem if a particular well-tuned query has a hit ratio of 100%. In fact, most anyone would agree that this is a good thing. So, what's the problem with a 100% (or nearly 100%) buffer cache hit ratio instance wide?? Well, maybe nothing....maybe. But it's also very possible to have an instance with a 100% buffer cache hit ratio that is so bottlenecked that noone can get any use out of it. The problem lies in the serialization I mentioned before. Even if you have a 100% buffer cache hit ratio, every buffer cache access is serialized through latches, buffer locks, and buffer pins. This is necessary for Oracle to maintain data integrity. So, what happens if you have a query that (for example) uses an index very inefficiently? You have a query that hits the same blocks over and over again. Because they are repeatedly hit, they remain in the buffer cache, so, no disk I/O! Yay! But, the serialization is killing you. What happens when you run several instances of these inefficient queries concurrently? They will push your buffer cache hit ratio well past 99.99%, but, the system will be suffering. The serialization invlolved in locking and latching will bring the system to it's knees. So, in this case, you have the "ideal" BCHR, but your system will run like a dog.  

The most effective way to determine the root cause of a performance problem is to specifically analyze the process that's having the problem. Look at the performance profile. Where is time actually being spent? Focus on that! Tools like the 10046 trace at level 8, the Oracle wait interface, and StatsPack, can be very useful in determining the root cause.  

-Mark  

PS If you go to http://www.hotsos.com/ and click on Library, Cary Millsap has a paper there called something like "Why a 99+% buffer cache hit ratio is bad", and he makes a much more eloquent argument than I do above. Lots of other good papers there too....take some time to read through some of them. Registration is required, but free.  

-- 
Mark J. Bobak 
Senior Oracle Architect 
ProQuest Information & Learning 

For a successful technology, reality must take precedence over public
relations, for Nature cannot be fooled.  --Richard P. Feynman, 1918-1988

.com <mailto:bunjibry_at_gmail.com>  


--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jun 06 2006 - 18:49:17 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US