Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Performance Tool Question (CONFIO DBFlash ) ...

RE: Performance Tool Question (CONFIO DBFlash ) ...

From: Bruce McCartney <bruce.mccartney_at_dbinfosystems.com>
Date: Wed, 16 Nov 2005 20:37:52 -0700
Message-ID: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAAkEdZ3vpKf0Cnkb171va42cKAAAAQAAAAM95LkIsj9EOm/24lP3TXpAEAAAAA@dbinfosystems.com>


Hi Sam,
It depends. With statistical sampling - you will introduce error. The magnitude of the error is dependent on your workload and how long you sample for. I have used a similar product developed by Precise Software (Now Veritas), which used direct attach to the SGA to read the X$Tables (which are just memory structures anyway) on a statistical sampling cycle from 1/s to 999/s. The key here is that it has to be over lots of samples to reduce the probability of significant statistical error. You know this polls that report opinions +-% is possible by managing the sample size. With precise; the overhead was very low; and allowed us to collect and save a weeks worth of detail data. I found in the field that 3/s sampling was good enough for us to find problematic statements in practice. Cary Milsap covers this problems associated with scoping/sampling in his book on optimizing oracle and explains also a superior method for actually resolving performance by profiling where time went via extended tracing. That method also suffers from measurement resolution and quantization error (pg 155-170). Cary argues that it is not significant and I would agree completely with the extended tracing method and have seen it not be a huge issue with sampling.  

One thing to be aware of is the effect of 'select'ing every second and the way it influences the performance of the system (known as the anthropic principle http://www.anthropic-principle.com/primer.html). The thing I liked about the direct memory attached approach is that you minimize you intrusion on the 'system performance'. I was able to fence the CPU and memory used by collecting memory samples of the direct attach (no oracle connection, no SGA use, no buffer cache use). You may want to try to quantify the impact of a sql-based sampling approach.    

Hope that helps...



Bruce McCartney |DBIS |*403 615 3350 | bruce.mccartney_at_dbinfosystems.com  


From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Sam Bootsma
Sent: November 16, 2005 2:20 PM
To: oracle-l_at_freelists.org
Subject: Performance Tool Question (CONFIO DBFlash ) ...

I have been reviewing the white papers for the DBFlash product from CONFIO. I am impressed, but I do have one reservation.  

DBFlash works by running a SQL statement (or group of statements) against X$ tables on the monitored database once every second. The data is pulled across the network to a repository on a separate database server and database instance and analyzed. A gui client can then access the repository and tell you which SQL statements are waiting the most, and what wait events the SQL statements are waiting on. It can also do this for database users, OS users, programs, and a few more.  

My concern has to do with the frequency of polling (once every second). Oracle records waits in micro seconds, there are 1 million micro-seconds in a second (I think). So a wait can last 10,000 microseconds, and not be picked up by the software. In fact, I would think that most waits would not be picked up by the software because most waits probably start after one snapshot and finish before the start of the next snapshot.  

I posed this question to CONFIO, and this is the response from their DBA:  

  1. With wait event tuning, the events occurring more frequently will be caught by DBFlash. We do statistical sampling which by definition will miss some things. However, the problems, i.e. the wait events happening more frequently or waiting more time, will be caught by DBFLash. In other words, DBFlash will be able to find your problematic waits which is what you want.

What do you guys think? Is the integrity of the performance data questionable because of the "long" delays between polling? Or is the response from CONFIO valid?  

Thanks!    

Sam Bootsma

George Brown College

 <mailto:sbootsma_at_gbrownc.on.ca> sbootsma_at_gbrownc.on.ca

416-415-5000 x4933  

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Nov 16 2005 - 22:46:23 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US