Re: Query runs fine on InstanceA but slow on InstanceB...

From: Kellyn Pot'vin <kellyn.potvin_at_ymail.com>
Date: Fri, 6 Jan 2012 09:20:38 -0800 (PST)
Message-ID: <1325870438.35790.YahooMailNeo_at_web121003.mail.ne1.yahoo.com>



Hi Dave,
I think you may have what you need already at your fingertips with trace files if you are sure this process is the clearly the culprit and not the whole environment.  As the query creates a more involved trace when you produce a 10053 trace file, why not start with a 10046 trace file, level 12, (bind variables and waits, so if there is any bind peeking issues, etc., they will show) which can then be run through TKPROF to give a clear output of what wait differences there are?  Method R's tools, such as Mr Trace also an excellent tool for deciphering waits that may not be caught by TKPROF and some of the tools are quite cost effective, such as the plug-in for Oracle Developer.

This might give you the higher level information you need and if it does not answer your question, you can then run the 10053 trace and drill down to the specific waits listed in the 10046 to help limit what data you inspect in the larger trace file.

Hope this helps,

Kellyn Pot'Vin
Sr. Database Administrator and Developer DBAKevlar.com



 From: "Taylor, Chris David" <ChrisDavid.Taylor_at_ingrambarge.com> To: "'dmann99_at_gmail.com'" <dmann99_at_gmail.com> Cc: "'oracle-l_at_freelists.org'" <oracle-l_at_freelists.org> Sent: Friday, January 6, 2012 8:41 AM
Subject: RE: Query runs fine on InstanceA but slow on InstanceB...  

David,

I was thinking several different options as I read your email - however, the more I read, each option quickly disappeared as you covered the strategies I would have used (minus SQLT which I haven't used).

So, I'd be interested in an update on this if you find a solution that works well.

One thing you didn't cover, is are these instances on the same server? Is it a vm or physical? Is the query running the same way in both instances - same app, same workstation (or whatever)?  Have you exported stats from one instance to the other to see if that resolves it?

Often it doesn't take a *significant* difference in stats (or parameters) between instances to really change performance.

I'd probably start by exporting stats for the affected tables (schema stats if possible) from Instance A to Instance B.  If that resolves it, then you know it's stats related. (I *think*)

Chris Taylor
Sr. Oracle DBA
Ingram Barge Company
Nashville, TN 37205

"Quality is never an accident; it is always the result of intelligent effort."
-- John Ruskin (English Writer 1819-1900)

CONFIDENTIALITY NOTICE: This e-mail and any attachments are confidential and may also be privileged. If you are not the named recipient, please notify the sender immediately and delete the contents of this message without disclosing the contents to anyone, using them for any purpose, or storing or copying the information on any medium.

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of David Mann Sent: Friday, January 06, 2012 8:42 AM
To: oracle-l_at_freelists.org
Subject: Query runs fine on InstanceA but slow on InstanceB...

I need some advice. I seem to be asked more and more the dreaded "why does my query run faster on InstanceA and slower on InstanceB?"

I provide all kinds of query tuning help to our developers, testers, and product support. Mostly this covers a surgical strike of 1 SQL statement. I investigate the issue, find a couple of ways to improve the query, document my improvement and provide my recommendation ... and then move on.

This A vs B seems to take me about 4X the time of a regular tuning request. I am now investigating the performance of a query on InstanceA, a query on InstanceB, and using my ever evolving checklist of why execution plans can differ to drill down into each reason (instance params, data volume, system stats, object stats, etc) and look for significant differences. I can usually find a way to improve the Query_at_InstanceB performance but I am not always successful finding a root cause for the difference. Plus I usually want to know this as much as my customer so I know what to look out for when dealing with future tuning issues with their apps.

I have my own checklist to guide my investigation and a bunch of custom queries to check data volume, system statistics, object statistics (basic table statistics right now, would like to expand to cover histograms, etc). I don't think I am covering all the bases since my checklist still gets changed/expanded after every tuning engagement.

SQLT probably covers everything I look for (and more) has been a good tool to get an idea of the environment surrounding the execution of a query on a system. but it is a little cumbersome at times (I can be called on to help with any of 600 DB instances - I end up installing it pretty much every time I need it somewhere to make sure I have the latest/greatest version). I had high hopes but limited success with SQLTCompare. I have only gotten it to compare successfully maybe 2 out of the 6 times I used it.

I have also had some success with 10053 traces for smaller queries. But at least for me these quickly get out of hand so I usually only use them as a last resort.

So my question is how do you approach something like this? Is there a 'best' method or tool for investigating and drilling down to a root cause of the same statement against 2 difference DB instances? Are there any book chapters, blog posts, or other resources you have seen covering this type of investigation? Is it time for me to get busy writing my own tool? :)

--

Dave Mann
www.brainio.us
www.ba6.us - Database Stuff - http://www.ba6.us/rss.xml
--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Fri Jan 06 2012 - 11:20:38 CST

Original text of this message