Re: SQL performance in prod

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Wed, 25 Jul 2018 19:23:08 -0400
Message-ID: <a9f7b70d-d29c-84a2-9bd2-1538ac61e237_at_gmail.com>



There is some material in Jonathan's Core DBA book, allegedly the first one of a trilogy. Douglas Adams has written a trilogy consisting of 6 books, Jonathan has written a trilogy consisting of a single book. Jonathan hasn't written anything about the SEP field, although this method can be extremely handy in the DBA world. In addition to the Jonathan's excellent work, there is a noteworthy presentation by Mauro Pagano:

https://de.slideshare.net/MauroPagano3/chasing-the-optimizer-71564184

There is also something in the Bob's Furniture Store: http://www.oaktable.net/contribute/10053-viewer

Regards

On 07/19/2018 04:49 PM, Cee Pee wrote:
> Nice thread, we get same kind of problems too. Is there a book or
> paper on how to read the 10053 trace file, like "basics of 10053"?
> Reading Tim's response, I think it is a good time someone wrote a book
> on that topic.
> CP
>
> On Thu, Jul 19, 2018 at 9:38 AM, Larry Elkins <elkinsl_at_verizon.net
> <mailto:elkinsl_at_verizon.net>> wrote:
>
> I’ve normally used DBMS_SHARED_POOL.PURGE with the C option to
> purge specific SQL’s and get a reparse and desired plan. It
> continued to work ok, for me anyway, in 12c, but Carlos Sierra,
> and maybe some others, had mentioned potential issues in 12x. Like
> the link mentions, I’d normally do this to get a re-parse it to
> pick up a baseline we just implemented, or simply to get a
> re-parse away from an “odd” value such as is already being discussed.
>
> https://carlos-sierra.net/2017/11/22/purging-a-cursor-in-oracle-revisited/
> <https://carlos-sierra.net/2017/11/22/purging-a-cursor-in-oracle-revisited/>
>
> Regards,
>
> Larry G. Elkins
>
> elkinsl_at_verizon.net <mailto:elkinsl_at_verizon.net>
>
> 214.695.8605
>
> *From:*oracle-l-bounce_at_freelists.org
> <mailto:oracle-l-bounce_at_freelists.org>
> [mailto:oracle-l-bounce_at_freelists.org
> <mailto:oracle-l-bounce_at_freelists.org>] *On Behalf Of *Mark W. Farnham
> *Sent:* Thursday, July 19, 2018 9:08 AM
> *To:* tim.evdbt_at_gmail.com <mailto:tim.evdbt_at_gmail.com>; 'Stefan
> Knecht' <knecht.stefan_at_gmail.com <mailto:knecht.stefan_at_gmail.com>>
> *Cc:* vraman4list_at_gmail.com <mailto:vraman4list_at_gmail.com>;
> 'oracle-l-freelists' <oracle-l_at_freelists.org
> <mailto:oracle-l_at_freelists.org>>
> *Subject:* RE: SQL performance in prod
>
> well… if you change the number of cpus, then you can get a
> different number of parallel servers and a different hash
> distribution, so something prone to skew of row assignments could
> in theory be pretty flat with one number of cpus getting one
> number of parallel servers and slow with a different number of
> cpus getting a different number of parallel servers with a small
> number of them getting the load.
>
> So you were initially right (but for a pretty rare yet interesting
> case.)
>
> I don’t know whether JL’s laundry list of “nothing changed, but
> the performance is different” laundry list has been mentioned in
> this thread, but as laundry lists go it is better than most.
>
> Of course something like Method-R’s tool can pin down what is
> taking time in the pair. That removes all guessing about WHAT is
> consuming the time, which sometimes helps diagnose what triggers
> the different classes of response.
>
> But most likely you are getting entirely different plan, so you
> want to focus on that to rule it out first. It is often the case
> that non-prod has a less robust parameter choice in the collection
> of regression tests for performance. As in the story of the rainy
> day, if in prod an unusual parameter choice produces a plan that
> is sub-optimal for the majority of re-uses of the plan for that
> query, then you would like to be able to flush just that one query
> from the shared pool and let it get reparsed. I don’t know how to
> do that, despite asking for that rifle shot in place of the  shot
> gun flush shared pool that can cause a parse storm decades ago.
>
> Still, if you have the sql text you can modify it with a comment
> and see if a fresh parse is likely to get the good plan and more
> importantly, eliminate the prescience required to ask for a
> Wolfgang trace.
>
> In addition to difference of parameter (in the sense of bind
> variable predicate choice as opposed to init stuff), timing of
> stats collection versus running of the query such that some
> predicates exceed the recorded high value is a classic way to get
> a completely different plan.
>
> Your mileage may vary. In addition to this oracle-l thread, I
> would certainly also review the Oracle Scratchpad laundry list.
>
> mwf
>
> *From:*oracle-l-bounce_at_freelists.org
> <mailto:oracle-l-bounce_at_freelists.org>
> [mailto:oracle-l-bounce_at_freelists.org
> <mailto:oracle-l-bounce_at_freelists.org>] *On Behalf Of *Tim Gorman
> *Sent:* Thursday, July 19, 2018 9:36 AM
> *To:* Stefan Knecht
> *Cc:* vraman4list_at_gmail.com <mailto:vraman4list_at_gmail.com>;
> oracle-l-freelists
> *Subject:* Re: SQL performance in prod
>
> Stefan,
>
> I had one too many:  hardware changes could not affect execution
> plans.  I added that to pad the list without much thought just
> before pressing SEND, and of course regretted it two seconds later.
>
> There are certainly more items to add, but I started with the
> easily-verifiable stuff.
>
> Clearly a 10053 trace is the ultimate, but as the OP noted, it
> requires prescience to be set before it is needed, and prescience
> isn't always available.
>
> 10053 trace output is also not easy to read. Reading two such
> traces, comprehending both, and then comparing and contrasting
> usually requires intelligence and attention to detail approaching
> the level of Wolfgang Breitling.
>
> Thanks!
>
> -Tim
>
> On 7/18/18 21:03, Stefan Knecht wrote:
>
> Tim, you forgot one:
>
> 7. The fact whether it rains Monday morning or not
>
> The original anecdote referred to the fact that if it rained,
> a certain employee that normally arrives first on a sunny day,
> would get to the office later - which caused a different
> employee to first trigger execution plan creation, with
> different bind variables, leading to a different plan.
>
> So the query would run fast all day on a sunny day, but slow
> all day when it rained.
>
> Venky - try looking at the values of the bind variables of a
> good run vs a bad run.
>
> On Thu, Jul 19, 2018 at 5:58 AM, Tim Gorman
> <tim.evdbt_at_gmail.com <mailto:tim.evdbt_at_gmail.com>> wrote:
>
> Venky,
>
> "Assuming there is not much change in the DB"
>
> Let's narrow down the things that can change an execution plan...
>
> 1. hardware change (i.e. #-cpus, # GB of RAM, storage, etc)
> 2. application software change (i.e. change to the SQL text)
> 3. Oracle software change (i.e. patch, upgrade, etc)
> 4. initialization parameter change
> 5. gathering system statistics
> 6. gathering table, index, column statistics
>
>
> When you state the assumption about "no much change in the
> DB", I am assuming that you're discussing items #1-4.
>
> How about item #5?  Can you query the SYS.AUX_STATS$ table and
> display the column PVAL1 where PNAME has the value "DSTART" or
> "DSTOP"?
>
> How about item #6?  Can you display the contents of
> DBA_TAB_STATS_HISTORY for the tables involved in the query? 
> Please refer to the useful blog posts by Uwe Hesse HERE
> <https://uhesse.com/2012/04/23/diff_table_stats_in_history-example/>
> and by Marcel-Jan Krijgsman HERE
> <https://mjsoracleblog.wordpress.com/2013/02/19/more-statistics-history/>
> for more information, if necessary?
>
> Hope this helps?
>
> Thanks!
>
> -Tim
>
>
>
> On 7/18/18 15:30, V Raman wrote:
>
> List
>
> We have a SQL that is performing intermittently bad in our
> prod env. The good ones take 2 to 5 mins, the bad ones run
> for hours we kill them. They run fine in the non prod env.
> I ran an awsqrpt and based on that I see that there are a
> few executions with the bad ones taking hours. Looking at
> the differences in the execution plan, the good ones have
> lots of nested loops in them, with the bad ones having
> lots of hash joins.
>
> I am trying to figure out the cause(s). Assuming there is
> not much change in the DB, the first thing that comes to
> mind is statistics. Can the listers help with ideas? Thanks.
>
> If anyone is interested is seeing the report, i can
> provide a link to them by email.
>
> Venky
>
>
>
> --
>
> //
>
> zztat - The Next-Gen Oracle Performance Monitoring and
> Reaction Framework!
>
> Visit us at zztat.net <http://zztat.net/> | _at_zztat_oracle |
> fb.me/zztat <http://fb.me/zztat> | zztat.net/blog/
> <http://zztat.net/blog/>
>
>

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217



--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jul 26 2018 - 01:23:08 CEST

Original text of this message