Oracle-L: RE: missed Anjo's webcast..

From: Karniotis, Stephen <Stephen_Karniotis_at_compuware.com>
Date: Fri, 09 Aug 2002 11:28:30 -0800
Message-ID: <F001.004B0E80.20020809112830@fatcity.com>

Cary:

That was a great idea to mention "The Goal". The premise of this book is a manufacturing operation is having significant difficulties meeting expected outputs. Dr. Goldratt describes a theory called the "Theory of Constraints" whereby in order to improve a process, you must remove the constraint causing the problem. Simple enough within manufacturing, but sometimes difficult in information technology, as many of the constraints are either unseen or out of your control.

Gaja is going to love this:
BCHR is a measurement. It is a statistic that measures percentages. It is an indicator of something. Unfortunately, we have been bitten by the "groupthink" bug that says that low BCHRs are bad and high BCHRs are good. If that were in fact true, then Craig, Cary, Gaja, Anjo and others would not have written the wait interface. A historical note: The wait event interface was actually present in late versions of Oracle Version 6, but you REALLY had to dig for it. The wait event interface does one thing: it locates a constraint on the system. That's it. It's your job to figure out how to solve the problem, and there are a multitude of instruments that can be used to correct the situation. However, the wait interface can also be considered a statistic and thereby biased. To Anjo's credit, the numerics from the interface are actual runs, executions, etc. from within the C code of the kernel. You can't anymore direct than that.

As much as I am a believer of using BCHR as a performance indicator, I also believe that the wait interface helps to locate and correct, in a more direct manner, constraints within the system. From the theory of constraints, once the constraint is removed, you move on to the next constraint, and so on. With BCHR, once corrected, you sit and watch until the next problem occurs. Not much value in that.

As Gaja is ramping up his phone to call me, I have been convinced that the W/I adds value to your toolkit. Now if Anjo and the boys could have added a more friendly interface to this thing, that would have been great. Must be former assembler programmers.

Thank You

Stephen P. Karniotis
Product Architect
Compuware Corporation

Direct:	(248) 865-4350
Mobile:	(248) 408-2918
Email:	Stephen.Karniotis_at_Compuware.com
Web:	www.compuware.com

 -----Original Message-----
Sent:	Friday, August 09, 2002 2:19 PM
To:	Multiple recipients of list ORACLE-L
Subject:	RE: missed Anjo's webcast..

Dan,

I think I can convince you of the contrary, but I doubt that I can do it solely via email. In our Clinics, where we spend three solid days addressing of all of the reservations you've stated below (including proof with examples of why "holistic" approaches are unreliable, and why we believe response time analysis is actually the fastest route to root-cause identification). The goal of the Clinic is to send you home able to actually *do* the stuff you see us writing about...

One of the reasons I'm so confident in Response Time analysis is that it seems to be the final convergence for performance optimization work in other industries, some of which have been evolving for over a century. For a really nice description of one such industry, read Eli Goldratt's "The Goal."

What Response Time analysis gives you that ratios don't is a valid priority-based task order. To take your cue, you're right: a slow response time on a query *could* indicate any of literally hundreds of different root causes. But seeing a Response Time account of exactly what that query did will show you exactly where that query spent its time. Consequently, you'll see exactly what you need to repair. However, if you're not looking at those response time statistics, then the only method you're left with is to check each of the hundreds of things that *might* have caused the query to be slow. Ratios are supposed to guide you through that process, but they often mislead you (as Dave Ensor, Graham Wood, Anjo Kolk, Jonathan Lewis, Connor McDonald, Steve Adams, Gaja Vaidyanatha, Kirti Deshpande, John Kanagaraj, James Morle, Mogens Nørgaard, Jeff Holt, I, and several others keep demonstrating).

We visit several sites each year who have suffered the same performance problem for months or years. They've checked everything they can think of, and they haven't found the problem root cause. It is extremely rare for the same people to require more than one hour to positively identify and begin the repair of the correct root cause, when presented with Response Time data. It's mind boggling how difficult it is to convince people to do it that way for the first time. Once they try it, they never go back.

Within the next few months, I hope to have completed my book project ("Oracle Response Time Optimization" or some-such). The questions on this list are exceptionally good inspiration for such a project, by the way. When the book is finished, if I still haven't convinced you, then I simply won't have done my job.

Cary Millsap
Hotsos Enterprises, Ltd.
http://www.hotsos.com

Upcoming events:
- Hotsos Clinic, Oct 1-3 San Francisco, Oct 15-17 Dallas, Dec 9-11
Honolulu
- 2003 Hotsos Symposium on Oracle(r) System Performance, Feb 9-12 Dallas

Next event: NCOAUG Training Day, Aug 16 Chicago

-----Original Message-----
Sent: Friday, August 09, 2002 12:09 PM
To: Multiple recipients of list ORACLE-L

Stream of conciousness/random thoughts...

To focus exclusively on response time is to fall victim to the same fallacy
as focusing exclusively on hit ratios. The fallacy is that there is one and
only one methodology to use for tuning/troubleshooting. In 5 years, another
approach will supplant response time tuning and we will all scratch our heads and wonder why Anjo and Cary could not see the obvious. Just kidding...you guys will probably be the first to say "While the other approach was good, here is why the new approach is better with this new technology/knowledge".

The best approach, which is often preached, is to use the holistic approach.
I recall a paper from some years ago that discussed this approach. Most of
the
tuning books advocate such an approach, but we techies get bogged down in
the day to day operations and only read the chapters that are of immediate
concern.

A poor hit ratio (of any cache) may indicate a problem, even if the response
time is adequate. How many systems never increase in size or number of users?

Response time monitoring describes the symptom. It does not define the root
cause. A large number of waits for the database writer does not conclusively
determine a single root cause. Rather, there are many possible causes (from
the cache to processes to i/o subsystem). A slow response time on a query
could indicate a malformed query, missing/present indexes, poor storage parameters, an invalid high water mark, excessive read consistent transforms, i/o contention, etc.

One reason why hit ratios are nice is that they are tangible and measurable.
Response time satisfaction is not. What is 'slow' to one user is 'fast' to
another.

Thoughts...<ducking the inevitable flaming emails>

Dan Fink

-----Original Message-----
Sent: Friday, August 09, 2002 4:18 AM
To: Multiple recipients of list ORACLE-L

Cary,

I love to talk about the response time thing, but it shouldn't be a lunch but a big dinner (will also settle for Tuborg and hotdogs ;-)).

I think it is acutally great to see this discussion on the BCHR. I don't

except people to jump to response time tuning directly and drop the ratio tuning thing. However the response time tuning approach works great also for management. They like to hear things like 85 percent of the end user response time is in the network, because the hit ratio could be perfect in this example (but the problem is outside the database and the hit ratio will not show that .......).

Reponse time/Throughput tuning is not perfect yet. There are may be 1 or

2 tools that look at the whole stack and correlate that information. There are no large number of tuning books about it, so people end up buying the wellknown BCH books. Eventually the whole thing will change. Even Oracle changed STATSPACK in 9.2 to include the CPU time in the TOP5

wait events (now why did they do that ;-))

I have to mow the lawn, the wife complains about the response time (it takes weeks to get a reply from me ;-). Will hear from you guys later, I

am sure ......

Anjo.

Cary Millsap wrote:

>Sure, I'd love to comment...
>
>1. If you can inexpensively cache your whole database working set in
>memory, there's nothing wrong with doing that *unless* you could have
>better spent the resources somewhere else to make a bigger positive
>impact to the business (business = net profit & return on investment &
>cash flow). Does it make a perceptible performance difference for you

to
>have your whole database in memory? I can't know without seeing a >profile of some of your key application sessions, but my experience over

>a few hundred trace files recently tells me, "probably not."
>
><sidebar>Because of the masses of real-life field data we've seen over
>the last two years of collecting people's 10046 trace files, I disagree
>vehemently with the prediction that, "With 64-bit Oracle and terabytes
>of cheap memory, tuning will be a thing of the past." Maybe tuning with
>the buffer cache hit ratio will be a thing of the past (imho, it should
>have become a thing of the past in 1992 when Oracle created 10046

data).
>But 99%+ of the application inefficiencies that I see today will be *no*

>faster--zero percent--when they're made memory-resident.</sidebar>
>
>2. Having your entire database is in memory is no guarantee that your
>users' performance will be adequate. We see lots of applications that

>*zero* PIOs, but that consume *hours* of 1GHz CPU time because they do
>so many LIOs. ...Cache hit ratios at 100.0%, full-table scans at zero,
>but performance at absolutely intolerable. The goal is not a bunch of
>ratios in their "green zones." The goal is a system that provides
>maximum business value.
>
>3. It is the performance analyst's job to *know*your*business* well
>enough to know where response time improvement will help the most. THE
>SYSTEM CANNOT TELL YOU THIS. What if nobody's complaining about lousy
>performance? Take a user to lunch. Buy someone a sandwich and ask the
>simple question, "If I could make one thing faster today, what would
>most improve your time on Earth with this application?" Every time you
>ask this, a user will point your nose at Response Time. When you go

back
>to work after lunch, you had better *keep* your nose pointed at Response

>Time. If you don't know how to measure or optimize Response Time, then
>take Anjo or me to lunch (:\). Pursuing the optimization of *anything*
>other than Response Time is reliable only in creating the illusion of
>progress, if that. If you're not communicating with users and
>specifically targeting their important Response Times, then you're not
>optimizing performance.
>
>4. Finally, there's no such thing as an app in which you have "no
>control over the SQL." Even if you're still on RBO, you have some
>control over the schema (ability create/drop/rebuild indexes). If

you're

>on CBO, you have absolute control over database statistics (I like
>Jonathan Lewis' proposal: consider telling the database its statistics
>[dbms_stats.set_%_stats] instead of asking it for them). With 8.1.6 and
>above, you have stored outlines, which give you enormous control over
>which plans the optimizer chooses (even with RBO, which we demonstrate
>in our class). And with meaningful statistics to prove the case, I've
>found vendors responsive to constructive suggestions that improve
>performance of their products noticeably for their entire revenue base.
>
>
>Cary Millsap
>Hotsos Enterprises, Ltd.
>http://www.hotsos.com
>
>Upcoming events:
>- Hotsos Clinic, Oct 1-3 San Francisco, Oct 15-17 Dallas, Dec 9-11
>Honolulu
>- 2003 Hotsos Symposium on OracleR System Performance, Feb 9-12 Dallas
>- Next event: NCOAUG Training Day, Aug 16 Chicago
>
>
>
>-----Original Message-----
>Rich
>Sent: Thursday, August 08, 2002 5:29 PM
>To: Multiple recipients of list ORACLE-L
>
>Hi Cary,
>
>This comment made me think.  I agree in most cases, but what about a
>very
>small DB situation where the buffer cache is larger than all the tables
>and
>indexes combined (~300MB)?  This is for a 3rd party tool of which we
>have no
>control over the SQL.  I sized the buffer cache as a guesstimate of

load

>on
>concurrent usage in the near future.  As it turns out, the amount of
>data in
>the DB seems to be relatively low, so theoretically, all accessed data
>and
>indexes could be buffered.
>
>My kneejerk is that seems somehow wrong, but I can't think of a

downside

>offhand.  Care to comment?
>
>Always willing to learn,
>Rich Jesse                           System/Database Administrator
>Rich.Jesse_at_qtiworld.com              Quad/Tech International, Sussex,

>USA
>
>-----Original Message-----
>Sent: Thursday, August 08, 2002 5:05 PM
>To: Multiple recipients of list ORACLE-L
>
>* If you have a really high database buffer cache hit ratio (>99%),

then

>you
>almost certainly have inefficient SQL in your application.
> 
>Cary Millsap
>Hotsos Enterprises, Ltd.
>http://www.hotsos.com
>

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Anjo Kolk
  INET: anjo_at_oraperf.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Fink, Dan
  INET: Dan.Fink_at_mdx.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Cary Millsap
  INET: cary.millsap_at_hotsos.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).



The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it. 

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Karniotis, Stephen
  INET: Stephen_Karniotis_at_compuware.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Received on Fri Aug 09 2002 - 14:28:30 CDT