Re: Monitoring question

From: Anton Buijs <remove_aammbuijs_at_xs4all.nl>
Date: Thu, 10 Jul 2003 01:18:15 +0200
Message-ID: <3f0ca29c$0$49109$e4fe514c@news.xs4all.nl>

Jan Gelbrich <j_gelbrich_at_westfalen-blatt.de> schreef in berichtnieuws bee1ob$3v507$1_at_ID-152732.news.dfncis.de...
| Hi,
|
| I hope my question is not *too* silly,
| but I would like to know if there is any view inside the data dictionary
| to see in real time which user is producing temporary bottlenecks in the
| database -
|
| the point is:
| I want to know what is happening *at that very time*
| when I see jams and hear users start complaining,
| but most DD Views like v$sysstat show _accumulated data_ ...
| I went through many v$* views in the manuals,
| but I was not successful to find something leading to this direction.
| and v$sqlarea show the SQL, but not the loads produced by it;
| I see just placeholders for bind variables.
|
| A fellow in my team just wondered
| why is there no similar functionality
| compared to the System Monitor on Windows NT or such,
| or top on linux, to see what is happening in time.
| You cannot even see any progress information in SQL+ ...
| ... for the moment I answered him well Oracle is just not M$
| ... but in a way he is right and he made me curious. It should be
| very important to any DBA to see session loads at one hit, isn´t it ?
| And I began to wonder how DBAs with much bigger dbs than mine,
| and thousands of user sessions are doing this ...
|
| Sometimes, mostly on Monday and Friday mornings,
| I have a jam situation for about 1 hour
| where everybody is waiting for app responding,
| but I have no blockings anymore. It is just 30+ users active at the same
| time
| using a couple of inhouse apps (Forms).
| So I cannot see who is to "blame" *in that very moment*
| (of course not the user, but the app (s)he is using ...).
|
| I know there are many possible reasons:
| data model (20%), app (50%), network (10%), server parameters (10%), old
| hardware etc.
| and I want to isolate the problem,
| but yet I am still lacking some techniques to do so.
| I want to overcome guessing it might be this or that app ...
| so v$session and v$sqlarea etc, is not enough for analysis.
|
| I admit that I am still not very experienced in monitoring,
| so I appreciate any comments.
|
| Is there a startegy or some points to start from in complex situations
like
| this ?
| Thank You in advance.
|
| Jan
|
| My system:
| Oracle 8.1.7 EE on a central AIX-Server 4.3.3, dedicated server, C/S
network
| to 20 local subsediaries.
| Oracle Forms & Reports inhouse app. Bind variables are used all over the
| place,
| and a fairly small DB of 20 GB, 200 users, jam starting when 30+ active.
|

Think you just hit the common performance tuning problem. Don't jump into the details at the beginning, but find the cause first: where are the sessions waiting for? Run the famous bstat/estat report in a 10 minutes interval one or more times during the peak period, maybe also when the db performs well but has significant load for comparison. If you are not familiar with bstat/estat use the newer tool statspack (if you need to learn one: learn statspack). Check the output for what is waited for most. Is it IO (waited most on for instance db_file_sequential_read, check the datafiles in the datafile io section), is it enqueus (blocking locks problem), is it for latches (check the latches session, library cache latch and shared pool latch high misses -> problems with the shared pool) etc. etc. Many more situations possible, just giving some examples from my experience. Problems interpreting the output? Visit www.oraperf.com and upload the output file (free registration required). You will get instant advice. Also use OS utilities like top and sar. If sar is not running ask your system manager to run it, always, with a 10 or 15 minutes interval. Sar shows cpu load (user, system, waitio and idle time), disk io load, runqueue etc.
It could be that there is a runqueue on the machine. Switching to MTS could be a solution then. In my experience don't start more shared servers as 1.5 * #cpu's. On a Sun Solaris machine I had the situation once that is was better sessions had to wait on an available shared server than have more server processes fighting to get a cpu time slice (so a runqueue again). How many cpu's are in this AIX machine?
Again: I mention some of my experiences, every situation is unique. First measure what's wrong, than take action.
Visit www.orapub.com (free registration required, again) and download paper #113 "Direct Contention Identification Using Oracle's Session Wait Event Views (April-2002) Craig Shallahamer"
(http://www.orapub.com/cgi/genesis.cgi?p1=sub&p2=abs113) and #119. The key is: where did the sessions waited for the most of the time, then investigate that area further. Skip the non-events, like "sqlnet message from client". Events are explained in the "Reference Guide", Appendix A. Quest has a tool Spotlight on Oracle (you have to pay for that), giving a very nice real-time presentation of what's going on in the database and you can zoom in to any area. Those screens look a lot like the Toad screens (most of them are included in free Toad). Only the overview screen is impressive and makes it a nice-to-have tool. Received on Wed Jul 09 2003 - 18:18:15 CDT