Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: [Long...] How a DBA should handle a load-average-spike?

Re: [Long...] How a DBA should handle a load-average-spike?

From: Mark D Powell <Mark.Powell_at_eds.com>
Date: 16 Dec 2006 08:01:24 -0800
Message-ID: <1166284884.507562.20970@73g2000cwn.googlegroups.com>

On Dec 15, 6:05 pm, "Peter Smith [gjfc]" <goodjobfast..._at_gmail.com> wrote:
> People,
>
> I'm trying to write a simple document on how a DBA should handle
> a host which is experiencing a load-average-spike (and
> negatively impacting end-user response time).
>
> Do any of you have some common sense you would like to share?
>
> Here are some ideas which come to my mind:
>
> -Try to characterize the current load using:
> -the 'ps' command.
> -the Targets -- Hosts -- Performance Page in EM
>
> Load-type-1: Is the load-average-spike caused by a single process or a
> small
> number of processes which are 'hogging' the CPU?
>
> Load-type-2: Is the load-average-spike caused by Oracle background
> processes which are 'hogging' the CPU?
>
> Load-type-3: Is the load-average-spike caused by Oracle foreground
> processes which are 'hogging' the CPU?
>
> Load-type-4: Is the load-average-spike caused by an abnormally large
> number of
> processes which have recently been added to the process list?
>
> If we are dealing with Load-type-1, we need to trace the processes
> to a user or business-process which spawned the process(es).
> Then, we implement procedures to rein-in or disconnect this user.
> This is a difficult situation because risks of doing the "wrong thing"
> are high. You don't want to disconnect a user who is conducting a
> legitimate business-critical task.
>
> If we are dealing with Load-type-2, we need to drill down into the
> Oracle
> background processes to find out what is keeping it overly busy.
> EM can help you with this:
> Home -- Targets -- Databases -- TheBusyDatabase -- Active Sessions with
> High CPU.
>
> In this screen you can see some clues about what is driving each of the
> busy sessions.
> When you drill down into Oracle background processes, one of the first
> layers you
> encounter are sessions.
>
> Your goal is to gather enough information to pinpoint the
> application(s) which are driving these sessions. Then, you interact
> with an administrative interface on the application to search for
> end-users who might be responsible. Or you might then find
> information about the application's configuration or state which is
> abnormal and needs to be fixed. For example, sometimes public facing
> websites can be pushed hard by mal-ware Perl scripts operated by
> delinquents to generate high volumes of traffic (to crash the site).
> A poorly constructed application might filter (or magnify) some of
> this traffic to a host for processing.
>
> One level below each session are individual SQL statements. EM offers
> functionality to look at each SQL statement in a session. If you are
> confident that the load on the DB is legitimate, an obvious next place
> to look for clues are the SQL statements. Tuning SQL statements is a
> vast subject but EM will quickly help you find SQL statements which
> consume relatively large amounts of CPU. Also EM contains a 'SQL
> Tuning Advisor'. It is sophisticated enough to help with a wide
> variety of poorly constructed SQL statements or schemas. For example,
> it could tell you if a new index would speed up a query and if so
> how much that index would slow down a corresponding insert.
>
> If we are dealing with Load-type-3, we need to...[ complete later ].
>
> If we are dealing with Load-type-4, we need to search for
> characteristics
> of the process set which yield clues about the deluge.
>
> Some simple questions might be a good start:
>
> -Are most of the processes owned by one userid?
> -Are most of the processes linked to the same executable?
> -Do we have source code for the underlying executable?
> -When were most of the processes created?
> -Are they all consuming CPU cycles?
> -Are they connected to anything:
> -reading/writing files?
> -connected to the DB?
> -network connected to anything?
>
> The goal here is to find the application or person who started all
> these processes.
> Another goal is to find out if they are obstructed by anything and thus
> prevented
> from dying since they cannot finish their jobs.
>
> So, that is my initial take on the high-load-average-spike scenario.
> Do any of you have any comments which would be useful for the
> DBA who needs a general set of guidelines for dealing with this
> situation?
>
> ...Peter
> --
> Peter Smith
> GoodJobFast...@gmail.comhttp://GoodJobFastCar.com

What version of Oracle and what platform?

Why not just quickly grap the sar statistics for cpu, memory, and IO and invoke "top" or equilivent to try to find the heavy hitters.

If the processes are related to Oracle connect to Oracle and start session traces for the heavy hitters

Also grab a quick, short statspack

If the problem is not Oracle related then make use of more OS utilities like vmstat, iostat, ipcs, etc... to grab more system related information.

Also check the system error logs for any problems detected by the OS

I do not see how you can assign the problem to a category until after you gather enough basic information to at least have an idea of what your are dealing with.

HTH -- Mark D Powell -- Received on Sat Dec 16 2006 - 10:01:24 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US