Re: HOw to introduce a proactive approach on DB use?

From: Mark D Powell <Mark.Powell_at_eds.com>
Date: Sat, 11 Oct 2008 08:31:21 -0700 (PDT)
Message-ID: <93523f8a-1f73-49b7-b660-5728c93a8ced@m3g2000hsc.googlegroups.com>


On Oct 11, 3:24 am, Tim X <t..._at_nospam.dev.null> wrote:
> Helma <helma.vi..._at_hotmail.com> writes:
> > hello everyone,
>
> > I just arrived at a shop where there are monthly performance problems.
> > At every end of the month, endusers are running reports (business
> > objects) and Peoplesoft batches.
> > Problem is, that there is no overview with the DBA-team of the
> > database use, e.g. :
>
> > - a nightbatch was accidentally started during daytime, the DBA team
> > was unable to figure this out. (enduser found out about this one)
>
> That seems a bit surprising. We have had similar problems at times and
> our DBAs have always been able to determine what was consuming all the
> resources.
>
>
>
>
>
> > - an other batch exeeded it's running time by more than 200% - DBA
> > team found out when looking for something else.
>
> > - the same report is ran by different users
>
> > - there are no ad hoc queries , but non of the batches and reports are
> > examined for performance.
>
> > etc.
>
> > I wonder how to get control over this situation and introduce a
> > proactive approach.
>
> > E.g. , I need an overview of the database use, and as a second target,
> > be able to quickly identify reports/ batches that are running amok.
>
> > If anyone has pointers on how they've dealt with such a situation, i
> > would be glad to hear.
>
> > Solaris, Oracle 9.2.0.5, app server
>
> > Thank you for your time,
>
> 1. If you have access to the code and if its using PL/SQL or Java, consider
> adding dbms_application_info statements. this can help the DBAs identify
> what is going on.
>
> 2. See if you can schedule the regular reports rather than allowing end
> users to execute them whenever they want. for example, if you know that
> some end users need a particular report on the last Thursday of each
> month and if it doesn't need to be run immediately, schedule the report
> to run during a low demand period, such as 3am (watch out for backups
> etc).
>
> What we ended up doing was implementing another layer between the app
> server and the database. this layer established a hierarchical form of
> access control. End users were still able to request reports, but unless
> they had the necessary access rights, their report would be scheduled to
> run during a low demand period. some senior staff had the authority to
> execute a report immediately. In most cases, staff had no problems with
> having to wait for a report (it did require some change management and
> some users complained because they werre now required to 'think
> ahead'. However, the outcome did mean that they actually got their
> reports and usually got them at an expected time. Previously, there was
> a lot of variation - once they got use to having to 'order' reports
> earlier, they appreciated the fact that the reports turned up in a more
> predictable way).
>
> Don't just consider technical solutions to the problem. Often, such
> problems are based around poor business practices. For example, we had a
> problem where the payroll report was taking far too long to run and it
> was getting worse every month. We did everything we could to tune things
> and get it running fast enough, but we soon reached a point of
> diminishing returns.
>
> Further investigation found that the report was doing payment analysis
> for over 40,000 employees. We only have around 5,000. Some more analysis
> showed that the HR department had adopted procedures that didn't fit
> well with how the software worked. Rather than terminating a position
> when someone left, they updated it so that the employees records
> appeared to represent an active employee who was not getting paid at the
> moment. they had adopted this procedure because we have quite a high
> turn-over of casual staff and they wanted to make the re-hiring of
> casual staff easier. As a result, the payruns were taking longer and
> longer to execute. The situation became worse after an application
> upgrade. It turns out, the previous version had a bug that was fixed in
> the new version. The fixing of the bug resulted in correct processing of
> employee pay entitlements and as we had what appeared to be 40,000 staff
> that could be entitled to pay, the system now processed all those
> records.
>
> this was a difficult one to resolve because it coincided with an
> applicaiton upgrade and a move to a new version of Oracle running under
> RAC. As we had moved to a new version of oracle and now had more
> processing power, the problem was not picked up in testing - reports
> were initially running within acceptable limits. However, after a few
> months, the processing time had really blown out. There were also some
> failures on the part of the HR systems administrator who was neglecting
> to run monthly 'clean-up' jobs. Again, a process problem rather than a
> technical problem.
>
> --
> tcross (at) rapttech dot com dot au- Hide quoted text -
>
> - Show quoted text -

I do not find the fact that the DBA's have difficulty finding the problem surprising. Often by the time word of a performance problem reaches the DBA the problem can be gone.

There are several ways to attack an overall month-end performance problem. One method would be to schedule or manually execute a series of short statspack snapshots during a problem time period. Then the high logical IO SQL could be exaimined from the reports. Tying specific SQL statements to source programs can however be difficult. You do have the option of generating outlines to tune the identified statements.

Another approach would be to monitor the SGA during a problem period looking for high cost SQL and since the SQL is currently running tie it to the source program. Now you know the source so you can apply coding changes directly, if desired and supported.

Identifying key jobs based on the customer input and also on schedule dependencies and setting SQL trace on for these jobs is probably a more logical approach. The trace files can give you very specific information about the critical tasks and the bottlenecks in the tasks. Getting customer input can sometimes help with customer relations. The customer can be more willing to live with a problem if the customer knows you are actively working on solving the problem even if it is going to take some time to resolve the issues.

HTH -- Mark D Powell -- Received on Sat Oct 11 2008 - 10:31:21 CDT

Original text of this message