Re: DBA Job Functions

From: Ram Raman <veeeraman_at_gmail.com>
Date: Thu, 1 Mar 2018 16:47:37 -0600
Message-ID: <CAHSa0M03db+EjvVOUPUSamGoKU3WugD62wihOFLfvFnsgHb9hQ_at_mail.gmail.com>



Since we are talking about people pointing to DB:

Few months ago application team complained that their new-user addition process started queueing up over several days and started lagging behind in our user provisioning application that runs on oracle. Sysadmins and oracle DBAs found out that the disks behind the datafiles were experiencing >90% utilization and bad response times. Disk subsytem was the initial suspect. Disk team argued that there was no change anywhere. Application team mentioned that there was no change anywhere. Even though I had just inherited the DB, I know nothing was changed on the DB side.

Unfortunately we could not go back more than a week to look at historical performance due to the default retention setting. After lots of noise and conference calls, meetings, work with support, we ended up doubling the disks. The problem recurred after a few weeks; same story repeated. I was not sure if it was the 2nd or 3rd time, but we found out there were unusable indexes; rebuilding those indexes seemed to have fixed the problem. Application team maintained innocence, I put in a script to check for invalid indexes. The heat we faced from the upper management was real.

Several weeks later, once again on a fine Monday am, same thing again - slow processes, conference calls, etc. I checked the dba_objects, it showed all the indexes were valid. But the script had sent out alerts about unusable indexes checking the dba_indexes (aargh). We rebuilt the indexes and once again things seemed to be ok.

The consultants working on the application suddenly asked for the help of DBAs to test a process of loading a small group of users in the test environment. They mentioned that they were testing bulk loading a small set of users in the prod, but they insisted it was for a small population only and they did not even touch other users and others should not have been affected; the load process was invalidating the indexes. That is what they had been doing all along. phewww!

What tool could have found out the cause sooner? We looked at the symptoms and could not arrive at the right conclusion earlier. Someone always does something and they maintain that nothing has changed.

Ram.

On Thu, Mar 1, 2018 at 9:27 AM, Cary Millsap <cary.millsap_at_method-r.com> wrote:

> In a way we’re lucky that everyone thinks that every performance problem
> is the database’s fault, because the Oracle Database software is the
> best-instrumented tier in the stack. So nobody looks at us funny when we go
> there first for measurements.
>
>
> Using “sql_trace wait=true” tracing seems like a measurement of only the
> database, but in actual fact, it’s a measurement of the database and
> everything it touches: including the application, the network, and the
> operating system. It’s the way you can *prove* how your system is spending
> the time consumed by any user’s specific experience with that system.
>
>
> The data we get from it generally makes it really easy to create if..then
> statements like, “If you will spend 4 hours changing the application in the
> following very specific way, then you will save at least 42 seconds every
> time a user runs this particular program, which they’re executing 369 times
> a day. It’ll save the company 4.3 hours per day of people waiting for this
> thing to run. And since 87% of that 42 seconds saved is CPU time, you’ll
> save 3.75 hours of CPU capacity per day, which you can apply to other
> applications.”
>
>
> It is deeply fulfilling to be able to do this.
>
>
> Cary Millsap
> Method R Corporation
> Author of *Optimizing Oracle Performance <http://amzn.to/OM0q75>* and *The
> Method R Guide to Mastering Oracle Trace Data, 2nd edition
> <http://amzn.to/1U7q8X1>*
>
>
> On Thu, Mar 1, 2018 at 9:22 AM, Mark W. Farnham <mwf_at_rsiz.com> wrote:
>
>> There is really quite a good laundry list on Oracle ScratchPad of things
>> to check when “nothing changed.” Sometimes a quick look at the list yields
>> a quick “Oh, that’s what happened” experience.
>>
>>
>>
>> And of course profiling some particular session that is slower than
>> expected is the effective way to be sure what is wrong. Whether that leads
>> to an obvious fix or not is another issue, but at least you won’t find
>> yourself working on things that are definitely NOT the problem.
>>
>>
>>
>> mwf
>>
>>
>>
>> *From:* oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freeli
>> sts.org] *On Behalf Of *Sheehan, Jeremy
>> *Sent:* Thursday, March 01, 2018 9:02 AM
>> *To:* dombrooks_at_hotmail.com; gogala.mladen_at_gmail.com
>> *Cc:* oracle-l_at_freelists.org
>> *Subject:* RE: DBA Job Functions
>>
>>
>>
>> I agree 100% with this. Many times it is related to a change in execution
>> plans. In those situations I will go ahead, troubleshoot, and fix. More
>> often than not (in my situations), it is lazy application support teams
>> that don’t want to begin troubleshooting on their end first. Most of the
>> time (with the applications that I support), when application teams come to
>> me the database is sitting mostly idle or with little to know more activity
>> than it would normally have.
>>
>>
>>
>>
>>
>>
>>
>> *From:* oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freeli
>> sts.org <oracle-l-bounce_at_freelists.org>] *On Behalf Of *Dominic Brooks
>> *Sent:* Thursday, March 1, 2018 2:40 AM
>> *To:* gogala.mladen_at_gmail.com
>> *Cc:* oracle-l_at_freelists.org
>> *Subject:* Re: DBA Job Functions
>>
>>
>>
>> Many sudden “database is slow” incidents, in my experience, are SQL plan
>> changes.
>>
>>
>>
>> I qualify these as database problems. Even though the data model and the
>> way the SQL is written etc may be contributory factors and then we get into
>> DBA semantics - “well it’s not an infrastructure DBA problem, it’s an
>> application DBA problem”.
>>
>>
>>
>> Experience also shows that it is rarely the network and rarely the
>> storage, etc. Sometimes it is.
>>
>> Sent from my iPhone
>>
>>
>> On 1 Mar 2018, at 03:52, Mladen Gogala <gogala.mladen_at_gmail.com> wrote:
>>
>> Jeremy, it never is a database problem. You can always blame network. Try
>> as they may, network engineers can never prove their innocence. If an
>> application is slow, it's always the network. Application service can't
>> reach the database and you can always show "waiting for more data from the
>> client" wait events to prove that your network is slow. The next in line
>> are system administrators. Is that app server swapping? How much CPU is it
>> using? The art of being a good DBA involves knowing how to find the
>> appropriate culprit.
>>
>> Joking aside, it really never is a database problem. What is slow is
>> always an application, not the database. Database is just storage, nothing
>> else. It's not the garage it's slow. When I was a DBA, I once got the
>> following complaint: "the database is slow in the northern half of the
>> sales room, but is fast in the southern half". This intrigued me so much
>> that I accepted the claim that "the database is slow". What ended up being
>> the problem was the router. The router for the northern part of the room
>> was plugged in the router port that was blinking red.
>>
>> You always start troubleshooting from the application. That is the lesson
>> from the Cary Millsap's book. It's simply incredible how long it takes for
>> that simple and obvious message to sink in.
>>
>>
>>
>> On 02/28/2018 02:58 PM, Sheehan, Jeremy wrote:
>>
>> HAHAHAHA! #truth
>>
>>
>>
>> Perhaps proving that it isn’t a database problem can be added to the
>> list?
>>
>>
>>
>> *From:* Jared Still [mailto:jkstill_at_gmail.com <jkstill_at_gmail.com>]
>> *Sent:* Wednesday, February 28, 2018 2:52 PM
>> *To:* Sheehan, Jeremy <JEREMY.SHEEHAN_at_fpl.com> <JEREMY.SHEEHAN_at_fpl.com>
>> *Cc:* oracle-l-freelist <oracle-l_at_freelists.org> <oracle-l_at_freelists.org>
>> *Subject:* Re: DBA Job Functions
>>
>>
>>
>> CAUTION - EXTERNAL EMAIL
>>
>>
>>
>> Also included: everything the DBA cannot get someone else to do.
>>
>>
>>
>>
>> Jared Still
>> Certifiable Oracle DBA and Part Time Perl Evangelist
>>
>> Principal Consultant at Pythian
>>
>> Pythian Blog http://www.pythian.com/blog/author/still/
>> <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.pythian.com%2Fblog%2Fauthor%2Fstill%2F&data=02%7C01%7C%7Ca9becef89c7f4c17644d08d57f27d1ab%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636554731355511391&sdata=fPB616KiFHWP2Shmk4W4WWP%2Bu3554ukoEJxZCZmYK7A%3D&reserved=0>
>>
>> Github: https://github.com/jkstill
>> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjkstill&data=02%7C01%7C%7Ca9becef89c7f4c17644d08d57f27d1ab%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636554731355511391&sdata=jEQ5k82aTYc5qibdlgiFDsiJtlZ2%2BoGUx5MS6O1tA9g%3D&reserved=0>
>>
>>
>>
>> On Tue, Feb 20, 2018 at 11:18 AM, Sheehan, Jeremy <JEREMY.SHEEHAN_at_fpl.com>
>> wrote:
>>
>> Hello Guru’s,
>>
>>
>>
>> My boss is asking me to compile a list of typical job functions for a
>> DBA. I came up with a brief list, but would like to hear any other
>> recommendations that you might have. What he said was, “We don’t want to go
>> into great detail, but not be too vague either. Somewhere between the
>> 10,000ft and 1,000ft view.
>>
>>
>>
>> Agile Work
>>
>> Backup/Recovery
>>
>> Change Deployment
>>
>> Database Design
>>
>> Database Install
>>
>> Documentation
>>
>> DR Activities (testing/maintenance)
>>
>> Lifecycles
>>
>> Performance Tuning/Monitoring
>>
>> Scripting DB/Host
>>
>> Solution Design
>>
>> On-call/Operations
>>
>>
>>
>> Thanks in advance,
>>
>>
>>
>> Jeremy
>>
>>
>>
>>
>>
>> --
>>
>> Mladen Gogala
>>
>> Database Consultant
>>
>> Tel: (347) 321-1217
>>
>>
>

--

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Mar 01 2018 - 23:47:37 CET

Original text of this message