Re: Interview type Question
Date: Fri, 26 Sep 2008 16:46:02 +1000
raja <dextersunil_at_gmail.com> writes:
> I am learning Oracle, my PM asked me the following question, Please
> help me on this :
> You are a new employee at the company and the only Oracle DBA.
> You were just informed by the management that one of the Oracle DB
> Server is performing very badly for about a week now and
> that immediate analysis should be conducted.
> You are not familiar with the applications or the databases yet, but
> they ask you to find the problem as soon as possible.
> Please detail your plan of action, tools that you will use along the
> process, what would you check and how would you continue if you did
> not find any issue at the level or resource you are checking.
> Please explain things in a high level along with assumptions and
> Please give me ur step-by-step level to proceed for the above
There is no step by step guide. Every problem is different and there is no standard response that would always work.
Personally, I think you have failed already simply because you have asked us to do the work and put forward nothing. You would get a much better response if you outlined how you would respond. We could then add, modify or correct your approach as necessary. As it is, you appear to either be lacking in the confidence to put yourself 'out there' (not what you wold be looking for in a new employee) or your just lazy and want others to do your work (also not what you want in an employee). I'll give you the benefit of assuming a lack of confidence rather than laziness (though that is also not a great trait in a new employee!).
If I was given such a problem, the first thing I would establish is exactly what are the facts relating to that problem. Exactly what is meant by one of the db servers performing badly? Has it been established it really is the server and not, for example, network problems or client problems? Is it affecting everyone or just some users? Did it really only start a week ago or have things slowly been getting worse over time and it was only a week ago that users got really fed up? What are users expectations regarding the system performance and how far do those expectations differ from reality? Is performance bad all the time for all users or does it vary according to time of day? Has there been any other configuration changes on the network, with other systems, client updates etc.
Don't rely on reports from managers - talk to the users and get the information first hand if you can. identify specific examples and reproduce the problem for yourself rather than rely on the reports of others. Get a good understanding of the problem and most importantly, get a good understanding of what the user expectations are. Only once you have clearly established the parameters of the problem and eliminated other factors do you start looking at ways the db can be tuned to provide improved performance. have a clear goal to work towards rather than a vague 'make it better'. Work out some method for messuring performance and gather some statistics before you start. Its not good enough to 'feel' it has improved. you need to be able to measure the improvement.
the most important step is to sort out the facts from assumptions and misleading guesses. The larger the organisation, the more important this is. As an example, on Monday, a problem was brought to me that 3 staff had spent two days trying to solve. One of our systems had apparently failed to send out a critical e-mail to one of our clients and management now wanted a full report that outlined what the failure was and what was going to be done to fix it. The first question I asked was whether it had been verified that the e-mail had actually been sent and when this had occured. Apparently nobody had done this. One phone call and two minutes later, I had established that in fact, the information is never sent by e-mail. This was a misleading assumtion. In fact, the system generates a hard copy report which staff then put into an envelope and send to the client. E-mail is never involved. It turned out the problem was not a technical one - it was a process problem that failed when a staff member was off sick. The report was still sitting in their in-tray and waiting for them to mail it out.
A failure to verify such a basic fact resulted in a total of 6 days of staff hours being wasted tracking down a technical problem that didn't exist. Even worse, it was now the beginning of a belief amongst some of the managers that our technology was unreliable. I've now got to explain clearly that it wasn't a technical failure, but rather a human process failure and I've got to do it in such a way that the manager form the area concerned does'nt feel he is at fault for not identifying the problem correctly (he isn't at fault in my mind. if anything, my section is still at fault for not identifying the full facts right at the start).
-- tcross (at) rapttech dot com dot auReceived on Fri Sep 26 2008 - 01:46:02 CDT