Re: Performance problems after moving to new hardware

From: Sandra Becker <sbecker6925_at_gmail.com>
Date: Tue, 17 Mar 2015 10:34:08 -0600
Message-ID: <CAJzM94Bpf3x85BApFdxC=jChzsQarYXToXP4tU88m=XgSaHVfg_at_mail.gmail.com>



After asking for the same reports 3 times, oracle support came back and asked if we were sure the SQL in the AWR and ADDM were from the application. Took a lot of doing, and a call to the escalation manager, but we finally were able to get a Solaris engineer involved to help us troubleshoot the hardware. Very annoyed that they agreed to have someone on a conference call yesterday and that person decided not to call in because he didn't think it was a hardware issue. No explanations, just he didn't think it was hardware. We switched the primary back to the old hardware since the database was basically unusable for the application users.

Sandy

On Thu, Mar 5, 2015 at 3:40 PM, Ls Cheng <exriscer_at_gmail.com> wrote:

> you can run addm using the script
>
> ?/rdbms/admin/addmrpt.sql
>
> On Thu, Mar 5, 2015 at 9:19 PM, Sandra Becker <sbecker6925_at_gmail.com>
> wrote:
>
>> Oracle support had me run the IO calibration then gather system stats.
>> They said the new stats look good. They now want us to run our queries
>> then provide them with AWR and ADDM reports. Not sure how to get the ADDM
>> report yet since we don't have OEM set up yet, but I'll figure it out. I
>> will definitely look at the doc you recommended.
>>
>> Sandy
>>
>> On Thu, Mar 5, 2015 at 11:02 AM, Wayne Smith <wts_at_maine.edu> wrote:
>>
>>> Sorry if I missed another suggestion of this in this thread, but ...
>>>
>>> 11.2.0.2 is notorious for a system stats bug where SREADTIM and MREADTIM
>>> are 10,000 times larger than they should be. Once broken, you can
>>> manually (using DBMS_STATS.SET_SYSTEM_STATS ) set these values to more
>>> sane value
>>> s
>>> (10,000 times smaller) to avoid the terrible performance that can
>>> occur. See doc 9842771.8.
>>>
>>> Cheers, Wayne
>>>
>>> On Thu, Mar 5, 2015 at 10:11 AM, Sandra Becker <sbecker6925_at_gmail.com>
>>> wrote:
>>>
>>>> Correct, the new server waits significantly longer than the old
>>>> server. I have a ticket open with Oracle support. At this point, we're
>>>> leaning towards the server configuration rather than the storage. We
>>>> migrated our lower environment databases to the same type of server and
>>>> simply detached the storage from the old server and attached it to the new
>>>> server. They are seeing the same problem in the lower environments.
>>>>
>>>> Since we have moved several production databases to the new hardware, I
>>>> want to run a few AWR reports on the other databases migrated to this
>>>> server to see what the waits are, regardless of the fact no one is
>>>> reporting issues with any other database.
>>>>
>>>> Sandy
>>>>
>>>> On Thu, Mar 5, 2015 at 1:30 AM, Iliya Peregoudov <iperegudov_at_cboss.ru>
>>>> wrote:
>>>>
>>>>> I think I correctly decrypted AWR stats.
>>>>>
>>>>> AWR from old server
>>>>>
>>>>> Host CPU (CPUs: 32 Cores: 16 Sockets: 4)
>>>>>
>>>>> Event Waits Time(s) Avg wait (ms) % DB time
>>>>> Wait Class
>>>>> ----------------------- ------- ------- ------------- ---------
>>>>> ----------
>>>>> db file parallel read 72,570 4,355 60 50.98
>>>>> User I/O
>>>>> DB CPU 2,092 24.49
>>>>> db file sequential read 387,105 1,308 3 15.31
>>>>> User I/O
>>>>> direct path write temp 3,227 509 158 5.96
>>>>> User I/O
>>>>> db file scattered read 133,051 236 2 2.27
>>>>> User I/O
>>>>>
>>>>> Snap Time Load %busy %user %sys %idle %iowait
>>>>> --------------- ----- ----- ----- ----- ----- -------
>>>>> 24-Feb 10:00:42 1.06
>>>>> 24-Feb 11:00:59 2.02 4.40 1.74 2.66 95.60 0.00
>>>>>
>>>>> AWR from new server
>>>>>
>>>>> Host CPU (CPUs: 32 Cores: 4 Sockets: 1)
>>>>>
>>>>> Event Waits Time(s) Avg wait (ms) % DB time
>>>>> Wait Class
>>>>> ----------------------- ------- ------- ------------- ---------
>>>>> ----------
>>>>> db file parallel read 46,337 18,808 406 43.47
>>>>> User I/O
>>>>> db file sequential read 154,062 6,861 45 15.86
>>>>> User I/O
>>>>> direct path write temp 8,394 3,203 382 7.40
>>>>> User I/O
>>>>> log file sync 3,002 1,564 521 3.61
>>>>> Commit
>>>>> DB CPU 1,433 3.31
>>>>>
>>>>> Snap Time Load %busy %user %sys %idle %iowait
>>>>> --------------- ----- ----- ----- ----- ----- -------
>>>>> 03-Mar 10:00:42 2.73
>>>>> 03-Mar 11:00:37 2.95 7.12 4.69 2.43 92.88 0.00
>>>>>
>>>>>
>>>>> New server waits for I/O much more per hour (30k seconds vs 6k
>>>>> seconds). Average read waits are also 10 times larger on new server (406ms
>>>>> vs 60ms, 45ms vs 3ms). CPU on new server is under-loaded I think because of
>>>>> waits. It seems that old server was better balanced in IO/CPU throughput.
>>>>>
>>>>>
>>>>> On 04.03.2015 18:48, Ls Cheng wrote:
>>>>>
>>>>>> I cant read anything useful, cant you format the output or paste a
>>>>>> screenshot :-?
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> http://www.freelists.org/webpage/oracle-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Sandy
>>>> GHX
>>>>
>>>
>>>
>>
>>
>> --
>> Sandy
>> GHX
>>
>
>

-- 
Sandy
GHX

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Mar 17 2015 - 17:34:08 CET

Original text of this message