RE: Higher CPU Utilisation on failover node under same workload

From: Osborne, Chris <Chris.Osborne_at_bskyb.com>
Date: Thu, 6 Nov 2014 14:43:11 +0000
Message-ID: <608208D242735D458CDE934D20B2FDAA011DFF_at_WPMBX040.bskyb.com>



HI Iggy,

There's a definite pattern where we only see CPU time > number of cores when we are on the other server. I only included a single awr report for brevity, but I've got loads of examples going back for a we while, and we only see the issue when we are on the failover node. We definitely do have large variations in load across the day, but it's a predictable when we'll be busy and when we are not as the application has been here for a while, and the batch (including daytime batch) schedule is fairly well understood.

As I said, this is a bit of a head scratcher for me, as I can't see much going on differently other than some pieces of CPU bound SQL taking longer to execute, and the SYS cpu time being higher.

Regards,

Chris

Christopher Osborne
Lead Technical Specialist, Performance Engineering British Sky Broadcasting
Email:chris.osborne_at_bskyb.com
Desk: +44 1506 325069 | Mobile: +44 7720 308941 Please note new Mobile number.

[oebanner4ps_gap2_620]

From: Iggy Fernandez [mailto:iggy_fernandez_at_hotmail.com] Sent: 06 November 2014 14:32
To: Osborne, Chris; oracle-l_at_freelists.org Subject: RE: Higher CPU Utilisation on failover node under same workload

Hi, Chris,

My gut reaction is that you almost certainly have large variations over time on your production system too so I am not surprised that there was a significant difference when you compared one sample from the primary with one sample from the standby (after switchover). You can write queries on the AWR tables to print the workload over an extended period of time. I would be extremely surprised if you did not see equal or greater variation on the primary over a period of time.

Iggy

> From: Chris.Osborne_at_bskyb.com<mailto:Chris.Osborne_at_bskyb.com>
> To: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>
> Subject: Higher CPU Utilisation on failover node under same workload
> Date: Wed, 5 Nov 2014 13:20:10 +0000
>
> Hi all,
>
> This is my first post.
>
> I have an ongoing issue where I am seeing much increased CPU utilisation when a database is running on the failover node, compared to when it is running on the primary node.
> When we perform OS patching we fail from one node to the DR site, while the primary site is being patched.
> Both hosts are the same spec and config, and the database is configured identically on both hosts too.
>
> AWR Diff reports show that the workload is very similar.
>
> The 2nd period is where we see the problem
>
>
> 1st 2nd
> ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
> Event Wait Class Waits Time(s) Avg Time(ms) %DB time Event Wait Class Waits Time(s) Avg Time(ms) %DB time
> ------------------------------ ------------- ------------ ------------ ------------- ----------- ------------------------------ ------------- ------------ ------------ ------------- -----------
> db file sequential read User I/O 4,178,196 23,392.4 5.6 61.7 CPU time N/A 38,985.7 N/A 59.8
> CPU time N/A 10,138.9 N/A 26.8 db file sequential read User I/O 4,581,489 23,083.5 5.0 35.4
> read by other session User I/O 325,114 1,866.3 5.7 4.9 db file parallel read User I/O 219,007 1,670.0 7.6 2.6
>
> db file parallel read User I/O 177,766 1,419.6 8.0 3.7 read by other session User I/O 246,088 1,307.7 5.3 2.0
> enq: TX - row lock contention Application 1,220 1,321.2 1083.0 3.5 enq: TX - row lock contention Application 651 618.6 950.2 0.9
> --------------------------------------------------------------------------------------------------------------------
>
>
> Host Configuration Comparison
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 1st 2nd Diff %Diff
> ----------------------------------- -------------------- -------------------- -------------------- ---------
> Number of CPUs: 256 256 0 0.0
> Number of CPU Cores: 32 32 0 0.0
> Number of CPU Sockets: 4 4 0 0.0
> Physical Memory: 261632M 261632M 0M 0.0
> Load at Start Snapshot: 32.61 57.98 25.37 77.8
> Load at End Snapshot: 33.13 63.91 30.78 92.9
> %User Time: 6.03 4.95 -1.07 -17.9
> %System Time: 4.82 15.19 10.37 215.1
>
> %Idle Time: 89.15 79.86 -9.29 -10.4
> %IO Wait Time: 0 0 0 0.0
> Cache Sizes
>
> I know that we have a problem with the size of the connection pools on this database, and the fact that they are dynamic too concerns me. This issue is being worked on.
> My first thought is that fact that the single block read time is 10% faster could be meaning more of the sessions are runnable at any given point and slowing us down through context switching, but this may be a stretch...
> I am seeing the host reporting more CPU time spent on SYS rather than User time though.
>
> Any advice/pointers would be gratefully received.
>
> Cheers
>
> Chris
>
>
> Christopher Osborne
>
>
>
>
>
>
> Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of British Sky Broadcasting Group plc and Sky International AG and are used under licence. British Sky Broadcasting Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries of British Sky Broadcasting Group plc (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD.
> --
> http://www.freelists.org/webpage/oracle-l
>
>

Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of British Sky Broadcasting Group plc and Sky International AG and are used under licence. British Sky Broadcasting Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries of British Sky Broadcasting Group plc (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD.



--
http://www.freelists.org/webpage/oracle-l


image001.png
Received on Thu Nov 06 2014 - 15:43:11 CET

Original text of this message