Re: LIO/sec per CPU limit? Is it Hardware or Oracle code?

From: Henry Poras <henry.poras_at_gmail.com>
Date: Thu, 10 Aug 2017 12:46:59 -0400
Message-ID: <CAK5zhLJOidaKc8q8of8GpJezf6LMEzVaFVE70-4OZeRq22EEFg_at_mail.gmail.com>



Thanks for all of the suggestions. Here is where I am so far:

Kevin - SLOB was always on my list of things I wanted to try and for some reason never got around to it (I don't mean for this problem, I mean going back a bunch of years). My question here relates to the fact that I can't take these machines off-line to run a test. Doesn't SLOB hammer the resources enough that I really need to run it as a test machine? not while our system is up and running (poorly)? Going over some of your docs to see.

Tony - I'll ask sysadms to check, but it's tough without knowing what to ask them to look for.

Karl - Sort of like what I looked at in /proc/cpuinfo, but much easier to read. After looking again, the two systems look identical from this level. Well, almost. cpu MHz is ~0.5% different (2299.908 vs. 2300.032). Doesn't seem like enough of a difference to explain my observations.

Mark - both have same hugepage configuration. Same HugePages_Free, Rsvd, Total, and size.

Bhavani - I can't run AWR, but I ran snapper on the same query in order to compare resource, latches, and statistics

Hans - Looking for differences, but not sure where to look.

MWF - do you know if there is a way to do this without being root?

Stefan - Thanks for the links. Haven't read these in a while. I'll see what I can use.

I'll post more if/when I have it.

Henry

On Thu, Aug 10, 2017 at 11:13 AM, Reen, Elizabeth <elizabeth.reen_at_citi.com> wrote:

> Are the disks set up identically?
>
>
>
> Liz
>
>
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_
> freelists.org] *On Behalf Of *Henry Poras
> *Sent:* Wednesday, August 09, 2017 5:46 PM
> *To:* ORACLE-L
> *Subject:* LIO/sec per CPU limit? Is it Hardware or Oracle code?
>
>
>
> I have two identical servers (or so I am told), but application work is
> running 2-3 times slower on one than the other. Using Tanel's snapper, I
> see that all active sessions are all on CPU. Viewing top shows me the same
> thing, each session pegs a cpu. We also found that it wasn't particular SQL
> that slowed down across severs, but it looked like everything was slow. A
> select count(*) from dba_objects showed this behavior as did Jonathan
> Lewis's kill_cpu script. This gave me something to test with. Running a
> 10046, I saw the same amount of resource utilization (parse count, fetch
> count, cr count, ...), no contention (wait events), but one server finished
> 2.5 times faster than the other. Looking at session stats through snapper,
> I see that the number of session logical reads per sec (~all of which are
> consistent reads) is ~ 2.5 times higher on one server than the other. That
> explains why it takes one longer to finish.
>
>
>
> So, now what?? Why is one server giving me 350k consistent gets/per second
> and the other is ~800k? Is it hardware? /proc/cpuinfo shows the same cpu
> for each box. Is it hidden in the Oracle code path? I realize that not all
> LIO are created equal, but how do I check this? I am running on SE12.1.0.1
>
>
>
> Any and all thoughts welcome.
>
>
>
> Henry
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Aug 10 2017 - 18:46:59 CEST

Original text of this message