Re: huge CPU load after UEK upgrade

From: Tanel Poder <tanel_at_tanelpoder.com>
Date: Wed, 18 Aug 2021 03:47:55 -0400
Message-ID: <CAMHX9JKAMjOaW3LO-3eY2EQaG8kzj8aOGnZZwb5+k=fHycwZhg_at_mail.gmail.com>

Sorry, been swamped and haven't had time to lurk around oracle-l threads :-)

The kworker threads do asynch I/O stuff in Linux kernel (among other stuff). So, when using libaio, you'd see kworkers involved in I/Os and apparently asmlib also ends up using kworkers under the hood (I never looked into asmlib in detail, but the psn/kstack output indicates kworker use too).

I'd say it's some asmlib/kernel bug, especially if you didn't do any hardware changes, like this one:

*Oracle Linux : OL6 Running On Exadata Observes A Particular CPU Comsumes High CPU Under Low System Load by kworker and ksoftirqd (Doc ID 2645898.1) - "Known issue with kernel version 4.1.12-124.24.3.el6uek.x86_64 and lower." *

Sometimes various interrupt storms & issues can also cause this weird behavior - *mpstat* would show how much (CPU) time you spend on hardware/software interrupts.

If you click on some of the stack sections on the very busy looking left-side of the *FG_UEK4_all_30seconds.jpg* screenshot, what do you see?

Note that the intel_idle and related stack frames always show up anyway when you use perf -a, as that option reports all *CPU* time of all CPUs (including CPU idle time, which is not the same as a process/thread idle time, essentially it's CPU's time spent in no-work/low-power-suspend/poweroff states). This is different from perf record -p behavior as -p would only show on-CPU samples of a process/thread and not show off-CPU samples *of a process/thread*.

As you see lots of *_raw_spin_lock* contention (CPU time spent to acquire a spinlock in that function), look up (with the FlameGraph search box, like I did in my blog entry), who calls that function? That would tell you which type of spinlock has that contention.

Also the fact that you see lots of *acpi_os_write_port* calls is weird. As so frequently interacting with hardware via I/O ports (instead of DMA) so that this *acpi_os_write_port* function shows up in the top CPU users, also indicates some sort of a kernel<->hardware communication issue... see what functions are its parents, grandparents, it may provide extra insight.

--
Tanel Poder
https://tanelpoder.com/events


On Fri, Aug 13, 2021 at 9:54 AM Laurentiu Oprea <laurentiu.oprea06_at_gmail.com>
wrote:



> quite suspicious is that running :

> perf record -F 99 -g a sleep 30

> and looking in FG looks like the word "idle" is a big chunk of it.

>

> În vin., 13 aug. 2021 la 11:25, Laurentiu Oprea <

> laurentiu.oprea06_at_gmail.com> a scris:

>

>> Thanks a lot for all the answers. Taking in order of what I tried:

>>

>> -> booting with acpi=off will apparently make the system see just 1 cpu

>> core / booting with acpi=ht I`ll not have the single cpu issue but my

>> load/slowness issue remains there -> not sure what I can explore more on

>> this subject

>>

>> -> On this particular rac cluster I have around 32 databases, I observed

>> that starting just 8 instances on the upgrade node will not generate a

>> significant CPU load, starting 12 or more will make the situation far worse

>> (the other 2 nodes with UEK3 are running perfectly well with all instances

>> up) . So I started just 8 instances and I wrote a test query (simple select

>> with  group by, all significant ash time being spent on CPU) that executed

>> on the upgraded node will run in 68 seconds while on non-upgraded node will

>> run in 38. I sampled with psn the spid of the session while running the

>> statement and also I sampled with perf and generated a flamegraph.

>>

>> From FG I might conclude that all the calls are identical from a

>> distribution perspective is just that each call  takes longer time .

>> From PSN Looks like the majority of samples are on CPU with no kstack (-)

>> or retin_carefull(), is just that on nodes running UEK3 is running faster

>> so less samples are collected.

>>

>> I`m starting to think that there is actually a general slowness running

>> UEK4 compared with UEK3 (and maybe seeing those kworker process is just a

>> side effect) While the system is slow the kstack for  kworkers is just:

>> ret_from_fork() ->kthread -> worker_thread() )

>>

>> Basically I`m out of ideas, so any suggestions are very welcomed :)

>>

>> Thank you.

>>

>>

>>

>>

>> În joi, 12 aug. 2021 la 20:29, Tanel Poder <tanel_at_tanelpoder.com> a

>> scris:


>>

>>> Oh, UEK 4 is Linux kernel 4.1.x, I don't recall if that bug/problem

>>> exists there, but you can still use the techniques from my blog entry to

>>> drill down deeper. One of the main questions is - is your Linux system load

>>> high due to on-CPU threads or threads in D state. perf record -g would help

>>> you drill down more into the CPU usage and from the full stack traces you'd

>>> see which kernel functions are trying to get that spinlock you already saw

>>> contention on, but if it's mostly D state sleeping threads, then

>>> /proc/PID/stack or psnapper would tell you more.

>>>

>>> On Thu, Aug 12, 2021 at 1:26 PM Tanel Poder <tanel_at_tanelpoder.com>

>>> wrote:

>>>

>>>> Hi,

>>>>

>>>> Run psnapper or just sudo cat /proc/PID/stack for some of the Oracle &

>>>> kworker processes (that are in D state) and check if you have

>>>> *wbt_wait* in the stack. UEK 4 (kernel 4.14) has a Linux kernel write

>>>> throttling bug/problem and lots of kworkers showing up is one of the

>>>> symptoms. I've written about drilling down into high Linux system load in

>>>> general here:

>>>>

>>>>

>>>> https://tanelpoder.com/posts/high-system-load-low-cpu-utilization-on-linux/

>>>>

>>>> Incidentally, this is one of the problem scenarios I cover at the

>>>> upcoming "troubleshooting very complex Oracle performance problems" virtual

>>>> "conference" (where all speakers are me - https://tanelpoder.com/events

>>>> ;-)

>>>>

>>>> --

>>>> Tanel Poder

>>>>

>>>>

>>>> On Thu, Aug 12, 2021 at 2:41 AM Laurentiu Oprea <

>>>> laurentiu.oprea06_at_gmail.com> wrote:

>>>>

>>>>> Hello everyone,

>>>>>

>>>>> I know this is more of a linux question but maybe someone had a

>>>>> similar issue.

>>>>>

>>>>> I started to upgrade the kernel version from UEK3 to UEK4 on servers

>>>>> (rac 2/3 nodes) running OL6. I had no issue upgrading the kernel on virtual

>>>>> machines.

>>>>>

>>>>> When I upgraded the first rac node located on physical hardware, after

>>>>> starting the oracle instances the cpu load went crazy, the whole system was

>>>>> far slower. For example the cpu load on the other rac nodes with UEK3 was

>>>>> 4/3/3/ while the upgraded node load was 107/104/77 with only half the

>>>>> instances started, this workload being created even if the instances had no

>>>>> significant workload (they were just started)

>>>>>

>>>>> I observed that with the new kernel version when oracle

>>>>> instances start there is quite a big number of kworker processes. Perf top

>>>>> will show me as overhead

>>>>> 17.24% [kernel]  _raw_spin_lock

>>>>> 4.50% [kernel] acpi_os_write_port

>>>>>

>>>>> Does anyone have any idea on what could be the issue?

>>>>>

>>>>> THanks a lot.

>>>>>

>>>>


--
http://www.freelists.org/webpage/oracle-l

Received on Wed Aug 18 2021 - 09:47:55 CEST

This message: [ Message body ]
Next message: Norman Dunbar: "Re: Turn off Partitioning"
Previous message: Lothar Flatz: "Re: Access and Filter Predicate on same execution plan line"
Maybe in reply to: Laurentiu Oprea: "huge CPU load after UEK upgrade"
In reply to Tanel Poder: "Re: huge CPU load after UEK upgrade"
Next in thread: Laurentiu Oprea: "Re: huge CPU load after UEK upgrade"
Reply: Laurentiu Oprea: "Re: huge CPU load after UEK upgrade"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message