Re: To Swap, or not to Swap

From: Jared Still <jkstill_at_gmail.com>
Date: Thu, 30 Mar 2023 11:46:41 -0700
Message-ID: <CAORjz=O-sjKprY14f8qoFAK-_1X-S9MwbvJCJrzTqThsJ-V09A_at_mail.gmail.com>



The test case is extreme: I used 'tail /dev/zero' to consume all memory, and it works pretty quickly.

Assuming the issue is a program or set of programs with memory leaks, this may be something you monitor.

I have seen scripts used to regularly kill and restart apps with leaks, because that was the only available recourse.

If something causes a server to use too much memory, what are you going to kill if it is an unknown and new condition?

And if the CPU is reduced to managing swap for active memory, you may not even be able to logon to the server.

Probably best to just let OOM deal with it, then find out what happened.

Tanel Poder's 0xtools can be used for forensic analysis of this, I tested for this specific case.

On Thu, Mar 30, 2023 at 11:18 AM Chris Taylor < christopherdtaylor1994_at_gmail.com> wrote:

> Well, that's an interesting use case. Not sure what to think about that.
> It could be argued that if you're monitoring swap usage, you'd catch the
> problem before OOM got to it, right?
>
> Without swap you lose that opportunity, right?
>
> Chris
>
>
> On Thu, Mar 30, 2023 at 1:47 PM Jared Still <jkstill_at_gmail.com> wrote:
>
>> I was recently asked by a colleague this same question.
>>
>> He had been asked by a client, with a fairly well regarded sysadmin team.
>>
>> They wanted to eliminate swap: here's why.
>>
>> If a process is consuming memory at a prodigious rate, then the OOM (out
>> of memory) killer is going to catch up to it and kill it eventually.
>>
>> Their position was that with a swap partition, this process was prolonged
>> far too long.
>>
>> Without swap, the process gets killed relatively quickly.
>>
>> With swap, it can take many minutes. The CPU spends so much time managing
>> memory on swap (remember, we are at an OOM condition), which is slow, that
>> the time to kill the process is prolonged to many minutes.
>>
>> At first my position was "what, no swap! we can't do that!"
>>
>> But, I decided to test it a bit.
>>
>> A small physical server, 4 cores and 32G of RAM, is running Oracle 19.3.
>>
>> A swingbench test is running, 10 sessions per core.
>>
>> When I cause an OOM condition with the 16G swap partition enabled, it
>> took the system between 7.5-8 minutes to kill the process.
>>
>> (For the client, the amount of time was 20+ minutes.)
>>
>> And during that time, it was impossible to logon to the server. The CPU
>> was too busy thrashing around in the swap partition.
>>
>> The next step of course is to disable the swap.
>>
>> Same OOM condition caused. Time to resolution is now 7 seconds.
>>
>> There is no swap to manage as if it were RAM.
>>
>> That is quite a bit difference.
>>
>> Of course I wondered 'what about paging in memory for new processes?', as
>> that often uses a page in swap.
>>
>> Without swap, it just takes place in memory.
>>
>> Swap is also a landing place for some pages used to initialize processes,
>> as they can only be used once.
>>
>> This is a minimal amount, and can just be left in memory.
>>
>> If one really wants to conserve, there is a thing called ZRAM (compressed
>> memory) where those pages can be parked, instead of swap.
>>
>> So, does anyone see any other need for a swap partition?
>>
>> It seems to have outlived its usefulness.
>>
>> Jared Still
>> Certifiable Oracle DBA and Part Time Perl Evangelist
>> Principal Consultant at Pythian
>> Oracle ACE Alumni
>> Pythian Blog http://www.pythian.com/blog/author/still/
>> Github: https://github.com/jkstill
>> Personality: http://www.personalitypage.com/INTJ.html
>>
>>
>>
>>
>> On Thu, Mar 30, 2023 at 9:24 AM Jared Still <jkstill_at_gmail.com> wrote:
>>
>>> That is the question.
>>>
>>> I am curious about current thoughts on having or not having a swap
>>> partition on Linux based Oracle servers.
>>>
>>> Let's assume typical production standard servers with a reasonable
>>> amount of RAM, sway 256G or more.
>>>
>>> I have some thoughts on this myself, but would like to see others'
>>> thoughts on this.
>>>
>>>
>>> Jared Still
>>> Certifiable Oracle DBA and Part Time Perl Evangelist
>>> Principal Consultant at Pythian
>>> Oracle ACE Alumni
>>> Pythian Blog http://www.pythian.com/blog/author/still/
>>> Github: https://github.com/jkstill
>>> Personality: http://www.personalitypage.com/INTJ.html
>>>
>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Mar 30 2023 - 20:46:41 CEST

Original text of this message