RE: ideal CPU/Memory relation

From: Joe H-Grosse <jhg_at_isnordic.dk>
Date: Fri, 19 Aug 2022 22:39:50 +0200
Message-ID: <015601d8b40b$d75966b0$860c3410$_at_isnordic.dk>


Hi,
If you/we are talking NUMA Architecture (vs UMA SMP Systems) and DIMM population into DIMM sockets within the context of a multi-CPU NUMA hardware, then assuming the question asked is "how much RAM is the right amount of RAM for a given number of CPU's for a given specific system (X)", then part of the answer might be as follows:-

  1. Minimum amount DIMM population rules (and thus in effect the minimum amounts) are very likely to be hardware-enforced (in firmware on the motherboard) and if so, will have to be followed, otherwise the machine won't boot. In other words, given you have 16 CPU's, then there will be described in the compute node's service manual a proscribed DIMM minimum and population ordering that must be followed or at the very least recommended, if not enforced. You should get a copy of the service manual and read the section on servicing DIMMs. The reason for enforcement and/or recommendations is to try to secure optimal performance signatures that are achieved by evenly distributing the DIMM's across the active CPU's, thus enabling memory interleaving and so memory access parallelism.
  2. Optimal amount This is much harder to answer without a whole lot more information - and even then, you need a real hardware engineer (which I am not) get a sensible answer. I guess it depends upon the interplay of multiple factors, such as:- The CPU L1, 2 & 3 cache sizes The size of the TLB in the MMU's CAM The clock speed of the CPU The number of DIMMs each CPU is directly connected to The clock speed the DIMMs can operate at (if you mix DIMMS with different ratings they will all downgrade to the lowest/slowest - so inadvisable) The O/S (does it allow you to alter any NUMA - aware properties) The memory slab sizes that the O/S works with The block size(s) your database is built with The type of database workload (e.g. OLAP or OLTP)

I think I recall that interleaving does not span CPU's - if so, then using smaller density DIMM's (but retaining the best clock speed capability) and adding more to each CPU will markedly improve performance relative to not doing so. In other words, if you only want 1TB of RAM total, then 2_at_32GB DIMMs attached to each CPU would activate interleaving, whereas one 64GB DIMM on each CPU won't. Obviously populating all the DIMM slots for each CPU will maximize interleaving potential and so achieve maximal performance.

...This is of course, all predicated on the assumption that the hardware is in fact designed to support memory interleaving...

Hope that helps.

Br,
Joe

-----Original Message-----
From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Clay Jackson ("Clay.Jackson") Sent: Friday, 19 August 2022 16.38
To: frits.hoogland_at_gmail.com; Lothar Flatz <l.flatz_at_bluewin.ch> Cc: Stefan Koehler <contact_at_soocs.de>; oracle-l_at_freelists.org Subject: RE: ideal CPU/Memory relation

I love a good "religious" debate! Most traffic on one topic here in a while.

The "correct" answer is obviously 42!

Clay Jackson

BTW, Frits - your mention of NUMA actually gave me some clues on chasing a performance problem I'm seeing with graphics on a Raspberry Pi 4; so there's no such thing as a "fruitless" discussion.

-----Original Message-----
From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Frits Hoogland Sent: Friday, August 19, 2022 5:55 AM
To: Lothar Flatz <l.flatz_at_bluewin.ch> Cc: Stefan Koehler <contact_at_soocs.de>; oracle-l_at_freelists.org Subject: Re: ideal CPU/Memory relation

CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.

What really matters here is the physical layout, because modern CPUs have memory attached to them more or less directly. This principle of CPUs with their memory attached is called NUMA.

By default, at least in the past, Oracle and linux did not have numa specific optimisations enabled. With these not being enabled, memory is divided evenly over "all memories", and thus your memory latency is the average of all memory latencies. With non-ideal NUMA (having different and higher latencies between certain numa nodes), the latency might be higher than a smaller box with less numa nodes, and thus will memory closer by, and thus with lower latency.

If numa is enabled, it still needs care and attention. Then memory might be explicitly allocated locally, but if you make the scheduler schedule a process on a non-numa-local CPU, your memory all of a sudden is remote and slow.

Frits

> On 19 Aug 2022, at 12:28, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
>
> Hi all,
>
> thanks fro responding. We are actually talking of a big server. Provider offered 288 cores (16 Cpus a 18 cores, these are 8354H, one CPU can address 1,2 TB) and 6 TB RAM.
> We said we only need 144 cores. Reaction was: then we need more RAM due the ideal CPU/RAM relation.
> So far I can see it ought to be the other way around, if at all..
>
> Regards
>
> Lothar
>
> Am 19.08.2022 um 12:17 schrieb Stefan Koehler:
>> Hello Lothar,
>> there are such databases like SAP HANA that use such an approach - quoting doc: "Using core-to-memory ratios which can be derived from the certified HANA listings. The memory requirement drives the number of cores required."
>>
>> ... but I guess you are using Oracle and never have seen such
>> guidelines in the field until yet :-)
>>
>> Best Regards
>> Stefan Koehler
>>
>> Independent Oracle performance consultant and researcher
>> Website:
>> https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
>> soocs.de%2F&amp;data=05%7C01%7Cclay.jackson%40quest.com%7C74c5578f717
>> e4395c85308da81e20ab9%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C63
>> 7965105084686715%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
>> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Brhi
>> oYF3q7tqlqMAB7rjHNfihAzF%2F6akcnTC6aSuf0w%3D&amp;reserved=0
>> Twitter: _at_OracleSK
>>
>>> Lothar Flatz <l.flatz_at_bluewin.ch> hat am 19.08.2022 09:02 CEST geschrieben:
>>>
>>>
>>> Hi,
>>>
>>> had somebody ever heard of a ideal CPU/Memory relation for a
>>> database server?
>>> A supplier of a customer stated such thing, I suppose they made it
>>> up.
>>> Any comments?
>>>
>>> Thanks
>>>
>>> Lothar
>>> --
>>> https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww
>>> .freelists.org%2Fwebpage%2Foracle-l&amp;data=05%7C01%7Cclay.jackson%
>>> 40quest.com%7C74c5578f717e4395c85308da81e20ab9%7C91c369b51c9e439c989
>>> c1867ec606603%7C0%7C0%7C637965105084842945%7CUnknown%7CTWFpbGZsb3d8e
>>> yJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
>>> 3000%7C%7C%7C&amp;sdata=FviEPoHE7bRB172hLku1AMIUrBDeNTTLva38QiPzqDo%
>>> 3D&amp;reserved=0
>
> --
> https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.f
> reelists.org%2Fwebpage%2Foracle-l&amp;data=05%7C01%7Cclay.jackson%40qu
> est.com%7C74c5578f717e4395c85308da81e20ab9%7C91c369b51c9e439c989c1867e
> c606603%7C0%7C0%7C637965105084842945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
> C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> %7C&amp;sdata=FviEPoHE7bRB172hLku1AMIUrBDeNTTLva38QiPzqDo%3D&amp;reser
> ved=0
>
>

--
https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.freelists.org%2Fwebpage%2Foracle-l&amp;data=05%7C01%7Cclay.jackson%40quest.com%7C74c5578f717e4395c85308da81e20ab9%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C637965105084842945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=FviEPoHE7bRB172hLku1AMIUrBDeNTTLva38QiPzqDo%3D&amp;reserved=0


--
http://www.freelists.org/webpage/oracle-l



--
http://www.freelists.org/webpage/oracle-l
Received on Fri Aug 19 2022 - 22:39:50 CEST

Original text of this message