Re: ideal CPU/Memory relation

From: Tanel Poder <tanel_at_tanelpoder.com>
Date: Fri, 19 Aug 2022 16:25:51 -0400
Message-ID: <CAMHX9JKEAJtMuEgqaohO_sF31hom=F8jmyGXPUheJoQ572j-Xg_at_mail.gmail.com>



Should correct a "typo":

I mean:

*My 2 x Xeon machine has CPUs with 6 memory channels each, but 8 DIMM slots per CPU in the mobo.*

By the way, another interesting thing is that Intel Xeon CPUs support direct transfer from PCIe devices to CPU cache, no need to write anything to RAM at all, if the CPU consumes the data quickly and then discards it (or the next PCIe transaction overwrites the old cache lines with next batch of data).

Since PCIe has really high throughput (a single AMD Zen2/3 EPYC CPU can do ~256 GB/s data transfers in each direction when all 128 PCIe 4.0 lanes are in use). PCIe5.0 doubles that, PCIe6.0 doubles that further... so for columnar analytics use cases, the PCIe to directly CPU cache traffic routing would compete with the limited DRAM... You can do SIMD/columnar analytics on data just rad from "disk" over PCIe, no problem. No need to keep everything in DRAM, especially if you're limited to only a few tens of TB per server :-)

--
Tanel Poder
https://learn.tanelpoder.com

On Fri, Aug 19, 2022 at 4:10 PM Tanel Poder <tanel_at_tanelpoder.com> wrote:


> There's another thing to think about - especially when you want the best
> memory access performance & throughput (and are not optimizing just for
> having max amount of RAM possible).
>
> Computers are networks. Modern CPUs are also networks. One core can not
> consume max memory bandwidth, you need multiple cores.
>
> Cores within a single processor can be connected to memory controllers /
> RAM and each other via ring networks (like some Xeons) or a central "I/O"
> hub (like AMD Zen).
>
> On some CPUs it's the CPU cores that each have memory channels (no central
> I/O hub). With cheaper CPUs (where half of the cores are disabled), half of
> the memory channels (or "CPU I/O" bandwith may be cut in half too ->
> https://www.servethehome.com/amd-epyc-7002-rome-cpus-with-half-memory-bandwidth/ ).
> This happens to be the case with the 16-core AMD Threadripper Pro that I
> have in my Lenovo Workstation
> <https://tanelpoder.com/posts/11m-iops-with-10-ssds-on-amd-threadripper-pro-workstation/>
> .
>
> The older Xeons have 6 memory channels each, the newest have 8 I think.
>
> AMD EPYC / Threadripper Pro have 8 memory channels.
>
> So, you should populate all 8 DIMM slots with memory if you want
> performance (not less). Or if your mobo supports dual/triple DIMMs per
> memory channel, could also use 16 DIMMs with 8 channels, but this won't
> increase your RAM throughput (and may increase latency actually, due to
> having to switch between the bank ID (or whatever it's called)). The low
> latency folks in high frequency trading world care about this stuff.
>
> My 2 x Xeon machine has CPUs with 6 memory channels each, but 8 DIMM slots
> in the mobo. But I filled only the 6 slots for each CPU, to avoid
> imbalancing the RAM access throughput & traffic.
>
> So if you're building a bad-ass (gaming) workstation or some
> high-performance server, don't buy just one-two large & expensive DIMMs in
> the hopes of adding more later, but populate enough DIMM slots so that the
> exact number of your CPUs' memory channels are in use.
>
> Oh, the world is changing, PCIe (especially PCIe 5.0 and future 6.0)
> latency and throughput are so good, so that it's getting pretty close to
> the RAM speed as far as the transport goes. So (now that Intel killed
> Optane
> <https://tanelpoder.com/posts/testing-oracles-use-of-optane-persistent-memory/>)
> it's worth keeping an eye on the Compute Express Link (CXL) standard. With
> CPU support, it's basically like cache coherent system memory, but accessed
> over PCIe5.0+ links. It's even possible to connect large boards full of
> DRAM to multiple separate compute nodes, so in theory someone could build a
> CXL-based shared global buffer cache used by the entire rack of servers
> concurrently, without needing RAC GC/LMS processes to ship blocks around.
>
> --
> Tanel Poder
> https://learn.tanelpoder.com
>
>
> On Fri, Aug 19, 2022 at 3:02 AM Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
>
>> Hi,
>>
>> had somebody ever heard of a ideal CPU/Memory relation for a database
>> server?
>> A supplier of a customer stated such thing,
>> I suppose they made it up.
>> Any comments?
>>
>> Thanks
>>
>> Lothar
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
-- http://www.freelists.org/webpage/oracle-l
Received on Fri Aug 19 2022 - 22:25:51 CEST

Original text of this message