Most of our customizations and modifications have been upstreamed. We get similar performance on production 100g and 200g boxes when running the upstream kernel. I see no reason I shouldn't be able to hit 380Gb/s on a single-socket Rome box with an upstream kernel. I just haven't tried yet.
Most of the changes that I have for the 700g number are changes to implement Disk centric NUMA siloing, which I would never upstream at this point because they are a pile of hacks. They are needed in order to change the NUMA node where memory is DMAed, so as to better utilize the xGMI links between AMD CPUs.
Most of the changes that I have for the 700g number are changes to implement Disk centric NUMA siloing, which I would never upstream at this point because they are a pile of hacks. They are needed in order to change the NUMA node where memory is DMAed, so as to better utilize the xGMI links between AMD CPUs.