Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That comment from Anandtech contains guesses that are unlikely to be true, e.g. "executing these instructions over two cycles", and it shows ignorance about how AVX-512 is implemented in the Intel CPUs.

What is really known from the AMD disclosure is that Zen 4 has the same execution resources as Zen 3, only the load/store units are improved, perhaps their bandwidth has been increased to match that of all recent Intel CPUs.

Zen 3 has 4 AVX pipelines which can execute four 256-bit instructions per cycle, but some more complex instructions, e.g. multiplications or FMAs can be executed by at most 2 pipelines.

There are 2 possible ways to execute a 512-bit instruction in such pipelines, either the instruction can occupy a pipeline for 2 cycles, or it can occupy 2 pipelines for 1 cycle.

The throughput is the same, but the latency of the operation is different. The simpler and better way is to occupy 2 pipelines for 1 cycle. This is also how most AVX-512 instructions are implemented in all Intel CPUs, with the exception of FMA/FMUL, for which a few models of Intel CPUs have a second 512-bit pipeline and with the exception of some other instructions for which there is a 256-bit extension of one of the three 256-bit pipelines that exist in Intel CPUs, allowing the Intel CPU to do two 512-bit instructions per cycle, even if it can do only three 256-bit instructions per cycle.

The Intel CPUs can do two 512-bit instructions per cycle, except for a few instructions like FMA/FMUL that can be done only one per cycle in the cheaper CPUs, but two per cycle in most Xeon Gold, all Xeon Platinum and the Xeon W models with AVX-512.

The AMD Zen 4 is certain to have the same 512-bit throughput per clock cycle (two 512-bit instructions per cycle, of which only 1 can be FMA/FMUL) as all Intel CPUs with the exception of the models with two 512-bit FMA units, which will have double throughput only for FMA/FMUL.



Thank you for the explanation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: