> Plus, how exactly did Deepseek lie. The model size, data size are all known. C...

sudosysgen · on Jan 25, 2025

> In the same amount of time, you could have 1 epoch or 100 epochs depending on how many GPUs you have.

This is just not true for RL and related algorithms, having more GPU/agents encounters diminishing returns, and is just not the equivalent to letting a single agent go through more steps.

kd913 · on Jan 25, 2025

It should be trivially easy to reproduce the results no? Just need to wait for one of the giant companies with many times the GPUs to reproduce the results.

I don't expect a #180 AUM hedgefund to have as many GPUs than meta, msft or Google.

sudosysgen · on Jan 25, 2025

AUM isn't a good proxy for quantitative hedge fund performance, many strategies are quite profitable and don't scale with AUM. For what it's worth, they seemed to have some excellent returns for many years for any market, let alone the difficult Chinese markets.