I've been intrigued by this approach. On a more highly optimized (but harder to program) take is the GA144[1] from Chuck Moore, the inventor of Forth. It's a grid of 144 F18 Forth based processors in a cartesian grid. These processors are far more limited, but then again they take far less power as well.
Fun fact, Tenstorrent wanted to add instructions to enqueue data between processors connected in a mesh and Arm said no (they don't do architectural licenses anymore), so Tenstorrent used RISCV.
[1] https://www.greenarraychips.com/