Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe evolutionary algorithms instead? Hasn't proven super useful historically, but maybe at the scale of enormous LLMs it will be?


Nope, they're orders of magnitude more inefficient because they don't leverage gradient descent.

Rule of thumb in optimization: real numbers are easy, integers are hard


This may be the status quo because of the so called "hardware lottery" which has historically been optimized for floating point. I'm speculating, but if hardware designers were instead only concerned about raw xnor density and throughput, we might end up with chips powerful enough that giant 1-bit nets could be trained purely through evolution.


No, it's a fact at the mathematical level that you can enshrine in big O terms if you want to


How do you optimize memory for floating point?


BF8 and other similar formats?


Evolutionary algorithms made you, didn’t they?


That does not prove that they can beat gradient decent.


It took a lot of human brain flops to get to this point in time though, I wonder how many orders of magnitude more than it took to train ChatGPT...


Gradient-directed evolutionary algorithm sounds kinda interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: