| | SQL injection-like attack on LLMs with special tokens (twitter.com/karpathy) |
| 1 point by tosh on Aug 14, 2024 | past |
|
| | RLHF is just barely RL (twitter.com/karpathy) |
| 386 points by tosh on Aug 8, 2024 | past | 257 comments |
|
| | Andrej Karpathy on X: RLHF is just barely RL (twitter.com/karpathy) |
| 15 points by bilsbie on Aug 7, 2024 | past |
|
| | LLM model size competition is intensifying backwards (twitter.com/karpathy) |
| 1 point by kk1naK0 on Aug 1, 2024 | past |
|
| | The Weirdness of LLM Tokenization (twitter.com/karpathy) |
| 2 points by tosh on July 26, 2024 | past |
|
| | Jagged Intelligence (twitter.com/karpathy) |
| 3 points by mellosouls on July 25, 2024 | past |
|
| | Andrej Karpathy: "LLM model size competition is intensifying backwards (twitter.com/karpathy) |
| 13 points by bilsbie on July 18, 2024 | past |
|
| | I am starting an AI+Education company (twitter.com/karpathy) |
| 915 points by bilsbie on July 16, 2024 | past | 539 comments |
|
| | The if-then-else monster (twitter.com/karpathy) |
| 4 points by tosh on July 10, 2024 | past |
|
| | [flagged] Andrej Karpathy on X: "100% Software 2.0 computer.Just a single neural net (twitter.com/karpathy) |
| 25 points by bilsbie on June 30, 2024 | past | 23 comments |
|
| | One built-in UI/UX feature of LLM interfaces I'd love is proof (twitter.com/karpathy) |
| 1 point by bilsbie on June 21, 2024 | past |
|
| | These 94 lines of code are everything that is needed to train a neural network (twitter.com/karpathy) |
| 3 points by r_singh on June 21, 2024 | past |
|
| | [Andrej Karpathy] Let's reproduce GPT-2, in PyTorch from scratch (nanoGPT) (twitter.com/karpathy) |
| 7 points by _giorgio_ on June 10, 2024 | past |
|
| | Let's reproduce GPT-2 (124M) (twitter.com/karpathy) |
| 5 points by Multiset on June 9, 2024 | past |
|
| | FineWeb-Edu: High quality LLM dataset (twitter.com/karpathy) |
| 3 points by tosh on June 2, 2024 | past |
|
| | I had ~30 direct reports and didn't do 1on1s at Tesla and imo it was great (twitter.com/karpathy) |
| 1 point by tosh on May 31, 2024 | past | 2 comments |
|
| | CUDA/C++ origins of Deep Learning (twitter.com/karpathy) |
| 4 points by tosh on May 4, 2024 | past |
|
| | llm.c: multi-GPU, bfloat16, flash attention, ~7% faster than PyTorch (twitter.com/karpathy) |
| 121 points by tosh on May 3, 2024 | past | 10 comments |
|
| | LLMs must one day run in Space (twitter.com/karpathy) |
| 6 points by tosh on May 3, 2024 | past | 1 comment |
|
| | Llm.c Update (twitter.com/karpathy) |
| 31 points by ibobev on April 19, 2024 | past |
|
| | Karpathy on Llama 3 (twitter.com/karpathy) |
| 12 points by tosh on April 18, 2024 | past |
|
| | Consider Being a Labeler for an LLM (twitter.com/karpathy) |
| 2 points by tosh on April 18, 2024 | past |
|
| | Scheduling Workloads to Run on Humans (twitter.com/karpathy) |
| 2 points by tosh on April 17, 2024 | past |
|
| | llm.c is now down to 26.2ms/iteration, matching PyTorch (twitter.com/karpathy) |
| 46 points by tosh on April 14, 2024 | past | 8 comments |
|
| | Llm.c is only 2X slower than PyTorch (fp32, forward pass) (twitter.com/karpathy) |
| 2 points by tosh on April 13, 2024 | past |
|
| | Andrej Karpathy explaining llm.c in layman terms (twitter.com/karpathy) |
| 1 point by ibobev on April 11, 2024 | past |
|
| | Karpathy: Explaining llm.c in layman terms [tweet] (twitter.com/karpathy) |
| 9 points by mellosouls on April 10, 2024 | past |
|
| | LLM training in simple, pure C/CUDA (twitter.com/karpathy) |
| 4 points by theaniketmaurya on April 8, 2024 | past | 1 comment |
|
| | Automating Software Engineering (twitter.com/karpathy) |
| 4 points by quick_brown_fox on March 13, 2024 | past |
|
| | Andrej Karpathy on automating software engineering (twitter.com/karpathy) |
| 5 points by hubraumhugo on March 12, 2024 | past |
|
|
| More |