Hacker Newsnew | past | comments | ask | show | jobs | submit | pidtom's submissionslogin
1.Skipping 90% of KV dequant work speeds up LLM decode by 22% (github.com/thetom)
1 point by pidtom 40 days ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: