Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, perfection is difficult, but it's relative. It can definitely be made much safer. Looking at the analysis of pre vs post alignment makes this obvious, including when the raw unaligned models are compared to "uncensored" models.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: