Hacker Newsnew | past | comments | ask | show | jobs | submit | jamescham's commentslogin

Pete Warden and team just published a paper on Moonshine, their speech to text model.

Key features include:

- 1.7x overall speed boost compared to Whisper - Flexible-sized input window, allowing for more efficient processing of shorter audio clips - Up to 5x faster performance on 10-second audio clips - Matches or exceeds Whisper's accuracy


How is this not getting any news?



This is a great point! (And kind of terrifying.)


“Some good companies this year”—observations from my friend Joshua.


This is exactly right—we now live in a world in which most jobs are knowledge work, and we should look to those who are the most productive (and lazy) knowledge workers: software developers.


My understanding is that Google’s big advantage is that they’ve collected so much good, annotated voice data.


Yeah. I’m convinced the current model is just too confusing. But I really wish there were new interaction patterns that took advantage of low latency speech recognition...


Oh! Good to know!


ardit33 is right. My experience is that most successful startups are so hungry for talent that they really don't care where you come from or who you know.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: