Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been comparing R1 to O1 and O1-pro, mostly in coding, refactoring and understanding of open source code.

I can say that R1 is on par with O1. But not as deep and capable as O1-pro. R1 is also a lot more useful than Sonnete. I actually haven't used Sonnete in awhile.

R1 is also comparable to the Gemini Flash Thinking 2.0 model, but in coding I feel like R1 gives me code that works without too much tweaking.

I often give entire open-source project's codebase (or big part of code) to all of them and ask the same question - like add a plugin, or fix xyz, etc. O1-pro is still a clear and expensive winner. But if I were to choose the second best, I would say R1.



How do you pass these models code bases?


made this super easy to use tool https://github.com/skirdey-inflection/r2md


Some of the interfaces can realtime check websites


At this point, it's a function of how many thinking tokens can a model generate. (when it comes to o1 and r1). o3 is likely going to be superior because they used the training data generated from o1 (amongst other things). o1-pro has a longer "thinking" token length, so it comes out as better. Same goes with o1 and API where you can control the thinking length. I have not seen the implementation for r1 api as such, but if they provide that option, the output could be even better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: