Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's the equivalent to what we are asking the model to do. If you give the model a calculator it will get 100%. If you give it a pen and paper (e.g. let it show it's working) then it will get near 100%.


Citation needed.


Which bit do you need a citation for? I can run the experiment in 10 mins.


> That's the equivalent to what we are asking the model to do.

Why?

What does it mean to give a model a calculator?

What do you mean “let it show its working”? If I ask an LLM to do a calculation, I never said it can’t express the answer to me in long-form text or with intermediate steps.

If I ask a human to do a calculation that they can’t reliably do in their head, they are intelligent enough to know that they should use a pen and paper without needing my preemptive permission.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: