Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can just ask it about the error, and it will suggest fixes. I don't know why it has to be 100% the first time. Humans almost never are.


You can also ask it about "errors" that aren't errors and it will suggest "fixes".


You can also do that with humans


Humans tend to put up a bit of a fight if you accuse them of producing incorrect program code; you know you're in good company if they pull-out Z3, slam out a few lines in their terminal keyboard, and show you rigid mathematical proof that their code is correct. LLMs don't do that.

...yet.


You can skip this above comment. GPT4 will absolutely double down when it knows it's right. The parent is clueless.


LLMs can do that with chain of thought in a way that generalized to multiple tasks


Only in a vague way. Even with train-of-thought, feedback loops, and other neat tricks, I've never seen an LLM produce valid theorems for Z3 (beyond trivial examples).


You got to pay humans and they have opinions and health issues and stuff.


Not really. It's actually a fairly good interview practice to see if somebody will defend their solution if probed on it.


I've attempted to use this iterative method with GPT4 to build an application, and things just get clumsier and more error-prone as the program grows in complexity. Eventually I get to the point where asking it to make revisions becomes a dice roll with respect to keeping the code behaving as expected or having it arbitrarily omit random portions of the application logic in the rewrite. It's certainly a great way to brainstorm or to quickly produce snippets of logic but it fails for anything beyond toy apps.


This looks promising when Microsoft release the code for it.

https://arxiv.org/abs/2309.12499


That works for a straightforward compiler error, but not for plot or logic errors in prose, which cannot be detected automatically.


It works for all kinds of errors to varying degrees.

Even "logic" errors can be generated in a go and detected by another instance. Like a "this is what i wanted but you did this" works sometimes too.


Most enterprise software just follows basic logic like the workflow above. So worst case scenario chatGPT can build 90% of the code we need.


Because why are we considering supplanting humans for this labor if it provides no additional value (apart from sacrificing humans at the altar of capitalism)?


Cheaper always wins.

But besides that, the LLM is less likely to demand unscheduled time off - especially as a fraction of the hours it can put in. If I have a family emergency once per year, and I need my eight hour day off, I've just removed 1/200 of my yearly output potential. The LLM would need to be down for over 400 hours per year to get to that type of output reduction. Realistically, that is unlikely to happen.


The curious thing about this, though, is that of course you can start by replacing software engineers with it, but how far away are you going to be from being able to replace a CEO with it? I would say not that far away.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: