You can just ask it about the error, and it will suggest fixes. I don't know why...

crooked-v · on Sept 27, 2023

You can also ask it about "errors" that aren't errors and it will suggest "fixes".

MeImCounting · on Sept 27, 2023

You can also do that with humans

DaiPlusPlus · on Sept 27, 2023

Humans tend to put up a bit of a fight if you accuse them of producing incorrect program code; you know you're in good company if they pull-out Z3, slam out a few lines in their terminal keyboard, and show you rigid mathematical proof that their code is correct. LLMs don't do that.

...yet.

lostmsu · on Sept 28, 2023

You can skip this above comment. GPT4 will absolutely double down when it knows it's right. The parent is clueless.

BoorishBears · on Sept 28, 2023

LLMs can do that with chain of thought in a way that generalized to multiple tasks

DaiPlusPlus · on Sept 29, 2023

Only in a vague way. Even with train-of-thought, feedback loops, and other neat tricks, I've never seen an LLM produce valid theorems for Z3 (beyond trivial examples).

huytersd · on Sept 27, 2023

You got to pay humans and they have opinions and health issues and stuff.

pxx · on Sept 27, 2023

Not really. It's actually a fairly good interview practice to see if somebody will defend their solution if probed on it.

root_axis · on Sept 27, 2023

I've attempted to use this iterative method with GPT4 to build an application, and things just get clumsier and more error-prone as the program grows in complexity. Eventually I get to the point where asking it to make revisions becomes a dice roll with respect to keeping the code behaving as expected or having it arbitrarily omit random portions of the application logic in the rewrite. It's certainly a great way to brainstorm or to quickly produce snippets of logic but it fails for anything beyond toy apps.

famouswaffles · on Sept 27, 2023

This looks promising when Microsoft release the code for it.

https://arxiv.org/abs/2309.12499

bo0tzz · on Sept 27, 2023

That works for a straightforward compiler error, but not for plot or logic errors in prose, which cannot be detected automatically.

famouswaffles · on Sept 27, 2023

It works for all kinds of errors to varying degrees.

Even "logic" errors can be generated in a go and detected by another instance. Like a "this is what i wanted but you did this" works sometimes too.

huytersd · on Sept 27, 2023

Most enterprise software just follows basic logic like the workflow above. So worst case scenario chatGPT can build 90% of the code we need.

dclowd9901 · on Sept 27, 2023

Because why are we considering supplanting humans for this labor if it provides no additional value (apart from sacrificing humans at the altar of capitalism)?

dotancohen · on Sept 27, 2023

Cheaper always wins.

But besides that, the LLM is less likely to demand unscheduled time off - especially as a fraction of the hours it can put in. If I have a family emergency once per year, and I need my eight hour day off, I've just removed 1/200 of my yearly output potential. The LLM would need to be down for over 400 hours per year to get to that type of output reduction. Realistically, that is unlikely to happen.

nicolapede · on Sept 27, 2023

The curious thing about this, though, is that of course you can start by replacing software engineers with it, but how far away are you going to be from being able to replace a CEO with it? I would say not that far away.