Once writing code is cheap you don't maintain code. You regenerate it from scrat...

simonw · 2026-02-24T13:21:37 1771939297

I'm not sold on that idea yet.

I don't just have LLMs spit out code. I have them spit out code and then I try that code out myself - sometimes via reviewing it and automated tests, sometimes just by using it and confirming it does the right thing.

That upgrades the code to a status of generated and verified. That's a lot more valuable than code that's just generated but hasn't been verified.

If I throw it all away every time I want to make a change I'm also discarding that valuable verification work. I'd rather keep code that I know works!

neilwilson · 2026-02-25T06:00:11 1771999211

I suspect that is where we will be going next - automated verification. At least to the point where we can pass it over the wall for user acceptance testing.

Is it possible to write Cucumber specs (for example) of sufficient clarity that allows an LLM agent team to generate code in any number of code languages that delivers the same outcome, and do that repeatedly?

Then we're at the point where we know the specs work. And is getting to the point where we know the specs work less effort than just coding directly?

We live in exciting times.

manuelabeledo · 2026-02-24T13:24:04 1771939444

Unless the specification is also free of bugs and side effects, there is no guarantee that a rewrite would have fewer bugs.

Plenty of rewrites out there prove that point.

jimbokun · 2026-02-24T21:57:10 1771970230

Seems relevant again:

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

skinner_ · 2026-02-25T07:34:48 1772004888

I think the nuanced take on Joel's rant is this: it was good advice for 26 years. It became slightly less good advice a few months ago. This is a good time to warn overenthuastic people that it’s still good advice in 2026, and to start a discussion about which of its assumptions remain to be true in 2027 and later.

neilwilson · 2026-02-25T06:03:52 1771999432

Yes, but that's the point.

We're not writing code in a computer language any more, we're writing specs in structured English of sufficient clarity that they can be generated from.

The debugging would be on the specs.

manuelabeledo · 2026-02-25T14:46:40 1772030800

> writing specs in structured English of sufficient clarity

What does "sufficient clarity" mean? And is it english expressive enough and free of ambiguities? And who is going to review this process, another LLM, with the same biases and shortcomings?

I code for a living, and so far I'm OK with using LLMs to aid in my day to day job. But I wouldn't trust any LLM to produce code of sufficient quality that I would be comfortable deploying it in production without human review and supervision. And most definitely wouldn't task a LLM to just go and rewrite large parts of a product because of a change of specs.

jimbokun · 2026-02-24T21:55:30 1771970130

Tokens aren’t free.

Far more expensive than compilation and non deterministic so you’re not sure if you will get the same software if you give the AI the same spec.

neilwilson · 2026-02-25T05:54:43 1771998883

You'll get the same software in outcome terms. Which is what we want.

Tokens are cheaper than getting an individual to modify the code, and likely the tokens will get cheaper - in the same way compilation has (which used to be batched once a day overnight in the mainframe era).

Non-determinism is how the whole LLM system works. All we're doing with agents is adding another layer of reinforcement learning that gets it to converge on the correct output.

That's also how routing protocols like OSPF work. There's no guarantee when those multicast packets will turn up, yet the routes converge and networks stay stable.

I think this fear of non-determinism needs to pass, but it will only pass if evidence of success arises.