You didn't read the full article. The past paragraph talks about this specifical...

tredre3 · 2026-02-27T21:31:05 1772227865

In the last paragraph you handwave that all the Z80 and ZX Spectrum documentations is likely already in the model anyway... Choosing to not provide the documents/websites might then requiring more prompting to finish the emulator, but the knowledge is there. You can't clean room with a large LLM. That's delusion!

nathell · 2026-02-28T20:52:04 1772311924

Counterpoint: in December, a Polish MP [0] has vibe-coded an interpreter [1] of a 1959 Polish programming language, feeding it the available documentation. _That,_ at least, is unlikely to have appeared in the model’s training data.

[0]: https://en.wikipedia.org/wiki/Adrian_Zandberg [1]: https://sako-zam41.netlify.app/

jaen · 2026-03-01T12:01:10 1772366470

Not exactly a counterpoint, since nobody argued that LLMs can not produce "original" code from specs at all - just that this particular exercise was not clean room.

(although for SAKO [1], it's an average 1960 programming language, just with keywords in Polish, so it's certainly almost trivial for an LLM to produce an interpreter, since construction via analogy is the bread and butter of LLMs. Also, such interpreters tend to have an order of magnitude less complexity than emulators.)

[1]: https://en.wikipedia.org/wiki/SAKO_(programming_language)

jaen · 2026-02-28T08:30:24 1772267424

I mean, for an article that's titled "clean room", that would be the first thing to do, not as a "maybe follow up in the future"...

(I do think the article could have stood on its own without mentioning anything about "clean room", which is a very high standard.)

For the handwavy point about the x86 assembler, I am quite sure that the LLM will remember the entirety of the x86 instruction set without any reference, it's more of a problem of having a very well-tuned agentic loop with no context pollution to extract it. (which you won't get by YOLOing Claude, because LLMs aren't that meta-RLed yet to be able to correct their own context/prompt-engineering problems)

Or alternatively, to exploit context pollution, take half of an open-source project and let the LLM fill in the rest (try to imagine the synthetic "prompt" it was given when training on this repo) and see how far it is from the actual version.