> I typed :rs pods to switch back to the pods view. Nothing rendered. The table was empty...
> now something was fundamentally broken and I couldn't just prompt my way out of it.
Hey I don't want to over simplify, I'm sure it was complicated, but did the author have functional tests for these broken views? As long as there are functional tests passing on the previous commit I'd have thought that claude could look at the end situation and work out how to get the desired feature without breaking the other stuff.
TUIs aren't an exception, it's still essential to have a way to end-to-end test each view.
The problem wasn't the view didn't work. The problem was the view didn't work after something else had been done.
You can't test every permutation of app usage. You actually need good architechture so you can trust your test and changes to be local with minimal side-effects.
On the one hand I'm not sure Dawkins has read/thought enough about how LLMs actually work. I'm getting the impression he doesn't fully appreciate or is somehow forgetting that it's a text completion algorithm with a vast number of parameters and that even if the patterns of learned parameter tunings are not really comprehendible, the architecture was very deliberately designed.
But on the other hand his thoughts at the end are interesting. Summary:
Maybe our "consciousness" is like an LLM's intelligence. But if not, then it raises the question of why do we even have this "extra" consciousness, since it appears that something like a humanoid LLM would be decent at surviving. His suggestions: maybe our extra thing is an evolutionary accident (and maybe there _are_ successful organisms out there with the LLM-style non-conscious intelligence), or maybe as evolved organisms it's necessary that we really feel things like pain, so that evolutionary mechanisms like pain (and desire for food, sex etc) had strong adaptive benefits.
The brain uses a lot less energy than an LLM, so most probably it is something completely different. Maybe consciousness is a byproduct of the architecture of the brain, so there is no version of a humanoid with no consciousness.
I don't think you read carefully what he said. At the end he gave three quite interesting thoughts about what might be true assuming LLMs are less conscious than we are (i.e. assuming our consciousness is not a purely algorithmic phenomenon as we obviously know LLMs are).
Your notes look really interesting, thanks. I'm curious --from the prose style it's clear they were written by an LLM. For design notes like this do you sort of have a mental TODO to go back and write them up in your own words to make sure they really capture your own opinions?
Overall the knowledgebase is a mixture of these. I have this disclaimer on the first page:
This KB is itself agent-operated: a human directs the inquiry, AI agents draft, connect, and maintain the notes. The framework for building knowledge bases is documented using that framework.
I hope it is enough - I've seen many people get angry with publishing LLM generated work.
The article is still missing the most important point about a "trust system" -- you have to explain what it is and convince me that I even care about the problem you're trying to solve. It's my machine, what is a "trusted" or "untrusted" file? If people just force security "solutions" on me without asking me whether I understand or agree with their problem diagnosis then I will immediately disable the protection if I can or blanket accept all prompts without thinking.
This is good, but it doesn't go far enough:
> ... the problem with security measures that cause too much friction is that users tend to disable them in order to get on with their work. To fulfill its security purposes, a good trust system needs to stay out of your way.
I reached the same conclusion after comparing diagram-as-code tools — D2 feels cleaner and more expressive than Mermaid.
I’ve been working on an AI diagramming tool built around D2: https://aidiagrammaker.com/
You describe a system in plain English, and it generates architecture diagrams, flowcharts, and sequence diagrams in D2.
Edits can be made either directly in the D2 code or via a context-aware editor.
d2 produces real svgs but I've found them to have a hard time displaying in other svg editors. The d2 folks talk about that somewhere and they have some fixes for it.
Oh, finally, something that supports actual hierarchical state diagrams (that isn't Graphviz, no offense)... Mermaid's "You cannot define transitions between internal states belonging to different composite states" [1] has driven me up a wall for years.
The language is richer and all diagram types are implemented consistently in the same language in a way that can be composed, as opposed to being a collection of unrelated DSLs.
The improved visual appearance is clear from inspecting example diagrams, I believe.
I read the first couple of posts in the series. The essay is full of criticism of LLMs, and in a couple of places the author distances himself, as if he himself isn't using them ("some people I respect tell me that...").
It's certainly worth discussing the fact that the entire industry is starting to outsource large amounts of our thinking and writing work to non-sentient statistical algorithms, but this discussion needs to honestly confront the extent to which they are successfully completing useful tasks today.
This is really cool! Is there an alternative way of thinking about it involving a hidden markov model, looking for a change in value of an unknown latent P(fail)? Or does your approach end up being similar to whatever the appropriate Bayesian approach to the HMM would be?
Doubtless the current LLMs aren't the last word. But this author sounds like they would get more out of the current LLMs if they put their energies into that rather than into criticism.
Hey I don't want to over simplify, I'm sure it was complicated, but did the author have functional tests for these broken views? As long as there are functional tests passing on the previous commit I'd have thought that claude could look at the end situation and work out how to get the desired feature without breaking the other stuff.
TUIs aren't an exception, it's still essential to have a way to end-to-end test each view.