I fell for it a few minutes the other day. Debugging an issue with a device, the AI wrote "I have a strong hypothesis about the cause in the code". I asked it to write out the hypothesis & create a test plan to validate it. It made a test plan, but no hypothesis. The test plan did not reproduce the issue, and it turned out to be a hardware design problem not in the code at all. But for a moment in there I thought it actually had a hypothesis, I forgot that it's not thinking beyond what's written in the chat. Someone who was going to reproduce & fix a bug would probably write "I have a strong hypothesis about the cause" or similar, so it played along & wrote that.
If the hypothesis is not printed out in the context, then it cannot hold it past that turn. You could prompt it to generate said hypothesis first (or set of hypotheses), and only then act on them. And then things might work.
Definitely not exactly a human. OTOH Low hanging fruit is low.
You could reverse that argument. The only thing that ever happens in a human mind is a Sodium-Kalium semi-permeable membrane balancing out (meaning going from polarized to unpolarized) and triggering the tiniest of explosions spreading one of 4 chemicals around. Repeat a few billion times per second for ~80 years.
The Eliza effect is off the scale.
What I'm trying to say is that the underlying method is not a valid reason to discredit one thinking process over another.
I remain baffled that anyone thinks dragging brains into discussions of these things does anything but make everyone more confused. This kind of thing is exactly what I'm getting at—that it's common for even people in the computer technology field to think the comparison is apt, or illuminates anything, is a wild indication of how inclined we are to be tricked by computer programs that happen to operate on language.
You are baffled because of your own ignorance of the underlying principles under discussion. Do you believe in a dualist interpretation of reality, that the process of thinking is somehow nonphysical? That these programs operate on language is besides the point. The fact you think this is why it's interesting shows you don't even understand the argument.
Are you familiar with the physical church turing thesis?
The effect is not quite what you think it is, and people don't quite take the right lessons.
Similar to the eliza effect, people still take the original reading of Clever Hans: "he couldn't really do maths, he's just taking social cues from his handler"
But what's the actual difference between Eliza, Clever Hans and RLHF? They're doing the similar things, right?
Now look at how we valued that in the 20th vs 21st century:
How much does an ALU even cost anymore? even a really good one? (it's almost never separate anymore, usually on the same silicon as the rest of the cpu/microcontroller)
Meanwhile... what's the TCO to deploy a sentiment classifier? Especially a really good one?
Counterpoint: When is the last time you, as a human being, honestly did that?
This isn’t trying to be glib or contentious, it’s a commentary on the nature of human existence. If you have, then your answer will show it. If you have not, your silence or excuses will also.
All the time? This morning when I dreaded getting up so early for work. Last night when I showered. The day before after playing some board games with friends. Normal people do introspect, despite the current fad among a few oddball elites in Silicon Valley [0].
This article reads like it’s been proofread or written out from an outline or bullet points given to an AI. And ALMA’s own posts that it references are just meandering ramblings, they’re really a slog to get through.
I think I’ve always tended to immediately notice the signs of sloppy thinking in the writing style and it’s been such a reliable heuristic that AI writing kind of short circuits me. I tend to get down a couple of paragraphs before I pause and realize “Wait a minute, this isn’t SAYING anything!” Even when there is an underlying point the writing often feels like a very competent college student trying to streeeeeetch to hit a word count without wanting to actually flesh their idea out past the topic statement.
Thought is a derivative of sensory processing. LLM does not have a physical body to interact with the world, nor does it develop itself and learn anything by experiencing the world, it has no subjective experience or subjective feeling, it has no qualia, it's symbols are not grounded in physical reality and it's "thoughts" is a mere simulacrum. Anyone personifying an LLM is just derealised by convincing outputs, not realising that manipulating symbols according to rules does not imply understanding
I mean, there are still philosophers metaphorically fist fighting about this stuff. Last time I stepped into the fray on this topic I got clapped back by someone from an area of philosophy of mind from after I graduated. It was an interesting perspective that was unaware of, but I studied language, not mind:
> you randomly sample letters from the alphabet and those letters make up actual words, then actual sentences
That sounds like a decently apt description of how I (a human) communicate. The only thing is that I suppose you implied a uniform distribution, while my sampling approach is significantly more complicated and path-dependent.
But yes, to the extent that I have some introspective visibility into my cognitive processes, it does seem like I'm asking myself "which of the possible next letters/words I could choose would be appropriate grammatically, fit with my previous words, and help advance my goals" and then I sample from these with some non-zero temperature, to avoid being too boring/predictable.
"it" is also not "thinking". It is still randomly (though not all words are equal probabilities) sampling from a distribution of words that have been stolen and it been trained on
If "randomly sampling from a trained distribution" can't produce useful, meaningful output, then deterministic computation is even more suspect. After all, it's a strict subset. You're sampling with temperature zero from a handcrafted distribution.
(this post directionality ok, but there's many a devil in the details)
I don't think it did any of that.