Sometimes the meaning of the response is totally different. [ 4 days ago ] The s...

skybrian · on Dec 15, 2022

Yes. GPT doesn't really deal in meanings. Much like autocomplete, it doesn't know what the end of a sentence will be when it starts it. If it randomly chooses different words at the start of a sentence, it may pretend to have a different belief by the end.

In reality it doesn't have beliefs any more than a library does. It implicitly contains beliefs (since it's been trained on them) and it can imitate them, but which one you get is random. (There is likely more than one answer to your question in the training data.)

ssgh · on Dec 15, 2022

So if we asked GPT to write a book, it would be hallucinating a chain of words without sticking to any coherent plot. However, we could use a "multi-resolution" approach even with today's version of GPT: at the top level we ask it to write a brief plot for the entire novel, at the next level we'll use this plot as the context and ask to outlines sub-plots of the 3 books in our novel, at the third level we'll use the overall plot and a book's summary as context to generate brief descriptions of chapters in the book, and so on.

rcarr · on Dec 15, 2022

It’s interesting because in the writing world there’s a spectrum with plotters at one end and pantsers at the other. Plotters work similarly to what you’ve suggested, starting with a plot and working their way down to the actual writing. Pantsers just start writing ‘by the seat of their pants’ and see what emerges. Stephen King is famously in the latter camp. Most people fall somewhere in between, having a rough plot in mind and work out the rest as they go along. Would be interesting to see different AIs take different approaches and see what emerged.

ssgh · on Dec 15, 2022

The pantsers can also fit the model I've described. In this case GPT would keep in memory a sliding window of past N=1024 words, like it does today, but in addition to that it would remember the past N paragraph-tokens (symbols that are blurry versions of all the words in that paragraph), the past N chapter-tokens and so on. When generating words, GPT would first generate the next chapter-token, then the next paragraph-token and finally the next word-token.

CompleteWalker · on Dec 15, 2022

https://arxiv.org/abs/2209.14958 This paper outlines a similar method, but with the addition of guiding the plot structure. See page 30 for the specific prompt sets they used.

skybrian · on Dec 15, 2022

It's worth a try, but I expect you will still get continuity issues between chapter 1 and later chapters. It's not necessarily coherent even at small scale.

flatline · on Dec 15, 2022

And yet it is often able to make surprising references to previous text. This is not just a markov chain, and is capable of what the author describes as chain of thought. I think there are deeper relationships encoded in the model that allow it to keep to a consistent narrative for a very long time. Its beliefs may change between queries but do not, generally, within the context of a single conversation.

skybrian · on Dec 15, 2022

The attention mechanism lets it look backwards to "understand" what was said before and predict what could possibly come next. Whatever consistency it has is due to studying the preceding text.

Thinking ahead is different. All it needs to do is calculate the probability that there is any reasonable completion starting with a particular word. It doesn't need to decide what it's going to say beyond that; it can decide later.

Have you ever played a game where players take turns adding one more word to a sentence? When it's your turn and you're choosing the next word, you don't need to think ahead very much. Also, you don't necessarily need have the same thing in mind as the player who went before you.

In improv there is a "yes, and" where you are always building on what happened before. These algorithms are doing improv all the time.

The algorithm doesn't know or care who wrote the words that came before. It will find a continuation regardless.

resource0x · on Dec 15, 2022

I see. The assumption here is that one can simulate intelligence without formalizing the notion of meaning. (And if "meaning" is not defined, then the notion of "truth" is impossible to define either). Is my understanding correct?

int_19h · on Dec 15, 2022

The assumption here is that one can produce useful outputs in this manner. Whether it constitutes "simulating intelligence" is a philosophical question.

skybrian · on Dec 15, 2022

Some people hope that training will cause it to represent meanings somehow. How to represent meaning isn't well understood.