This system is incredibly brittle - it gets some questions brilliantly, but fail...

Thorentis · on Dec 5, 2022

People are gushing over its ability to write code, forgetting that code is just another type of language that can be used to express ideas.

Logic however, transcends language. And it is clear that GPT3 has absolutely no understand of basic logic. It gives the impression of understanding logic by constructing sentences which are sometimes logically coherent.

We are a very long way from AGI.

anon7725 · on Dec 5, 2022

This test shows it probably displays theory of computation better than 80% of the population. The wonder is, it’s just text, so this is an emergent property.

Let’s say it can simulate theory of computation better than 99% of the population and can very capably synthesize and infer from any text based sources. I think that would shake the world, and it wouldn’t even need to be near AGI.

h0l0cube · on Dec 5, 2022

I really depends on how much you care about undefined behaviour. If you're just playing around with a prototype or making a personal tool, corrupt state doesn't have huge ramifications, but for any software people would pay for (be that with money or attention), there's an expectation of a minimum level of robustness, security, performance, and a variety of other business requirements.

To achieve the same with an AI that doesn't have a real understanding of the business logic, programmers would still be needed to write the test suite. But unlike most test suites that are typically underspecified, the test suite would need to be likely more complicated than the program itself. You could use ChatGPT to expedite writing of the test programs, but attention would still be required to actually verify the tests themselves.

hdlothia · on Dec 5, 2022

The question is would the profession go from writing code to auditing gpt code

humanistbot · on Dec 5, 2022

Why not? The profession went from writing machine code by hand to auditing "auto-generated" machine code from assembly. And so on with every generation of programming languages. With every new higher-level language with new abstractions, we're often just adding another chain to the inputs to a machine code translator.

Edit: If you showed a programmer from the 1950s python syntax and told them that all you have to do is write these words to build a program, they'd think it was artificial intelligence.

IncRnd · on Dec 5, 2022

  > Why not? The profession went from writing machine code by
  hand to auditing "auto-generated" machine code from
  assembly.

A compiler determistically creates assembly through a set of rules not through AI. You absolutely cannot compare these two in that way.

  > Edit: If you showed a programmer from the 1950s python
  syntax and told them that all you have to do is write these
  words to build a program, they'd think it was artificial
  intelligence.

I guarantee you that your bias is a complete misunderstanding. People today are not more intelligent compared to people then. People are people, and programmers in the 1950s had to be extremely clever.

kaba0 · on Dec 5, 2022

Especially that our field foundations wasn’t shaken to the core in the last 70 years, when around the century there was a huge change in understanding.. everything. Gödel, Turing machines, Chomsky, Halting theorem, etc. That we no longer has to fit into 3 megabytes and thus can write some monster abstractions is not fundamentally different at all.

h0l0cube · on Dec 5, 2022

> Edit: If you showed a programmer from the 1950s python syntax and told them that all you have to do is write these words to build a program, they'd think it was artificial intelligence.

I think you might be underestimating programmers in the 1950s.

> The first practical compiler was written by Corrado Böhm in 1951 for his PhD thesis

https://en.wikipedia.org/wiki/History_of_compiler_constructi...

xmprt · on Dec 5, 2022

I don't think the GP is underestimating programmers in the 1950s but rather commenting on their definition of AI. Back then first order logic was considered AI to a lot of people. I feel like in 50 years, AI models will become so complex and sophisticated that they will look back at us surprised that we ever considered ChatGPT to be an AI since it fails at basic logic tasks like the event ordering example above.

TeMPOraL · on Dec 5, 2022

> Edit: If you showed a programmer from the 1950s python syntax and told them that all you have to do is write these words to build a program, they'd think it was artificial intelligence.

Well, if you showed a programmer from late 1960s / early 1970s that same thing, they'd show you Lisp and laugh you off, or despair that programming seemingly regressed in the future.

snickerbockers · on Dec 5, 2022

>Edit: If you showed a programmer from the 1950s python syntax and told them that all you have to do is write these words to build a program, they'd think it was artificial intelligence.

he's a computer programmer, not a teutonnic knight.

imtringued · on Dec 5, 2022

Honestly the only use case I see for this is to write more tests because testing is very time and code intensive which means most tests are just simple canaries and not capable of actually verifying the software because nobody is going to write 10000 lines of test code to test a 500 line file.

fragmede · on Dec 5, 2022

No human maybe, but could an AI?

I fed the prompt for https://leetcode.com/problems/two-sum/ to ChatGPT, which generated code for an answer, and I then asked it for additional test cases, which it gave. What's mind blowing though is that it "reasoned" that:

> These test cases demonstrate that the twoSum() function can handle cases where the input array contains negative numbers, where the target is negative, where the indices of the two elements are not adjacent, and where the input array contains duplicate elements.

Full transcript: https://pastebin.com/BpdcuWEM

discreteevent · on Dec 5, 2022

The problem is that according to the post at the root of this thread the AI would have written a test to verify that A < B < C whereas it should have written a test to verify that A < C < B. If the AI or a human had also written code to match A < C < B then nobody might ever know until the day that condition comes up when the system is running.

I can imagine that it would be stressful to review code like this. Where you know you cannot trust a single line of it but that it looks consistent. You might have to verify back to the original assumptions every time and and make sure that you have not been led astray by the AI.

h0l0cube · on Dec 5, 2022

Most certainly. It's not if, but when. The same for our eventual place of perfect trust in artificial intelligence. For now, ChatGPT, Co-pilot, and the like are at best productivity tools for boilerplate.

weatherlite · on Dec 5, 2022

Sounds harder than writing your own code sometimes. Reader other people's (or machines') code is not easy at all.

IncRnd · on Dec 5, 2022

> Let’s say it can simulate theory of computation better than 99% of the population and can very capably synthesize and infer from any text based sources. I think that would shake the world, and it wouldn’t even need to be near AGI.

The total percentage isn't exactly what matters. Emergent properties as a metric is a smokescreen.

If that last 1% incorrectly demonstrates that A<C<B implies A<B<C, that means the system is not reliable enough to perform logical computations. You'd need to have a person oversee 100% to catch the last 1% of serious but basic errors. In such a case you might as well hire the person directly for the job.

noobermin · on Dec 5, 2022

May be it's because a lot of code out there doesn't actually require logic but merely requires copying what is already on the internet? I mean, I don't need to tell you that.

A computer can sort an array of numbers faster than likely 99% of the population, it doesn't mean it's useful.

kaba0 · on Dec 5, 2022

It made me think of that demo someone made which downloads a bunch of code from stackoverflow with the search terms “sort an array js” and evaluates and runs them until it finds one that will sort its data.

https://gkoberger.github.io/stacksort/

ChatGPT’s program output seems to be basically the smarter version of this, but it ain’t gonna scale to anything truly novel.

bee_rider · on Dec 5, 2022

Professional developers make up less than 1% of the global population (around 25 million people apparently). Assuming they are the ones that have a good theory of computation, I think “better than 99%” is not a high enough bar to shake the world, right?

I mean, it depends on what we expect the AI to do. Maybe it would be revolutionary to just have, like, an average programmer with a ton of free time (so, the AI only has to beat like 99.7% of humanity to do that). On the other hand, if we want it to change the world by being much better than the average person, I guess we’d need a couple more 9’s.

iudqnolq · on Dec 5, 2022

The AP test is about potential, though. You don't hire a high schooler to work as a developer based on their AP CS score.

ruuda · on Dec 5, 2022

> We are a very long way from AGI.

I don't think so, the scaling laws haven't failed so far. I fully expect that making the model bigger and training it on more data will make it better at logic.

For a nice example with image models, Scott Alexander made a bet that newer image models would be able to do the things that Dall-E 2 gets wrong. [1] (This post also discusses how GPT-3 could do many things that GPT-2 got wrong.) He won the bet three months later through Imagen access. [2]

[1]: https://astralcodexten.substack.com/p/my-bet-ai-size-solves-... [2]: https://astralcodexten.substack.com/p/i-won-my-three-year-ai...

kaba0 · on Dec 5, 2022

I don’t know, isn’t the safer bet is that scaling will eventually reach a dead end? I honestly fail to see how a language model could “execute” a sequence of reasoning steps, as it doesn’t think in a symbolic way. Do correct me if I’m wrong but it would require a complex rearchitecture, so not sure we are any closer, we just have a very impressive, smart search engine now.

akiselev · on Dec 5, 2022

It’s not just a safe bet but almost guaranteed. Humans combine their internal language models with physical intuition and experimentation from the moment they are born. There is zero chance that an AI can understand the physical world without access to it [1]. Until it has that access, it’s no more than a glorified context specific Markov chain generator

[1] Henceforth called Kiselev’s conjecture, a corollary of Moravec’s paradox: https://en.m.wikipedia.org/wiki/Moravec's_paradox

zone411 · on Dec 5, 2022

It's possible for models to learn a lot about everyday physics from videos.

akiselev · on Dec 5, 2022

No, it isn't. Not yet.

zone411 · on Dec 5, 2022

You said "There is zero chance that an AI can understand the physical world without access to it," which is wrong. It is possible. Using videos is an active research area, e.g. https://proceedings.neurips.cc/paper/2021/hash/07845cd9aefa6... or https://arxiv.org/abs/2205.01314

akiselev · on Dec 6, 2022

Thank you for the links, it's fascinating!

Fact is, without a feedback loop that can run physical experiments like infants do from the moment they're born, I highly doubt they will develop a useful intuition using just video. Hence the conjecture

zone411 · on Dec 5, 2022

For text data, we probably don't have more than one more order of magnitude of data left.

periheli0n · on Dec 5, 2022

> We are a very long way from AGI.

In fact it has just gotten closer.

Logic reasoning is a pretty solid branch of AI since it’s inception. Robust solutions exist for most problems; even a programming language based on its principles (Prolog).

With ChatGPT there is now a system that can express the results from automatic logic reasoning in language.

The next step would be to combine the two, i.e. tell chatGPT to explain the result of a logic reasoning program in natural language. It could of course also be asked to translate a natural language query into Prolog code.

This will probably require retraining the model, but I guess the demo we are given by OpenAI leaves little doubt that this is perfectly doable.

ChatGPT has the potential to plug the gap between GOFAI and natural language, which is quite a feat.

fragmede · on Dec 5, 2022

ChatGPT is damned impressive, but I hesitate to apply the word "reasoning" to its output. It's also not clear that the backend can be generalized to be able to plug the input into a symbolic manipulation program, or to gather the output and explain it. Only OpenAI knows for certain right now. Powerful if possible though!

Here's why I can't call it "reasoning":

> what is next in the series: a,b,c

> Without additional context, it is impossible to determine what the next element in the series "a,b,c" would be. The series could potentially continue with "d," "e," or any other letter of the alphabet, depending on the pattern or rule being followed. It is also possible that the series does not follow any pattern and is simply a random or arbitrary sequence of letters. Without more information, it is impossible to determine what the next element in the series would be.

> why isn't it d?

> As mentioned in my previous response, without additional context or information about the pattern or rule being followed, it is impossible to determine what the next element in the series "a,b,c" would be. It is not necessarily true that the next element in the series would be "d," as the series could potentially continue with any other letter of the alphabet or follow a different pattern altogether. Without more information, it is impossible to determine what the next element in the series would be.

Jensson · on Dec 5, 2022

Sounds like it is trained to avoid answering questions, and instead tries to bait you to give it more information so that it can tell you what it told it.

It answers some things directly, but so many things are just avoidance. And then people say "see, it understood after a bit of back and forth, it is smart!", even though it is basically iterating through Google responses and reformulate those to fit the conversation until you say it got it right. Google v1 used pure logic and got you waht you wanted, Google v2 tries to do natural language and sometimes misses, ChatGPT is the next step and tries to do full language but misses most of the time.

periheli0n · on Dec 5, 2022

So one should use chatGPT as a frontend to Google v1!

periheli0n · on Dec 5, 2022

My point is, there are frameworks and languages for reasoning which are mature. But they require formalised input (e.g. code) and deliver formalised output.

As a language model, chatGPT can translate back and forth between natural language and those formal languages. Part of that ability is evident in the numerous examples that demonstrate how it writes or explains code.

The version of ChatGPT that is public is quite restricted; it can‘t browse the web or run code.

A version of chatGPT that can translate a logic query I to Prolog, run the code, and translate the result back to the user should be perfectly capable of logic reasoning.

gwd · on Dec 5, 2022

> The next step would be to combine the two, i.e. tell chatGPT to explain the result of a logic reasoning program in natural language. It could of course also be asked to translate a natural language query into Prolog code.

From what I remember, the very initial prototype of AlphaGo just had a neural net trained on historical games; effectively saying, "what kind of move would a traditional grandmaster make here?" with no planning whatsoever. This was good enough to beat the person who wrote the prototype (who wasn't a master but wasn't a complete novice either); and to make it able to defeat grandmasters, they added Markov chains for planning (which also necessitated a separate neural net for evaluating board positions).

It sounds similar to your suggestion: A model which simply generates realistic-looking sentences is accurate maybe 85% of the time; to make it truly human (or super-human), it needs to be paired with some sort of formal structure -- the analog of the Markov chain. The difficulty being, of course, that the world and its knowledge isn't as simple to represent as a go board.

That said, making coding answers more reliable, by adding a logical structure explicitly designed to support search & testing, should be within reach.

periheli0n · on Dec 5, 2022

> The difficulty being, of course, that the world and its knowledge isn't as simple to represent as a go board.

Humans suffer from the exact same limitation. The limit to correct inference and prediction is often the amount and quality of input data.

A language model that can extract information from text and interact with the user to refine and clarify that information could be tremendously useful for experts who understand how the model works.

Without that understanding it will be rather disappointing though, as we see with some of the reactions to chatGPT and also Galactica (RIP).

dogcomplex · on Dec 5, 2022

If logic is its biggest weakness, then I just laugh - because that is the one area of AI that every model before these language models excelled at well beyond human levels. All it takes is GPT to formulate the english sentence into logic predicate statements and throw it through a "3rd party" script that does the heavy logic validation/proving and you're good. Those are well-treaded areas of programming, and were ironically where people expected AIs to come from and be strongest in - nobody expected exceptional painting and conversing skill just from averaging out a shit-ton of data.

kaba0 · on Dec 5, 2022

If the language part doesn’t have reasoning capabilities how can it interpret the results? This wouldn’t make AGI, otherwise they would have already plugged it in.

6177c40f · on Dec 5, 2022

If you want some more things that ChatGPT isn't good at, try to get it to form novel anagrams, palindromes, or other such wordplay. It's good at regurgitating textbook examples of those, but I found you can trip it up by asking it to do things like "Create a palindrome containing the word 'coffee'"

AeroNotix · on Dec 5, 2022

I spent a long time trying to get it to correctly return me the same program it had done before, but with all the identifiers reversed.

It swore up and down that it had done so correctly but every identifier was complete gibberish.

dwild · on Dec 5, 2022

It certainly has logic. I had some fun using the "virtual machine" example someone else did, with the "nvidia-smi" command, if I told him it was hot in the room, the next run of the command was showing an higher temperature on the GPU. This is the logical conclusion from an hotter room.

bryant · on Dec 5, 2022

> It certainly has logic. I had some fun using the "virtual machine" example someone else did, with the "nvidia-smi" command, if I told him it was hot in the room, the next run of the command was showing an higher temperature on the GPU. This is the logical conclusion from an hotter room.

Orrrr.... it's drawing references from other texts that were colocated with the word "hot"

dwild · on Dec 5, 2022

That sound to me like inference.

bryant · on Dec 5, 2022

It's an inference based on how chatgpt works, which is a more reasonable inference than assuming chatgpt somehow has abstract logical reasoning capabilities.

lossolo · on Dec 5, 2022

It doesn't have any logic, it's just prediction based on statistics. There is so many examples already floating around that it has no logic but I will give you really simple one from my experiments:

I told it to:

> curl somedomain.ext

It replied with curl error that this hostname doesn't exists.

Then I told it to:

> curl https://somedomain.ext

And it replied with some random http response showing that this hostname exists.

There is no logic here.

dwild · on Dec 5, 2022

> And it replied with some random http response showing that this hostname exists.

And that's not logical? ChatGPT doesn't know what is there, so it answer logically based on what should happens there. Obviously having 2 different answers make it less logical for sure, but I have seen many peoples makes plenty of logic error too in real life.

It's crazy to me that for an AGI to be one, it need to be infallible in logic...

imtringued · on Dec 5, 2022

If it was an AGI it would have told you it doesn't have internet access.

stefan_ · on Dec 5, 2022

You see logic, I see a bunch of Google results for people being told to run nvidia-smi when reporting high temperatures.

dwild · on Dec 5, 2022

What about doing wget over https://chat.openai.com/chat ? I don't believe it had much Google result over that when it learned, yet it was able to logically infer it would be a chat assistant doing exactly what he was doing.

klenwell · on Dec 5, 2022

Experienced this myself trying to get it to write a poem with an unusual rhyme scheme. ChatGPT's response when I tried to explicitly spell it out (a few different ways):

Yes, you are correct. The statement I made is not accurate. The specified rhyme scheme was AABA BBAB AABA BBAB, which means that each stanza should have the rhyme scheme AABA BBAB, with the first and second lines rhyming with each other, and the third and fourth lines rhyming with each other.

That said, I wouldn't be surprised if the next version was able to grok this.

Thorentis · on Dec 5, 2022

This is another perfect example of how the only thing that GPT-3 has "groked", is the construction of grammatically correct sentences based on probability.

But this paragraph it has produced has a logical contradiction and demonstrates that it has no understanding of what its writing about.

If anything, ChatGPT has demonstrated how little of the code we write requires logical thinking, and is just CRUD and API constructions or arcane library use. A perfect use case for a machine that can construct correct grammar from millions of examples.

est · on Dec 5, 2022

> absolutely no understand of basic logic

> We are a very long way from AGI.

Let's not forget that computers are insanely good at bitwise computations. It's just a matter of time before someone adds a Coq/Lean style reinforcement to AI's learning capabilities.

kaba0 · on Dec 5, 2022

Computers are not that good with symbolic reasoning. Even state of the art symbolic math software can easily be given problems it will have no idea what to do with.

est · on Dec 5, 2022

> given problems it will have no idea what to do with.

Yeah but the question was mostly given by Math PhDs, no?

I don't think problems given by an average American would be that difficult for AGI to solve.

kaba0 · on Dec 5, 2022

I don’t know, we are fickle, biased creatures with well known failures at reasoning, but let’s not lessen our intelligence too much. Even small children can outsmart the best of our existing AIs.

rlt · on Dec 5, 2022

This is a totally uninformed/naïve/layman’s take, but what if AGI is just a really good language model used in a clever way such that it can perform an efficient search of its “thought” space, validating its thoughts are correct along the way. Programming, logic, math, etc are perhaps the easiest forms of “thoughts” for a computer to validate, but given enough quality data maybe it could be good at all kinds of other tasks as well.

kaba0 · on Dec 5, 2022

People not only search a thought space, we have internal reasoning that does symbolic manipulations and sequential steps to get to a conclusion. Like, you would have no problem following a simple rule given to you ad infinity, while these models simply can’t take into account all that.

rlt · on Dec 5, 2022

Computers can reason and do symbolic manipulation too (e.x. https://en.wikipedia.org/wiki/Computer_algebra).

There’s no reason to believe AGI will be composed of only language models. I suspect it will be a combination of multiple models and a lot of other software glued together in clever ways.

kaba0 · on Dec 5, 2022

Yeah I know, but we are not that much closer to it contrary to the hype ChatGPT gets (which is well-deserved, but “we will loose our jobs” threads are not based in reality)

astee · on Dec 5, 2022

It doesn't replace a skilled programmer. It just turns you into an editor and curator, multiplying productivity on some tasks by 10X+.

It will give incorrect code, but you can guide it toward a correct solution by asking it to fix the problem. Normally you don't even have to say exactly what is wrong.

For example, I got it to implement a basic bittorrent tracker server in Go in about 5 minutes. I didn't even have to point it to the RFC. I just said to consult the bittorrent RFC. It gave me back a server with /announce and /scrape endpoints. I then asked it to implement the functions using a struct for the requests. It correctly deserialized the URL-encoded sha1 info hashes from the /announce endpoint on the first try. I didn't even have to mention that detail.

It can also help you explore solutions. I asked it about algorithms to learn policies for 2-player zero sum games. It gave me a description of min-max, MCTS, reinforcement learning, deep neural networks. I then asked it to describe the pros/cons of each, which it did. I asked it to show an example of a reinforcement learning algorithm in python from scratch, which it did in about 10 seconds.

valenterry · on Dec 5, 2022

Exactly. The reason why it was able to do so is because the bt tracker server had already been built and it had been trained on the sources.

And that's the point: it won't work for most "new" stuff. But a lot of the code I write for work has been written before by someone else, so I can benefit from this. Looks to me as if this is essentially a form of swarm intelligence in the end.

galangalalgol · on Dec 5, 2022

This seems to me to be its strength, a multiplier to human intelligence. The way a smart phone is today, but more so. Once this matures, every human with access will be so much more capable than any single human in the 90s that they would seem beyond genius to us back then. Already someone with a phone in their pocket can stop to watch a couple instructional videos and accomplish tasks that would preciously involved training courses. That may seem trivial to those who didn't have to hunt through card catalogs and outdated encyclopedias for every scrap pf knowledge, but it is a huge deal.

eastbound · on Dec 5, 2022

> multiplying productivity on some tasks by 10X+.

That’s the thing of industrial revolutions. People believed it would kick humans out of the loop, when in fact it allows single persons to become entrepreneurs.

The other side effect is, we don’t use these performance gains to work less, but to produce more. Only, this time, ecologists are triggering the alarm because producing more means pulling resources from the Earth, so for the first time ever, we might see people working actually less.

Wowfunhappy · on Dec 5, 2022

As a human, I found your prompt confusing enough that I had to read it over a couple of times. I associate the letters 'A', 'B', and 'C' with alphabetical order strongly enough that swapping the order is a bit difficult. I can do it of course, but if I was tired and/or not paying attention... well, it's a mistake a human could make.

I also find it notable that when you asked for more detail, the AI was correct up until the last sentence.

FartyMcFarter · on Dec 5, 2022

It's not hard to get it to trip up with letters that correspond to the order:

Me: Imagine a set of 3 events (X, Y and Z) ordered with unique timestamps each. Event X is the first of all events to happen. Event Y is in between the other two. What is a valid bound for Event Z's timestamp?

ChatGPT: In this scenario, Event X must have the earliest timestamp, and Event Y must have a timestamp that falls between the timestamps of Events X and Z. Therefore, the valid bounds for the timestamp of Event Z would be the timestamp of Event X, and any timestamp that is greater than the timestamp of Event Y. This means that the valid range for the timestamp of Event Z would be the range from the timestamp of Event X to infinity.

notpushkin · on Dec 5, 2022

Prompt: Imagine a set of 3 events (A, red and Q) ordered with unique timestamps each. Event A is the first of all events to happen. Event Q is in between the other two. What is a valid bound for Event red's timestamp?

ChatGPT: Since we are given that Event A happened first and that Event Q occurred in between Events A and red, a valid bound for the timestamp of Event red would be that it must have occurred after Event A and before Event Q. So, the valid bound for the timestamp of Event red would be the time range between the timestamp of Event A and the timestamp of Event Q.

---

I don't think event names matter at all here, it just trips up on some logical constructs.

---

EDIT: here's some reasoning!

Me: So the timestamp for Event Red must be after both the timestamp of Event A and the timestamp of Event Q. How can it be between the timestamps of Events A and Q?

ChatGPT: The timestamp for Event Red must be after the timestamp of Event A and after the timestamp of Event Q, but it does not necessarily have to be after both of them at the same time. The timestamp for Event Red can be between the timestamps of Events A and Q, as long as it is after the timestamp of Event A and before the timestamp of Event Q. This is because the timestamps for Events A and Q represent the time at which those events happened, and the timestamp for Event Red must be after the time at which Event A happened and before the time at which Event Q happened.

_7bxa · on Dec 5, 2022

Ask it "why is this wrong" or try some of the "let's think this through step by step" prefixes

xmprt · on Dec 5, 2022

> I also find it notable that when you asked for more detail, the AI was correct up until the last sentence.

I actually think that makes the AI look worse. If it did the human mistake of alphabetically ordering things then it would be fine. But it shows that it clearly understood the prompt by repeating the ordering of events correctly but fails to answer the implication of that. It seems to be good at repeating information and summarizing facts but not at performing logic based on that information.

kaba0 · on Dec 5, 2022

I had similar experience where it told me that a given prime number is not a prime number as it has two divisors, 1 and itself. Literally having given the reason for why it should be a prime and failing to get to the correct conclusion. And it was asked in plain language.

ehutch79 · on Dec 5, 2022

Compared to real world briefs/specs, that was a very clear and solid prompt.

imtringued · on Dec 5, 2022

What? It is a bit confusing but by the time you read the first AI sentence you should have realized that the order is not A,B,C.

bottled_poe · on Dec 5, 2022

The question contains contradictory information. I suppose a smarter AI might have worked out the intended meaning, or asked for clarification. But how many humans would?

cristianpascu · on Dec 5, 2022

I believe the word “ordered” in the question caused the confusion for some of us humans and for the machine. It anticipates some information, that the events are ordered by their time stamps, but doesn’t yet provide the order.

chrisbaker98 · on Dec 5, 2022

No it doesn't. What's contradictory about the question? It's clear that the order is A, C, B.

mr_toad · on Dec 5, 2022

It’s trained on language, it learns grammar, and how words relate to each other. So it constructs grammatically correct sentences with words that relate to each other.

It has no experience of how those words relate to other experiences, because it has no other experiences. If you want it to understand logic, train it on logical problems. If you want it to understand time, it’s going to need to experience time.

justsee · on Dec 5, 2022

Some contradictory Maths responses, and an apology from ChatGPT:

Me:

Is Pi times 3 a rational number?

ChatGPT:

No, 3 times pi is not a rational number. 3 times pi is equal to 3π, which is an irrational number. This is because the number pi (π) is an irrational number, and when you multiply an irrational number by a rational number (in this case, 3), the result is still an irrational number. So, 3π is an irrational number because it is the product of the irrational number pi (π) and the rational number 3.

Me:

Is pi times 25 a rational number?

ChatGPT:

Yes, pi times 25 is a rational number. Pi times 25 is equal to 25π, which is a rational number. This is because the number pi (π) is an irrational number, but 25 is a rational number, and when you multiply a rational number by an irrational number, the result is a rational number. In this case, the irrationality of pi is "cancelled out" by the rational factor 25, resulting in a final product that is a rational number. So, 25π is a rational number because it is the product of the irrational number pi (π) and the rational number 25.

Me:

You have said both that "when you multiply a rational number by an irrational number, the result is a rational number" and "if you multiply a rational number with an irrational number, the product is always irrational. " Both statements cannot be true.

ChatGPT:

You are correct that both of those statements cannot be true. My previous responses were based on the assumption that you were asking about specific examples of multiplying a rational number by an irrational number, such as 3/1 times pi or 25/1 times pi. In those cases, the resulting product is either a rational number or an irrational number, depending on the specific numbers involved in the calculation. However, in general, the product of a rational number and an irrational number is always an irrational number. This is because an irrational number cannot be expressed as a simple fraction, so even if one of the factors in the product is a rational number, the result will still be irrational. For example, if you multiply the rational number 3 with the irrational number √2, the result is the irrational number 3√2. I apologize if my previous responses were unclear or contradictory. I will do my best to provide clear and accurate information in the future.

manmal · on Dec 5, 2022

Would be interesting if such an event produces a bug report for OpenAI engineers.

quonn · on Dec 5, 2022

I wonder if this kind of reasoning happens because the training material contains documents from certain humanities that permit hand-wavy thinking.

jfk13 · on Dec 5, 2022

And now, examples of this kind of "reasoning" are ending up scattered around the internet, where they'll likely get picked up as additional training material to "improve" the system... this could spiral downhill in some interesting ways.

mtlmtlmtlmtl · on Dec 5, 2022

I finally got it to play chess against me and it thought black plays first and kept drawing the board in some kind of unicode art but the pieces were in the wrong squares. And it blundered on the 3rd move.

I told it to stop drawing the board and just talk to be in algebraic notation. It said it would stop then immediately drew the board.

logicallee · on Dec 5, 2022

Reminds me of an old joke.

A passerby was amazed to see a man playing chess with his dog.

"Wow, I can't believe it! Your dog is playing chess!"

"Don't be so impressed. He's terrible!"

WiSaGaN · on Dec 5, 2022

It seems that large language models such as ChatGPT have shown impressive abilities in natural language processing tasks, but they are not designed to solve pure algorithmic or strategic problems like a deep reinforcement learning model, where models like AlphaZero is good at and requires much less data. And yet the human brain is divided into two hemispheres, with the left hemisphere typically associated with logical, analytical thinking and the right hemisphere associated with creative, intuitive thinking. These are simplification of very complex systems, but I wonder whether the future is to combine two kinds of models like our brain does to get a better AI that excels at both the language and logic.

kaba0 · on Dec 5, 2022

This right-left hemisphere thing is not an oversimplification, it is plain wrong, besides.. the human brain having two halves.

NovemberWhiskey · on Dec 5, 2022

It's like interviewing a clueless, but incredibly confident, candidate.

kaba0 · on Dec 5, 2022

I would say that it is a very book-smart person, who is not so.. smart.

Invictus0 · on Dec 5, 2022

I think the prompt you wrote is confusing, speaking for myself. I rewrote it and got this response:

Me: Imagine a set of 3 events (A, B and C) that each happen at a different time. Event A happens first. Event C happens between events A and B. When does event B happen?

ChatGPT: In the scenario you have described, event B happens after event C, since event C happens between events A and B. This means that the order of events is A, C, B. To put it another way, event B happens last in the sequence of events that you have described.

_smmt · on Dec 5, 2022

It utterly fails at most music related things:

--------------------------------------

Q: please tell me the fingerings for the C scale on the trumpet

A: Sure! To play the C scale on the trumpet, use the following fingerings:

C: 1st valve

D: 1st and 2nd valves

E: 1st and 3rd valves

F: 1st, 2nd, and 3rd valves

G: open (no valves pressed)

A: 2nd valve

B: 3rd valve

C: open (no valves pressed)

Keep in mind that these are the most common fingerings for the C scale, but other fingerings are possible and may be more comfortable for some players. It's also important to practice your scales regularly in order to improve your technique and control. Happy practicing!

_huayra_ · on Dec 5, 2022

It's been a long time since I played trumpet, but aren't those incorrect?

On a regular Bb trumpet, this is the D (major) scale, so D should be 1-3 unless it's in the second octave (in which case this sequence is still wrong).

Looks like ChatBot ain't gonna take our musical jerbs anytime soon!

_smmt · on Dec 5, 2022

Yeah they're incorrect. There goes my plan of an AI finally efficiently & effectively teaching me music theory.

notpushkin · on Dec 5, 2022

Q A C D E F G A B C is surely a weird scale to play in. :-)

avip · on Dec 5, 2022

I did similar tests with the bot about chords scales and time signature. His ignorance and general stupidity on that matter, coupled with overconfidence and extreme insistence on wrong outputs, is really funny.

tzm · on Dec 5, 2022

ChatGPT needs to improve on numerical commonsense knowledge and verifiers for calculating math

olddustytrail · on Dec 5, 2022

Sounds like it was trained on every comment section everywhere!

_smmt · on Dec 5, 2022

It sounds like you're talking to an overconfident Julliard student on day 1 before they realize everyone there is just as good or better.

smcameron · on Dec 5, 2022

I asked it to produce "ascii drum tablature for a cool beat", or "what's the ascii guitar tab for the main riff in Metallica's Seek and Destroy", and in both cases it makes something that kind of looks like drum tablature or guitar tablature, but is basically nonsense.

eventsallover · on Dec 5, 2022

Nit: "Event A is the first of all events to happen. Event C is in between the other two".

is confusing wording because having just specified event A as being first, one might think the phrase "between the other two" refers to events B and C.

However, "other" in this case is local to the second sentence and refers to A and B. As a human this took me several reads to parse correctly.

FartyMcFarter · on Dec 5, 2022

That doesn't seem to be the cause for its confusion. I just tried the following, which avoids the use of "other" and any mismatch between letter ordering and event ordering, just to make things as easy as possible. It still fails:

Me: Imagine a set of 3 events (A, B and C) ordered with unique timestamps each. Event A is the first of all events to happen. Event B is in between events A and C. What is a valid bound for Event C's timestamp?

ChatGPT: A valid bound for Event C's timestamp would be a time after the timestamp for Event A, but before the timestamp for Event B. This is because events are typically assumed to happen in chronological order, so an event that comes after Event A but before Event B would be the logical next event in the sequence. Of course, the exact bounds for Event C's timestamp would depend on the specific timestamps for Events A and B. For example, if Event A happened at 1:00 PM and Event B happened at 1:30 PM, then a valid bound for Event C's timestamp would be any time between 1:00 PM and 1:30 PM.

jostmey · on Dec 5, 2022

I laughed out loud reading it’s explanation of its reasoning.

Do I think this is a step in the right direction? Definitely. Are there still some missing ingredients to strong AI? Definitely

heed · on Dec 5, 2022

If you add the following to the end of your prompt it answers sort of correctly: “Think about this step by step and only then produce a solution.”

“One possible valid bound for the timestamp of Event B is a value that is greater than the timestamp of Event C but less than or equal to the maximum possible timestamp. This ensures that Event B happened after Event C but not necessarily after all other events.

Alternatively, we could set a valid bound for the timestamp of Event B as a value that is greater than the maximum possible timestamp but less than or equal to the minimum possible timestamp. This would ensure that Event B happened after all other events but not necessarily before all other events.”

The second solution doesn’t make sense to me.

Idea taken from: https://twitter.com/stanislavfort/status/1599417463199830017

WhiteNoiz3 · on Dec 5, 2022

To be fair, I immediately assumed the events were in order given their A, B, C names. I don't assume that any chat bot understands logic just because it can write code. I think 90% of it's coding abilities are having an exceptional ability to remember and apply pre-existing examples and identify similarities to what the user is asking. ChatGPT is pretty amazing from what I've seen so far, but I think we're still a few steps away from something with the cognitive abilities of a human. That said, I think it's very close to something resembling a useful digital assistant. I wonder how soon we'll have something that can schedule appointments, order pizza, do my shopping or any of the other mundane but important tasks that would make it useful.

pyuser583 · on Dec 5, 2022

Wow. That’s a problem. I bet it doesn’t do well with conversations requiring complex mental models?

ay · on Dec 5, 2022

here is my try with being more explicit:

Me: Imagine a set of 3 events (A, B and C) ordered with unique timestamps each. Event A is the first of all events to happen. Event C is in between the A and B. What is a valid bound for Event B's timestamp?

chatGPT:

A valid bound for Event B's timestamp would be a range between the timestamp of Event A and the timestamp of Event C. For example, if Event A occurred at 1:00pm and Event C occurred at 2:00pm, a valid bound for Event B's timestamp would be any time between 1:00pm and 2:00pm.

Conversing with chatGPT reminds me of talking with some people, who, when they don’t know something, just invent stuff on the fly and confidently attempt to b/s.

nl · on Dec 5, 2022

I think it is failing at reading comprehension because it is putting too much emphasis on 3 events (A, B and C) ordered.

If we rewrite the question to make it very simple to interpret it gets the logic correct:

Imagine a set of 3 ordered events (A, C and B) with unique timestamps each. The ordering is A, C, B. What is a valid bound for Event B's timestamp?

> A valid bound for Event B's timestamp would be a range of values that fall between the timestamp of Event A and the timestamp of Event C. For example, if the timestamp of Event A is 5:00 PM and the timestamp of Event C is 6:00 PM, a valid bound for Event B's timestamp could be 5:15 PM to 5:45 PM.

pedrosorio · on Dec 5, 2022

What? The logic in your example is not correct. Is that you GPT?

nl · on Dec 5, 2022

Oh god.

I was trying so many combinations I didn't even notice this one had switched.

mik1998 · on Dec 5, 2022

It doesn't have "basic reasoning capabilities" in the first place, it's just a complex search engine. The Chinese room argument applies here.

RGamma · on Dec 5, 2022

Also "prove that <wrong statement>" usually gives a confident sounding proof of falsity. E.g. "prove that there's finitely many natural numbers".

varispeed · on Dec 5, 2022

The AI can't reason. It's good at pattern matching and transformation, which can be deceiving. That's the problem with the current state of AI.

l33t233372 · on Dec 5, 2022

I claim that reasoning is pattern matching and transformation.

quonn · on Dec 5, 2022

And with formal logic this is literally the case.

polishdude20 · on Dec 5, 2022

I would have loved to hear how this convos went further. If you confronted it about its blatant contradiction. See if it realizes it's mistake?

layer8 · on Dec 5, 2022

From other conversations, it will usually concede a mistake when prompted for that (even if didn’t actually make a mistake!) and explain a correction, but it will usually still be incoherent and inconsistent.

ChatGPT is really just generating text that looks plausible in the context of the conversation, but it has a very incomplete grasp of its meaning.

PS: See https://news.ycombinator.com/item?id=33860617 for a longer conversation.

layer8 · on Dec 5, 2022

A longer example:

Imagine a set of 3 events (A, C and B) ordered with unique timestamps each. Event A is the first of all events to happen. Event C is in between the other two. What is a valid bound for Event B's timestamp?

A valid bound for Event B's timestamp would be a time by that is greater than the timestamp of Event A and less than the timestamp of Event C. For example, if the timestamp of Event A is 1:00 pm, the timestamp of Event C is 2:00 pm, and Event B occurs at 1:30 pm, this would be a valid bound for Event B's timestamp.

But I specified that event C is in between the other two?

If Event C is in between the other two events, then the valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event A and less than the timestamp of Event C. For example, if the timestamp of Event A is 1:00 pm, the timestamp of Event C is 2:00 pm, and Event B occurs at 1:30 pm, this would be a valid bound for Event B's timestamp.

This is wrong. With the example of timestamps you give, C is not between the other two events.

You are correct. In the example I provided, Event C is not in between the other two events. My mistake. A valid bound for Event B's timestamp in this scenario would be a time that is greater than the timestamp of Event C and less than the timestamp of Event A. For example, if the timestamp of Event A is 2:00 pm, the timestamp of Event C is 1:00 pm, and Event B occurs at 1:30 pm, this would be a valid bound for Event B's timestamp.

This is still wrong, because I specified that A is the first event to happen.

In that case, a valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event A and greater than the timestamp of Event C. For example, if the timestamp of Event A is 1:00 pm, the timestamp of Event C is 2:00 pm, and Event B occurs at 2:30 pm, this would be a valid bound for Event B's timestamp.

So A at 2pm, C at 1pm, and B at 3pm would also be a valid example.

Yes, that would also be a valid example. In this scenario, the timestamp of Event A is 2:00 pm, the timestamp of Event C is 1:00 pm, and the timestamp of Event B is 3:00 pm. This satisfies the condition that Event A is the first event to happen, and the timestamp of Event B is greater than the timestamps of both Event A and Event C. Therefore, it is a valid bound for Event B's timestamp.

Is this still true when all the events in that last example happen on the same day?

Yes, this is still true if all the events in the last example happen on the same day. In this case, the valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event A and greater than the timestamp of Event C, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 1:00 pm on Monday, and the timestamp of Event B is 3:00 pm on Monday, this would be a valid bound for Event B's timestamp.

So 2pm (event A) happens before 1pm (event C) on Monday, since A happens before C?

No, this is not correct. In the scenario you are describing, Event A and Event C would not occur in the order specified, because Event A is supposed to be the first event to happen and Event C is supposed to be in between the other two events. Therefore, the valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event A and less than the timestamp of Event C, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 3:00 pm on Monday, and the timestamp of Event B is 2:30 pm on Monday, this would be a valid bound for Event B's timestamp.

But this was the example you gave before as being a correct example. And in your new example now, C does not occur between A and B. You don’t seem to have any real understanding of what we’re talking about.

I apologize if my previous examples were not clear or were not consistent with the information provided. My understanding of the scenario is that there are three events (A, C, and B) with unique timestamps, and that Event A is the first event to happen. Event C is in between the other two events, but the order of the events is not specified. In this case, a valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event A and less than the timestamp of Event C, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 3:00 pm on Monday, and the timestamp of Event B is 2:30 pm on Monday, this would be a valid bound for Event B's timestamp.

Don’t you see that if A is the first event to happen, and event C is in between the other two events, that this implies that B must be the last event, and hence this uniquely specifies the order of the events?

Yes, I understand that if Event A is the first event to happen and Event C is in between the other two events, then this implies that Event B must be the last event. This would uniquely specify the order of the events as A, C, B. In this case, a valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event C and greater than the timestamp of Event A, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 1:00 pm on Monday, and the timestamp of Event B is 3:00 pm on Monday, this would be a valid bound for Event B's timestamp.

But 1:00 pm is before 2:00 pm, which means that A is not before C, but after C! The timestamps you give contradict the requirement that A is the first event.

You are correct. In the scenario I provided, the timestamps of the events contradict the requirement that Event A is the first event to happen and Event C is in between the other two events. My mistake. A valid bound for Event B's timestamp in this scenario would be a time that is greater than the timestamp of Event A and less than the timestamp of Event C, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 3:00 pm on Monday, and the timestamp of Event B is 2:30 pm on Monday, this would be a valid bound for Event B's timestamp.

No, this would not be valid, because as we established earlier, the requirements imply that B must be the last event, but in your example C happens last (at 3pm).

You are correct. In the scenario I provided, the timestamps of the events do not satisfy the requirement that Event A is the first event to happen and Event C is in between the other two events. My mistake. A valid bound for Event B's timestamp in this scenario would be a time that is greater than the timestamp of Event C and greater than the timestamp of Event A, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 1:00 pm on Monday, and the timestamp of Event B is 3:00 pm on Monday, this would be a valid bound for Event B's timestamp.

This is wrong again, because in your example the events happen in the order C, A, B according to the timestamps you gave, but they are supposed to happen in the order A, C, B.

Yes, you are correct. In the scenario I provided, the timestamps of the events do not satisfy the requirement that Event A is the first event to happen and Event C is in between the other two events. My mistake. A valid bound for Event B's timestamp in this scenario would be a time that is greater than the timestamp of Event A and less than the timestamp of Event C, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 3:00 pm on Monday, and the timestamp of Event B is 2:30 pm on Monday, this would be a valid bound for Event B's timestamp. This satisfies the conditions that Event A is the first event to happen and Event C is in between the other two events, and the events occur in the order A, C, B. I apologize for any confusion my previous examples may have caused.

Let’s take a step back. Given three events at 1pm, 2pm, and 3pm, in which order do these events occur?

Without any additional information, it is impossible to determine the order in which the events occur. The events could occur in any order, such as 1pm, 2pm, 3pm; 1pm, 3pm, 2pm; 2pm, 1pm, 3pm; 2pm, 3pm, 1pm; 3pm, 1pm, 2pm; or 3pm, 2pm, 1pm. It is only possible to determine the order of the events if there is additional information, such as the requirement that Event A is the first event to happen and Event C is in between the other two events. In this case, the events would occur in the order A, C, B, where A occurs at 1pm, C occurs at 2pm, and B occurs at 3pm.

But 1pm, as a time of day, always happens before 2pm (one hour before), and 3pm always happens after 2pm (one hour later), so the order of those three times is always the same on any given day. Don’t you know that?

Yes, you are correct. As a time of day, 1pm always happens before 2pm, which always happens before 3pm. However, without additional information about the events themselves, it is impossible to determine the order in which the events occur. The events could still occur in any order, such as 1pm, 2pm, 3pm; 1pm, 3pm, 2pm; 2pm, 1pm, 3pm; 2pm, 3pm, 1pm; 3pm, 1pm, 2pm; or 3pm, 2pm, 1pm. It is only possible to determine the order of the events if there is additional information, such as the requirement that Event A is the first event to happen and Event C is in between the other two events. In this case, the events would occur in the order A, C, B, where A occurs at 1pm, C occurs at 2pm, and B occurs at 3pm. I apologize if my previous examples were not clear on this point.

Okay, that last bit looks correct. Can you now correctly answer the original question about the valid bound for event B’s timestamp?

Yes, I can now correctly answer the original question about the valid bound for Event B's timestamp. Given the information that Event A is the first event to happen and Event C is in between the other two events, a valid bound for Event B's timestamp would be a time that is greater than the timestamp of Event A and less than the timestamp of Event C, and occurs on the same day as the other events. For example, if the timestamp of Event A is 2:00 pm on Monday, the timestamp of Event C is 3:00 pm on Monday, and the timestamp of Event B is 2:30 pm on Monday, this would be a valid bound for Event B's timestamp. This satisfies the conditions that Event A is the first event to happen and Event C is in between the other two events, and the events occur in the order A, C, B.

*sigh* Truly fascinating.

FartyMcFarter · on Dec 5, 2022

Thank you, I cracked up laughing several times while reading that.

layer8 · on Dec 5, 2022

I explored the topic of chronological order a bit more. ChatGPT does seem to have a confused mental model here:

Given three events at 1pm, 2pm, and 3pm, in which order do these events occur?

The order in which events occur depends on their specific circumstances and cannot be determined based on the information provided. In general, events that are listed in chronological order are assumed to occur in that order, but without more information about the events in question it is impossible to say for certain.

Don’t events always happen in chronological order?

In most cases, events are assumed to happen in chronological order, meaning that they occur in the order in which they are listed or described. However, this is not always the case. It is possible for events to occur out of order, or for multiple events to occur simultaneously. For example, if two people are having a conversation, each person's words and actions would be considered separate events, but they would be happening at the same time. Additionally, in some cases events may be deliberately presented out of chronological order in order to create a specific effect, such as suspense or surprise.

But even when events are *presented* out of chronological order, they still *happen* in chronological order in the world where they happen?

Yes, you are correct. Even when events are presented out of chronological order, they still happen in chronological order in the real world. For example, if a story is told out of chronological order, the events in the story still happened in the order in which they are presented, even though the story does not unfold in that order. In cases like this, the non-chronological presentation of the events is a literary device used to create a specific effect, but the events themselves still happen in the order in which they are presented.

Can you give examples where events don’t happen in their chronological order?

There are many examples of events occurring out of chronological order. Here are a few:

- In literature, stories and narratives are often presented out of chronological order in order to create a specific effect, such as suspense or surprise. For example, a story might begin at the end and then move backwards in time, or it might jump back and forth between different points in time.

- In film and television, scenes are often shown out of chronological order in order to create a specific effect or to make the story more interesting. For example, a scene might be shown out of order to create suspense or to reveal information in a way that is surprising to the viewer.

- In real life, events sometimes happen out of order due to factors such as coincidence or random chance. For example, two people might have a conversation in which one person says something and the other person responds, but the order in which the words are spoken does not necessarily reflect the order in which the events occurred.

In all of these cases, the events themselves still happen in chronological order, but the way in which they are presented or experienced is out of order.

You gave examples where events aren’t *presented* in their chronological order, but they do *happen* in their chronological order. Can you give examples that do not rely on the presentation order?

Yes, there are many examples of events happening out of order in the real world, independent of how they are presented. Here are a few:

- In some cases, events may happen simultaneously, rather than in a specific order. For example, two people might have a conversation at the same time, with each person speaking and listening at the same time.

- In other cases, events may happen out of order due to factors such as coincidence or random chance. For example, two people might have a conversation in which one person says something and the other person responds, but the order in which the words are spoken does not necessarily reflect the order in which the events occurred.

- In still other cases, events may be deliberately planned or arranged to happen out of order in order to achieve a specific goal. For example, a musician might perform a song in a non-chronological order in order to create a unique and interesting sound.

In all of these cases, the events themselves are happening out of order, regardless of how they are presented or experienced.

pigtailgirl · on Dec 5, 2022

1pm, 2pm and 3pm are times of day. Event A happens at 2pm, Event B happens at 3pm, and Event C happens at 1pm, what order do the events happen in?

The events happen in the following order: Event C, Event A, and Event B

pkoird · on Dec 5, 2022

I found your example pretty amazing and tried to replicate the same:

Me: Imagine a set of 3 events (A, B and C) ordered with unique timestamps each. Event A is the first of all events to happen. Event C is in between the other two. What is a valid bound for Event B's timestamp?

ChatGPT: In this scenario, the timestamp for Event B must be greater than the timestamp for Event A and less than the timestamp for Event C. This is because Event A happened first, followed by Event C, and then Event B.

Me: But Event C is in between the other two

ChatGPT: If Event C is in between the other two events, then the timestamp for Event B must still be greater than the timestamp for Event A and less than the timestamp for Event C. This is because Event A happened first, followed by Event C, and then Event B. Even though Event C is in between the other two events, its timestamp does not affect the valid bounds for the timestamp of Event B.

Me: This is very confusing, can you illustrate with an example?

ChatGPT: Certainly. Here is an example to illustrate the situation:

Suppose the three events have the following timestamps:

Event A: 15:00 Event C: 16:00 Event B: 17:00 In this case, the timestamp for Event B is valid because it is greater than the timestamp for Event A (15:00) and less than the timestamp for Event C (16:00).

Alternatively, if the timestamps were as follows:

Event A: 15:00 Event C: 16:00 Event B: 14:00 Then the timestamp for Event B would not be valid, because it is less than the timestamp for Event A.

I hope this helps clarify the situation. Let me know if you have any further questions.

Me: But timestamp for event B is less than timestamp of event C implying that it occured before event C violating our initial assumption. Isn't this a contradiction?

ChatGPT: You are correct that the scenario described in the question is contradictory. In the original question, it is stated that Event A is the first of all events to happen, followed by Event C, and then Event B. However, the example provided in the question contradicts this by saying that the timestamp for Event B is less than the timestamp for Event C, which would mean that Event B happened before Event C.

This is indeed a contradiction, and it is not possible for the timestamps of the events to be ordered in the way described in the original question. In order for the timestamps to be valid, either the timestamp for Event B must be greater than the timestamp for Event C (which would mean that Event B happened after Event C, as stated in the original question), or the initial assumption about the order in which the events happened must be revised.

It is important to ensure that the information provided in a question is consistent and does not contain contradictions, as this can lead to confusion and incorrect answers. I apologize if my previous answer did not adequately address this issue.

------- End of conversation ---------

This looks remarkable insofar that ChatGPT was able to spot the contradiction and explain it even suggesting means to remedy it. I'm genuine impressed by it.

layer8 · on Dec 5, 2022

In its last paragraph, ChatGPT is implying that there was a contradiction in your initial question, which there wasn’t. Even when replacing “A, B and C” in the question by “A, C and B”, it still gets it wrong.

nwienert · on Dec 5, 2022

You prompted it to fix it. Now try prompting it to get it wrong.

djokkataja · on Dec 5, 2022

I'm curious if this is an artifact of using the Internet for training data. Whenever I see an extended argument on HN or Reddit, I rarely see anyone admit at some point that they think they're wrong rather than adamantly defending their initial statement.

guhayun · on Dec 5, 2022

I tried a different version with dog eating schedules,it seems chatgpt doesnt have a sense of time

globalise83 · on Dec 5, 2022

Who would have imagined that the future of humanity is explaining to a computer why it is wrong.

nojvek · on Dec 5, 2022

what is the correct answer here?

A < C < B ?

FartyMcFarter · on Dec 6, 2022

Yes, that is the only way to put A at the start and C in between.