I’ll be honest, unless there’s a confident but obviously wrong complete factual ...

I’ll be honest, unless there’s a confident but obviously wrong complete factual blunder then I would have very low confidence in any probability of human/ai determination I tried to make. And even then I wouldn’t add many points to my confidence: I’ve made obvious blunders of my own, plenty of them.

Right now, and probably for a very limited time only, the primary hallmarks of llm responses like ChatGPT are confident tone and orderly structure, down to using bullet points and ordered lists. Except for the almost overly confident tone that bears a striking similarity to the tone of digests I frequently need to write, and GPT-4 is already much better at not sounding like a reasonably intelligent college sophomore that needs a bit more experience to avoid severe dunning kruger effects.

Within a year I’m not sure I’ll have any confidence at all in my ability to detect human/ai, except in areas where much deeper domain knowledge is required, and even that will likely be a rapidly closing window.

I doubt it would take much difficulty to train a model out of that confident structured tone as well. If we’re talking about adversarial or malicious use or content farms then I think the barn door is already wide open.

As far as content farms go though there’s a reasonable question becoming ever more relevant: apart from the cynical ad revenue cash grab, does it matter that content farms use AI, if it’s accurate and possibly better than, say, the average stack exchange response & threads in 90% of the cases? I’m ambivalent on the question. Deep knowledge and novel insights will be the domain of humans for some time, but I honestly don’t know for how long.

As I said before, my work requires succinct output along the lines of OpenAI’s capabilities. Right now it’s not reducing the amount of time it take me to perform some tasks— it’s bootstrapping the process so I can use roughly the same amount of time to produce a better, deeper, more polished result. I am already becoming more of an editor or curator of the end product of projects I work on, where 90% of the work takes place before I get to that point. But especially with GPT 4 I can feed it complex explanations of what I’m doing (not raw data) and ask it for its thoughts to get my own juices flowing.

To me, the question of whether something is human/ai produced is becoming not just irrelevant somewhat if a non sequitur. I’m exaggerating a bit to make my point, it’s not as simple as what I can fit into a comment here. But the question “did you use an LLM to produce some/all if this?” Is quickly beginning to sound like “did you spell check this before completing it?”