For humans I think it just comes down to interacting with LLMs enough to realize their quirks, but that's not really fool-proof.
Unfortunately many believe they can, and it is impossible to disprove. So now real people need to write avoiding certain styles, because a lot of other people have decided those are "LLM clues." Bullets, EM Dash, certain common English phases or words (e.g. Delve, Vibrant, Additionally, etc)[0].
Basicaly you need to sprinkle subtle mistakes, or lower the quality of your written communications to avoid accusations that will side-track whatever youre writing into a "you're a witch" argument. Ironically LLM accusations are now a sign of the high quality written word.
[0] https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing
Staccato (too may short sentences with periods) is also a telltale for me. Most humans prefer longer sentences with more varied punctuation; I, for example, am a sucker for run-on sentences.
Essentially 0 people use emoji to create a bulleted list. Nobody unintentionally cites fake legal precedents or non-existent events, articles, or papers. Even the “it’s not X, it’s Y” structure, in the presence of other suspicious style/tone cues signals LLM text.
Ask an LLM to read your project specs and add a section headed: Performance Optimizations, to see an example of this
Another is a certain punchy and sensationalist style that does not change throughout a longer piece of writing.
Eg - The Strait of Hormuz: Chokepoint or Opportunity?
I haven't seen this yet, but I guess the only reason I haven't done it is because it never crossed my mind.
What I have found an easy detection is non-breaking spaces. They tend to get littered through the passages of text without reason.
I wonder where some of this comes from. Another one is 'real unlock', it's not a common phrasing that I really recall.
https://trends.google.com/explore?q=real%2520unlock&date=all...
> It’s the fake drama. Punchy sentences. Contrast. And then? A banal payoff.
It's great because it's a double-decker of annoying marketing copy style and nonsensical content.
If one measures for perplexity (how likely text is under a certain language model), common text in a training set will be very likely. But you can easily create better models.
So judge the content on its merit irrespective of its source.
I think people will be able to detect the lowest-user-effort version of LLM text pretty reliably after a while (ie what you describe; many people have a good sense of LLM clues). But there's probably a *ton* of LLM text out there where some of the instructions given were "throw a few errors in", "don't use bullet points or em dashes", "don't do the `it's not this, it's that` thing" going undetected.
And then those changes will get built into ChatGPT's main instructions, and in a few months people will start to pick up on other indicators, and then slightly smarter/more motivated users will give new instructions to hide their LLM usage... (or everyone stops caring, which is an outcome I find hard to wrap my head around)
However, reasoning models adding a random typo to seem less automated, still do not hide the fairly repeatable quantized artifacts from the training process. For LLM, it is rather trivial to find where people originally stole the data from if they still have annotated training metadata.
Finally, reading LLM output is usually clear once one abandons the trap of thinking "I think the author meant [this/that]", and recognizing a works tone reads like a fake author had a stroke [0]. =3
Citation needed. The LLM accusations come from the specific cadence they use. You can remove all em-dashes from a piece of text and it still becomes clear when something is LLM written.
Can they be prompted to be less obvious? Sure, but hardly anyone does that.
It's more "The Core Insight", "The Key Takeaway", etc. than it is about emdashes.
Incidentally, the only people annoyed about "witch-hunts" tend to be those who are unable to recognise cadence in the written word.
This is an artifact of the default LLM writing style, cross-poisoned through training on outputs -- not an "universal" property.
Specific language tells, such as: unusual punctuation, including em–dashes and semicolons; hedged, safe statements, but not always; and text that showcases certain words such as “delve”.
Here’s the kicker. If you happen to include any of these words or symbols in your post they’ll stop reading and simply comment “AI slop”. This adds even less to the conversation than the parent, who may well be using an LLM to correct their second or third language and have a valid point to make.
As far as how I / other people do it, there are some obvious styles that reek of LLMs, I think it’s chatgpt.
There’s a very common structure of “nice post, the X to Y is real. miscellaneous praise — blah blah blah. Also curious about how you asjkldfljaksd?"
From today:
This comment is almost certainly AI-generated: https://news.ycombinator.com/item?id=47658796
And I'm suspicious of this one too - https://news.ycombinator.com/item?id=47660070 - reads just a bit too glazebot-9000 to believe it's written by a person.
I am writting an LLM captcha system, here is the proof of concept: https://gitlab.com/kaindume/llminate
To me, it often feels like the text version of the uncanny valley.
But again, that's just "feels", I don't have proof or anything.
Stylistic tells like 'delve' and bullet formatting are just RLHF training artifacts. Already shifting between model versions, compare GPT-4 to 4o output and the word frequency distributions changed noticeably.
Long term the only thing with real theoretical legs is watermarking at generation time, but that needs provider buy-in and it slightly hurts output quality so adoption has been basically nonexistent.
I think the better question to ask is: What are your goals? Is it to prevent AI SPAM, or to discourage people copy-pasting AI? Those are two very different problems: in the case of AI SPAM you look for patterns of usage, (IE, unusually high interaction from a single IP, timing patterns around when things are read and the response comes in,) and in the other case it all comes down to cultural norms.
dipb•1h ago