frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Discuss – Do AI agents deserve all the hype they are getting?

4•MicroWagie•4h ago•1 comments

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

48•UmYeahNo•1d ago•30 comments

Ask HN: Non AI-obsessed tech forums

30•nanocat•19h ago•26 comments

LLMs are powerful, but enterprises are deterministic by nature

4•prateekdalal•8h ago•6 comments

Ask HN: Ideas for small ways to make the world a better place

18•jlmcgraw•21h ago•21 comments

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

44•Invictus0•1d ago•11 comments

Ask HN: Who wants to be hired? (February 2026)

139•whoishiring•5d ago•520 comments

Ask HN: Who is hiring? (February 2026)

313•whoishiring•5d ago•514 comments

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

2•netfortius•16h ago•1 comments

AI Regex Scientist: A self-improving regex solver

7•PranoyP•23h ago•1 comments

Tell HN: Another round of Zendesk email spam

104•Philpax•2d ago•54 comments

Ask HN: Is Connecting via SSH Risky?

19•atrevbot•2d ago•37 comments

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

18•jchung•2d ago•13 comments

Ask HN: Why LLM providers sell access instead of consulting services?

5•pera•1d ago•13 comments

Ask HN: How does ChatGPT decide which websites to recommend?

5•nworley•1d ago•11 comments

Ask HN: What is the most complicated Algorithm you came up with yourself?

3•meffmadd•1d ago•7 comments

Ask HN: Is it just me or are most businesses insane?

8•justenough•1d ago•7 comments

Ask HN: Mem0 stores memories, but doesn't learn user patterns

9•fliellerjulian•2d ago•6 comments

Ask HN: Is there anyone here who still uses slide rules?

123•blenderob•4d ago•122 comments

Kernighan on Programming

170•chrisjj•5d ago•61 comments

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

2•guhsnamih•1d ago•4 comments

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

5•wewewedxfgdf•1d ago•3 comments

Ask HN: Any International Job Boards for International Workers?

2•15charslong•19h ago•2 comments

We built a serverless GPU inference platform with predictable latency

5•QubridAI•2d ago•1 comments

Ask HN: Does a good "read it later" app exist?

8•buchanae•3d ago•18 comments

Ask HN: Have you been fired because of AI?

17•s-stude•4d ago•15 comments

Ask HN: Anyone have a "sovereign" solution for phone calls?

12•kldg•4d ago•1 comments

Ask HN: Cheap laptop for Linux without GUI (for writing)

15•locusofself•3d ago•16 comments

Ask HN: How Did You Validate?

4•haute_cuisine•2d ago•6 comments

Ask HN: OpenClaw users, what is your token spend?

14•8cvor6j844qw_d6•4d ago•6 comments
Open in hackernews

Why do LMMs overuse these patterns of speech that aren't overused in the wild?

10•jimbo808•3mo ago
Here's a quote:

> "That’s not neuroscience — that’s cargo-cult reasoning wrapped in academic buzzwords."

I think most of us who use ChatGPT would immediately recognize this as being AI-generated. It's perfectly valid English and we could all imagine a real human saying it, but ChatGPT (or maybe LLMs more broadly) seem to have landed on certain patterns like this one that they use constantly. Is it some kind of overfitting? Post training where a biased toward this pattern was introduced? Something else?

Comments

A_D_E_P_T•3mo ago
"It's not X, it's Y" (or "it's not just X, it's Y") is one of the most common tells, but there are many others. Here's a partial list:

> https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing

It's definitely post-training bias and reinforcement. This rhetorical structure isn't too common IRL. (Or wasn't, anyway, prior to LLMs...)

incomingpain•3mo ago
the way they train the models is feeding it lots and lots of literature, and usually high quality.

The way they write by default is thus going to be this weird hybrid of all english styles/dialects from the last ~500 years.

The reason they are heavy with em dashes is that they were immensely popular in literature for a long time but not so much in modern times. So it stands out.

If you tell it to write in a specific way though, it does a good job at it.

Here's detroit english, no messin' around. The reason why AI's love the em dash so much is simple: it’s the most versatile and natural punctuation mark they can use to connect ideas and maintain flow. A large language model's primary goal is to sound human, and when people speak, they often pause, clarify, or insert a quick side-thought the dash captures that conversational stop-and-start rhythm better than a rigid comma or a full-stop period. Plus, in the enormous amount of text the AI studies (its data), the em dash is frequently used by skilled writers as an efficient tool to replace colons, parentheses, or strong commas, so the AI simply picked up that effective writing pattern and runs with it, seeing it as the clearest and most dynamic way to structure complex sentences. That's the real deal.

whycome•3mo ago
I used to use the em dash a lot. I refrain from doing it now. I hate that outcome.
incomingpain•3mo ago
>I used to use the em dash a lot. I refrain from doing it now. I hate that outcome.

I dont think it ever got taught in my schooling; the semi-colon is what they taught to use.

JohnFen•3mo ago
I refuse to let genAI determine what my writing style should be on principle. I may not be able to do much about the rest of the various degradations genAI brings, but I can at least stand my ground when it comes to my personal expression.
gooodvibes•3mo ago
This isn’t accurate - most of the style comes from the fine tuning and reinforcement learning, not from the original training data.

At some point people got this idea that LLMs just repeat or imitate their training data, and that’s completely false for today’s models.

bediger4000•3mo ago
So LLMs have gotten creativity recently?
gooodvibes•3mo ago
No, my point has nothing to do with creativity. It's about the fact that their output is taylored to look and sound in a certain way in the later stages of model training, it's not representative of the original text data the base model was trained on.
idonotknowwhy•3mo ago
Agreed. The pre-2025 base models don't write like this.
incomingpain•3mo ago
>This isn’t accurate - most of the style comes from the fine tuning and reinforcement learning, not from the original training data.

Fine tuning, reinforcement, etc are all 'training' in my books. Perhaps this is your confusion over 'people got this idea'

gooodvibes•3mo ago
> Fine tuning, reinforcement, etc are all 'training' in my books.

They are but they have nothing to do with how frequent anything is in literature which was your main point.

jotux•3mo ago
I read some articles when openai was first getting popular about them using cheap labor from English-speaking African countries (like Kenya): https://news.ycombinator.com/item?id=34426421

I remember other articles specifically talking about English language features of those regions(like "Okay" instead of "OK") getting into LLMs because of this.

teunlao•3mo ago
Humans cut the phrase to avoid sounding like AI. AI copied it because humans used it. Style collapsed into feedback loop.
muzani•3mo ago
ChatGPT seems to use this most often. Claude uses similar data and training and has similar patterns.

Gemini doesn't yet seem to have a consistent pattern, but it's quite different, and honestly, I find it even more disturbing; Gemini Flash is like chatting with an intelligent 9 year old. Grok feels unnaturally chirpy to me.

fogzen•3mo ago
I wonder if it's intentional so that they can identify their model output in the wild using stylometry.
raw_anon_1111•3mo ago
I recently re-read parts of this book thst I haven’t read in years.

https://github.com/97-things/97-things-every-programmer-shou...

I think it originally came out in 2013. If I were just introduced to it today, I would think some of the articles were AI written with only slight prompting.

It’s sad that if I write something from scratch, without any writing assistance besides spell checking, mostly for internal company consumption I now review it to see if it smells like AI.

(Just like I would have used a dash above. But I don’t anymore because it has an AI smell and I had to get out of the habit)