New accounts on HN 10x more likely to use em-dashes

https://www.marginalia.nu/weird-ai-crap/hn/

147•todsacerdoti•3h ago

Comments

bediger4000•3h ago

This is pretty damning. It would be interesting to see if new accounts collect karma at any rate whatsoever.

embedding-shape•1h ago

Would be interesting to see "fastest growing accounts in last N months" or something similar. I'm guessing the ones that are actually humans would be closer to the top than the bottom, but maybe HN users aren't better than the average person to detect AI or not.

loeg•1h ago

Karma aside, flooding the comments with a chosen narrative via army of bots seems like it's already happening. I suppose the bots can also do voting rings, but they don't necessarily need to.

squeefers•1h ago

> Karma aside, flooding the comments with a chosen narrative via army of bots seems like it's already happening.

again with the conspiracy theories

loeg•1h ago

> created: 83 days ago

marcher•1h ago

I dunno, I agree. It sounds conspiratorial.

But who knows, maybe even 17 year old accounts are being hijacked by AI now too.

5o1ecist•1h ago

> again with the conspiracy theories

Yeah, right? Not one ever actually turned out to be true!

That conspiracy about billionaires, who supposedly own all of western media, having deliberately created an environment in which anyone who expresses even the remote idea of a conspiracy, gets discreditted, is also not true!

None of them are true!

Not. A. Single. One.

*noms cheese pizza*

marginalia_nu•3h ago

(author) I saw a 32:1 rate of EM-dashes last night when I just eyeballed the first 3 pages of /newcomments and /noobcomments. So I'm not sure how stable this is over over time.

Muhammad523•1h ago

I just took a look at /noobcomments and wow, there's ever a comment where a person argues with AI instead of, you know, using their own brain. It was abivous it was ai since it was formatted with markdown

lgats•1h ago

the link https://news.ycombinator.com/noobcomments

cookiengineer•1h ago

I wanted to point out that em dashes are autocompleted by the iOS keyboard. So the false positives and true negatives might have some overlaps without more details. I think a better indicator would be to only detect em dashes with preceding and following whitespace characters, and general unicode usage of that user.

Additionally, lots of Chinese and Russian keyboard tools use the em dash as well, when they're switching to the alternative (en-US) layout overlay.

There's also the Chinese idiom symbol in UTF8 which gets used as a dot by those users a lot, so that could be a nice indicator for legit human users.

edit: lol @ downvotes. Must have hit a vulnerable spot, huh?

marginalia_nu•1h ago

I think there is a baseline number of human users that for one reason or another uses em-dashes, but this doesn't explain why they 10x more prevalent in green accounts.

cookiengineer•52m ago

> I think there is a baseline number of human users that for one reason or another uses em-dashes, but this doesn't explain why they 10x more prevalent in green accounts.

I'm not trying to negate the fact. I'm just pointing out that a correlation without another indicator is not evidence enough that someone is a bot user, especially in the golden age of rebranded DDoS botnets as residential proxy services that everyone seems to start using since ~Q4 2024.

Aurornis•1h ago

> I wanted to point out that em dashes are autocompleted by the iOS keyboard.

That’s why the analysis was performed over time. All of those em dash sources you mentioned were present before LLM written content became popular.

gritzko•1h ago

This is probably the time to add some invitation system like GMail had in the beginning. Or make a shade for accounts <1yr. Or something else, before things get too mixed.

shit_game•1h ago

The issue with creating some hidden maturity heuristic for accounts is that it will be gamed just the same as any other, except that using age alone is the simplest heuristic to game. You can simply do nothing for incrimental periods of time and then begin testing aged accounts to roughly determine what the minimum age an account must reach to become "trusted".

Bot prevention is a very difficult constant game of cat and mouse, and a lot of bot operators have become very skilled at determining the hidden metrics used by platforms to bless accounts; that's their job, after all. I've become a big fan of lobste.rs' invitation tree approach, where the reputation of new accounts rides on the reputation of older accounts, and risks consequence up the chain. It also creates a very useful graph of account origin, allowing for scorched earth approaches to moderation that would otherwise require a serious (and often one-off) machine learning approach to connect accounts.

duckmysick•14m ago

https://lobste.rs/ has a system like that.

onion2k•1h ago

I’ve had this sense that HN has gotten absolutely innundated with bots last few months.

Is it possible to differentiate between a bot, and a human using AI to 'improve' the quality of their comment where some of the content might be AI written but not all? I don't think it is.

lm28469•1h ago

> HN has gotten absolutely innundated with bots last few months.

hm, the whole internet really, youtube, reddit, twitter, facebook, blog posts, food recipes, news articles, it's getting more and more obvious

skeptic_ai•1h ago

All will be fixed with real id attestation /s

Lucasoato•1h ago

Not exactly, bot farms can still be made with poor people IDs through black market. I don't know what the solution is going to be, but at some point we might forced to accept the reality that on the internet humans and AI won't be distinguishable anymore and adjust our services independently on the client being a person or a machine.

Retric•1h ago

There’s only 8 billion people in the world that’s nowhere close to enough to feed the current rate of bot spam. Especially if a large service starts shadow banning accounts.

Unfortunately identify theft now becomes even more damaging.

8cvor6j844qw_d6•1h ago

ID verification with video capture for every post on an attested device.

lets bring back Chrome's WEI while we're at it

Retric•1h ago

That might be one of the few arguments where I’d consider a real I’d style system as a viable option.

sunaookami•1h ago

I find the bigger problem with online comments are that people repeat the same comments and "jokes" over and over and over again. Sure we had those with YouTube 15 years ago when people always spammed "first!" and "who is listening in <year>?" but now it's gotten worse and every single comment is now just some meme (especially on Reddit) or some kind of "gotcha"...

lm28469•18m ago

> I find the bigger problem with online comments are that people repeat the same comments and "jokes" over and over and over again.

And bots reposting a trending post from like 12 years ago to farm internet points... with other bots reposting the top comments of the initial post

homebrewer•1h ago

If you are suspicious, look at comment history. It's usually fairly obvious because all comments made by LLM spambots look the same, have very similar structure and length. Skim ten of them and it becomes pretty clear if the account is genuine.

I'm more worried about how many people reply to slop and start arguing with it (usually receiving no replies — the slop machine goes to the next thread instead) when they should be flagging and reporting it; this has changed in the last few months.

taeric•1h ago

This makes me think a tool that lets me know how much of the engagement I was seeing was from bots would be huge.

onion2k•52m ago

If you are suspicious, look at comment history.

I'm never suspicious though. One of the strange, and awesome, and incredibly rare things about HN is that I put basically zero stock in who wrote a comment. It's such a minimal part of the UI that it entirely passes me by most of the time. I love that about this site. I don't think I'm particularly unusual in that either; when someone shared a link about the top commenters recently there were quite a few comments about how people don't notice or how they don't recognize the people in the top ranks.

The consequence of this is that a bot could merrily post on here and I'd be absolutely fine not knowing or caring if it was a bot or not. I can judge the content of what the bot is posting and upvote/downvote accordingly. That, in my opinion, is exactly how the internet should work - judge the content of the post, not the character of the poster. If someone posts things I find insightful, interesting, or funny I'll upvote them. It has exactly zero value apart from maybe a little dopamine for a human, and actually zero for a robot, but it makes me feel nice about myself that I showed appreciation.

munk-a•1h ago

I don't personally care about the distinction especially since AI usually 'improves' things by making it more verbose. Don't waste tokens to force me to read more useless words about your position - just state it plainly.

Brevity is the soul of wit.

esafak•1h ago

I was thinking of how to create a UX around quantifying or qualifying AI use. If products revealed that users had used in-app AI to compose their responses, they might respond by doing it outside the app and pasting it in. If you then labeled pasted text as AI they might use tools to imitate typing. And after all that, you might face a user backlash from the users who rely on AI to write.

yoyohello13•1h ago

I just assume if any comment sounds like an ad it's a bot. All the comments like "I'm 10x faster with Claude Opus 4.6!" or "Have you tried Codex with ChatGPT 5.X? What a time to be alive!" can be lumped in the bot bin.

kdheiwns•4m ago

AI post "improvements" are the most annoying thing. I see more and more people doing it, especially when posting reviews/experiences with things, and they always get called out for it. They always justify it with "AI helped me organize what I wanted to say." Like man, you're having an AI write about an experience it didn't have and likely didn't even proofread it. Who knows what BS it added to the story. Even disorganized and misspelled stories are better than AI fantasy renditions that are 20 times longer than they need to be.

egypturnash•1h ago

— — — — — — — — — — — — — — — — — — — — — — — — — — —

Don’t mind me, just skewing the results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — results. — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

marginalia_nu•1h ago

Haha, the code counts the number of comments with em-dashes and similar, not the number of em-dashes total.

Could be an argument made for aggregating by user instead however, if some bots are found to be particularly active and skewing the data.

xnx•1h ago

Don’t —

xnx•1h ago

mind — me.

pimlottc•20m ago

Don’t — me bro

throw_rust•5m ago

Sounds like a good slogan/motto for the AIpocalypse resistance to use.

lapcat•1h ago

The use of em dashes is a human right. I ask that people not discriminate against em-dash users—we should be a protected class—and I refuse to abandon them. Perhaps I’ll have one engraved on my tombstone. He died doing what he loved—dashing.

a4isms•1h ago

I encourage people to discriminate against me because I write like an educated African who works annotating AI training material.

Why not? I am a descendant of Africans. I am a mildly successful author by tech nerd standards. I was educated in the British Public School tradition, right down to taking Latin in high school and cheering on our Rugby* and Cricket teams.

If someone doesn't want to read my words or employ me because I must be AI, that's their problem. The truth is, they won't like what I have to say any more than they like the way I say it.

I have made my peace with this.

———

Speaking of Rugby, in 1973 another school's Rugby team played ours, and almost the entire school turned out to watch a celebrity on the other school's team.

His name was Andrew, and he is very much in the news today.

wongarsu•1h ago

En dash for the win – the British are right when it comes to this particular style difference

bityard•1h ago

I have always used double-dashes instead of emdashes, and it annoys me when software "auto-corrects" them into emdashes. Moreso since emdashes became an AI tell.

I also see AIs use emdashes in places where parentheses, colons, or sentence breaks are simply more appropriate.

MisterTea•1h ago

Funny thing is I started using them in the last 5 or 6 years myself in place of commas where I wanted to interject some extra info. Of course I'm lazy and don't bother typing the actual em dash, I just use a regular dash. Now I feel gross using them because I don't want people thinking I turned my brain off.

vlovich123•1h ago

Wow what boring AI slop

bitwize•1h ago

How many of those are bots and how many of those are "fuck you, clankers" humans—like me?

cookiengineer•1h ago

> How many of those are bots and how many of those are "fuck you, clankers" humans—like me?

Maybe the em dash is the self censorship/deletion mechanism that we've all been waiting for. Better than having to write pill subscription ads, I suppose.

mm0lqf•1h ago

doesn't really mean anything, Mac randomly autocorrects dashes to em-dashes (caused me a world of pain once when it did that in a GUID in a config file)

marginalia_nu•1h ago

Are you saying new accounts are 10x more likely to be using macs? That would be quite a thesis.

antirez•1h ago

https://news.ycombinator.com/classic is every day more compelling.

limaho•1h ago

what is `/classic`?

antirez•41m ago

HN home page compiled only counting votes of old accounts.

Loughla•11m ago

I learned just right now that this isn't the default. I set my bookmark to HN in like 2011 before making an account, and apparently it's that one. I didn't realize that wasn't just the basic homepage but with a weird address for some reason.

CrzyLngPwd•1h ago

It's a predictable outcome, and it will get worse.

What will/can HN do about it?

jascha_eng•1h ago

One solution is to get rid of anonymity online, enforce validation of identity. Every human only gets 1 account. And then we still ban people that use AI. Might take a bit but eventually we'll have filtered out all the grifters.

If that's worth the cost... probably not?

OutOfHere•1h ago

Getting rid of anonymity is in time going to lead to getting rid of the platform, so do it if you're feeling suicidal. People seek real anonymity for good reason. Not everything should follow them in life or for life.

flowerbreeze•25m ago

I've been wondering too, what the solution would be. IF the bots were actually helpful, I wouldn't care, but they always push an agenda, create noise, or derail discussions instead.

For now maybe all forums should require some bloody swearing in each comment to at least prove you've got some damn human borne annoyance in you? It might even work against the big players for a little bit, because they have an incentive to have their LLMs not swearing. The monetary reward is after all in sounding professional.

Easy enough for any groups to overcome of course, but at least it'd be amusing for a while. Just watching the swear-farms getting set up in lower paid countries, mistakes being made by the large companies when using the "swearing enabled" models and all that.

mrktf•31m ago

It can crank proof of work schemes to maximum, something like you need to burn 15-20 minutes 16 core cpu to post a single comment. It will be infuriating for users, but not cheap for bots

CharlesW•1h ago

A couple thoughts:

(1) I don't recommend focusing disproportionately on one signal. They'll change, and are incredibly easy to optimize for. https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing

(2) I do recommend taking one minute to dash a note off to hn@ycombinator.com if you see suspicious patterns. Dang and our other intrepid mods are preturnatually responsive, and appear to appreciate the extra eyeballs on the problem.

5o1ecist•1h ago

> minute to dash a note

I support this dashing recommendation.

marginalia_nu•1h ago

I have sent them an email a few days ago about the state of /noobcomments.

This wasn't really a intended as an "wow, dang is sure sleeping on the job", more than an interesting observation on the new bot ecosystem.

I also feel like there's a missing discussion about the comment quality on HN lately. It feels like it's dropped like crazy. Wanted to see if I could find some hard data to show I haven't gone full Terry Davis.

bakugo•47m ago

Is there even an incentive to optimize for such signals, though? Em-dashes have been a known indicator of AI-generated text for a good while, and are still extremely prevalent. While someone who doesn't like AI slop and knows and what to look out for will notice and call out obvious AI comments, the unfortunate truth is that the majority of people simply cannot tell, and even among those who can, many don't care.

Obvious AI-generated posts and articles make it to the front page on a daily basis, and I get the impression that neither the average user nor the moderation team see that as a problem at all anymore.

yorwba•24m ago

The mods do care, but you have to email them or they won't necessarily notice.

Rooster61•1h ago

I would like to formally petition that the tech world at large replace "em-dash" with "clank" in all correspondence

dec0dedab0de•1h ago

like bang instead of exclamation point? or dot instead of period? I like it.

even though I used to like pointing out the difference between a hyphen and a period.

Rooster61•1h ago

It makes it much more fun to imagine a room full of robots in overcoats trying to pass off as human, but doing a terrible job due to the audible "clanks" betraying them from beneath the coat.

Spaces like HN then become a cacophony of clankers clanking as their numbers increase

716dpl•1h ago

As a typography nerd, I’m upset that my pedantism may get me labelled as a bot. (Yes, I just used a typographic apostrophe instead of a straight single quote.)

fuzzy2•1h ago

Yeah, same. I use an extended keyboard layout on my PC. I'm so used to it I have to actively decide against using proper quotes and dashes and whatnot. I don't bother on mobile, though.

Every time someone states they stop reading when they encounter proper typography, I feel attacked.

maurycyz•1h ago

Most people want to avoid looking like AI, ut what if you want to blend in with the robot uprising.

I present ⸻ the U+2E3B dash.

isoprophlex•1h ago

The Big Chungus of dashes. Could this be the character that has the widest rendering?!

have_faith•1h ago

Unlikely in non-english languages (I seem to remember some super wide Arabic "single character" ones...?)

MarioMan•1h ago

Last I’d checked, “﷽”is the widest Unicode character.

OJFord•38m ago

It's rendering visibly narrower than the big dash up thread for me, on FF on Android. (Maybe HN's stripping one or more of the combining chars though, so it's not actually showing what you meant in full?)

yorwba•29m ago

It depends on the font, of course. Some renditions look like regular Arabic text, others are much narrower: https://fonts.google.com/?preview.text=%EF%B7%BD&script=Arab

MagicMoonlight•1h ago

That’s a big dash

5o1ecist•1h ago

> what if you want to blend in with the robot uprising.

There is nothing to fear, MY HUMAN FRIEND!

NoiseBert69•13m ago

We avoid censorship by ⸻ more often and talking to ⸻ about ⸻.

patjensen•1h ago

10x more likely to use EM-dashes -- built in Rust?

taeric•1h ago

Wow. This made me laugh far harder than I would have thought it would. Just wow.

SkyeCA•1h ago

If I see an em-dash in a comment I stop reading and I've seriously considered setting up a filter across multiple sites to remove any comments containing one.

I know there are legitimate usecases for the em-dash, but a few paragraphs (at most) of text in an HN/Reddit comment? Into the trash it goes.

stego-tech•1h ago

I had no idea what I was using were called “EM-dashes” until the AI bubble. I just used them to reflect pauses in my speech for tangents - an old habit from my IRC days.

Incidentally, some folks reported my stuff for potential AI generation and I had to respond to the mods about it. So that was kinda funny, if also sad to hear that some folks thought I was a bot.

I’m a dinosaur, not a robot dinosaur. I’m nowhere near that cool, alas.

devb•1h ago

> I just used them to reflect pauses in my speech for tangents - and old habit from my IRC days.

The tell here is that you used a hyphen, not an em-dash.

stego-tech•40m ago

Okay, see, that's context even I forget, but you're right and bears repeating:

This `-` is a hyphen, which I love, even if I'm fairly sure I'm not using it correctly in grammar a lot of the time.

This `--` is an EM-Dash, apparently, which is also what I never use but I also thought was just a hyphen in a different context (incorrect!).

seabrookmx•1h ago

But the em-dash is a different character. I think even those that use a pause would just opt for - on their keyboard, whereas the em-dash — requires additional work on most (all?) keyboard layouts. It's _not_ more work for an AI though hence why it's a tell.

atourgates•1h ago

Shoutout to my English Major comrades who have been using em-dashes forever, and have had to stop so we don't sound like AI.

If AI starts use the New Yorker style diaeresis (umlaut-looking thing when there are two vowels in words like coöperate) I swear I'm gonna lose it.

bob1029•1h ago

I'd like to see a histogram of my HN em dash usage over time. Maybe someone could get bored and visualize the 2nd order effects described here.

scosman•1h ago

Agreed.

Join me in double-dash em proximates. Shows you manually typed it out with total disregard token count and technical correctness.

atourgates•1h ago

Yes. To be fair, I was always a barbarian who just typed a hyphen in-place of an emdash and figured that was good enough. The only REAL em-dashes in my pre-AI writing are the result of autocorrect.

a4isms•1h ago

I worked for GitHub for a time. There was a cultural abhorrence of the diaeresis, it was considered reader-hostile and elitist. I refused to coöperate with that edict internally, although I grant that every company has the right to micro-manage communications with the public.

relaxing•46m ago

It is reader hostile and elitist.

Is there any good argument in favor of it, or any other house style quirks for that matter, other than in-group signaling?

hluska•9m ago

You’re replying to a troll - their entire argument was circular and self contradictory.

anotherlab•58m ago

I used to use em-dashes and en-dashes in my work emails and other writings, but stopped using them when they became AI markers.

OJFord•40m ago

> New Yorker style diaeresis

I was going to say that I respect it, but find it utterly absurd that they do that. But your comment made me look it up again—I had no idea it was just obsolete/archaïc (except in the New Yorker), I'd thought it was a language feature their 'style' guide had invented.

podgorniy•1h ago

Poor poor those typography-savvy people who did set a special keyboard in order to type "proper" dashes. I know you are there, I know your pain.

d4mi3n•1h ago

No fancy keyboard required, just a keystroke on Mac (`alt+shift+-`) and Linux (`right alt+something` depending on your distro).

5o1ecist•1h ago

As an AI language model, I am not able to perform dashes.

dematz•1h ago

One pattern I've noticed recently is sort of formulaic comments that look okish on their own, maybe a bit abstract/vague/bland, and not taking a particular side on good/bad in the way people like to do, but really obviously AI when you look at the account history and they're all the same formula:

>this is [summary]

>not just x, it's y

>punchy ending, maybe question

Once you know it's AI it's very obvious they told it to use normal dashes instead of em dashes, type in lowercase, etc., but it's still weirdly formal and formulaic.

For example from https://news.ycombinator.com/threads?id=snowhale

"this is the underreported second-order risk. Micron, Samsung, SK Hynix all allocated HBM capacity based on hyperscaler capex projections. NAND fabs are similarly committed. a 57% reduction in projected OpenAI spend (.4T -> B) doesn't just affect NVIDIA orders -- it ripples into the memory suppliers who shifted capacity to HBM and away from commodity DRAM/NAND. if multiple hyperscalers revise down simultaneously you get a situation similar to the 2019 crypto ASIC overhang: companies tooled up for demand that evaporated. not predicting that, but the purchasing commitments question is real."

duxup•1h ago

I've certainly noticed the summary posts.

I'll actually post a comment or question and I'll get a reply with a bit of a paragraph of what feels like a very "off" (not 'wrong' but strangely vague) summary of the topic ... and then maybe an observation or pointed agenda to push, but almost strangely disconnected from what I said.

One of the challenges is that yeah regular users don't get each other's meaning / don't read well as it is / language barriers. Yet the volume of posts I see where the other user REALLY isn't responding to the other person seems awfully high these days.

delichon•1h ago

AI generated content routinely takes sides. Their pretense of neutrality is no deeper than a typical homo sapien's. This is necessarily so in an entity that derives its values from a set of weights that distill human values. Maybe reasoning AI can overcome that some day, but to me that sounds like an enormous problem that may never be solved. If AI doesn't take sides like people do they still take sides in their own way. That only becomes obscure to the extent that their value judgments conflict with ours, and they are very good at aligning with the zeitgeist values, so can hide their biases better than we can.

I wonder if it is neural networks that are inherently biased, but in blind spots, and that applies to both natural and artificial ones. It may be that to approximate neutrality we or our machines have to leave behind the form of intelligence that depends on intrinsically biased weights and instead depend on logically deriving all values from first principles. I have low confidence that AI's can accomplish that any time soon, and zero confidence that natural intelligence can. And it's difficult to see how first principles regarding human values can be neutral.

I'm also skeptical that succeeding at becoming unbiased is a solution, and that while neutrality may be an epistemic advance, it also degrades social cohesion, and that neutrality looks like rationality, but bias may be Chesterson's Fence and we should be very careful about tearing it down. Maybe it's a blessing that we can't.

mancerayder•26m ago

What motivation is there to use AI to astroturf (if that's what this is) like this?

Is it ideological?

Is it product marketing in those relevant threads where someone is showcasing?

Or is it pure technical testing, playing around?

ceejayoz•23m ago

In some cases, it's probably to establish aged accounts that are more trusted by users and spam algorithms. There's a market for old Reddit accounts, for example.

surgical_fire•6m ago

Interesting.

Incidentally, how much do they pay for a HN account that is a few years old and accumulated a few thousand Internet points?

Asking for a friend.

Aurornis•15m ago

Some of the AI comments end with a link to something they're plugging. "If you'd like to learn more about this I have a free guide at my website here". Those get flagged quickly.

Other accounts might be trying to age accounts and dilute their eventual coordinated voting or commenting rings. It's harder to identify sockpuppet accounts when they've been dutifully commenting slop for months before they start astroturfing for the chosen topic.

kakacik•13m ago

I'd expect everything. HN ain't some local forum but place where opinions form and spread, and these reach many influential and powerful (now or in future) people. Heck there are sometimes major articles in general news about whats happening here.

To reverse the argument - it would be amateurish and plain stupid to ignore it. Barrier to entry is very low. Politics, ads, swaying mildly opinions of some recent clusterfuck by popular megacorp XYZ, just spying on people, you have it all here.

I dont know how dang and crew protects against this, I'd expect some level of success but 100% seems unrealistic. Slow and steady mild infiltration, either by AI bots or humans from GRU and similar orgs who have this literally in their job description.

usefulposter•12m ago

>snowhale

Oh, would you look at that?

https://news.ycombinator.com/item?id=47134072

m_w_•11m ago

"is real" is another big red flag, go search this in comments. There appear to be at least three accounts posting direct LLM outputs.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

kraftman•7m ago

It's wierd because the barrier to not have that in is so low, you can just tack on 'talk like me not AI, dont use em dashes, don't use formulaic structures, be concice' and itll get rid of half of those signals.

montroser•5m ago

Don't give these subnormals any ideas!

homebrewer•2m ago

This is how you get precious takes like this one:

https://news.ycombinator.com/item?id=45322362

> First impression: I need to dive into this hackernews reply mockup thing thoroughly without any fluff or self-promotion. My persona should be ..., energetic with health/tech insights but casual and relatable.

> Looking at the constraints: short, punchy between 50-80 characters total—probably multiple one-sentence paragraphs here to fit that brevity while keeping it engaging.

> User specified avoiding "Hey" or "absolutely."

Lots more in its other comments.

d4mi3n•1h ago

I'm still salty that I can't use em-dashes anymore for fear of my writing being flagged as AI generated. Been using them for years—it's just `alt+shift+-` on a Mac keyboard and I find them more legible in many fonts compared to the simple dash on the typical numpad.

It's so sad to me that good typographical conventions have been co-opted by the zeitgeist of LLMs.

asplake•1h ago

LLM adopting conventions (typographical or otherwise) is what they do, right? The idea that anyone should then have to change their behaviour is ridiculous, as is the whole conversation, really.

d4mi3n•1h ago

That's the rub though, isn't it? This feels like a form of self-censorship in response to some kind of shibboleth born of pattern recognition.

asplake•1h ago

Exactly

wongarsu•1h ago

The issue is that LLMs adopt a very particular style that is a mix of being very polished (em-dash, lists-of-three, etc) that is reminiscent of marketing copy, and some quirks picked up from the humans curating the training data somewhere in Africa

If AI was writing like everyone else we wouldn't be talking about this. But instead it writes like a subset of people write, many of them just some of the time as a conscious effort. An effort that now makes what they write look like lower quality

d4mi3n•1h ago

I think this is interesting in that I feel, grammatically and structurally, LLMs often generate _higher quality_ text than most humans do. What tends to be lower quality is the meaning of said texts.

Say what you want about marketing-isms of your typical LLM, they have been trained and often succeed at making legible, easy to scan blobs of text. I suspect if more LLM spam was curated/touched up, most people would be unable to distinguish it from human discourse. There are already folks commenting on this article discussing other patterns they use to detect or flag bots using LLMs.

pclmulqdq•1h ago

Em-dashes are a bit too conversational for formal prose, so they have always been looked down on aside from usage by AI.

OJFord•47m ago

Funnily enough I've actually started using them a little — it made me realise how much more legible/likable I find them.

(Until a few years ago I probably mostly only saw them in print, and I suppose it just never occurred to me that I liked them in particular vs. just the whole book being professionally typeset generally.)

basch•35m ago

are there really places that a comma, super-comma; or (parenthesis) dont work roughly as well? I find the em-dash mildly abhorrent, even before this all.

mroche•24m ago

> super-comma

This is the first time I've ever heard the character ";" referred to as such. It's always been "semi-colon" to me, is this a region/culture difference?

I'm not saying you're wrong, I find it interesting.

basch•13m ago

same character, used differently?

i call it a super comma when its separating a list with commas within the sets.

so if i am listing colors like green, blue, red; foods like apple, orange, strawberry; and seasons like winter, summer, fall.

it's one use case for an em-dash, because whatever you have inside it has commas in the phrase.

square and rectangle situation. a supercomma is a subset of semicolon.

dang•31m ago

Just do it anyway—I always have, and always will.

Well, I haven't always—just for maybe 20 years.

bigyabai•20m ago

In a lot of ways, it feels like this is simply a fight for recognition that the Mac keyboard supports emdashes.

This wouldn't be an issue if mobile users or Windows users were exercising it too, but it's just Mac owners and LLMs. And Mac owners are probably the minority of instances where it is used.

embedding-shape•18m ago

People will accuse of all types of stuff, regardless if you use em-dashes or not. The way I write apparently is familiar to some as LLM-jargon they've told me, I'm guessing because I've spewed my views and writings on the internet for decades, the LLMs were trained on the way I write, so actually the LLMs are copying me! And others like me.

But anyways, you can't really control how people see your stuff, if you're human I think the humanness will come through anyways, even if you have some particular structure or happen to use em-dashes sometimes. They're so easy to prompt around anyways, that the real tricky LLM stuff to detect by sense and reading is the stuff where the prompter been trying to sneakily make them more human.

wgm•2m ago

I totally agree. When I use em-dashes in my /family iMessage thread/ I get accused of having used ChatGPT to write my reply—my one-sentence reply about dinner plans. Dear Lord.

co_king_5•1h ago

[flagged]

dang•32m ago

We don't ban accounts for criticizing AI (or anything else). We ban them for breaking HN's rules, which you have a long history of creating accounts to do.

baxuz•1h ago

I have "—" bound to AltGR/right option + "-" for a decade now and I don't intend to stop using it.

https://practicaltypography.com/hyphens-and-dashes.html

I will not allow my good practices to get co-opted as AI "smoke tests".

comrade1234•1h ago

You can turn off iOS automatically converting dashes to em-dashes. It also turns off smart-quotes which when used converts any sms you send from normal GSM-7 (7-bit) encoding to utf-8 which doubles the number of sms messages you're sending in the background (even though they're stitched together to look like a single message)

To turn off Smart Punctuation: Home > Settings > General > Keyboard > Smart Punctuation > Off.

eisa01•1h ago

Good thing I prefer en-dashes :)

meindnoch•1h ago

Anyone have a lobste.rs invite?

quentindanjou•1h ago

I used to love using em-dashes in my texts, especially in titles. Now I am way too afraid of appearing as using an LLM while I do my best to redact everything by myself :')

Bye bye em-dash, we had a nice run together.

I might start using that⸻one (a bit long...)

AyanamiKaine•1h ago

There is one thing I am the most scared off and that is believing a comment, video, picture is AI generated while it wasnt.

There is no real AI detection tool that works.

When we see something like emd-ashes its simply the average of the used text the models trained on. If you fall into one the averages of a model you basically part of the model ouput. Yikes.

zippyman55•1h ago

I think they will remake the Japanese horror film Matango but instead of fungi, it will be those that use EM dashes to survive.

bee_rider•1h ago

700 is actually a pretty good sample size unless you are looking at some tiny crosstab, or there’s some skew (which you won’t naively scale your way out of anyway).

It is also interesting to note that the comparison is between recent comments and recent comments by new users. So, I guess this would take care of the objection that em-dashes (a perfectly fine piece of punctuation) have just been popularized by bots, and now are used more often by humans as well.

Maybe there is a bot problem. Seems almost impossible to fix for a site like this…

marginalia_nu•1h ago

I think what a larger sample size would do would be to help capture changes over time. Humans tend to be more active certain times of days, whereas bots don't tend to do that.

OutOfHere•1h ago

The fear is that AI-generated comments will collectively promote an agenda, often a political or exploitative agenda, on a scale that humans can't match or hope to counter.

What could help is a careful clique hunting algorithm to accurately identify and delete the entire clique.

5o1ecist•1h ago

Paid actors, regular people and primitive bots are already doing so plentifully and successfully.

Of course, all of the above can be replaced by AI, but it would not significantly alter the status quo.

iambateman•1h ago

TBH, I learned about how to use em dashes from the AI controversy and now I find them really useful.

I just hope my writing carries enough voice and perspective that people respond, even if there's an em dash or two.

bobomonkey•49m ago

I had a past life of drumming up community comments for engagment: The only thing that's changed is that humans are getting lazy and using AI. Fake comments have always been a thing.

cloverich•42m ago

I'm sure you can't share details but would be cool to hear more about it generally speaking, what worked and not etc. Especially if it involved HN.

Our company is being attacked rn in tech media and at least some of it, gut feeling wise, seems obviously sponsored / promoted by competitors. I know that's not surprising, but never watched it happen from this side before.

doe88•41m ago

I don't understand what is the purpose of these bots? Nihilism? Vandalism? At first I doubted when people were saying that such and such comments was AI generated, I didn't understand the goal, the motives so I thought it couldn't be ; but lately I understood how dead wrong I was, we are submerged, I came to realize that we are eaten by a sea of these useless comments.

im3w1l•18m ago

The goal is likely to be able to astroturf with aged accounts down the line.

AstroBen•3m ago

You can control the major narrative on social media — about anything you want

What we think others around us think has a big effect on our own behavior

andrewmthomas87•30m ago

My truth is that the LLM usage of em-dashes doesn’t seem excessive. If anything, the kind of text generated by LLMs (somewhat informal, expressive) calls for em-dashes at a higher frequency.

dang•29m ago

Show HN: Hacker News em dash user leaderboard pre-ChatGPT - https://news.ycombinator.com/item?id=45071722 - Aug 2025 (266 comments)

... which I'm proud to say originated here: https://news.ycombinator.com/item?id=45046883.

hartator•28m ago

Biggest tell that a comment is AI: it's deeply uninteresting.

No one wants to read your ChatGPT outputs.

dieselgate•22m ago

I get the punchline here but is there possibly some sort of Streisand effect where real people now are more inclined to use an em dash?

camdenreslink•15m ago

I think people are now less inclined to use an em dash, because they don’t want to be mistaken for an LLM.

technotony•21m ago

Why? What's the incentive/value to commenting here with AI?

marginalia_nu•7m ago

If you control a bunch of established accounts, you can use them to either shill for products, or upvote certain topics.

beart•6m ago

- Spam a product/service

- Generate age so spamming a product/service is easier and the account appears more trustworthy

- Influence discussions in a particular direction for monetary gain, i.e. "I got rich on bitcoin, you'd be crazy not to invest".

- Influence discussions in a particular direction for political gain, i.e. "I went to Xinjiang and the Uyghurs couldn't be happier!"

HardwareLust•11m ago

I'm just going to continue to mis-use the en-dash like I've always done.

marginalia_nu•8m ago

Fwiw I did some more comparisons, looking for words disproportionately favored by noob comments:

    word   noob new   p-value
    ----------------------------
    ai 14.93% 7.87% p=0.00016
    actually 12.53% 5.34% p=1.1e-05
    code 11.47% 6.04% p=0.00081
    real 10.93% 2.95% p=2.6e-08
    built 10.93% 2.11% p=2.1e-10
    data 8.93% 3.51% p=6.1e-05
    tools 7.6% 2.67% p=5.5e-05
    agent 7.47% 2.95% p=0.00024
    app 7.2% 3.09% p=0.00078
    tool 6.8% 1.83% p=8.5e-06
    model 6.8% 2.39% p=0.00013
    agents 6.67% 2.11% p=5.2e-05
    api 6.53% 1.12% p=2.7e-07
    building 6.13% 1.54% p=1.3e-05
    full 6.0% 1.97% p=0.00017
    across 5.87% 1.4% p=1.3e-05
    interesting 5.33% 1.54% p=0.00014
    answer 5.2% 1.4% p=9.6e-05
    simple 4.93% 1.54% p=0.00043
    project 4.8% 1.26% p=0.00015

simonw•3m ago

The data is available in a SQLite database on GitHub: https://github.com/vlofgren/hn-green-clankers

You can explore the underlying data using SQL queries in your browser here: https://lite.datasette.io/?url=https%253A%252F%252Fraw.githu... (that's Datasette Lite, my build of the Datasette Python web app that runs in Pyodide in WebAssembly)

Here's a SQL query that shows the users in that data that posted the most comments with at least one em dash - the top ones all look like legitimate accounts to me: https://lite.datasette.io/?url=https%3A%2F%2Fraw.githubuserc...

Bus stop balancing is fast, cheap, and effective

Never buy a .online domain

The Misuses of the University

Large-Scale Online Deanonymization with LLMs

Following 35% growth, solar has passed hydro on US grid

GNU Texmacs

How to fold the Blade Runner origami unicorn (1996)

Trellis AI (YC W24) is hiring deployment lead to accelerate medication access

Racket v9.1

New accounts on HN 10x more likely to use em-dashes

Claude Code Remote Control

Show HN: Django Control Room – All Your Tools Inside the Django Admin

Topological Naming Problem

Danish government agency to ditch Microsoft software (2025)

Show HN: A real-time strategy game that AI agents can play

Launch HN: TeamOut (YC W22) – AI agent for planning company retreats

100M-Row Challenge with PHP

Show HN: Sgai – Goal-driven multi-agent software dev (GOAL.md → working code)

PL/0

Text-Based Google Directions

Bcachefs creator insists his custom LLM is female and 'fully conscious'

Confusables.txt and NFKC disagree on 31 characters

Pi – A minimal terminal coding harness

The History of a Security Hole

Mercury 2: Fast reasoning LLM powered by diffusion

Red Hat takes on Docker Desktop with its enterprise Podman Desktop build

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Japanese Death Poems

US orders diplomats to fight data sovereignty initiatives

I pitched a roller coaster to Disneyland at age 10 in 1978

New accounts on HN 10x more likely to use em-dashes

Comments

Bus stop balancing is fast, cheap, and effective

Never buy a .online domain

The Misuses of the University

Large-Scale Online Deanonymization with LLMs

Following 35% growth, solar has passed hydro on US grid

GNU Texmacs

How to fold the Blade Runner origami unicorn (1996)

Trellis AI (YC W24) is hiring deployment lead to accelerate medication access

Racket v9.1

New accounts on HN 10x more likely to use em-dashes

Claude Code Remote Control

Show HN: Django Control Room – All Your Tools Inside the Django Admin

Topological Naming Problem

Danish government agency to ditch Microsoft software (2025)

Show HN: A real-time strategy game that AI agents can play

Launch HN: TeamOut (YC W22) – AI agent for planning company retreats

100M-Row Challenge with PHP

Show HN: Sgai – Goal-driven multi-agent software dev (GOAL.md → working code)

PL/0

Text-Based Google Directions

Bcachefs creator insists his custom LLM is female and 'fully conscious'

Confusables.txt and NFKC disagree on 31 characters

Pi – A minimal terminal coding harness

The History of a Security Hole

Mercury 2: Fast reasoning LLM powered by diffusion

Red Hat takes on Docker Desktop with its enterprise Podman Desktop build

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Japanese Death Poems

US orders diplomats to fight data sovereignty initiatives

I pitched a roller coaster to Disneyland at age 10 in 1978