frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Slop or not – can you tell AI writing from human in everyday contexts?

https://slop-or-not.space
5•eigen-vector•1h ago
I’ve been building a crowd-sourced AI detection benchmark. Two responses to the same prompt — one from a real human (pre-2022, provably pre prevalence of AI slop on the internet), one generated by AI. You pick the slop. Three wrong and you’re out.

The dataset: 16K human posts from Reddit, Hacker News, and Yelp, each paired with AI generations from 6 models across two providers (Anthropic and OpenAI) at three capability tiers. Same prompt, length-matched, no adversarial coaching — just the model’s natural voice with platform context. Every vote is logged with model, tier, source, response time, and position.

Early findings from testing: Reddit posts are easy to spot (humans are too casual for AI to mimic), HN is significantly harder.

I'll be releasing the full dataset on HuggingFace and I'll publish a paper if I can get enough data via this crowdsourced study.

If you play the HN-only mode, you’re helping calibrate how detectable AI is on here specifically.

Would love feedback on the pairs — are any trivially obvious? Are some genuinely hard?

Comments

lucastonelli•59m ago
Hey, congratulations on the final product. It even feels fun. Some are really hard, but some feel blatantly obvious. I don't know why though. I guess it's just because the way we communicate feels off when compared to AI, some times.
eigen-vector•38m ago
Thanks for checking it out! The obvious ones are (hopefully) weaker models :) but yes my experience has been unless you're engaging with human written content consistently the line really blurs easily.
SsgMshdPotatoes•43m ago
Nice idea! Em dashes were giveaways for AIs and typos for human, at least in the ones I did, so those are at least trivial. So might have to do some filtering at least for those.

Some were hard though, yeah (at least if not looking longer than 5-10 seconds). Btw, it seemed more logical to me to just see a green/red card when you click, i.e. right choice or wrong choice. Getting red for the correct answer confused me a bit (but this might just be me).

lucastonelli•33m ago
The coloring is a fair point. I was some times confused if I got the right or the wrong one XD
eigen-vector•31m ago
Thanks for checking it out! The color signal is useful feedback. Let me think about it and rework!

Yeah there are some very obvious tells, but the models that are most capable are very good at writing like human.

Especially when the human responses for reddit or HN prompts were presumably made after reading the content of the article or the post; whilw the model is simply going off of the title.

SsgMshdPotatoes•24m ago
Also for example this one has a giveaway for the human case: "There are lots of great people here at /r/personalfinance" (actually, not sure if that is a giveaway, that was my guess, but depends on how the model was prompted, I guess). And human ones often seem to have two spaces sometimes instead of one, idk why. If you want to get a serious dataset, maybe you could use this one to find all the flaws and perfect it, and then try to get a real dataset from the next one? People will be more eager to help too if they've seen you designed it all very carefully. (Or you could filter the results from this one to make it a good dataset if you get lots of responses.)
eigen-vector•13m ago
You'd be surprised at the nuances we tend to miss :)

This time around I prompted the models not necessarily to be adversarial - i didn't ask them to try and fool the reader. But i gave them contextual info - something to the effect of "you're a user posting on hacker news"

Crypto investor turns $50M into $36,000 in one botched move

https://www.coindesk.com/markets/2026/03/12/crypto-investor-turns-usd50-million-into-usd36-000-in...
1•scrlk•40s ago•0 comments

US carrying out rescue effort after military aircraft crash in Iraq

https://www.reuters.com/world/middle-east/us-carrying-out-rescue-effort-after-losing-aircraft-ira...
1•tartoran•41s ago•0 comments

Steve Yegge Wants You to Stop Looking at Your Code

https://www.oreilly.com/radar/steve-yegge-wants-you-to-stop-looking-at-your-code/
1•metadat•1m ago•0 comments

Anthropic and OpenAI just exposed SAST's structural blind spot with free tools

https://venturebeat.com/security/anthropic-openai-sast-reasoning-scanners-security-directors-guide
1•mooreds•1m ago•0 comments

US fuel tanker aircraft crashes in Iraq – what we know and don't know

https://www.bbc.com/news/live/c4gqjyk0vx3t
1•tartoran•1m ago•0 comments

Before you let AI agents loose, you'd better know what they're capable of

https://thenewstack.io/risk-mitigation-agentic-ai/
1•chhum•5m ago•0 comments

To use your brain, first accept the Terms and Conditions

https://ctlj.colorado.edu/?p=1460
1•hhs•6m ago•0 comments

Spacedrive v3: The local-first data engine

https://spacedrive.com/blog/spacedrive-v3-launch
1•raybb•7m ago•0 comments

I'll probably never use Windows

https://waspdev.com/articles/2026-03-12/i-ll-probably-never-use-widows
1•senfiaj•7m ago•0 comments

London Man wore smart glasses for High Court 'coaching'

https://www.bbc.co.uk/news/articles/cj6d4k65ky5o
1•bsdz•8m ago•0 comments

France's ghost car scandal that allowed one million illegal vehicles onto the

https://www.bbc.com/news/articles/cpqw0jgn5xgo
1•absqueued•9m ago•0 comments

I Built a Modern Million Dollar Homepage with Pixel Wars

https://pixelboard-xxx.lovable.app
1•BeParent•10m ago•1 comments

How long does it take to get last liquid drops from kitchen containers?

https://www.brown.edu/news/2026-03-04/kitchen-fluid-dynamics
2•hhs•13m ago•0 comments

Cameyo by Google: Run any legacy application and turns it into a PWA

https://cameyo.google/
1•devy•13m ago•0 comments

Vibe in Go – It's the only way

https://yagnipedia.com/wiki/vibe-in-go
2•riclib•13m ago•2 comments

Trump's AI-Powered World Wars

https://theintercept.com/2026/03/11/podcast-trump-ai-world-wars/
2•agarttha•14m ago•0 comments

Live Nation Executives Brag About "Robbing" Ticket Buyers in Slack DMs

https://pitchfork.com/news/live-nation-executives-brag-about-robbing-ticket-buyers-in-slack-dms/
2•cdrnsf•14m ago•0 comments

The Force Multiplier Generation

https://twitter.com/SirMoremoney/status/2030763229815963870
1•snoren•17m ago•0 comments

Daily multivitamin use may slow biological aging: COSMOS trial results

https://www.massgeneralbrigham.org/en/about/newsroom/press-releases/daily-multivitamin-use-may-sl...
2•hhs•20m ago•0 comments

Fresh Open Claw Documentation

1•manos-saratsis•23m ago•0 comments

Wvw.dev: world vibe web – A free and OSS federated app store for vibecoded apps

https://wvw.dev
1•fka•24m ago•1 comments

Simulating Catalog and Table Conflicts in Iceberg

https://cdouglas.github.io/posts/2026/03/catalog
1•karsinkk•25m ago•0 comments

DOGE Operative Accused of Planning to Take Social Security Data Is Named

https://www.wired.com/story/john-solly-doge-operative-accused-social-security-data-leidos/
8•afavour•26m ago•1 comments

Replit's Jordanian Immigrant Billionaire Founder Shakes Up Vibe Coding

https://www.forbes.com/sites/richardnieva/2026/03/11/meet-the-9-billion-ai-company-reimagining-vi...
1•abdelhousni•26m ago•0 comments

Ask HW: Claude Code design tools

1•dogclaw•27m ago•0 comments

Pi-Autoresearch

https://github.com/davebcn87/pi-autoresearch
1•tin7in•28m ago•0 comments

The AI coding divide: craft lovers vs. result chasers

https://blog.lmorchard.com/2026/03/11/grief-and-the-ai-split/
5•avernet•29m ago•0 comments

Apollo's Private Credit Logic Is a Lot Like Goldman

https://www.bloomberg.com/opinion/articles/2026-03-12/private-credit-apollo-logic-on-loan-values-...
2•petethomas•33m ago•0 comments

Show HN: Tokemon, a terminal dashboard to track LLM token usage

1•mm65•33m ago•0 comments

Learning Is Forgetting; LLM Training as Lossy Compression

https://openreview.net/forum?id=tvDlQj0GZB
1•pera•34m ago•0 comments