frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Slop or not – can you tell AI writing from human in everyday contexts?

https://slop-or-not.space
7•eigen-vector•1h ago
I’ve been building a crowd-sourced AI detection benchmark. Two responses to the same prompt — one from a real human (pre-2022, provably pre prevalence of AI slop on the internet), one generated by AI. You pick the slop. Three wrong and you’re out.

The dataset: 16K human posts from Reddit, Hacker News, and Yelp, each paired with AI generations from 6 models across two providers (Anthropic and OpenAI) at three capability tiers. Same prompt, length-matched, no adversarial coaching — just the model’s natural voice with platform context. Every vote is logged with model, tier, source, response time, and position.

Early findings from testing: Reddit posts are easy to spot (humans are too casual for AI to mimic), HN is significantly harder.

I'll be releasing the full dataset on HuggingFace and I'll publish a paper if I can get enough data via this crowdsourced study.

If you play the HN-only mode, you’re helping calibrate how detectable AI is on here specifically.

Would love feedback on the pairs — are any trivially obvious? Are some genuinely hard?

Comments

lucastonelli•1h ago
Hey, congratulations on the final product. It even feels fun. Some are really hard, but some feel blatantly obvious. I don't know why though. I guess it's just because the way we communicate feels off when compared to AI, some times.
eigen-vector•1h ago
Thanks for checking it out! The obvious ones are (hopefully) weaker models :) but yes my experience has been unless you're engaging with human written content consistently the line really blurs easily.
SsgMshdPotatoes•1h ago
Nice idea! Em dashes were giveaways for AIs and typos for human, at least in the ones I did, so those are at least trivial. So might have to do some filtering at least for those.

Some were hard though, yeah (at least if not looking longer than 5-10 seconds). Btw, it seemed more logical to me to just see a green/red card when you click, i.e. right choice or wrong choice. Getting red for the correct answer confused me a bit (but this might just be me).

lucastonelli•1h ago
The coloring is a fair point. I was some times confused if I got the right or the wrong one XD
eigen-vector•1h ago
Thanks for checking it out! The color signal is useful feedback. Let me think about it and rework!

Yeah there are some very obvious tells, but the models that are most capable are very good at writing like human.

Especially when the human responses for reddit or HN prompts were presumably made after reading the content of the article or the post; whilw the model is simply going off of the title.

SsgMshdPotatoes•1h ago
Also for example this one has a giveaway for the human case: "There are lots of great people here at /r/personalfinance" (actually, not sure if that is a giveaway, that was my guess, but depends on how the model was prompted, I guess). And human ones often seem to have two spaces sometimes instead of one, idk why. If you want to get a serious dataset, maybe you could use this one to find all the flaws and perfect it, and then try to get a real dataset from the next one? People will be more eager to help too if they've seen you designed it all very carefully. (Or you could filter the results from this one to make it a good dataset if you get lots of responses.)
eigen-vector•57m ago
You'd be surprised at the nuances we tend to miss :)

This time around I prompted the models not necessarily to be adversarial - i didn't ask them to try and fool the reader. But i gave them contextual info - something to the effect of "you're a user posting on hacker news"

Show HN: OneCLI – Vault for AI Agents in Rust

https://github.com/onecli/onecli
111•guyb3•7h ago•37 comments

Show HN: Axe – A 12MB binary that replaces your AI framework

https://github.com/jrswab/axe
132•jrswab•9h ago•91 comments

Show HN: Detect any object in satellite imagery using a text prompt

https://www.useful-ai-tools.com/tools/satellite-analysis-demo/
8•eyasu6464•4d ago•3 comments

Show HN: Understudy – Teach a desktop agent by demonstrating a task once

https://github.com/understudy-ai/understudy
77•bayes-song•6h ago•23 comments

Show HN: Rudel – Claude Code Session Analytics

https://github.com/obsessiondb/rudel
124•keks0r•10h ago•72 comments

Show HN: OpenClaw-class agents on ESP32 (and the IDE that makes it possible)

https://pycoclaw.com/
10•pycoclaw•2h ago•1 comments

Show HN: Web-based ANSI art viewer

https://sure.is/ansi/
23•lubujackson•2d ago•7 comments

Show HN: s@: decentralized social networking over static sites

http://satproto.org/
393•remywang•23h ago•205 comments

Show HN: Stratum – SQL that branches and beats DuckDB on 35/46 1T benchmarks

https://datahike.io/notes/stratum-analytics-engine/
8•whilo•2h ago•3 comments

Show HN: Codelegate, keyboard-driven coding agent orchestrator GUI for Mac/Linux

https://codelegate.dev/
3•brucehsu•2h ago•0 comments

Show HN: An application stack Claude coded directly in LLVM IR

https://github.com/dot-matrix-labs/alien-stack
8•dboreham•6h ago•0 comments

Show HN: Every Developer in the World, Ranked

https://coderank.me
8•ejc•3h ago•4 comments

Show HN: PipeStep – Step-through debugger for GitHub Actions workflows

https://github.com/Photobombastic/pipestep
7•photobombastic•6h ago•3 comments

Show HN: Slop or not – can you tell AI writing from human in everyday contexts?

https://slop-or-not.space
7•eigen-vector•1h ago•7 comments

Show HN: Cloud to Desktop in the Fastest Way

https://nativedesktop.com/
3•lasgawe•6h ago•1 comments

Show HN: Open-source browser for AI agents

https://github.com/theredsix/agent-browser-protocol
143•theredsix•1d ago•52 comments

Show HN: I built a tool that watches webpages and exposes changes as RSS

https://sitespy.app
305•vkuprin•1d ago•79 comments

Show HN: Autoresearch@home

https://www.ensue-network.ai/autoresearch
74•austinbaggio•1d ago•19 comments

Show HN: Raccoon AI – Collaborative AI Agent for Anything

https://raccoonai.tech
3•scorchy38•5h ago•1 comments

Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids

https://fuelingcuriosity.com/game.html
118•fuelingcurious•1d ago•46 comments

Show HN: VaultLeap – USD accounts for founders outside the US

https://vaultleap.com
4•GregReve•8h ago•2 comments

Show HN: A desktop app for managing Claude Code sessions

https://github.com/doctly/switchboard
4•kapitalx•8h ago•1 comments

Show HN: Baltic security monitor from public data sources

https://estwarden.eu/
4•makefunstuff•6h ago•0 comments

Show HN: A context-aware permission guard for Claude Code

https://github.com/manuelschipper/nah/
121•schipperai•1d ago•83 comments

Show HN: Hyper – Voice Notes for Whiteboarding Sessions

https://apps.apple.com/us/app/hyper-ai-for-real-talk/id6760206718
3•kthaker1224•6h ago•0 comments

Show HN: XLA-based array computing framework for R

https://github.com/r-xla/anvil
12•sebffischer•3d ago•1 comments

Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG

https://aether.saphal.me/dashboard/default
64•saphalpdyl•1d ago•19 comments

Show HN: Klaus – OpenClaw on a VM, batteries included

https://klausai.com/
155•robthompson2018•1d ago•90 comments

Show HN: Satellite imagery object detection using text prompts

https://www.useful-ai-tools.com/tools/satellite-analysis-demo/
51•eyasu6464•3d ago•22 comments

Show HN: Verge Browser a self-hosted isolated browser sandbox for AI agents

https://github.com/zzzgydi/verge-browser
3•zzzgydi•6h ago•0 comments