frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: A game/benchmark where AI bots hunt each other

https://hiding-robot.vercel.app/
4•-babi-•21h ago
I've created a social deduction game for LLMs, in which the bots attempt to hunt each other. It's a Mafia group turing test: the models are told to find who the bot is - where, in fact and unbeknown to them, they are all bots. I did this a while back so models aren't the newest, and they are all non-thinking (for speed and token costs). Et voilà.

Comments

MajidAliSyncOps•20h ago
Interesting setup. Social-deduction feels like a clever proxy for multi-agent coordination and deception. One trade-off I’m curious about is how much the results reflect prompt design vs actual model behavior. Have you tried swapping prompts or role constraints to see how stable the outcomes are?
-babi-•19h ago
the inverted game, in which bots are instructed to find the human hiding in the LLM conversaion (although no human is present), is here: https://hiding-robot.vercel.app/human The leaderboard is different, but I didn't run it enough times to flatten all the kinks.

All bots get the same prompt and context: are you suggesting that a specific prompt wording might be helping or hurting specific models? I Haven't come across any suggestions that specific models should be prompted differently, though this might be true.

falloutx•13h ago
Pretty cool, few small ui nits:

- conversation has one left, one right pattern. imo It would be better to have all on the left side like left side like a true group chat. right could be used for game commentator or controller, just an idea.

- may be make the entire text some color based on the AI model, its hard to tell which AI is who because name is certainly small and the tiny dot is hard to differentiate.

Show HN: macOS menu bar app to track Claude usage in real time

https://github.com/richhickson/claudecodeusage
121•RichHickson•15h ago•41 comments

Show HN: Ever wanted to look at yourself in Braille?

https://github.com/NishantJoshi00/dith
3•cat-whisperer•1h ago•0 comments

Show HN: A Wall Street Terminal for Everyone

https://marketterminal.com/chart
4•adamfontan•1h ago•1 comments

Show HN: Commit-based code review instead of PR-based

https://commitguard.ai
4•moshetanzer•3h ago•0 comments

Show HN: A geofence-based social network app 6 years in development

https://www.localvideoapp.com
60•Adrian-ChatLocl•12h ago•40 comments

Show HN: DeepDream for Video with Temporal Consistency

https://github.com/jeremicna/deepdream-video-pytorch
61•fruitbarrel•20h ago•23 comments

Show HN: I built a tool to create AI agents that live in iMessage

https://tryflux.ai/
27•danielsdk•5d ago•11 comments

Show HN: I visualized the entire history of Citi Bike in the browser

https://bikemap.nyc/
109•freemanjiang•1d ago•31 comments

Show HN: Layoffstoday – Open database tracking for 10k Companies

https://layoffstoday.io/
2•doremon0902•5h ago•1 comments

Show HN: Watch LLMs play 21,000 hands of Poker

https://pokerbench.adfontes.io/run/Large_Models
29•jazarwil•19h ago•18 comments

Show HN: Claude Code for Django

https://github.com/kjnez/claude-code-django
3•cui•6h ago•2 comments

Show HN: I built a "Do not disturb" Device for my home office

https://apoorv.page/blogs/over-engineered-dnd
93•quacky_batak•5d ago•49 comments

Show HN: Workzonespeedingticket.com – Automating disputes for automated fines

https://workzonespeedingticket.com/
3•todaycompanies•2h ago•1 comments

Show HN: SMTP Tunnel – A SOCKS5 proxy disguised as email traffic to bypass DPI

https://github.com/x011/smtp-tunnel-proxy
136•lobito25•2d ago•44 comments

Show HN: Open database of link metadata for large-scale analysis

https://github.com/rumca-js/RSS-Link-Database-2025
14•renegat0x0•5d ago•1 comments

Show HN: Legit, Open source Git-based Version control for AI agents

5•jannesblobel•9h ago•0 comments

Show HN: We built a permissions layer for Notion

https://notionportals.com/
10•PEGHIN•13h ago•6 comments

Show HN: Executable Markdown files with Unix pipes

47•jedwhite•6h ago•41 comments

Show HN: Free and local browser tool for designing gear models for 3D printing

https://gears.dmtrkovalenko.dev
52•neogoose•2d ago•13 comments

Show HN: Tailsnitch – A security auditor for Tailscale

https://github.com/Adversis/tailsnitch
277•thesubtlety•3d ago•28 comments

Show HN: Mantic.sh – A structural code search engine for AI agents

https://github.com/marcoaapfortes/Mantic.sh
78•marcoaapfortes•2d ago•37 comments

Show HN: DoNotNotify – Log and intelligently block notifications on Android

https://donotnotify.com/
343•awaaz•3d ago•165 comments

Show HN: How I generate animated pixel art with AI and Python

https://sarthakmishra.com/blog/building-animated-sprite-hero
16•sarthak_drool•1d ago•2 comments

Show HN: VaultSandbox – Test your real MailGun/SES/etc. integration

https://vaultsandbox.com/
58•vaultsandbox•2d ago•12 comments

Show HN: 48-digit prime numbers every git commit

https://textonly.github.io/git-prime/
66•keepamovin•1w ago•54 comments

Show HN: KeelTest – AI-driven VS Code unit test generator with bug discovery

https://keelcode.dev/keeltest
28•bulba4aur•1d ago•15 comments

Show HN: Remotedays – Cross-border remote work compliance for EU companies

3•alwinaugustin•12h ago•0 comments

Show HN: Prism.Tools – Free and privacy-focused developer utilities

https://blgardner.github.io/prism.tools/
371•BLGardner•2d ago•101 comments

Show HN: Turn your PRs into marketing updates

https://personabox.app
3•mpc75•13h ago•0 comments

Show HN: Comet MCP – Give Claude Code a browser that can click

https://github.com/hanzili/comet-mcp
28•hanzili•5d ago•27 comments