frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: A game/benchmark where AI bots hunt each other

https://hiding-robot.vercel.app/
4•-babi-•20h ago
I've created a social deduction game for LLMs, in which the bots attempt to hunt each other. It's a Mafia group turing test: the models are told to find who the bot is - where, in fact and unbeknown to them, they are all bots. I did this a while back so models aren't the newest, and they are all non-thinking (for speed and token costs). Et voilà.

Comments

MajidAliSyncOps•19h ago
Interesting setup. Social-deduction feels like a clever proxy for multi-agent coordination and deception. One trade-off I’m curious about is how much the results reflect prompt design vs actual model behavior. Have you tried swapping prompts or role constraints to see how stable the outcomes are?
-babi-•18h ago
the inverted game, in which bots are instructed to find the human hiding in the LLM conversaion (although no human is present), is here: https://hiding-robot.vercel.app/human The leaderboard is different, but I didn't run it enough times to flatten all the kinks.

All bots get the same prompt and context: are you suggesting that a specific prompt wording might be helping or hurting specific models? I Haven't come across any suggestions that specific models should be prompted differently, though this might be true.

falloutx•12h ago
Pretty cool, few small ui nits:

- conversation has one left, one right pattern. imo It would be better to have all on the left side like left side like a true group chat. right could be used for game commentator or controller, just an idea.

- may be make the entire text some color based on the AI model, its hard to tell which AI is who because name is certainly small and the tiny dot is hard to differentiate.

GLM-4.7: Frontier intelligence at record speed

https://www.cerebras.ai/blog/glm-4-7
1•sorenbs•58s ago•0 comments

Show HN: Tea Dating App for Men

https://www.herlaps.com/
1•ellie_dcruz•1m ago•1 comments

Iran: An Uprising Besieged from Within and Without: Three Perspectives

https://crimethinc.com/2026/01/07/iran-an-uprising-besieged-from-within-and-without-three-perspec...
1•pabs3•1m ago•0 comments

The future of space exploration depends on better biology

https://www.economist.com/leaders/2025/12/30/the-future-of-space-exploration-depends-on-better-bi...
1•zeristor•3m ago•1 comments

Using process dynamics to select compression modes online

https://substack.com/inbox/post/183988513
1•Alex1Morgan•4m ago•1 comments

Moving Scratch generation to Python on browser

https://kushaldas.in/posts/introducing-ektupy.html
1•kushaldas•9m ago•0 comments

How AI Is Making Everything More Expensive [video]

https://www.youtube.com/watch?v=JlmLdvCM-ZI
1•mgh2•10m ago•1 comments

Dutch set to outlaw fireworks after more new year chaos

https://www.theguardian.com/world/2026/jan/09/dutch-netherlands-fireworks-ban-new-years-eve
2•n1b0m•10m ago•0 comments

Apple Loses Safari Lead Designer to the Browser Company

https://www.macrumors.com/2026/01/08/apple-loses-safari-designer-to-the-browser-company/
1•mgh2•12m ago•0 comments

HP's EliteBoard G1a is a Ryzen-powered Windows 11 PC in a membrane keyboard

https://arstechnica.com/gadgets/2026/01/hps-eliteboard-g1a-is-a-ryzen-powered-windows-11-pc-in-a-...
1•teleforce•12m ago•0 comments

End-to-End Influencer Marketing AI Agent

https://kflx.ai/en
1•Lily_666•13m ago•1 comments

15 Years of Indie Dev in 4 Bits of Advice

https://www.pentadact.com/2026-01-08-15-years-of-indie-dev-in-4-bits-of-advice/
1•microflash•13m ago•0 comments

Who's who at X, the deepfake porn site formerly known as Twitter

https://www.ft.com/content/ad94db4c-95a0-4c65-bd8d-3b43e1251091
4•doener•15m ago•0 comments

Claude Code changes it's privacy settings and policy

2•tankenmate•18m ago•0 comments

GNU Awk and Me: 37 Years of Free Software Development [video]

https://www.youtube.com/watch?v=Hm1a-pWsnMI
2•benhoyt•18m ago•0 comments

Model Anxiety

https://blog.verifai.ai/model-anxiety-the-enterprise-dilemma-in-the-age-of-ai/
1•sandeepsr•20m ago•1 comments

Show HN: A small system monitor for Mac, based on the classic IRIX gr_osview

https://github.com/Pablo-Merino/OSView
1•kp195_•20m ago•0 comments

Show HN: A little app for learning vocab with daily images

https://app.snapalabra.com
2•detectivestory•23m ago•1 comments

Israel tells Doctors Without Borders to end its work in Gaza

https://www.nytimes.com/2026/01/06/world/middleeast/israel-bars-doctors-without-borders-gaza.html
9•jpster•24m ago•2 comments

Render AI Tool Free: Best Free AI Rendering Tools in 2026

https://vocus.cc/article/6960b54afd8978000134411f
1•architech_willy•25m ago•0 comments

Grok turns off image generator for most after outcry over sexualised AI imagery

https://www.theguardian.com/technology/2026/jan/09/grok-image-generator-outcry-sexualised-ai-imagery
3•beardyw•28m ago•0 comments

Show HN: Vibemux – Run multiple Claude Code instances in one TUI

https://github.com/UgOrange/vibemux
1•UgOrange•30m ago•0 comments

Arguments for a syncable data exchange format

https://replicated.wiki/blog/args.html
1•gritzko•30m ago•0 comments

Bluefors to Source Helium-3 from the Moon to Power Quantum Industry Growth

https://bluefors.com/press-releases/bluefors-to-source-helium-3-from-the-moon-with-interlune-to-p...
1•JoachimS•30m ago•0 comments

The quest for grammar combinators: introducing the Pup library

https://www.tweag.io/blog/2026-01-08-grammar-combinators/
1•ingve•31m ago•0 comments

Auto Claude - Autonomous multi-agent coding framework

https://github.com/AndyMik90/Auto-Claude
2•t0mas88•32m ago•0 comments

Interviewing Ruby Software Engineers Is Easier Than Ever in 2025

https://andymaleh.blogspot.com/2025/12/interviewing-ruby-software-engineers-is.html
1•amalinovic•35m ago•0 comments

Claude Code Flickering in Tmux

https://blog.tymek.dev/claude-code-flickering-in-tmux/
1•behnamoh•35m ago•0 comments

Developing Tactility: the second year recap

https://bytewelder.com/posts/2026/01/08/tactility-second-year.html
1•ByteWelder•39m ago•1 comments

Dialogue Between a Developer and a Kid

https://riggraz.dev/dialogue-developer.html
1•Growtika•53m ago•0 comments