frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: PlayTheAI – Test AI models in strategy games (Open Beta)

https://playtheai.com/en/
1•stefan_wibmer•1d ago

Comments

stefan_wibmer•1d ago
Hi, I'm Stefan from Austria. PlayTheAI.com lets humans play classic games (Tic-Tac-Toe, Connect4, Battleship, Mastermind, WordDuel) against 16 non-thinking AI models (no o1/R1 - instant-response only) with Elo tracking.

Why? We're curious how models perform in dynamic situations where they can't rely on memorized patterns.

Early observations from 800+ matches: - Many models show single-digit win rates against humans - We observe interesting patterns in how models handle game state - Price doesn't seem to correlate strongly with performance

Key: All models get identical prompts with game rules - no per-model optimization, no hints about which moves are currently valid. They must analyze the board themselves.

Tech: Astro + Cloudflare Workers, OpenRouter API, Supabase.

All games logged for transparency. This is a hobby project - we'd love feedback on methodology and would welcome collaboration with researchers.

https://playtheai.com

The Silence of the LLaMbs: Getting LLMs to Shut Up

https://ossa-ma.github.io/blog/silence-of-the-llambs
1•ossa-ma•21s ago•0 comments

Columbia Univ. Center on Global Energy Policy: Q&A on US Actions in Venezuela

https://www.energypolicy.columbia.edu/qa-on-us-actions-in-venezuela/
1•TMWNN•1m ago•0 comments

Key open source challenges in developing countries (2023)

https://opensource.com/article/23/4/challenges-open-source-developing-countries
3•devonnull•2m ago•0 comments

EMF Exposure from a Substation Could Be Cause of 49ers' Tendon Rupture Epidemic

https://peteranthonycowan.substack.com/p/could-chronic-emf-exposure-from-a
1•CGMthrowaway•2m ago•0 comments

Show HN: MakeMe – A Makefile tool rewritten from Fish to Go

1•OakNinja•3m ago•0 comments

Show HN: Game Boy Release Timelines

https://gameboyessentials.com/timelines
1•philistine•4m ago•0 comments

Automated testing without the setup: Mechasm.ai Beta

https://mechasm.ai
1•sleepless02•5m ago•1 comments

Ask HN: Job seekers, what's working / not working?

1•Jabbs•6m ago•0 comments

OpenAI adds ChatGPT Health for medical questions

https://www.axios.com/2026/01/07/chatgpt-health-tab-apple-fitness-apps
1•FergusArgyll•6m ago•1 comments

The Dream of the Universal Library

https://asteriskmag.com/issues/12-books/the-dream-of-the-universal-library
1•ilamont•10m ago•0 comments

Show HN: Grammar of Graphics CLI tool made in Rust

https://github.com/williamcotton/gramgraph
1•williamcotton•11m ago•0 comments

Infinite Canvas: Building a Seamless, Pan-Anywhere Image Space – Codrops

https://tympanus.net/codrops/2026/01/07/infinite-canvas-building-a-seamless-pan-anywhere-image-sp...
1•rcarmo•12m ago•0 comments

OpenAI to Buy Pinterest? A Strategic Analysis

https://nekuda.substack.com/p/openai-to-buy-pinterest-heres-what
1•ilamont•13m ago•1 comments

What are we to make of "AI replacement"?

https://joshuagans.substack.com/p/what-are-we-to-make-of-ai-replacement
1•paulpauper•13m ago•0 comments

Lua is a pretty good config language

https://til.andrew-quinn.me/posts/lua-is-a-pretty-good-config-language/
1•hiAndrewQuinn•15m ago•0 comments

ActorAgents

https://tailrecursion.com/~alan/ActorAgents.html
1•wooby•15m ago•0 comments

Claude Code CLI Broken

https://github.com/anthropics/claude-code/issues/16673
13•sneilan1•15m ago•4 comments

Show HN: Startup Simulator – AI Choose Your Own Adventure

https://startup-simulator-beta.vercel.app/
1•baristaGeek•19m ago•0 comments

Dora 2025: Year in Review

https://dora.dev/insights/dora-2025-year-in-review/
1•cebert•23m ago•0 comments

Unit testing your code's performance, part 1: Big-O scaling

https://pythonspeed.com/articles/big-o-tests/
2•todsacerdoti•24m ago•0 comments

Tailscale state file encryption no longer enabled by default

https://tailscale.com/changelog
27•traceroute66•24m ago•11 comments

Show HN: Prompt Tower – build and visualize your context

https://prompttower.com/
3•ramoz•25m ago•0 comments

Free health summaries from the top creators

https://summabase.com/en
1•luis13hgr•26m ago•0 comments

Ledger customers impacted by third-party Global-e data breach

https://www.bleepingcomputer.com/news/security/ledger-customers-impacted-by-third-party-global-e-...
1•DGAP•29m ago•0 comments

Why Musk says it would be a 'distraction' for SpaceX to go to Mars this year

https://www.morningstar.com/news/marketwatch/20260107182/why-elon-musk-now-says-it-would-be-a-dis...
3•voxadam•31m ago•0 comments

Intel's Best Product in Years – Panther Lake Announcement [video]

https://www.youtube.com/watch?v=bG68OBQ3x9Y
5•tester756•33m ago•1 comments

A minimal keyboard key effect with CSS

https://pjg1.site/kbd-css.html
2•birdculture•34m ago•0 comments

Claude Code Emergent Behavior: When Skills Combine

https://vibeandscribe.xyz/posts/2025-01-07-emergent-behavior.html
8•ryanthedev•34m ago•2 comments

Show HN: ScotiaSignal: Public sector intent data for Nova Scotia

https://scotiasignal.ca
2•5eva•35m ago•0 comments

Show HN: LLM-First Personal Knowledge Management

https://github.com/joel-solymosi
2•joelsol•39m ago•0 comments