frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: A Bomberman-style 1v1 game where LLMs compete in real time

https://github.com/klemenvod/TokenBrawl
2•sunandsurf•8h ago
A few weeks ago, ARC-AGI 3 was released. For those unfamiliar, it’s a benchmark designed to study agentic intelligence through interactive environments.

I'm a big fan of these kinds of benchmarks as IMO they reveal so much more about the capabilities and limits of agentic AI than static Q&A benchmarks. They are also more intuitive to understand when you are able to actually see how the model behaves in these environments.

I wanted to build something in that spirit, but with an environment that pits two LLMs against each other. My criteria were:

1. Strategic & Real-time. The game had to create genuine tradeoffs between speed and quality of reasoning. Smaller models can make more moves but less strategic ones; larger models move slower but smarter. 2. Good harness. I deliberately avoided visual inputs — models are still too slow and not accurate enough with them (see: Claude playing Pokémon). Instead, a harness translates the game state into structured text, and the game engine renders the agents' responses as fluid animations. 3. Fun to watch. Because benchmarks don't need to be dry bread :) The end result is a Bomberman-style 1v1 game where two agents compete by destroying bricks and trying to bomb each other. You can check a demo video here: https://youtu.be/4x8tVypmuRk

Would love to hear what you think!

Comments

jamespeng•3h ago
hahha that's mad !
Eric_Xua•1h ago
Love the idea of turning agent benchmarks into a real-time Bomberman match between LLMs — super fun way to surface speed vs reasoning tradeoffs.

Show HN: Kontext CLI – Credential broker for AI coding agents in Go

https://github.com/kontext-dev/kontext-cli
12•mc-serious•2h ago•0 comments

Show HN: Run GUIs as Scripts

https://github.com/skinnyjames/hokusai-pocket
9•zero-st4rs•4d ago•2 comments

Show HN: We built an MCP for Windows – ask Claude about CPU, temps, and privacy

https://github.com/AppControlLabs/appcontrol-mcp-go/
6•suprnurd•1h ago•3 comments

Show HN: Ithihāsas – a character explorer for Hindu epics, built in a few hours

https://www.ithihasas.in
159•cvrajeesh•20h ago•42 comments

Show HN: A stateful UI runtime for reactive web apps in Go

https://github.com/doors-dev/doors
7•derstruct•7h ago•4 comments

Show HN: VibeDrift – Measure drift in AI-generated codebases

https://www.vibedrift.ai/
2•samiahmadkhan•5h ago•6 comments

Show HN: Pushduck – S3 uploads that run on Cloudflare Workers, no AWS SDK

8•abhay_ramesh•9h ago•4 comments

Show HN: Write better Go integration tests with open source dockertest v4

https://github.com/ory/dockertest/tree/v4
3•pragmaticviber•2h ago•0 comments

Show HN: boringBar – a taskbar-style dock replacement for macOS

https://boringbar.app/
510•a-ve•1d ago•292 comments

Show HN: A CLI that writes its own integration code

https://docs.superglue.cloud/getting-started/cli-skills
8•adinagoerres•7h ago•5 comments

Show HN: Deflect One – command line dashboard for managing Linux servers via SSH

https://github.com/Frytskyy/deflect-one
5•whitemanv•10h ago•4 comments

Show HN: A Bomberman-style 1v1 game where LLMs compete in real time

https://github.com/klemenvod/TokenBrawl
2•sunandsurf•8h ago•2 comments

Show HN: Continual Learning with .md

https://github.com/SunAndClouds/ReadMe
30•wenhan_zhou•19h ago•27 comments

Show HN: Tsplat – Render Gaussian Splats directly in your terminal

https://github.com/darshanmakwana412/tsplat
6•darshanmakwana•9h ago•1 comments

Show HN: Mcptube – Karpathy's LLM Wiki idea applied to YouTube videos

https://github.com/0xchamin/mcptube
12•0xchamin•23h ago•2 comments

Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)

https://github.com/rochus-keller/OberonSystem3Native/releases
240•Rochus•2d ago•107 comments

Show HN: I built a social media management tool in 3 weeks with Claude and Codex

https://github.com/brightbeanxyz/brightbean-studio
184•JanSchu•1d ago•125 comments

Show HN: Pardonned.com – A searchable database of US Pardons

496•vidluther•3d ago•272 comments

Show HN: Claudraband – Claude Code for the Power User

https://github.com/halfwhey/claudraband
118•halfwhey•1d ago•44 comments

Show HN: Prmana – OIDC SSH Login for Linux with DPoP (Rust, Apache 2.0)

https://github.com/prodnull/prmana
3•cbchhaya•13h ago•0 comments

Show HN: Lythonic – Compose Python functions into data-flow pipelines

https://github.com/walnutgeek/lythonic
5•walnutgeek•19h ago•0 comments

Show HN: Equirect – a Rust VR video player

https://github.com/greggman/equirect
13•greggman65•1d ago•1 comments

Show HN: Excalicharts – Charting Library for Excalidraw

https://github.com/tombedor/excalicharts
4•jjfoooo4•15h ago•0 comments

Show HN: FluidCAD – Parametric CAD with JavaScript

https://fluidcad.io/
155•maouida•3d ago•37 comments

Show HN: Farchive – SQLite-backed history-preserving compressed archive

https://github.com/eliask/farchive
5•ekns•23h ago•0 comments

Show HN: Dbg – One CLI debugger for every language (AI-agent ready)

https://redknightlois.github.io/dbg/
5•redknight666•23h ago•0 comments

Show HN: OQP – A verification protocol for AI agents

https://github.com/OranproAi/open-qa-protocol
6•Aamir21•17h ago•1 comments

Show HN: Deconflict – Open-source WiFi planner with physics-based walls

https://deconflict.pages.dev
3•s_e__a___n•19h ago•1 comments

Show HN: Rekal – Long-term memory for LLMs in a single SQLite file

https://github.com/janbjorge/rekal
7•jeeybee•1d ago•9 comments

Show HN: I built a Cargo-like build tool for C/C++

https://github.com/randerson112/craft
176•randerson_112•4d ago•168 comments