frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Letting AI play my game – building an agentic test harness to help play-testing

https://blog.jeffschomay.com/letting-ai-play-my-game
52•jschomay•2h ago

Comments

chrisweekly•1h ago
This is awesome. Thanks for sharing! The text-based renderer reminds me of playing Larn on my dad's VT100 when I was a child (early 80s).
zoetaka38•44m ago
Built something similar for E2E web testing recently. A few observations from running an agentic test harness in production:

1. The single biggest jump in test quality came from giving the agent BOTH source code analysis AND live browser snapshots, not either alone. With code-only the agent hallucinates selectors; with browser-only it misses project conventions. Two MCP servers feeding the same agent — one local file-read, one Playwright in-process — was the architecture that worked.

2. For the browser snapshot tool, returning the raw DOM ate tens of thousands of tokens per call and the agent struggled to navigate it. Swapping to accessibility-tree refs (e1, e2, ...) cut token usage by ~10x and made the agent reliably target the right elements.

3. We avoided Docker-based MCP servers in production (we run on ECS Fargate). The in-process SDK MCP pattern (create_sdk_mcp_server + @tool decorator) keeps the browser handle in scope of the tool definition, which let us attach page.on('console') listeners and have the agent read them via a separate tool. Hard to do that across stdio process boundaries.

For game testing specifically — your text-renderer detail is interesting because it sidesteps the visual-grounding problem (how does the agent verify what it's seeing?). Curious how you'd extend this to a 2D/3D rendered game where the screen state isn't easily textualized.

squeegmeister•13m ago
I recently added E2E tests in my game too. One of the benefits is that I can have my agent verify its own work by asking it write a test and look at screenshots. Which means I can say “I’m going to bed, implement this and verify it with e2e tests” and it gets further along than it used to

Zed is 1.0

https://zed.dev/blog/zed-1-0
177•salkahfi•33m ago•39 comments

Tangled – We need a federation of forges

https://blog.tangled.org/federation/
157•icy•1h ago•95 comments

Soft launch of open-source code platform for government

https://www.nldigitalgovernment.nl/news/soft-launch-for-government-open-source-code-platform/
366•e12e•5h ago•104 comments

Ghostty is leaving GitHub

https://mitchellh.com/writing/ghostty-leaving-github
3036•WadeGrimridge•19h ago•904 comments

Improving ICU handovers by learning from Scuderia Ferrari F1 team

https://healthmanagement.org/c/icu/IssueArticle/improving-handovers-by-learning-from-scuderia-fer...
19•embedding-shape•2h ago•9 comments

Letting AI play my game – building an agentic test harness to help play-testing

https://blog.jeffschomay.com/letting-ai-play-my-game
53•jschomay•2h ago•3 comments

HashiCorp co-founder says GitHub 'no longer a place for serious work'

https://www.theregister.com/2026/04/29/mitchell_hashimoto_ghostty_quitting_github/
342•terminalbraid•3h ago•181 comments

GitHub – DOS 1.0: Transcription of Tim Paterson's DOS Printouts

https://github.com/DOS-History/Paterson-Listings
40•s2l•3h ago•1 comments

Bugs Rust won't catch

https://corrode.dev/blog/bugs-rust-wont-catch/
461•lwhsiao•12h ago•249 comments

Show HN: Adblock-rust Manager – Firefox extension to enable the Brave ad blocker

https://github.com/electricant/adblock-rust-manager
29•electricant•2h ago•17 comments

Before GitHub

https://lucumr.pocoo.org/2026/4/28/before-github/
575•mlex•17h ago•183 comments

Stardex Is Hiring a Founding Customer Success Lead

https://www.ycombinator.com/companies/stardex/jobs/6GCK1HC-founding-customer-success-lead
1•sanketc•3h ago

How ChatGPT serves ads

https://www.buchodi.com/how-chatgpt-serves-ads-heres-the-full-attribution-loop/
428•lmbbuchodi•15h ago•293 comments

Show HN: Rip.so – a graveyard for dead internet things

https://rip.so
130•bozdemir•5h ago•93 comments

Show HN: Rocky – Rust SQL engine with branches, replay, column lineage

https://github.com/rocky-data/rocky
94•hugocorreia90•1d ago•29 comments

Show HN: Auto-Architecture: Karpathy's Loop, pointed at a CPU

https://github.com/FeSens/auto-arch-tournament/blob/main/docs/auto-arch-tournament-blog-post.md
200•fesens•21h ago•59 comments

Coffee with a splash of physics: how to make the most out of your brew

https://physicsworld.com/a/coffee-with-a-splash-of-physics-how-to-make-the-most-out-of-your-brew/
42•sohkamyung•2h ago•28 comments

HardenedBSD Is Now Officially on Radicle

https://hardenedbsd.org/article/shawn-webb/2026-04-26/hardenedbsd-officially-radicle
123•lftherios•8h ago•25 comments

Withnail's Coat and I

https://ontherow.substack.com/p/withnails-coat-and-i
115•apollinaire•1d ago•17 comments

OpenAI models coming to Amazon Bedrock: Interview with OpenAI and AWS CEOs

https://stratechery.com/2026/an-interview-with-openai-ceo-sam-altman-and-aws-ceo-matt-garman-abou...
304•translocator•19h ago•98 comments

Low-Compilation-Cost Register Allocation in LLVM-Based Binary Translation

https://dl.acm.org/doi/abs/10.1145/3767295.3803591
48•matt_d•8h ago•1 comments

Who owns the code Claude Code wrote?

https://legallayer.substack.com/p/who-owns-the-claude-code-wrote
483•senaevren•1d ago•446 comments

GitHub RCE Vulnerability: CVE-2026-3854 Breakdown

https://www.wiz.io/blog/github-rce-vulnerability-cve-2026-3854
410•bo0tzz•22h ago•87 comments

I won a championship that doesn't exist

https://ron.stoner.com/How_I_Won_a_Championship_That_Doesnt_Exist/
211•SEJeff•18h ago•116 comments

He asked AI to count carbs 27000 times. It couldn't give the same answer twice

https://www.diabettech.com/i-asked-ai-to-count-my-carbs-27000-times-it-couldnt-give-me-the-same-a...
180•sarusso•2h ago•233 comments

Talkie: a 13B vintage language model from 1930

https://talkie-lm.com/introducing-talkie
727•jekude•1d ago•310 comments

Gallium oxide electronics withstand extreme cold

https://discovery.kaust.edu.sa/en/article/26858/gallium-oxide-electronics-withstand-extreme-cold/
69•giuliomagnifico•2d ago•6 comments

Warp is now open-source

https://www.warp.dev/blog/warp-is-now-open-source
330•meetpateltech•23h ago•97 comments

Behavioral timescale synaptic plasticity rewires the brain after an experience

https://www.quantamagazine.org/a-new-type-of-neuroplasticity-rewires-the-brain-after-a-single-exp...
147•ibobev•2d ago•9 comments

Your phone is about to stop being yours

https://keepandroidopen.org/en/
1557•doener•23h ago•751 comments