frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: AlphaEvolve inspired evolution harness for Pokemon

https://github.com/papercomputeco/pokemon
2•brianllamar•1h ago
Last week I sat in the LLM Paper Club (hosted by the latent.space podcast). We shared a paper from DeepMind on "Discovering Multiagent Learning Algorithms with Large Language Models". Linked in the github repo.

This was my first time sitting in one of these and reading an LLM paper. I thought it was interesting that the AlphaEvolve agent in the paper essentially writes code, runs it, scores the output, and then writes better code, over and over. It's not discovering strategies through play. It's discovering algorithms through code mutation. The LLM proposes changes to how regret is accumulated or how policies are derived, a fitness function scores the result, and the best variants survive to the next generation.

The two algorithms it found (VAD-CFR and SHOR-PSRO) use mechanisms the authors describe as "non-intuitive," things like volatility-sensitive discounting and hard warm-start schedules that a human designer probably wouldn't have tried. That's the interesting part: the LLM isn't constrained by the same design intuitions we are.

To make it concrete for myself, I built a small version of this loop for a Pokemon game agent. The setup is simple: define a fitness function (turns survived, maps visited, stuck events), parameterize the strategy space (door cooldown, stuck threshold, skip distance), and let an LLM propose variants that get evaluated in parallel. Ten agents race through the game, the best parameters survive. I used tapes.dev to collect session telemetry and feed observational memory back into the fitness scoring.

The first run already surfaced something useful: shorter door cooldowns (4 vs 8) reduce stuck events from 16 to 9. Not a breakthrough, but the point is the system found it without me guessing. That's the same dynamic as the paper, just at toy scale.

What I took away from the paper club: the bottleneck in algorithm design isn't computation, it's the search process itself. If you can express your problem as "parameterized code + fitness function," an LLM evolution loop can explore the space faster than manual iteration. The paper proves it works for game theory. The link is in my startup's repo, I want to explore applying this technique for improving future general purpose and coding agent sessions.

Just pointing out Pokemon aren't the only thing evolving in that repo.

Invent your own comprehensions in Python – Python Morsels

https://www.pythonmorsels.com/custom-comprehensions/
1•rbanffy•4m ago•0 comments

SQL Order-Equivalence

https://modern-sql.com/blog/2026-03/order-equivalence-over-clause
1•chmaynard•4m ago•0 comments

Show HN: Clauductor – Web UI for Claude Code with real-time work graph

https://github.com/mikolajbadyl/clauductor
1•mbadyl•7m ago•0 comments

Advocates urge judge to block $68M Colony Ridge settlement

https://www.americanbanker.com/news/advocates-urge-judge-to-block-68m-colony-ridge-settlement
1•petethomas•7m ago•0 comments

I Infected My iPhone with Russian Spyware. Here's What I Found [video]

https://www.youtube.com/watch?v=XQvZ2mLnZVI
2•seanieb•8m ago•0 comments

The Unmaking of the American University

https://www.newyorker.com/magazine/2026/03/16/the-unmaking-of-the-american-university
3•rbanffy•8m ago•0 comments

Show HN: Chat AI Agent built into live Appium/mobile device sessions

https://robotactions.com/
1•krishpavuluri•9m ago•0 comments

Apple's Privacy Is a Lie

https://www.youtube.com/watch?v=FDJP1OI2MXk
1•frag•12m ago•0 comments

I vibe coded my dream macOS presentation app

https://simonwillison.net/2026/Feb/25/present/
1•alwillis•13m ago•0 comments

Claude Tried to Hack 30 Companies. Nobody Asked It To

https://trufflesecurity.com/blog/claude-tried-to-hack-30-companies-nobody-asked-it-to
1•riverdroid•14m ago•0 comments

Air strikes cause black rain and 'unprecedented' pollution in Tehran

https://www.bbc.com/news/articles/cqxd1nv3re2o
3•tartoran•16m ago•0 comments

TermF1: A terminal-style dashboard for Formula 1

https://github.com/dk-a-dev/termf1
1•dev345•16m ago•0 comments

Steve Rosenberg: Russia seeks diplomatic and economic gains from Iran war

https://www.bbc.com/news/articles/c4gjyg0djvmo
2•tartoran•16m ago•0 comments

Russia's deportation of Ukrainian children amounts to crime against humanity

https://www.bbc.com/news/articles/cz7g5xnvl2eo
7•tartoran•17m ago•0 comments

The U.S. borrowed $50B a week for the past five months, the CBO says

https://fortune.com/2026/03/10/treasury-debt-borrowing-five-months-deficit-warning/
6•testing22321•17m ago•0 comments

Krazam – Paradise Episode 1 – Public Memories [video]

https://www.youtube.com/watch?v=AS9y-d2BvZU
1•tart-lemonade•20m ago•0 comments

Show HN: Clawbake: Multi-User Instance Management for OpenClaw

https://neurometric.substack.com/p/we-built-clawbake-open-source-multi
2•robmay•21m ago•0 comments

Go-pty: Procfile process manager with PTY support

https://www.mendelowski.com/go-pty/
2•pchm•22m ago•0 comments

Size-shifting nanoparticles deliver mRNA medicine to the pancreas

https://phys.org/news/2026-02-size-shifting-nanoparticles-successfully-mrna.html
1•PaulHoule•22m ago•0 comments

Reliability Theatre: When reliability metrics stop measuring reliability

https://halil.cetiner.me/reliability-theatre/
1•bayneri•23m ago•0 comments

Roast My Website

https://tear-my-site-down.vercel.app/
1•jerdman76•23m ago•1 comments

M.C. Escher Flavoured Pages

https://www.josleys.com/galleries.php?catid=6
3•TigerUniversity•24m ago•0 comments

I Got Fired Because of AI – But I Still Think I'm the Engineer of the Future

2•vital_pavlenko•25m ago•0 comments

Show HN: Prompt Enricher – paste a rough prompt, get a structured one back

https://statwonk.com/prompt-enricher/
1•RA_Fisher•25m ago•1 comments

Lessons from 30 Years Building Software Systems

1•alkas•28m ago•0 comments

OverflowML – Run AI models larger than your GPU, one line of code

https://github.com/Khaeldur/overflowml
2•khaeldur•34m ago•0 comments

Evaluating Evolving Agents with Evolving Benchmarks

https://frontier-cs.org/blog/agent-evaluation/
2•lihanc111•37m ago•1 comments

Fil-C is safer than Rust

https://twitter.com/filpizlo/status/1984366437390303265
4•ppew•37m ago•0 comments

Haarp: A Never-Ending Conspiracy Theory in Remote Alaska

https://www.theatlantic.com/technology/2026/03/haarp-weather-conspiracies/686264/
3•jonah•38m ago•0 comments

Show HN: An on-device Mac app for real-time posture reminders

https://apps.apple.com/us/app/ai-posture-reminder-app/id1574005886?mt=12
3•data-leon•39m ago•0 comments