frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

https://sakana.ai/drq/
117•hardmaru•20h ago

Comments

hardmaru•20h ago
Hi HN,

I am one of the authors from Sakana AI and MIT. We just released this paper where we hooked up LLMs to the classic 1984 programming game Core War. For those who haven't played it, Core War involves writing assembly programs in a language called Redcode that battle for control of a virtual computer's memory. You win by crashing the opponent's process while keeping yours running. It is a Turing-complete environment where code and data share the same address space, which leads to some very chaotic self-modifying code dynamics.

We did not just ask the model to write winning code from scratch. Instead, we treated the LLM as a mutation operator within a quality-diversity algorithm called MAP-Elites. The system runs an adversarial evolutionary loop where new warriors are continually evolved to defeat the champions of all previous rounds. We call this Digital Red Queen because it mimics the biological hypothesis that species must continually adapt just to survive against changing competitors.

The most interesting result for us was observing convergent evolution. We ran independent experiments starting from completely different random seeds, yet the populations consistently gravitated toward similar behavioral phenotypes, specifically regarding memory coverage and thread spawning. It mirrors how biological species independently evolve similar traits like eyes to solve similar problems. We also found that this training loop produced generalist warriors that were robust even against human-written strategies they had never encountered during training.

We think Core War is an under-utilized sandbox for studying these kinds of adversarial dynamics. It lets us simulate how automated systems might eventually compete for computational resources in the real world, but in a totally isolated environment. The simulation code and the prompts we used are open source on GitHub.

Other info other than the blog link:

Paper (website): https://pub.sakana.ai/drq/

Arxiv: https://arxiv.org/abs/2601.03335

Code: https://github.com/SakanaAI/drq

NitpickLawyer•17h ago
> adversarial evolutionary loop where new warriors are continually evolved to defeat the champions of all previous rounds.

Interesting. So you're including past generation champions in the "fights"? That would intuitively model a different kind of evolution than just "current factors"-driven evolution.

> We also found that this training loop produced generalist warriors that were robust even against human-written strategies they had never encountered during training.

Nice. Curious, did you do any ablations for the "all previous champions" vs. "current gen champions"?

aldebaran1•15h ago
Very interesting paper, thank you. It makes me wonder what other game substrates could form the basis for adversarial/evolutionary strategy optimization for LLMs, and whether these observations replicate across games.

Since LLMs are text based, a text-based game might be interesting. Something like Nomic?

Or a "meme warfare" game where each agent tries to prompt-inject its adversaries into saying a forbidden codeword, and can modify its own system prompt to attempt to prevent that from happening to itself.

GuB-42•17h ago
Using evolution in the context of Core War is not a new idea by far, it is even referenced in the paper.

Examples here: https://corewar.co.uk/evolving.htm

The difference here is that instead of using a typical genetic algorithm written in a programming language, it uses LLM prompts to do the same thing.

I wonder if the authors tried some of the existing "evolvers" to compare to what the LLM gave out.

api•17h ago
See also:

https://en.wikipedia.org/wiki/Tierra_(computer_simulation)

https://avida-ed.msu.edu

https://github.com/adamierymenko/nanopond

Lots of evolving bug corewar-style systems around.

I think the interesting thing with this one is they're having LLMs create evolving agents instead of blind evolution or some similar ML system.

Ieghaehia9•15h ago
That in turn makes me wonder:

Given fixed opposition, finding a warrior that performs the best is an optimization problem. Maybe, for very small core sizes like a nano core, it would be possible to find the optimum directly by SAT or SMT instead of using evolution? Or would it be impractical even for those core sizes?

slickytail•12h ago
I think it would, for all practical purposes, be impossible to determine an optimal warrior, even at very small core sizes. Not only is the search space huge but the evaluation function can take unbounded time to resolve. We should consider the halting problem embedded inside the optimization target as a clue to the problem's difficulty.
Ieghaehia9•35m ago
That's the thing: Core War matches last a finite time (after which the match is judged a tie). So you have a finite memory space, finite time, and a finite number of match combinations. And for predetermined constant N, the bounded halting problem ("does the program halt within N steps") is in NP.

For the nano hill[1], the constants are: each warrior has a max of five lines of code, core size is 80 instructions, and a match lasts a maximum of 800 cycles.

If N = 1, it's clear that the best you can do is drop a bomb at a fixed location and hope you hit. So that is mostly a tie. For N=2, it's probably still not possible to do anything useful. With N = 10, perhaps a quickscan is possible. N = 800 -- who knows?

[1] https://corewar.co.uk/nano.htm

dgacmu•12h ago
Oh man, that's funny to see one of my grad school class projects in that list. Takes me back. :-)

From that experience: The LLM is likely to do drastically better. Most of the prior work, mine included, took a genetic algorithm approach, but an LLM is more likely to make coherent multi-instruction modifications.

It's a shame they didn't compare against some of the standard core wars benchmarks as a way to facilitate comparisons to prior work, though. Makes it hard to say that they're better for sure. https://corewar.co.uk/bench.htm

jacquesm•11h ago
I'm not sure if that will hold up. The LLM is not going to do anything random and that is actually a powerful component that makes original output possible.
kyralis•8h ago
I wonder if a combination would be useful. Use an actual GA to do the mutation, and then let an LLM "fix" each mutated child.
jacquesm•3h ago
Could be. But the interesting thing is that all you can do here is optimize. Random chance is - like attention ;) - all you need.
pkhuong•14h ago
How does the output fare on competitive hills like https://sal.discontinuity.info/hill.php?key=94t ?

AFAIK, the best results so far for fully computer-generated warriors have been on the nano and tiny format (https://sal.discontinuity.info/hill.php?key=nano, https://sal.discontinuity.info/hill.php?key=tiny), with much shorter warriors (at most 5 or 20 instructions).

JKCalhoun•11h ago
What a lovely period of time that was—when "Computer Recreations" ran monthly in Scientific American. I read the column every month and was fascinated to learn about Eliza, Core Wars, Conway's Life, Wa-Tor, etc. It was a time when you coded simply for the fun of it—to explore, learn.

I know you can still do that today, but… something has changed. I don't know what it is. (Maybe I changed.)

Anyway, I was unable to track down PDF versions of the original articles, but, for the curious and newcomers to Core Wars, they're transcribed here:

https://corewar.co.uk/dewdney/

idiotsecant•7h ago
Computers are no longer something fresh and new. They are firmly in the realm of stuff that exists and has Rules. The frontier is dead.
rao-v•9h ago
The idea of what LLMs could do in CoreWars has been hanging around in the back of my head for a while now. So happy to see someone explore it systematically

Mathematics for Computer Science (2018) [pdf]

https://courses.csail.mit.edu/6.042/spring18/mcs.pdf
156•vismit2000•5h ago•22 comments

Linux Runs on Raspberry Pi RP2350's Hazard3 RISC-V Cores (2024)

https://www.hackster.io/news/jesse-taube-gets-linux-up-and-running-on-the-raspberry-pi-rp2350-s-h...
34•walterbell•5d ago•7 comments

How to Code Claude Code in 200 Lines of Code

https://www.mihaileric.com/The-Emperor-Has-No-Clothes/
574•nutellalover•17h ago•187 comments

European Commission issues call for evidence on open source

https://lwn.net/Articles/1053107/
224•pabs3•5h ago•120 comments

Wolves Became Dogs

https://www.economist.com/christmas-specials/2025/12/18/how-wolves-became-dogs
8•mooreds•3d ago•4 comments

What happened to WebAssembly

https://emnudge.dev/blog/what-happened-to-webassembly/
174•enz•5h ago•163 comments

Why I left iNaturalist

https://kueda.net/blog/2026/01/06/why-i-left-inat/
212•erutuon•11h ago•105 comments

Hacking a Casio F-91W digital watch (2023)

https://medium.com/infosec-watchtower/how-i-hacked-casio-f-91w-digital-watch-892bd519bd15
122•jollyjerry•4d ago•34 comments

Sopro TTS: A 169M model with zero-shot voice cloning that runs on the CPU

https://github.com/samuel-vitorino/sopro
268•sammyyyyyyy•16h ago•97 comments

Samba Was Written (2003)

https://download.samba.org/pub/tridge/misc/french_cafe.txt
55•tosh•5d ago•28 comments

Embassy: Modern embedded framework, using Rust and async

https://github.com/embassy-rs/embassy
235•birdculture•13h ago•98 comments

Bose has released API docs and opened the API for its EoL SoundTouch speakers

https://arstechnica.com/gadgets/2026/01/bose-open-sources-its-soundtouch-home-theater-smart-speak...
2361•rayrey•21h ago•354 comments

Photographing the hidden world of slime mould

https://www.bbc.com/news/articles/c9d9409p76qo
57•1659447091•1w ago•12 comments

Richard D. James aka Aphex Twin speaks to Tatsuya Takahashi (2017)

https://web.archive.org/web/20180719052026/http://item.warp.net/interview/aphex-twin-speaks-to-ta...
191•lelandfe•15h ago•70 comments

Show HN: Executable Markdown files with Unix pipes

56•jedwhite•10h ago•48 comments

1ML for non-specialists: introduction

https://pithlessly.github.io/1ml-intro
18•birdculture•6d ago•4 comments

The Jeff Dean Facts

https://github.com/LRitzdorf/TheJeffDeanFacts
487•ravenical•23h ago•169 comments

The unreasonable effectiveness of the Fourier transform

https://joshuawise.com/resources/ofdm/
248•voxadam•17h ago•104 comments

AI coding assistants are getting worse?

https://spectrum.ieee.org/ai-coding-degrades
341•voxadam•21h ago•538 comments

He was called a 'terrorist sympathizer.' Now his AI company is valued at $3B

https://sfstandard.com/2026/01/07/called-terrorist-sympathizer-now-ai-company-valued-3b/
207•newusertoday•18h ago•275 comments

Anthropic blocks third-party use of Claude Code subscriptions

https://github.com/anomalyco/opencode/issues/7410
407•sergiotapia•9h ago•333 comments

Mysterious Victorian-era shoes are washing up on a beach in Wales

https://www.smithsonianmag.com/smart-news/hundreds-of-mysterious-victorian-era-shoes-are-washing-...
40•Brajeshwar•3d ago•15 comments

MCP is a fad

https://tombedor.dev/mcp-is-a-fad/
101•risemlbill•2h ago•74 comments

Why is there a tiny hole in the airplane window? (2023)

https://www.afar.com/magazine/why-airplane-windows-have-tiny-holes
46•quan•4d ago•22 comments

Ushikuvirus: Newly discovered virus may offer clues to the origin of eukaryotes

https://www.tus.ac.jp/en/mediarelations/archive/20251219_9539.html
112•rustoo•1d ago•26 comments

Google AI Studio is now sponsoring Tailwind CSS

https://twitter.com/OfficialLoganK/status/2009339263251566902
681•qwertyforce•17h ago•240 comments

Systematically Improving Espresso: Mathematical Modeling and Experiment (2020)

https://www.cell.com/matter/fulltext/S2590-2385(19)30410-2
44•austinallegro•6d ago•10 comments

Fixing a Buffer Overflow in Unix v4 Like It's 1973

https://sigma-star.at/blog/2025/12/unix-v4-buffer-overflow/
136•vzaliva•18h ago•36 comments

Show HN: macOS menu bar app to track Claude usage in real time

https://github.com/richhickson/claudecodeusage
136•RichHickson•18h ago•46 comments

Mux (YC W16) is hiring a platform engineer that cares about (internal) DX

https://www.mux.com/jobs
1•mmcclure•15h ago