Show HN: Codex builds a working NES Emulator in one hour

https://github.com/kaonashi-tyc/codex-nes-emulator

4•zi2zi-jit•1h ago

Hi folks! I know NES emulators have been implemented countless times, in practically every language imaginable.

However, having an LLM fully replicate the spec purely from memory—without referencing existing code—is still a significant challenge. It requires the underlying model to have strong anti-hallucination capabilities and solid long-term planning to keep from going astray. Because of this, building an NES emulator makes for an excellent LLM stress test.

Here is how the emulator was built:

Data Gathering: I asked Codex to download the necessary developer manuals and test suites. It was strictly prohibited from searching for reference implementations online.

Development: I instructed Codex to build the emulator until all test suites passed. This process was mostly hands-free; I only chimed in to encourage it to continue when it paused.

First Draft: After just 4-5 prompts, Codex delivered a functional, pure-Python emulator—though it ran at a sluggish 7 FPS.

Optimization: Asking Codex to optimize the app completely on its own didn't work this time. Instead, I had it generate a flamegraph, which identified the PPU update as the bottleneck. I then instructed Codex to rewrite the PPU in Cython without breaking the passing tests.

Overall, I'm incredibly impressed by Codex. I already knew it was capable of the task, but the speed was astonishing. It finished the project in under an hour, using merely 2% of my weekly Pro quota.

While the NES might be a relatively easy system to emulate, I think emulation could serve as a fantastic benchmark for testing future LLMs.

Comments

nunobrito•57m ago

Quite amazing. This opens doors to many other emulators because now it can replicate quite nicely what is expected as output.

zi2zi-jit•54m ago

Totally agree. I am looking to build something more complex next, something like PS1 in a different language as test. That would require significant more effort but with the speed of how model gets improved I am optimistic.

qsera•21m ago

Can you try to vibe code an AI shill detector next?

You Want to Visit the UK? You Better Have a Google Play or App Store Account

'Futuristic' Unison functional language debuts

The Coming Middle-Class Existential Crisis

Comparing manual vs. AI requirements gathering: 2 sentences vs. 127-point spec

The Edge of Mathematics

China's robot dance for German Chancellor

Mako: A simple virtual game console

Show HN: Molecular Intelligence Platform – Claude Code for Biology – Purna AI

AI Is a Productivity Revolution, Not a Collapse

In this Cleveland newsroom, AI is writing (but not reporting) the news

Show HN: Tablex – Your wedding seat arrangement tool

Data Confidentiality via Storage Encryption on Embedded Linux Devices

Welcome to the Age of the Slop Fork

Show HN: Parallel rsync launcher with fancy progress bars

DeepSeek Paper – DualPath: Breaking the Bandwidth Bottleneck in LLM Inference

I Hate Trump's Awful Policies, but I Love That He's an Asshole

Skiaskia – Read now. Learn later

Show HN: Anonymize LLM traffic to dodge API fingerprinting and rate-limiting

How and why I attribute LLM-derived code

Accelerating AI research that accelerates AI research

Global Intelligence Crisis

How to Stop a Dictator

Agent System – 7 specialized AI agents that plan, build, verify, and ship code

Apple: Python bindings for access to the on-device Apple Intelligence model

Show HN: Sidekick – See what your AI coding agents are doing

Lessons from running Debezium and Kafka at ~1k tables

Plato Warned Us About ChatGPT (and Told Us What to Do About It)

Google API Keys Weren't Secrets. But Then Gemini Changed the Rules

Automated Generation of Background Music and Sound Effects

Login Endpoint Is Leaking Information