frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

A toy RTOS inside Super Mario Bros. using emulator save states

https://prettygoodblog.com/p/what-threads-are-part-2
245•notorious_pgb•1d ago
This started as a throwaway metaphor in a blog post, but is now fully runnable: a toy RTOS with preemptive multitasking inside of Super Mario Bros. on the NES.

Essentially, this is:

- A rudimentary preemptive RTOS

- Using an unmodified NES emulator (FCEUX) as the CPU

    - "Unmodified" depending on how you define terms
- With emulator save states as the thread contexts

- With support for (very basic) mutexes, interrupt masking, and condition variables

- Demonstrated using Super Mario Bros. 1-1 with sections of the map dedicated to various synchronization primitives

There are many simplifications and shortcuts taken (doesn't even have task priorities), and it doesn't map 1:1 to true multithreading (e.g., emulator save states represent the state of the entire machine including RAM, whereas thread contexts represent a much more minimal slice), but I think it's A) pretty interesting and B) a unique visceral explanation of threads.

Comments

ignormies•1d ago
This is a super cool visual demonstration of RTOS/scheduling! I love the region-based critical sections!

I took a real-time operating systems course in university as an elective. One of the hardest courses I took the whole four years, but also one of the most interesting. Had a great professor, who gave really demanding, but very instructive, project-based assignments.

I need to find a toy project to play around with this domain again.

notorious_pgb•1d ago
Thank you!

I'm curious how effective you feel this specific example might've been if it were delivered during your course. I suspect I've stumbled across a really helpful teaching tool, but having not gone to university, I don't actually know how this stuff is being taught :v

throwanem•1d ago
Nor I, but I've learned from the metaphor, for what that's worth. And the demystification of primitives has considerable value these days.
notorious_pgb•1d ago
> Nor I, but I've learned from the metaphor, for what that's worth.

It's the sickest possible outcome for me, so: worth a lot!

Refreeze5224•1d ago
This is a fantastic way of demonstrating threading and scheduling, well done!
bitwize•1d ago
"...But first we need to talk about parallel universes."
dsfdsfsfew•1d ago
Kind of pushing the definition of RTOS here, I was expecting something like using ACE to run a concurrent program, still an interesting and informative post.
notorious_pgb•1d ago
You're goddamn right it's pushing it.

I struggled with how to convey this; the ultimate goal is to viscerally demonstrate what a thread IS, so I'm comfortable with it being... something strange... but I do want it to resemble actual concurrent systems to a useful degree.

Like, the thread scheduler code is userland code -- similar to, say, FreeRTOS -- but the "threads" themselves aren't exactly threads in an exact sense, since their context packages are entire save states for a console, not just its processor.

Also, the game code (the true userland?) has no ability to access the threading API whatsoever, which really harms the analogy.

So I went with RTOS because it's a preemptive scheduler with synchronous reschedule-on-block; where the scheduler is implemented in the same codebase as the code using it. But to be honest, nothing I know of fits.

loas•1d ago
I really enjoyed this, thank you.
shakna•1d ago
If you're going to make this your DooM thing, clearly the next thing to do is grab a DooM engine and get to work. (PsyDoom has Lua support even).

But congrats. This is worth being included in a uni course.

notorious_pgb•1d ago
It'd be really interesting to push this to that level -- the big challenge likely being that DooM's state (including video memory) is certainly much larger and expensive to swap than an NES save state.

One would have to get really clever...

noone_youknow•13h ago
… and introduce something like virtual memory?
anthk•1d ago
Forth does that too. https://www.bradrodriguez.com/papers/mtasking.html
gwbas1c•1d ago
Last night I read an article about eliminating lag in emulators. It's done with a similar concept. Basically, for each frame, the emulator calculates the state for different button combinations. Then, based on the button you actually push, the state shown moves to the precalculated one.
jchw•1d ago
Not 100% familiar with exactly what this is but I am familiar with run-ahead, which is basically also the same idea as GGPO, but for eliminating lag:

- Run the emulator one frame normally, using real polled input. Somehow snapshot the state.

- Then, run the emulator n frames (usually just one) with the same input. Present the video + audio from the last frame.

- Synchronize. (Some emulators can get very fancy; instead of just waiting for vsync, they'll also delay until the end of the window minus estimated processing time, to poll input at the last possible moment.)

- Roll back to the saved snapshot. (I believe you can also optimize if you know the inputs really didn't change, avoiding a lot of rollbacks at the cost of less predictable frame times.)

The main reason this is even a good idea is because most games will have some of their own processing latency by design, so jumping a frame or two ahead usually doesn't have any noticeable side-effects. This is a pretty cool idea since obviously modern computers with LCD screens have a lot more latency basically everywhere versus older simpler machines connected to CRTs.

Unfortunately, this sort of approach only works when your emulator's state is small enough and fast enough to restore.

I actually have been dying to experiment with designing an emulator to have fast incremental snapshots from the ground up to see if you could manage to make this feasible for more modern consoles. You could, for example, track dirty memory pages with userfaultfd/MEM_WRITE_WATCH, and design structures like JIT caches to be able to handle rewinding without having to drop the entire cache. I'm actually not sure that all emulators clear their caches upon loading state, but you know, more generally, I would like to know how fast and small you could get save states to be if you were designing for that from the ground up.

achierius•1d ago
What do you mean by JIT cache in this case -- like an inline data cache, or like a cache for the jitted binary code?
jchw•1d ago
The latter, though I'm just using it as an example really. I think for most emulator core designs today you wind up dumping a lot of ephemeral state when you rewind or load state (and sometimes even saving state has some overhead due to synchronization.)
cortesoft•1d ago
i think this is kind of the inverse of how some emulators that add remote play deal with network lag... they will predict what control the opponent will give for the current frame and display it, but then when they actually receive the real opponent control some number of frames later, they will go back to that frame and re-calculate based on knowing the actual control input for that frame, and then catch up to the current frame. That way the game isn't waiting for remote inputs to refresh the screen, but can then reconcile once all information is known.
potato3732842•1d ago
Sounds a lot like speculative execution for hardware.
sparky_•14h ago
Was just thinking the same, this is effectively branch prediction/precalculation.
cevn•1d ago
Umm, that's brilliant. Is this technique employed in all modern emulators or a recent thing?
notorious_pgb•1d ago
Interesting; which emulator(s) was this for? I have to imagine this strategy is mostly effective below a certain total complexity level -- however you want to define that -- and the more modern you get, the more you're lucky to even render a frame at all.
liamwire•1d ago
Brilliant visual demonstration, really helps build an intuition for the concepts at play.
Sohcahtoa82•1d ago
That's pretty neat.

The metaphor I usually go for when it comes to for threads is having several widgets that need to be assembled. To begin working on Widget B, you have to pause the work on Widget A. Adding SMP is like adding a second person to help assemble widgets, but you're still using the same toolbox, so if you both need the same screwdriver, there's no performance benefit to having a second person. Multicore is having multiple workbenches each with their own toolbox so they can operate completely in parallel.

A mutex is like having a specialized tool that can only be used by one widget assembly at a time. Like, each workbench might have a full set of screwdrivers, hammers, wrenches, sockets, etc., but you only have 1 welder.

A semaphore is a tool that can be used by a limited number of widgets, like an oven that can fit up to 4 widgets.

eru•1d ago
As a special challenge: explain (Software) Transactional Memory, as eg implemented in Haskell, in your metaphor.

STM is pretty similar to how some databases implement transactions.

ImPostingOnHN•1d ago
> Adding SMP is like adding a second person to help assemble widgets, but you're still using the same toolbox, so if you both need the same screwdriver, there's no performance benefit to having a second person.

This sounds like hyperthreading – or, maybe I'm missing the screwdriver in the metaphor :)

ranger207•23h ago
Hyperthreading is Intel's implementation of SMP
Sohcahtoa82•21h ago
Funny thing is, when I was first typing that comment, I started with "Hyperthreading" and deleted it in favor of SMP to be vendor agnostic.

EDIT: And it should have been SMT anyways!

_ihaque•20h ago
I think you mean SMT (simultaneous multithreading), as opposed to SMP (symmetric multiprocessing)?
Sohcahtoa82•20h ago
Ah hell, that was MY mistake. It should be SMT in my original comment and it's way too late to edit it now.
notorious_pgb•21h ago
I think this is a pretty damn good metaphor for the introduction of these concepts, especially in a context where the learner(s) might need to imminently apply this knowledge.

What drove me to write this post (as someone who didn't attend college where I might've learned exactly what I'm about to say -- big grain of salt) is that it took me being forced by circumstance to become a firmware engineer for a time to actually understand not just how to multithread well, but what the hell this all even was.

And I guess I saw one too many Reddit comments (mistake #1) which downplayed the idea that there's any benefit to knowing the answer to such things.

That all said, this blog post / metaphor is definitely aimed at someone who's already been using threads effectively for some time -- I don't think it'd fare very well as a first introduction.

iwontberude•18h ago
SMP is a second processor, even more redundant than a second core. Its own cache, memory, Northbridge access, etc.
Sohcahtoa82•12h ago
Yeah, I messed up SMT vs SMP. Can't edit it now, though.
notorious_pgb•11h ago
Ah well -- the article itself has its own fair share of accuracy issues -- and I learned something from your conversation, which wouldn't have happened without said mistake.
aftbit•1d ago
https://web.archive.org/web/20250528203416/https://prettygoo...

https://archive.ph/msHkO

hei-lima•1d ago
Wow! So cool!
noone_youknow•1d ago
This is simply amazing work, I take my hat off to you. And not just the threading in an emulator (which is a genius concept all by itself) but for the article itself - I have spent years trying to find the right way to articulate exactly your points in hopes of engaging people to learn more about the systems they’re building on top of.

From now on, I’ll simply send out a link to this. Thank you!

notorious_pgb•1d ago
My person!

I worried that I'd come across as a high-horse dickhead in the mystification section, when that's the fundamental opposite of my intention and attitude here. I'm glad to hear that it resonated with another.

Although I guess we could both be high-horse dickheads.

upghost•1d ago
Not only that but now, thanks to you, we can all be high-horse dickheads concurrently.
peterkelly•23h ago
The mystification section was my favorite part - you make some excellent points there.
noone_youknow•20h ago
IMO the mystification section is right on the money, I couldn’t have said it better myself (and believe me, I’ve tried).
CGamesPlay•1d ago
Would be neat if the subworld memory was shared between the save states, to really drive home the mutex concept.
notorious_pgb•1d ago
This is one of the core weaknesses of the analogy that I'm trying to figure out how to resolve: there's no shared memory, unlike actual threads.

I think the solution requires that the game code is what's executing threading syscalls, not the Lua around the game code.

It'd definitely be a super sick evolution to add true shared in-game memory.

kelnos•1d ago
I guess in a way the three Mario threads are more like processes? And they're using OS-level or shared-memory mutexes and condvars that can be shared between processes?
notorious_pgb•1d ago
I see what you're getting at -- though then it raises the question: what would it mean for those processes to have multiple threads within them?

The way I've been looking at it, if I want to take this further, processes would maybe be _NES ROMs_, and each process can have multiple threads. Swapping between processes would inherently be more expensive than between threads within a process, which I think would be a really instructive aspect. Also, the whole idea of "sharing memory" between entirely different games is obviously ridiculous, which could be yet another teaching tool.

Then if we want to expand it even further, maybe multicore support would look like multiple emulator instances, connected via sockets to support cross-core IPC. You'd probably want to institute an "all the threads of a process have to be on the same core" rule (so they can locally share primitives in the same Lua script), which would be a great way to demonstrate the principle of building systems which adapt to their realistic constraints, as opposed to building systems which exactly model a textbook.

I've gotten ahead of myself, though.

Ericson2314•20h ago
Easier to leave out the parallelism and just look at concurrency.

For a game like Mario, you can split up the memory and decide what you want to be shared and what do you want to be per-thread. E.g. starting small you can rig up just a few fixed variables like scores or whatever to be persisted from one game to the next. trying to push that further without causing endless corruption should be fun :)

leni536•21h ago
You have critical section in the bonus section behind the pipe. Maybe shared state protected by that critical section could be your shared memory, like the state of the coins there.
Cloudef•22h ago
Isn't this more like coroutines / green threads? I guess it doesn't matter as they are essentially same concept apart from cooperative scheduling vs. preemptive scheduling.
notorious_pgb•22h ago
I suppose the actual thing it is is closer to what you're describing, but the thing that it makes viscerally understandable is threads itself -- I hope.
abyesilyurt•22h ago
Tsoding recently built coroutines for C from scratch. The concept of coroutines is related to threads, so I bet you’d like the video if you liked the article.

https://youtu.be/sYSP_elDdZw

Ericson2314•21h ago
Sorry to be an pedant.

From your previous post

> The first thing to understand about threads is that processors do not support them.

Yes, but I would say by changing the emulator (augmenting the emulator software with your other software, sure) in the outside world, not doing something in the inside world, you have are implementing something closer akin to hardware concurrency support or processes.

> It doesn't map 1:1 to true multithreading (e.g., emulator save states represent the state of the entire machine including RAM, whereas thread contexts represent a much more minimal slice

Right, but I think if we're going to make a threads vs process distinction (vs concurrency in general) whether we have only a small opt-in set of communication channels / synchronization primitives (the mutex and barrier), or a wide open one (shared memory, be careful!) is the operative distinction.

mort96•17h ago
This is probably a stupid question, but why real-time? Isn't it enough to "just" make an OS in SMB, doesn't adding real-time guarantees make it even more difficult, by restricting what allocator and scheduling algorithms you can use?

In any case, this looks like a really cool project.

Someone told me to "touch grass," so I made this simulator

https://pirillo.com/arcade/touch-grass.html
1•ChrisPirillo•6m ago•0 comments

Decomplexification

https://daniel.haxx.se/blog/2025/05/29/decomplexification/
1•leephillips•6m ago•0 comments

Hugging Face unveils two new humanoid robots

https://techcrunch.com/2025/05/29/hugging-face-unveils-two-new-humanoid-robots/
1•ashutosh-mishra•6m ago•0 comments

The Zod Engine

https://zod.sourceforge.net/
1•petethomas•8m ago•0 comments

The ICE agents disappearing your neighbors would like a little privacy, please

https://sfstandard.com/2025/05/29/ice-arrests-blur-photos-agents-san-francisco-court/
7•arkadiyt•11m ago•0 comments

US says it will start revoking visas for Chinese students

https://www.reuters.com/world/us/rubio-says-us-will-start-revoking-visas-chinese-students-2025-05-28/
1•derbOac•16m ago•0 comments

We built a tool to read and learn Spanish.

https://guavabook.web.app/
1•kikichiki•16m ago•0 comments

DeepSeek-R1-0528 Released

https://api-docs.deepseek.com/news/news250528
1•tvvocold•19m ago•0 comments

AI artwork that's more than a gimmick

https://www.nytimes.com/2024/02/15/arts/design/aaron-ai-whitney.html
1•RyanShook•22m ago•0 comments

A Scientist Says Humans Will Go Backwards in Time Within Just 4 Years

https://www.popularmechanics.com/science/a64906457/humans-going-backwards-in-time/
2•rmason•27m ago•0 comments

OpenBAO (Vault open-source fork) Namespaces

https://openbao.org/blog/namespaces-announcement/
1•gslin•31m ago•0 comments

Famous NYC skyscraper was almost toppled by winds

https://nypost.com/2025/05/28/us-news/nyc-citicorp-tower-skyscraper-was-almost-toppled-by-winds/
2•rmason•33m ago•1 comments

The Yasny Breach: Russia's Nuclear Secrets Laid Bare by a Procurement Leak

https://twitter.com/nigroeneveld/status/1927713106907496542
2•borski•34m ago•1 comments

Coinbase Returns to SF: A Tech Boomerang Story

https://sffrontier.com/technology/coinbase-returns-to-sf-a-tech-boomerang-story
1•randycupertino•35m ago•1 comments

Spacewar (DEC PDP-1 emulated in JavaScript)

https://www.masswerk.at/spacewar/
2•felineflock•35m ago•0 comments

New Visa Policies Put America First, Not China

https://www.state.gov/releases/office-of-the-spokesperson/2025/05/new-visa-policies-put-america-first-not-china/
3•BeetleB•35m ago•0 comments

History of the Electronic Self Playing Piano System

https://www.thepianoworld.com/history-of-the-electronic-self-playing-piano-system
1•brudgers•36m ago•0 comments

Delivering full-stack 4G/5G and converged BSS in a remote Caribbean island

https://www.cloud-net.ai/case-studies/communication-coverage
1•y2so•37m ago•0 comments

Oregon Bill to Block Private-Equity Medical Deals Heads to Governor's Desk

https://www.wsj.com/articles/oregon-bill-to-unwind-private-equity-medical-deals-heads-to-governors-desk-a9a7d252
5•petethomas•42m ago•0 comments

MariaDB Acquires Galera Cluster

https://mariadb.com/newsroom/press-releases/mariadb-acquires-galera-cluster/
1•hu3•46m ago•0 comments

Product of Additive Inverses

https://susam.net/product-of-additive-inverses.html
1•susam•47m ago•0 comments

Player Piano Rolls

https://omeka-s.library.illinois.edu/s/MPAL/page/player-piano-rolls-landing
14•brudgers•47m ago•2 comments

Google's Stitch nails prompt abstraction – I explored similar ideas in janicre

1•anicre•47m ago•0 comments

Quiero crear una nueva aplicación

https://blog.cloudflare.com/foundation-dns-launch/
1•awildagomez•48m ago•0 comments

White House MAHA Report may have garbled science by using AI, experts say

https://www.adn.com/nation-world/2025/05/29/white-house-maha-report-may-have-garbled-science-by-using-ai-experts-say/
2•rolph•51m ago•0 comments

Prosody 13.0.2 released – An XMPP/Jabber server written in Lua

https://blog.prosody.im/prosody-13.0.2-released/
1•neustradamus•53m ago•0 comments

Pick Your (User Agent) Battles

https://sharpletters.net/2025/05/29/pick-your-ua-battles/
1•tabletcorry•55m ago•0 comments

RAG is dead, long live agentic retrieval

https://www.llamaindex.ai/blog/rag-is-dead-long-live-agentic-retrieval
2•thomasjudge•56m ago•1 comments

Bayes for Everyone

https://www.astralcodexten.com/p/bayes-for-everyone
1•feross•59m ago•0 comments

Source Code for Oracle VirtualBox

https://github.com/VirtualBox/virtualbox
3•Lwrless•1h ago•0 comments