frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

https://github.com/ambonvik/cimba
21•ambonvik•3h ago
Hi all,

I have built Cimba, a multithreaded discrete event simulation library in C.

Cimba uses POSIX pthread multithreading for parallel execution of multiple simulation trials, while coroutines provide concurrency inside each simulated trial universe. The simulated processes are based on asymmetric stackful coroutines with the context switching hand-coded in assembly.

The stackful coroutines make it natural to express agentic behavior by conceptually placing oneself "inside" that process and describing what it does. A process can run in an infinite loop or just act as a one-shot customer passing through the system, yielding and resuming execution from any level of its call stack, acting both as an active agent and a passive object as needed. This is inspired by my own experience programming in Simula67, many moons ago, where I found the coroutines more important than the deservedly famous object-orientation.

Cimba turned out to run really fast. In a simple benchmark, 100 trials of an M/M/1 queue run for one million time units each, it ran 45 times faster than an equivalent model built in SimPy + Python multiprocessing. The running time was reduced by 97.8 % vs the SimPy model. Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores.

The speed is not only due to the efficient coroutines. Other parts are also designed for speed, such as a hash-heap event queue (binary heap plus Fibonacci hash map), fast random number generators and distributions, memory pools for frequently used object types, and so on.

The initial implementation supports the AMD64/x86-64 architecture for Linux and Windows. I plan to target Apple Silicon next, then probably ARM.

I believe this may interest the HN community. I would appreciate your views on both the API and the code. Any thoughts on future target architectures to consider?

Docs: https://cimba.readthedocs.io/en/latest/

Repo: https://github.com/ambonvik/cimba

Comments

quibono•1h ago
I don't know enough about event simulation to talk API design in depth but I find the stackful coroutine approach super interesting so I'll be taking a look at the code later!

Do you plan on accepting contributions or do you see the repo as being a read-only source?

ambonvik•1h ago
I would be happy accepting contributions, especially for porting to additional architectures. I think the dependency is relatively well encapsulated (see src/port), but code for additional architectures needs to be well tested on the actual platform, and there are limits to how much hardware fits on my desk.
jerf•1h ago
While that speed increase is real, of course, you're really just looking at the general speed delta between Python and C there. To be honest I'm a bit surprised you didn't get another factor of 2 or 3.

"Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores"

One of the reasons I don't care in the slightest about Python "fixing" the GIL. When your language is already running at a speed where a compiled language can be quite reasonably expected to outdo your performance on 32 or 64 cores on a single core, who really cares if removing the GIL lets me get twice the speed of an unthreaded program in Python by running on 8 cores? If speed was important you shouldn't have been using pure Python.

(And let me underline that pure in "pure Python". There are many ways to be in the Python ecosystem but not be running Python. Those all have their own complicated cost/benefit tradeoffs on speed ranging all over the board. I'm talking about pure Python here.)

anematode•45m ago
Looks really cool and I'm going to take a closer look tonight!

How do you do the context switching between coroutines? getcontext/setcontext, or something more architecture specific? I'm currently working on some stackful coroutine stuff and the swapcontext calls actually take a fair amount of time, so I'm planning on writing a custom one that doesn't preserve unused bits (signal mask and FPU state). So I'm curious about your findings there

ambonvik•37m ago
Hi, it is hand-coded assembly. Pushing all necessary registers to the stack (including GS on Windows), swapping the stack pointer to/from memory, popping the registers, and off we go on the other stack. I save FPU flags, but not more FPU state than necessary (which again is a whole lot more on Windows than on Linux).

Others have done this elsewhere, of course. There are links/references to several other examples in the code. I mention two in particular in the NOTICE file, not because I copied their code, but because I read it very closely and followed the outline of their examples. It would probably taken me forever to figure out the Windows TIB on my own.

What I think is pretty cool (biased as I am) in my implementation is the «trampoline» that launches the coroutine function and waits silently in case it returns. If it does, it is intercepted and the proper coroutine exit() function gets called.

anematode•11m ago
Interesting. How does the trampoline work?

I'm wondering whether we could further decrease the overhead of the switch on GCC/clang by marking the push function with `__attribute__((preserve_none))`. Then among GPRs we only need to save the base and stack pointers, and the callers will only save what they need to

sovande•7m ago
Didn’t read the code yet, but stuff like this tend to be brittle. Do you do something clever around stack overflow, function return overwrite or would that just mess up all coroutines using the same stack?

Qwen3-Coder-Next

https://qwen.ai/blog?id=qwen3-coder-next
351•danielhanchen•3h ago•205 comments

Deno Sandbox

https://deno.com/blog/introducing-deno-sandbox
112•johnspurlock•1h ago•35 comments

AliSQL: Alibaba's open-source MySQL with vector and DuckDB engines

https://github.com/alibaba/AliSQL
26•baotiao•50m ago•3 comments

Agent Skills

https://agentskills.io/home
263•mooreds•5h ago•161 comments

Prek: A better, faster, drop-in pre-commit replacement, engineered in Rust

https://github.com/j178/prek
91•fortuitous-frog•3h ago•48 comments

Xcode 26.3 unlocks the power of agentic coding

https://www.apple.com/newsroom/2026/02/xcode-26-point-3-unlocks-the-power-of-agentic-coding/
109•davidbarker•1h ago•70 comments

221 Cannon Road Is Not for Sale

https://fredbenenson.com/blog/2026/02/03/221-cannon-is-not-for-sale/
67•mecredis•2h ago•38 comments

France dumps Zoom and Teams as Europe seeks digital autonomy from the US

https://apnews.com/article/europe-digital-sovereignty-big-tech-9f5388b68a0648514cebc8d92f682060
304•AareyBaba•2h ago•172 comments

What's up with all those equals signs anyway?

https://lars.ingebrigtsen.no/2026/02/02/whats-up-with-all-those-equals-signs-anyway/
496•todsacerdoti•9h ago•151 comments

Kilobyte is precisely 1000 bytes

https://waspdev.com/articles/2026-01-11/kilobyte-is-1000-bytes
36•surprisetalk•2h ago•108 comments

Launch HN: Modelence (YC S25) – App Builder with TypeScript / MongoDB Framework

29•eduardpi•3h ago•17 comments

Show HN: Octosphere, a tool to decentralise scientific publishing

https://octosphere.social/
21•crimsoneer•2h ago•10 comments

Bunny Database

https://bunny.net/blog/meet-bunny-database-the-sql-service-that-just-works/
167•dabinat•7h ago•80 comments

Heritability of intrinsic human life span is about 50%

https://www.science.org/doi/10.1126/science.adz1187
101•XzetaU8•2d ago•59 comments

Puget Systems Most Reliable Hardware of 2025

https://www.pugetsystems.com/labs/articles/puget-systems-most-reliable-hardware-of-2025/
16•zdw•3d ago•1 comments

Defining Safe Hardware Design [pdf]

https://people.csail.mit.edu/rachit/files/pubs/safe-hdls.pdf
20•rachitnigam•2h ago•2 comments

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

https://github.com/ambonvik/cimba
21•ambonvik•3h ago•7 comments

The Everdeck: A Universal Card System (2019)

https://thewrongtools.wordpress.com/2019/10/10/the-everdeck/
63•surprisetalk•6d ago•16 comments

Tadpole – A modular and extensible DSL built for web scraping

https://tadpolehq.com/
15•zachperkitny•3h ago•5 comments

Migrate Wizard – IMAP Based Email Migration Tool

https://migratewizard.com/#features
10•techstuff123•2h ago•7 comments

Emerge Career (YC S22) is hiring a product designer

https://www.ycombinator.com/companies/emerge-career/jobs/omqT34S-founding-product-designer
1•gabesaruhashi•7h ago

Y Combinator will let founders receive funds in stablecoins

https://fortune.com/2026/02/03/famed-startup-incubator-y-combinator-to-let-founders-receive-funds...
21•shscs911•1h ago•10 comments

Young adults report lower life satisfaction in Sweden

https://internationaljournalofwellbeing.org/index.php/ijow/article/view/6001/1299
11•late•2h ago•6 comments

Floppinux – An Embedded Linux on a Single Floppy, 2025 Edition

https://krzysztofjankowski.com/floppinux/floppinux-2025.html
221•GalaxySnail•14h ago•153 comments

The next steps for Airbus' big bet on open rotor engines

https://aerospaceamerica.aiaa.org/the-next-steps-for-airbus-big-bet-on-open-rotor-engines/
31•CGMthrowaway•3h ago•28 comments

Show HN: I built "AI Wattpad" to eval LLMs on fiction

https://narrator.sh/llm-leaderboard
8•jauws•2h ago•6 comments

Show HN: Sandboxing untrusted code using WebAssembly

https://github.com/mavdol/capsule
46•mavdol04•5h ago•17 comments

Show HN: PII-Shield – Log Sanitization Sidecar with JSON Integrity (Go, Entropy)

https://github.com/aragossa/pii-shield
7•aragoss•2h ago•3 comments

Show HN: Safe-now.live – Ultra-light emergency info site (<10KB)

https://safe-now.live
141•tinuviel•10h ago•63 comments

Banning lead in gas worked. The proof is in our hair

https://attheu.utah.edu/health-medicine/banning-lead-in-gas-worked-the-proof-is-in-our-hair/
286•geox•17h ago•213 comments