frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

https://github.com/ambonvik/cimba
19•ambonvik•3h ago
Hi all,

I have built Cimba, a multithreaded discrete event simulation library in C.

Cimba uses POSIX pthread multithreading for parallel execution of multiple simulation trials, while coroutines provide concurrency inside each simulated trial universe. The simulated processes are based on asymmetric stackful coroutines with the context switching hand-coded in assembly.

The stackful coroutines make it natural to express agentic behavior by conceptually placing oneself "inside" that process and describing what it does. A process can run in an infinite loop or just act as a one-shot customer passing through the system, yielding and resuming execution from any level of its call stack, acting both as an active agent and a passive object as needed. This is inspired by my own experience programming in Simula67, many moons ago, where I found the coroutines more important than the deservedly famous object-orientation.

Cimba turned out to run really fast. In a simple benchmark, 100 trials of an M/M/1 queue run for one million time units each, it ran 45 times faster than an equivalent model built in SimPy + Python multiprocessing. The running time was reduced by 97.8 % vs the SimPy model. Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores.

The speed is not only due to the efficient coroutines. Other parts are also designed for speed, such as a hash-heap event queue (binary heap plus Fibonacci hash map), fast random number generators and distributions, memory pools for frequently used object types, and so on.

The initial implementation supports the AMD64/x86-64 architecture for Linux and Windows. I plan to target Apple Silicon next, then probably ARM.

I believe this may interest the HN community. I would appreciate your views on both the API and the code. Any thoughts on future target architectures to consider?

Docs: https://cimba.readthedocs.io/en/latest/

Repo: https://github.com/ambonvik/cimba

Comments

quibono•1h ago
I don't know enough about event simulation to talk API design in depth but I find the stackful coroutine approach super interesting so I'll be taking a look at the code later!

Do you plan on accepting contributions or do you see the repo as being a read-only source?

ambonvik•1h ago
I would be happy accepting contributions, especially for porting to additional architectures. I think the dependency is relatively well encapsulated (see src/port), but code for additional architectures needs to be well tested on the actual platform, and there are limits to how much hardware fits on my desk.
jerf•57m ago
While that speed increase is real, of course, you're really just looking at the general speed delta between Python and C there. To be honest I'm a bit surprised you didn't get another factor of 2 or 3.

"Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores"

One of the reasons I don't care in the slightest about Python "fixing" the GIL. When your language is already running at a speed where a compiled language can be quite reasonably expected to outdo your performance on 32 or 64 cores on a single core, who really cares if removing the GIL lets me get twice the speed of an unthreaded program in Python by running on 8 cores? If speed was important you shouldn't have been using pure Python.

(And let me underline that pure in "pure Python". There are many ways to be in the Python ecosystem but not be running Python. Those all have their own complicated cost/benefit tradeoffs on speed ranging all over the board. I'm talking about pure Python here.)

anematode•40m ago
Looks really cool and I'm going to take a closer look tonight!

How do you do the context switching between coroutines? getcontext/setcontext, or something more architecture specific? I'm currently working on some stackful coroutine stuff and the swapcontext calls actually take a fair amount of time, so I'm planning on writing a custom one that doesn't preserve unused bits (signal mask and FPU state). So I'm curious about your findings there

ambonvik•32m ago
Hi, it is hand-coded assembly. Pushing all necessary registers to the stack (including GS on Windows), swapping the stack pointer to/from memory, popping the registers, and off we go on the other stack. I save FPU flags, but not more FPU state than necessary (which again is a whole lot more on Windows than on Linux).

Others have done this elsewhere, of course. There are links/references to several other examples in the code. I mention two in particular in the NOTICE file, not because I copied their code, but because I read it very closely and followed the outline of their examples. It would probably taken me forever to figure out the Windows TIB on my own.

What I think is pretty cool (biased as I am) in my implementation is the «trampoline» that launches the coroutine function and waits silently in case it returns. If it does, it is intercepted and the proper coroutine exit() function gets called.

anematode•6m ago
Interesting. How does the trampoline work?

I'm wondering whether we could further decrease the overhead of the switch on GCC/clang by marking the push function with `__attribute__((preserve_none))`. Then among GPRs we only need to save the base and stack pointers, and the callers will only save what they need to

sovande•2m ago
Didn’t read the code yet, but stuff like this tend to be brittle. Do you do something clever around stack overflow, or would that just mess up all coroutines using the same stack?

Show HN: Octosphere, a tool to decentralise scientific publishing

https://octosphere.social/
20•crimsoneer•2h ago•9 comments

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

https://github.com/ambonvik/cimba
19•ambonvik•3h ago•7 comments

Show HN: I built "AI Wattpad" to eval LLMs on fiction

https://narrator.sh/llm-leaderboard
8•jauws•2h ago•6 comments

Show HN: Sandboxing untrusted code using WebAssembly

https://github.com/mavdol/capsule
46•mavdol04•4h ago•17 comments

Show HN: PII-Shield – Log Sanitization Sidecar with JSON Integrity (Go, Entropy)

https://github.com/aragossa/pii-shield
7•aragoss•2h ago•3 comments

Show HN: Safe-now.live – Ultra-light emergency info site (<10KB)

https://safe-now.live
141•tinuviel•10h ago•63 comments

Show HN: Autoliner – write a bot to control a virtual airline

https://autoliner.app/
3•msvan•49m ago•0 comments

Show HN: Nomad Tracker – a local-first iOS app to track visas and tax residency

https://nomadtracker.app
2•gotzonza•1h ago•0 comments

Show HN: Stigmergy pattern for multi-agent LLMs (80% fewer API calls)

https://github.com/KeepALifeUS/autonomous-agents
3•keepalifeus•1h ago•0 comments

Show HN: kiln.bot - Orchestrate Claude Code from GitHub

6•elondemirock•1h ago•2 comments

Show HN: Homomorphically Encrypted Vector Database

https://github.com/cloneisyou/HEVEC
2•cloneisme•1h ago•2 comments

Show HN: difi – A Git diff TUI with Neovim integration (written in Go)

https://github.com/oug-t/difi
38•oug-t•5h ago•40 comments

Show HN: TrueLedger – a local-first personal finance app with no cloud back end

https://trueledger.satyakommula.com
3•satyakommula•2h ago•0 comments

Show HN: Minikv – Distributed key-value and object store in Rust (Raft, S3 API)

https://github.com/whispem/minikv
60•whispem•11h ago•26 comments

Show HN: ItemGrid – Free inventory management for single-location businesses

https://itemgrid.io
2•boxqr•3h ago•0 comments

Show HN: I built an AI movie making and design engine in Rust

https://github.com/storytold/artcraft
5•echelon•3h ago•1 comments

Show HN: Adboost – A browser extension that adds ads to every webpage

https://github.com/surprisetalk/AdBoost
116•surprisetalk•1d ago•122 comments

Show HN: LUML – an open source (Apache 2.0) MLOps/LLMOps platform

https://github.com/luml-ai/luml
7•okost1•4h ago•2 comments

Show HN: govalid – Go validation without reflection (5-44x faster)

https://github.com/sivchari/govalid
2•sivchari•5h ago•0 comments

Show HN: Sentinel Gate – Open-source RBAC firewall for MCP agents

https://github.com/Sentinel-Gate/Sentinelgate
2•Sentinel-gate•5h ago•1 comments

Show HN: npx claude-mycelium grow – fungi agent orchestration for your repo

https://www.npmjs.com/package/claude-mycelium
2•altras•7h ago•0 comments

Show HN: Kannada Nudi Editor Web Version

https://nudiweb.com/
6•Codegres•15h ago•0 comments

Show HN: Wikipedia as a doomscrollable social media feed

https://xikipedia.org
427•rebane2001•1d ago•140 comments

Show HN: I built a task manager in the MacBook notch for my ADHD brain

https://notchable.com
6•rezabeye•8h ago•2 comments

Show HN: PolliticalScience – Anonymous daily polls with 24-hour windows

https://polliticalscience.vote/
29•ps2026•1d ago•40 comments

Show HN: Open-source semantic search over your local notes via CLI

https://github.com/chenxin-yan/nia-vault
6•jellyotsiro•16h ago•3 comments

Show HN: Apate API mocking/prototyping server and Rust unit test library

https://github.com/rustrum/apate
31•rumatoest•2d ago•21 comments

Show HN: Axiomeer – An open marketplace for AI agents

https://github.com/ujjwalredd/Axiomeer
4•ujjwalreddyks•18h ago•0 comments

Show HN: NanoClaw – “Clawdbot” in 500 lines of TS with Apple container isolation

https://github.com/gavrielc/nanoclaw
516•jimminyx•1d ago•220 comments

Show HN: Nioh guide site – release info, beginner guides, and builds

https://nioh3.net/
2•tanjump•9h ago•1 comments