Hi HN,
I started this project because I was tired of BMP libraries that hide allocations, pull half of libc, or solve only one small part of the problem. There are many libbmp-style repos, but the functionality feels scattered: one does decoding, another does encoding, another works on embedded, another is single-header. In real projects this often means depending on 5–10 small libraries just to load and display images.
I wanted one predictable library instead: no allocations, one header, full control over buffers, and usable both on desktop and on microcontrollers. Something where you always know where memory comes from and where it goes.
Over time, I got more and more into it and started optimizing around what I call ESS: Energy, Speed, Small binary. I added fast paths for common 24/32bpp BMPs, removed unnecessary branching in hot loops, and focused on predictable performance. For embedded systems, I added a streaming decoder that can write directly into LCD/DMA via callbacks, and even decode only a clipped rectangle to save CPU time and power.
The result is TurboLibBMP: a single-header, stb-style BMP decoder and encoder in C, with no allocations, no stdio, no hidden buffers, and full user control over memory and behavior. It is GPL-3.0 and works in both C and C++.
I built this mainly for my own low-level and embedded projects, but I’m sharing it in case others find it useful. I would really appreciate feedback on the API design, edge cases, and whether this approach makes sense in real projects.
Repo:
https://github.com/Ferki-git-creator/turbo-lib-bmp