Their reference parsers for Mach-O and DER work quite nicely in abi3audit[1].
[1]: https://github.com/pypa/abi3audit/tree/main/abi3audit/_vendo...
Kaitai is for describing, encoding and decoding file formats. Wuffs is for decoding images (which includes decoding certain file formats). Kaitai is multi-language, Wuffs compiles to C only. If you wrote a parser for PNGs, your Kaitai implementation could tell you what the resolution was, where the palette information was (if any), what the comments look like and on what byte the compressed pixel chunk started. Your Wuffs implementation would give you back the decoded pixels (OK, and the resolution).
Think of Kaitai as an IDL generator for file formats, perhaps. It lets you parse the file into some sort of language-native struct (say, a series of nested objects) but doesn't try to process it beyond the parse.
> Kaitai Struct is in a similar space, generating safe parsers for multiple target programming languages from one declarative specification. Again, Wuffs differs in that it is a complete (and performant) end to end implementation, not just for the structured parts of a file format. Repeating a point in the previous paragraph, the difficulty in decoding the GIF format isn't in the regularly-expressible part of the format, it's in the LZW compression. Kaitai's GIF parser returns the compressed LZW data as an opaque blob.
Taking PNG as an example, Kaitai will tell you the image's metadata (including width and height) and that the compressed pixels are in the such-and-such part of the file. But unlike Wuffs, Kaitai doesn't actually decode the compressed pixels.
---
Wuffs' generated C code also doesn't need any capabilities, including the ability to malloc or free. Its example/mzcat program (equivalent to /bin/bzcat or /bin/zcat, for decoding BZIP2 or GZIP) self-imposes a SECCOMP_MODE_STRICT sandbox, which is so restrictive (and secure!) that it prohibits any syscalls other than read, write, _exit and sigreturn.
(I am the Wuffs author.)
Maybe for GIF, but that each of the following is true is worth a ponder:
(a) Wuffs doesn't implement the various archive formats; the deflate/ README says this is a TODO
(b) the very first sentence of the Wuffs README says Wuffs is "for Wrangling Untrusted File Formats Safely. Wrangling includes parsing, decoding and encoding. Example file formats include images, audio, video, fonts and compressed archives."
(c) a bunch of commentary that has accompanied the recent advisories about ZIP implementation exploits in the last several months have included complaints about the ZIP container format (and not deflate)
(d) for the longest time (like years), the Kaitai IDE demo for ZIP was broken (it may still be broken; I'm not in a place where I can check right now)
I gave a guest lecture in a friend's class last week where we used Kaitai to back out the file format used in "Where in Time is Carmen Sandiego" and it was a total blast. (For me. Not sure that the class agreed? Maybe.) The Web IDE made this super easy -- https://ide.kaitai.io/ .
(On my youtube page I've got recordings of streams where I work with Kaitai to do projects like these, but somehow I am not able to work up the courage to link them here.)
DFDL is heavily encroaching on Kaitai structs territory.
I did NOT have fun trying to use Kaitai to pack the files back together. Not sure if this has improved at all but a year or so ago you had to build dependencies yourself and the process was so cumbersome it ended up being easier to just write imperative code to do it myself.
zzlk•4h ago