Got into an argument on Discord about how inefficient CBR/CBZ is

https://old.reddit.com/r/selfhosted/comments/1qi64pr/i_got_into_an_argument_on_discord_about_how/

3•Breadmaker•1w ago

Comments

forgotpwd16•1w ago

>best girl Matsuri profile avatar

>manga-related project

No more info required to know it's good.

Beyond joking, there was/is indeed unexplored space in this niche. Quite amusing though how they went from an argument on a (yuri) manga discord to actually implementing & delivering a (technically competent and well presented) solution.

Also, got some comments. First will state two info points in regards to ZIP: (i) doesn't use solid compression (each file is compressed individually), (ii) can be used in archive-only (no compression) mode aka "stored" (compression method "0").

>Random Access

ZIP supports random access utilizing an (towards) EOF central directory and due to (i).

>If one file is corrupt, the whole thing won't open.

ZIP lacks corruption resistance but it isn't totally fragile. Due to (i) corruptions don't cascade across files; if one file is corrupted, others can be extracted; if no compression used (files "stored") is possible to recover files by headers.

>Metadata isn't native to CBZ,

CBZ (CBR,CB7,etc too) isn't really a well-defined format. It's just pictures within a zip file. (And due to this, project competes CBZ usage-wise but actual competitive tech is zip/rar/etc.) The ComicInfo file is conventional. This information can (and some readers support this) instead be saved in zip header as comment.

>BBF's content deduplication

Wonder how this works. Assume (due the comment '[...]the same "Credits.jpg", "ScanlationGroup.png"[...]') is done across files so maybe something like linked mkvs, that some anime mini-encode groups use? Then how zero-copy goal is achieved? In any case, no, ZIP cannot do this. But if files "stored", a deduplicating filesystem can handle it.

>Images inside are bit-exact copies. No re-encoding.

ZIP can do this due to (ii).

TL;DR BBF competes vs "stored" ZIP plus (conventional) header-saved metadata plus external tooling if my de-duplication assumption is correct. Issue though, and perhaps why an entirely new format is needed, CBZ is conventional and many things BBF targets to do won't hold if not design-bound.