My compression algo explorations are like font explorations. I spend a lot of time doing research and testing, but I (almost) always end up coming back to gzip / arial.
One notable exception is that for very large files (e.g. 10GB+ mbox archives), we found 7z compressed to 39% and gzip 65%. 7z was about 10% faster as well.
gmiller123456•53m ago
Probably better called "Taking a Look at Compression Utilities", not really any information on the algorithms other than the high level names of them and a short description.
ghusbands•36m ago
It has DEFLATE code, Snappy code, LZ4 code, ZSTD exploration, and describes many involved sub-algorithms, with diagrams - what more were you wanting?
georgemcbay•28m ago
For anyone who already has at least a surface level understanding of compression and wants to take a deeper dive, check out Charles Bloom's blog:
Unfortunately it has been dormant for some time but there are years worth of useful information there and he is an uncommonly good presenter of technical knowledge through the written word.
jmagland•7m ago
I've found that Asymmetric Numeral Systems (you mentioned it briefly) is the optimal practical method for pure entropy encoding. I just posted this https://news.ycombinator.com/item?id=47806122
jgalt212•1h ago
One notable exception is that for very large files (e.g. 10GB+ mbox archives), we found 7z compressed to 39% and gzip 65%. 7z was about 10% faster as well.