Show HN: I made a new compression algorithm

https://github.com/BrowserBox/LZW-X

9•keepamovin•1w ago

Comments

cranberryturkey•1w ago

Middle out?

keepamovin•1w ago

Working on the Weissman score as we speak ;)

cranberryturkey•1w ago

nice

forgotpwd16•1w ago

You or Claude? Have you verified/reason the claims made in README? For starters ZIP doesn't use LZW. Initially used an LZW-derived method with reset mechanism but v2 (early 90s) introduced DEFLATE, combining dict-based LZ77 & Huffman coding (which has become the de-facto ZIP compression). And even this, superior to LZW method, is not considered state-of-the-art nowadays.

keepamovin•1w ago

You got me on the ZIP/LZW mix-up -- that was a mistake in the readme drafting. I'll fix that.

Regarding 'Me or Claude': The core concept (applying bioinformatics edit-distance/alignment to compression rather than just exact prefix matching) is something I worked on back in 2013. The implementation in this repo was heavily assisted by Claude, yes.

You're right that DEFLATE and modern algos (Zstd, Brotli) are the production standard. This project isn't trying to replace Zstd tomorrow; it's a research prototype testing the hypothesis that fuzzy matching + edit scripts can squeeze out entropy that exact-match dictionaries miss. The 8-10x slowdown means it's definitely experimental, but as a starting point for further exploration? That's what I want.

forgotpwd16•1w ago

This is better presentation than README, which currently is marketing-heavy and technically weak. Project as an experiment is acceptable and interesting but certainly isn't "next-generation" when has (assuming benchmarks are valid) <0.2% ratio improvement to an outdated algorithm, at expense (assuming description is valid) of much worse compression/decompression speed. Note such slowdown isn't implementation detail but expected by design; neighbor graph, Levenshtein distance, edit scripts, etc, kill speed. In the end compression is trade-off between ratio and speed, and methods benchmark to both rather one.

As overall note, AIs when you prompt "apply concept X in Y" (or anything really) will tell you what a great idea and then output something that without domain knowledge you've no idea if it's correct or if even makes sense at all. If don't want to do a literature research/study, recommend at least throwing the design back to the machine and asking for critique.

keepamovin•1w ago

I launched with the hype version README. What can I say? Rolled the dice, didn't really care that much. Because the code worked. Spent a few hours iterating on it from the first version - to get the speed to that, and the gains over LZW. That's what I wanted, that's how it happened.

gus_massa•1w ago

> an outdated algorithm

Sorry, not my area. Which are the current best algorithms? (Bonus points if they are open source so the OP can add them to the benchmark.)

vunderba•1w ago

There was a post on HN a few years back on Zstandard which at the time was one of the better compression algorithms out there:

https://news.ycombinator.com/item?id=25455314

keepamovin•1w ago

LZW is the algorithm used in compress and also in GIF. It is a beautifully elegant and simple algorithm (based on learning a dictionary of words, and encoding the source as their indices) that converges in the limit on the Shannon entropy of the source.

In 2013, I was studying bioinformatics and had an idea to apply something like sequence alignment and edit scripts to compression instead of just, as LZW, addition at the end of the string. So, the idea for LZW-X was born long ago, but it wasn't until recently, by the power of AI, that I could implement and test it properly.

This is that proper implementation and it reveals what I intuited: that there are gains to be had using a method like this. I consider this a first rung, a starting point for further exploration.

Check it out: https://github.com/BrowserBox/LZW-X

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: ARM64 Android Dev Kit

Show HN: Compile-Time Vibe Coding

Show HN: Slack CLI for Agents

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

Show HN: Horizons – OSS agent execution engine

Show HN: I built a RAG engine to search Singaporean laws

Show HN: Daily-updated database of malicious browser extensions

Show HN: Sem – Semantic diffs and patches for Git

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

Show HN: Local task classifier and dispatcher on RTX 3080

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

Show HN: I built a directory of $1M+ in free credits for startups

Show HN: A Kubernetes Operator to Validate Jupyter Notebooks in MLOps

Show HN: A password system with no database, no sync, and nothing to breach

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

Show HN: 33rpm – A vinyl screensaver for macOS that syncs to your music

Show HN: Chiptune Tracker

Show HN: Craftplan – I built my wife a production management tool for her bakery

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: ARM64 Android Dev Kit

Show HN: Compile-Time Vibe Coding

Show HN: Slack CLI for Agents

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

Show HN: Horizons – OSS agent execution engine

Show HN: I built a RAG engine to search Singaporean laws

Show HN: Daily-updated database of malicious browser extensions

Show HN: Sem – Semantic diffs and patches for Git

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

Show HN: Local task classifier and dispatcher on RTX 3080

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

Show HN: I built a directory of $1M+ in free credits for startups

Show HN: A Kubernetes Operator to Validate Jupyter Notebooks in MLOps

Show HN: A password system with no database, no sync, and nothing to breach

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

Show HN: 33rpm – A vinyl screensaver for macOS that syncs to your music

Show HN: Chiptune Tracker

Show HN: Craftplan – I built my wife a production management tool for her bakery

Show HN: I made a new compression algorithm

Comments