frontpage.

Show HN:Novel Compression Algorithm Based on Pattern Similarity Unlike the Other

1•Forgret•3h ago

I've developed a fundamentally different compression algorithm called FSP (Find Similar Patterns) that breaks from traditional dictionary and statistical compression methods. While most compression algorithms struggle with small files due to inherent overhead, FSP takes a completely different approach that makes it uniquely effective for specific use cases.

How FSP Differs from Traditional Compression:

Unlike LZ77/LZ78 (used in ZIP), LZMA, Huffman coding, or other common algorithms, FSP doesn't rely on:

· Sliding window dictionary approaches · Frequency statistics or entropy coding · Block-based compression with fixed headers · Move-to-front transformations or Burrows-Wheeler transforms

Instead, FSP uses a pattern similarity approach that:

1. Identifies similar patterns across data chunks 2. Stores only differential changes between patterns 3. Uses position-based editing rather than token replacement 4. Has virtually no overhead for highly similar data

The Unique Advantages:

What makes FSP novel isn't just that it handles small files well – it's that it represents a different philosophical approach to compression:

1. No Dictionary Overhead: Unlike LZ variants that must store dictionaries, FSP only stores differences 2. Position-Aware Editing: Rather than replacing tokens, FSP uses exact positional editing 3. Similarity-Based: Excels where data has structural similarity rather than just byte-level repetition 4. Stream-Compatible: Processes data in logical chunks rather than fixed blocks

Technical Differentiators:

· Unlike LZ77: No sliding window or distance-length encoding · Unlike Huffman/Arithmetic: No frequency tables or probability models · Unlike RLE: Handles non-sequential patterns and similarities · Unlike BWT: Doesn't require full data transformation and rearrangement

Where FSP Excels:

The algorithm particularly shines in:

· Versioned data: Where consecutive versions have minor changes · Structured records: Database entries with similar schema but different values · Sensor data: Regular readings with small fluctuations · Log files: Similar log entries with varying parameters · Genomic data: Sequences with localized variations

Performance Characteristics:

In testing, FSP achieves what traditional compressors cannot: consistent compression of very small data chunks without the overhead that typically plagues small-file compression. Where ZIP might add 50+ bytes of overhead, FSP adds virtually none for similar patterns.

Open Questions for Discussion:

I'm particularly interested in the HN community's thoughts on:

1. Theoretical classification of this approach within compression taxonomy 2. Potential hybrid approaches combining FSP with traditional methods 3. Mathematical analysis of the similarity detection problem 4. Applications in distributed systems where small payload compression matters 5. Comparisons to other non-traditional compression approaches

The algorithm is open-source under LGPL 3.0, and I welcome both theoretical and practical contributions. Sometimes innovation comes not from better implementations of existing approaches, but from fundamentally different ways of thinking about problems.

GitHub: https://github.com/Ferki-git-creator/fsp Website(more info): https://ferki-git-creator.github.io/fsp/

Nicholas (Nick) J. Fuentes

Every Commodore Amiga Model Ever Made [video]

Training to Improve Memory

David Baltimore, Nobel-Winning Molecular Biologist, Dies at 87

Pre-owned software trial kicks off in UK as Microsoft pushes resale ban

Lolgato: Advanced controls for Elgato lights on macOS

Show HN: Search the IndieWeb, one query at a time

Don't Build an RL Environment Startup

MacBook lid angle sensor sound effects

Show HN: AIHint – Open standard for verifiable website trust metadata

Show HN: The Daily Word Game Experience

TS framework introspectable by AI via GraphQL

Beyond package management: How Nix refactored my digital life

Undersea cables cut in Red Sea, disrupting internet access in Asia and Mideast

ButterBarTheGr8's Aug 15, 2025 comment in "Unsuitable SSD/NVMe hardware for ZFS"

Will AI Choke Off the Supply of Knowledge?

Source Cooperative

Ask HN: What program is running on this 1996 laptop?

Tor VPN Beta (Android)

14 Killed in protests in Nepal over social media ban

Ask HN: Would Windows users want a native multi-model AI client?

The Dropshipping Problem: Youth Digital Marketing Gone Wrong

Trillion Dollar Elephants

Show HN: Silksong Map Online

Quantum router could speed up quantum computers

Alloyed agents: combining LLMs to improve AI code generation

OntoMotoOS: An "Operating System" Between Delusion and Scholarship

Using Cursor Commands to Onboard a New Developer to a Repository

Show HN: I made a tool to turn anxiety-inducing news into short narrated videos

Signal to start offering 100GB cloud storage