I've just open-sourced NFR (Neural Fractal Reconstruction). Unlike GZIP/LZ methods that rely on static frequency, NFR treats data as a predictable sequence.
The Core Idea: > It uses a PyTorch-based LSTM to predict P(x_t | \text{context}). The more the model "overfits" or learns the specific grammar of the file, the closer the prediction gets to 100%, meaning the Arithmetic Coder consumes nearly 0 bits per symbol.
Specs:
Engine: LSTM (Long Short-Term Memory).
Coding: Custom 32-bit Integer Arithmetic Kernel (no floating-point drift).
Approach: Zero-Shot Adaptation (The model learns the file during compression).
It’s a work in progress, but the goal is to reach the Kolmogorov limit where traditional statistical compressors fail.
philippeantoine•1h ago
The Core Idea: > It uses a PyTorch-based LSTM to predict P(x_t | \text{context}). The more the model "overfits" or learns the specific grammar of the file, the closer the prediction gets to 100%, meaning the Arithmetic Coder consumes nearly 0 bits per symbol. Specs:
Engine: LSTM (Long Short-Term Memory).
Coding: Custom 32-bit Integer Arithmetic Kernel (no floating-point drift).
Approach: Zero-Shot Adaptation (The model learns the file during compression). It’s a work in progress, but the goal is to reach the Kolmogorov limit where traditional statistical compressors fail.
Check the architecture here:
https://github.com/AFKmoney/NFR-Compressor
Curious to hear your thoughts on the DaemonArithmeticCoder implementation!"