Near-lossless delta compression for fine-tuned neural network models.
Compressed a Qwen-2.5-0.5b wikitext fine-tune by 3.2x.
Instead of storing 50 fine-tunes of the same base model, store one base and 50 small .wdelta delta files. deltatensors compresses the delta between a base and fine-tuned model, and reconstructs with sub-1% perplexity difference.
AaravGaur•2h ago
Compressed a Qwen-2.5-0.5b wikitext fine-tune by 3.2x.
Instead of storing 50 fine-tunes of the same base model, store one base and 50 small .wdelta delta files. deltatensors compresses the delta between a base and fine-tuned model, and reconstructs with sub-1% perplexity difference.
docs: https://deltatensors.readthedocs.io/en/latest/