The problem: Python data pipelines become the bottleneck when fine-tuning large models. Your GPUs sit idle waiting for data.
The solution: Drop-in Rust acceleration. One import line, zero config changes.
Results on 50k rows: - Streaming data loading: 0.009s vs 0.724s (77x faster) - Parallel SHA256 hashing: 0.027s vs 0.052s (1.9x faster)
Works with Parquet, Arrow, JSON, JSONL, CSV. Supports compression. Cross-platform.
Usage:
import fast_axolotl import axolotl # now accelerated pip install fast-axolotl
Built with PyO3 and maturin. MIT licensed. Happy to answer questions about the Rust/Python interop or benchmark methodology.