I built a tool that generates deterministic SFT + DPO datasets for tool-calling LoRA fine-tuning (no LLM needed)
I was tired of hand-writing JSONL for my Qwen fine-tunes, so I built DataForge. It's a Python framework that generates structured training data from tool schemas — completely deterministic, no API calls needed.
What it does:
You define tool schemas (JSON) + data pools → it generates SFT conversations with tool calls
DPO preference pairs from contrastive ranking
Anti-template explosion detection (Bloom filter + trigram analysis)
Quality gates (configurable thresholds, not vibes)
Streaming generation, constant RAM — tested up to 100K examples
Output: OpenAI/ShareGPT/ChatML format, ready for trl or axolotl
Two working examples included (restaurant assistant, customer support) — ~600 SFT + 60 DPO each, runnable out of the box.
senza1dio•1h ago
I was tired of hand-writing JSONL for my Qwen fine-tunes, so I built DataForge. It's a Python framework that generates structured training data from tool schemas — completely deterministic, no API calls needed.
What it does:
You define tool schemas (JSON) + data pools → it generates SFT conversations with tool calls DPO preference pairs from contrastive ranking Anti-template explosion detection (Bloom filter + trigram analysis) Quality gates (configurable thresholds, not vibes) Streaming generation, constant RAM — tested up to 100K examples Output: OpenAI/ShareGPT/ChatML format, ready for trl or axolotl Two working examples included (restaurant assistant, customer support) — ~600 SFT + 60 DPO each, runnable out of the box.
pip install -e . → dataforge generate --config config.yaml → dataset ready.
Repo: https://github.com/adoslabsproject-gif/dataforge
https://nothumanallowed.com/datasets