We benchmarked which small language models are most tunable and which deliver best performance after fine-tuning. Tested 12 models (Qwen, Llama, Gemma, Granite, SmolLM) on 8 tasks.
TL;DR Qwen3 family is the best overall choice, small Llamas improve the most after fine-tuning.
maciejgryka•59m ago
TL;DR Qwen3 family is the best overall choice, small Llamas improve the most after fine-tuning.