Seems topical given some recent front-page HN articles on fine-tuning. I discuss a large-scale empirical study from 2014 of fine-tuning 7B models to outperform GPT-4 and GPT-3.5-Turbo, as well as arguments why fine-tuning is coming back into favor
aininja•1h ago
For most enterprise use cases it is indeed all they need. Why use a slow, expensive and inaccurate sledge hammer to push your very specific small nail through the wall?
It just doesn't make sense most of the times to use the slow, expensive, generic black box models that are not optimized for the specific task.
stefanwebb•2h ago