The Case for the Return of Fine-Tuning

https://welovesota.com/article/the-case-for-the-return-of-fine-tuning

35•nanark•3h ago

Comments

oli5679•1h ago

The OpenAI fine-tuning api is pretty good - you need to label an evaluation benchmark anyway to systematically iterate on prompts and context, and it’s often creates good results if you give it a 50-100 examples, either beating frontier models or allowing a far cheaper and faster model to catch up.

It requires no local gpus, just creating a json and posting to OpenAI

https://platform.openai.com/docs/guides/model-optimization

deaux•43m ago

They don't offer it for GPT-5 series, as a result much of the time fine-tuning Gemini 2.5-Flash is a better deal.

melpomene•1h ago

This website loads at impressive speeds (from Europe)! Rarely seen anything more snappy. Dynamic loading of content as you scroll, small compressed images without looking like it (webp). Well crafted!

hshdhdhehd•1h ago

Magic of a CDN? Plus avoiding JS probably. Haven't checked source though.

CuriouslyC•1h ago

Fine tuning by pretraining over a RL tuned model is dumb AF. RL task tuning works quite well.

HarHarVeryFunny•48m ago

You may have no choice in how the model you are fine tuning was trained, and may have no interest in verticals it was RL tuned for.

In any case, platforms like tinker.ai support both SFT and RL.

CuriouslyC•17m ago

Why would you choose a model where the trained in priors don't match your use case? Also, keep in mind that RL'd in behavior includes things like reasoning and how to answer questions correctly, so you're literally taking smart models and making them dumber by doing SFT. To top it off, SFT only produces really good results when you have traces that closely model the actual behavior you're trying to get the model to display. If you're just trying to fine tune in a knowledge base, a well tuned RAG setup + better prompts win every time.

imcritic•12m ago

Because you need a solution for your problem and the available tools are what they are and nothing else and you don't have enough resources to train your own model.

empiko•52m ago

Fine-tuning is a good technique to have in a toolbox, but in reality, it is feasible only in some use cases. On one hand, many NLP tasks are already easy enough for LLMs to have near perfect accuracy and fine tuning is not needed. On the other hand, really complex tasks are really difficult to fine-tune and clevem data collection might be pretty expensive. Fine-tuning can help with the use cases somewhere in the middle, not too simple, not too complex, feasible for data collection, etc.

meander_water•32m ago

A couple of examples I have seen recently which makes me agree with OP:

- PaddleOCR, a 0.9B model that reaches SOTA accuracy across text, tables, formulas, charts & handwriting. [0]

- A 3B and 8B model which performs HTML to json extraction at GPT-5 level accuracy at 40-80x less cost, and faster inference. [1]

I think it makes sense to fine tune when you're optimizing for a specific task.

[0] https://huggingface.co/papers/2510.14528

[1] https://www.reddit.com/r/LocalLLaMA/comments/1o8m0ti/we_buil...

Show HN: Duck-UI – Browser-Based SQL IDE for DuckDB

What Happened in 2007?

The Case for the Return of Fine-Tuning

EQ: A video about all forms of equalizers

OpenAI researcher announced GPT-5 math breakthrough that never happened

Jupyter Collaboration has a history slider

Titan submersible’s $62 SanDisk memory card found undamaged at wreckage site

Root System Drawings

The Accountability Problem

Pebble is officially back on iOS and Android

Chen-Ning Yang, Nobel laureate, dies at 103

How to sequence your DNA for <$2k

When you opened a screen shot of a video in Paint, the video was playing in it

How does Turbo listen for Turbo Streams

How one of the longest dinosaur trackways in the world was uncovered in the UK

Why the open social web matters now

Flowistry: An IDE plugin for Rust that focuses on relevant code

Uber will offer gig work like AI data labeling to drivers while not on the road

./watch

K8s with 1M nodes

Secret diplomatic message deciphered after 350 years

Tinnitus Neuromodulator

The optimistic case for protein foundation model companies

IDEs we had 30 years ago and lost (2023)

GoGoGrandparent (YC S16) Is Hiring Back End and Full-Stack Engineers

Coral NPU: A full-stack platform for Edge AI

Adding Breadcrumbs to a Rails Application

Lego Theft Ring

BQN "Macros" with •Decompose (2023)

Space junk falls on Western Australian minesite

The Case for the Return of Fine-Tuning

Comments

Show HN: Duck-UI – Browser-Based SQL IDE for DuckDB

What Happened in 2007?

The Case for the Return of Fine-Tuning

EQ: A video about all forms of equalizers

OpenAI researcher announced GPT-5 math breakthrough that never happened

Jupyter Collaboration has a history slider

Titan submersible’s $62 SanDisk memory card found undamaged at wreckage site

Root System Drawings

The Accountability Problem

Pebble is officially back on iOS and Android

Chen-Ning Yang, Nobel laureate, dies at 103

How to sequence your DNA for <$2k

When you opened a screen shot of a video in Paint, the video was playing in it

How does Turbo listen for Turbo Streams

How one of the longest dinosaur trackways in the world was uncovered in the UK

Why the open social web matters now

Flowistry: An IDE plugin for Rust that focuses on relevant code

Uber will offer gig work like AI data labeling to drivers while not on the road

./watch

K8s with 1M nodes

Secret diplomatic message deciphered after 350 years

Tinnitus Neuromodulator

The optimistic case for protein foundation model companies

IDEs we had 30 years ago and lost (2023)

GoGoGrandparent (YC S16) Is Hiring Back End and Full-Stack Engineers

Coral NPU: A full-stack platform for Edge AI

Adding Breadcrumbs to a Rails Application

Lego Theft Ring

BQN "Macros" with •Decompose (2023)

Space junk falls on Western Australian minesite