frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Nobody Gets Promoted for Simplicity

https://terriblesoftware.org/2026/03/03/nobody-gets-promoted-for-simplicity/
458•aamederen•4h ago•251 comments

"It Turns Out"

https://jsomers.net/blog/it-turns-out
97•Munksgaard•1h ago•36 comments

The one science reform we can all agree on, but we're too cowardly to do

https://www.experimental-history.com/p/the-one-science-reform-we-can-all
77•sito42•1h ago•44 comments

Something is afoot in the land of Qwen

https://simonwillison.net/2026/Mar/4/qwen/
40•simonw•35m ago•11 comments

Glaze by Raycast

https://www.glazeapp.com/
111•romac•3h ago•65 comments

Motorola GrapheneOS devices will be bootloader unlockable/relockable

https://grapheneos.social/@GrapheneOS/116160393783585567
1052•pabs3•15h ago•422 comments

Qwen3.5 Fine-Tuning Guide – Unsloth Documentation

https://unsloth.ai/docs/models/qwen3.5/fine-tune
101•bilsbie•4h ago•28 comments

Apple Introduces MacBook Neo

https://www.apple.com/newsroom/2026/03/say-hello-to-macbook-neo/
457•dm•2h ago•460 comments

Libre Solar – Open Hardware for Renewable Energy

https://libre.solar
58•evolve2k•3d ago•12 comments

Chimpanzees Are into Crystals

https://www.nytimes.com/2026/03/04/science/chimpanzees-crystals.html
58•jimnotgym•8h ago•28 comments

RFC 9849. TLS Encrypted Client Hello

https://www.rfc-editor.org/rfc/rfc9849.html
200•P_qRs•9h ago•94 comments

Medical journal says the case reports it has published for 25 years are fiction

https://retractionwatch.com/2026/03/03/canadian-pediatric-society-journal-correction-case-reports...
61•Tomte•1h ago•13 comments

MyFirst Kids Watch Hacked. Access to Camera and Microphone

https://www.kth.se/en/om/nyheter/centrala-nyheter/kth-studenten-hackade-klocka-for-barn-1.1461249
24•jidoka•3h ago•2 comments

Greg Knauss Is Losing Himself

https://shapeof.com/archives/2026/2/greg_knauss_is_losing_himself.html
36•wallflower•2d ago•18 comments

Jiga (YC W21) Is Hiring

https://jiga.io/about-us
1•grmmph•4h ago

RE#: how we built the fastest regex engine in F#

https://iev.ee/blog/resharp-how-we-built-the-fastest-regex-in-fsharp/
127•exceptione•3d ago•49 comments

A Visual Guide to DNA Sequencing

https://www.asimov.press/p/dna-sequencing
18•surprisetalk•2h ago•2 comments

Toxic combinations: when small signals add up to a security incident

https://blog.cloudflare.com/toxic-combinations-security/
9•unknownhad•5d ago•0 comments

Elevator Saga: The elevator programming game (2015)

https://play.elevatorsaga.com/index.html
65•xmprt•3d ago•11 comments

A CPU that runs entirely on GPU

https://github.com/robertcprice/nCPU
185•cypres•12h ago•90 comments

Bet on German Train Delays

https://bahn.bet
233•indiantinker•6h ago•152 comments

Agentic Engineering Patterns

https://simonwillison.net/guides/agentic-engineering-patterns/
345•r4um•11h ago•193 comments

Modern Illustration: Archive of illustration from c.1950-1975

https://www.modernillustration.org
42•eustoria•3d ago•4 comments

Charging a three-cell nickel-based battery pack with a Li-Ion charger [pdf]

https://www.ti.com/lit/an/slyt468/slyt468.pdf
21•theblazehen•1d ago•1 comments

Show HN: Stacked Game of Life

https://stacked-game-of-life.koenvangilst.nl/
116•vnglst•4d ago•23 comments

Better JIT for Postgres

https://github.com/vladich/pg_jitter
119•vladich•10h ago•49 comments

Claude's Cycles [pdf]

https://www-cs-faculty.stanford.edu/~knuth/papers/claude-cycles.pdf
719•fs123•1d ago•304 comments

Graphics Programming Resources

https://develop--gpvm-website.netlify.app/resources/
159•abetusk•14h ago•14 comments

Did Alibaba just kneecap its powerful Qwen AI team?

https://venturebeat.com/technology/did-alibaba-just-kneecap-its-powerful-qwen-ai-team-key-figures...
82•GTP•3h ago•23 comments

Show HN: I made a zero-copy coroutine tracer to find my scheduler's lost wakeups

https://github.com/lixiasky-back/coroTracer
39•lixiasky•1d ago•1 comments
Open in hackernews

Qwen3.5 Fine-Tuning Guide – Unsloth Documentation

https://unsloth.ai/docs/models/qwen3.5/fine-tune
101•bilsbie•4h ago

Comments

clueless•2h ago
What are some sample real world cases folks are using to fine tune their own small/medium models?
danielhanchen•1h ago
Oh I wrote up a post on X on this exact question! https://x.com/danielhanchen/status/1979389893165060345?s=20

1. Cursor used online RL to get +28% approval rate: https://cursor.com/blog/tab-rl

2. Vercel used RFT for their AutoFix model for V0: https://vercel.com/blog/v0-composite-model-family

3. Perplexity's Sonar for Deep Research Reasoning I think was a finetuned model: https://docs.perplexity.ai/docs/getting-started/overview

4. Doordash uses LoRA, QLoRA for a "Generalized Attribute Extraction model" https://careersatdoordash.com/blog/unleashing-the-power-of-l...

5. NASA flood water detection https://earthdata.nasa.gov/news/nasa-ibm- openly-release-geospatial-ai-foundation-model-nasa-earth-observation-data6

6. Online RL for robotics - imagine you teaching a robot in the future via some mini finetuning

7. OpenAI's RFT page has more: https://developers.openai.com/api/docs/guides/rft-use-cases

8. For larger models - https://www.mercor.com/blog/expert-data-drives-model-perform...

azath92•1h ago
Only to prompt thought on this exact question, im interested in answers:

I just ran a benchmark against haiku of a very simple document classification task that at the moment we farm out to haiku in parallel. very naive same prompt system via same api AWS bedrock, and can see that the a few of the 4b models are pretty good match, and could be easily run locally or just for cheap via a hosted provider. The "how much data and how much improvement" is a question i dont have a good intuition for anymore. I dont even have an order of magnitude guess on those two axis.

Heres raw numbers to spark discussion:

| Model | DocType% | Year% | Subject% | In $/MTok |

|---------------|----------|-------|----------|-----------|

| llama-70b -----| 83 | 98 | 96 | $0.72 |

| gpt-oss-20b --| 83 | 97 | 92 | $0.07 |

| ministral-14b -| 84 | 100 | 90 | $0.20 |

| gemma-4b ----| 75 | 93 | 91 | $0.04 |

| glm-flash-30b -| 83 | 93 | 90 | $0.07 |

| llama-1b ------| 47 | 90 | 58 | $0.10 |

percents are doc type (categorical), year, and subject name match against haiku. just uses the first 4 pages.

in the old world where these were my own in house models, id be interested in seeing if i could uplift those nubmers with traingin, but i haven't done that with the new LLMs in a while. keen to get even a finger to the air if possible.

Can easily generate tens of thousands of examples.

Might try myself, but always keen for an opinion.

_edit for table formatting_

airstrike•19m ago
if you add 2 spaces at the start of the line, you turn it into a code block

  like this
syntaxing•1h ago
Awesome guide, shame how a couple of the Qwen leads got kicked out and replaced with more “business” minded leadership. Hopefully this doesn’t mean the end of the open source era from Qwen.
danielhanchen•1h ago
Oh I think on X a few ago: https://x.com/poezhao0605/status/2029151951167078454 - Alibaba's CEO and CTO are having an emergency all hands now! Hope it all goes well!
antirez•1h ago
Fine tuning is a story that is nice to tell but that with modern LLMs makes less and less sense. Modern LLMs are so powerful that they are able to few shot learn complicated things, so a strong prompt and augmenting the generation (given the massive context window of Qwen3.5, too) is usually the best option available. There are models for which fine tuning is great, like image models: there with LoRa you can get good results in many ways. And LLMs of the past, too: it made sense for certain use cases. But now, why? LLMs are already released after seeing (after pre-training) massive amount of datasets for SFT and then RL. Removing the censorship is much more efficiently done with other techniques. So I have a strong feeling that fine tuning will be every day less relevant, and already is quite irrelevant. This, again, in the specific case of LLMs. For other foundational models fine tuning still makes sense and is useful (images, text to speech, ...).
ranger_danger•1h ago
where it makes sense IMO is when you need it to know about a large amount of information that's not already in the model, such as a company knowledgebase, code repositories or a trove of specialized legal documents... in that case it's not realistic to try to stuff the context window every time with that information, especially if you're trying to make a responsive chat bot.
antirez•1h ago
With the current context windows and the ability those models did RL to work as agents, it's much faster and reliable for them to use tools and find the information before replying. Much better, no hallucinations problems (or a lot less), no fine tuning needed when information changes. I believe it is exactly in this case that fine tuning is no longer useful, and even in the past worked at very different degrees of quality.
dotancohen•1h ago
Wouldn't a RAG make more sense for this use case?
prettyblocks•1h ago
I think the biggest case for fine tuning is probably that you can take small models, fine tune them for applications that require structured output, and then run cheap inference at scale. "Frontier LLMs can do it with enough context" is not really a strong argument against fine-tuning, because they're expensive to run.
throwaway6977•1h ago
I agree- I'm currently trying to learn how I can embed a fine tuned tiny model into my c++ game so it can provide a narrative in prose of certain game-event logs. It needs to be as tiny as possible so it doesn't take resources away from the running game.
derwiki•1h ago
Exactly, inference cost is a very good reason to fine tune with something like Qwen
butILoveLife•55m ago
This is literally what I'm waiting for. I want a ~8B model that works well with OpenClaw.
prettyblocks•38m ago
I don't think you will get that anytime soon because for a model to work well with something like openclaw it needs a massive context window.
butILoveLife•34m ago
but but but but unified memory! (jk, I don't actually believe in Apple marketing words)

There might be future optimizations. Like, have your small model do COT to find where to look for memory that is relevant.

piyh•28m ago
Qwen 9B doesn't?
butILoveLife•19m ago
Nothing is really usable outside Opus.

I've tried too. Wasted a few days trying out even high end paid models.

Me1000•34m ago
Wouldn’t it be better to use a grammar in the token sampler? Tuning is fine, but doesn’t guarantee a syntactical correct structured output. But if the sampler is grammar aware it could.
MillionOClock•26m ago
I think both should be done, they don't really serve the same purpose.
esafak•1h ago
I would like model adaptation algorithms like Doc-to-LoRA (https://pub.sakana.ai/doc-to-lora/) to go mainstream.
danielhanchen•1h ago
These are fair points considering LLMs are getting smarter and better every week - but to be fair the biggest benefits of finetuning / RL are still not yet realized:

1. If we have robots at home, they need some sort of efficient continual learning, which could be on the go finetuning / RL via some small LoRA - this will need to do multimodal finetuning with sparse reward signals - one could also imagine all data is aggregated to one central processing center after anonymization, and training a larger model with more data + RL like that

2. Agreed images, audio, video etc is what still LoRA does well - the guide at https://unsloth.ai/docs/models/qwen3.5/fine-tune is actually a vision + text finetuning guide, so you can finetune the vision layers on your own use case

3. Model routing is going to be more the norm in the future - ie locally smallish models with LoRA for continuous finetuning can be used, but complex tasks can be offloaded to a large LLM in the cloud.

4. I also wrote about more use-cases below on the post - DoorDash, Vercel, Mercor, Stripe, NASA, Perplexity, Cursor and many others all do finetuning - for eg Cursor, Perplexity finetune large OSS LLMs themselves for their specific product lines - so there is definitely value if you have the data for it.

canyon289•45m ago
I work on Gemma and Gemini models I want to echo Daniel's point here. Small finetuned models have their place even with larger general purpose models.

For example last year with Daniel/Unsloth's help we released a tiny specialized model that can get equivalent to Gemini level purpose specifically for FC. For folks that need efficient limited purpose models small models like this can fit a specific need.

https://blog.google/innovation-and-ai/technology/developers-...

Especially on device. https://developers.googleblog.com/on-device-function-calling...

It's the same with chips, we have general purpose CPUs but we still have specialized silicon for tasks that are smaller, more power efficient, cheaper, and because they're single purpose it simplifies and derisks certain designs.

And I have to add, if you want to learn about finetuning models efficiently the Unsloth guides are at the top of my list. They're practical, have all the technical details, and most importantly Daniel and the others are working around the clock to keep it up to date in what is an incredibly fast moving space of models and hardware. I am continually astounded by their work.

KronisLV•46m ago
> But now, why?

Because these models are good in general but their Latvian output is half-drivel, like the roots of the words are usually the right ones, but not the rest.

That, and EuroLLM is really slow to release new models that would be similarly good off the shelf.

abhgh•38m ago
They are great for specialized use-cases: (a) where the problem is not hard enough (you don't need reasoning), or (b) diverse enough (you don't need a world model), (c) you want cheap inference (and you can make it happen hardware-wise) and (d) you either have enough data or a workflow that accumulates data (with fine tuning with enough data you can sometimes beat a premier model while ensuring low latency - ofc, assuming (a) and (b) apply).

I make it sound like a rare perfect storm needs to exist to justify fine tuning, but these circumstances are not uncommon - to an extent (a), (c) and (d) were already prerequisites for deploying traditional ML systems.

joefourier•34m ago
Fine-tuning still makes sense for cost/latency-sensitive applications. Massive context windows drastically slow down generation, and modern models' performance and instruction following ability relies heavily on a reasoning step that can consume orders of magnitude more tokens than the actual response (depending on the application), while a fine-tuned model can skip/significantly reduce that step.

Using the large model to generate synthetic data offline with the techniques you mentioned, then fine-tuning the small model on it, is an underrated technique.

sweaterkokuro•23m ago
As strong as current LLMs are they are easily distracted from the task often. At production scale, fine tuning can make a lot more sense given you provide the model a very specific task.
andsoitis•9m ago
For agentic coding, which do you prefer:

a) qwen3-coder

b) qwen3.5 (general)