frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Open-Weight Models Don't Need to Win

https://twitter.com/googrish/status/2058981136370651610
5•kumama•52m ago

Comments

seahyinghang8•47m ago
i really hope that's the case
kumama•44m ago
yeah :)
sowbug•26m ago
Reprinted with proper capitalization:

The general sentiment on open-weight models is something like this: they can approach the frontier, but will always lag a little behind. Because reaching/surpassing the frontier needs large pools of capital, compute & data access that only the big labs can put together. There’s a good chance this is right. But the question beckons - does it actually matter? I, for one, don’t think it does. The reasons behind this have interesting implications for the longer-term market structure of AI models.

The weights aren’t the moat

A recent thought experiment I ran with my team over lunch: if OpenAI and Anthropic “open-weighted” their models, what do they actually lose? My take: lesser than most believe. Weights alone aren’t a moat, on both consumer & enterprise.

On the consumer side, both the labs have built brands with loyalty. The average consumer isn’t running quantitative benchmarks to see which is better, they use whatever they feel has the best vibes. Case in point is all the folks who clamored for GPT-4o even after much newer, “better” versions were released. On the enterprise side, the fact that both companies are starting PE-like deployment companies & FDEs tells you something.

Enterprises need more than a model - they need folks to figure out integration, evaluation & operationalization. Just having access to weights doesn’t help much there.

What’s holding open-weights back?

The above prose is great and all - but it hasn’t empirically played out yet. Open-weights model usage has gone up in recent months but is still far, far behind frontier models. What’s missing?

The core misunderstanding here is treating open-weight models as one-to-one replacements to closed models. API models are products. Open-weight models are toolkits. And right now, the toolkits are missing most of their tools. There’s historical precedent to this with Linux - Linux didn’t win by being better than Windows out of the box. It won by allowing customization in ways Windows didn’t. Devs could do whatever they wanted with it. That allowed a real community & ecosystem to be built around it: tooling, package managers, etc. Linux won because of the stuff built around it, not because of its core kernel.

Open-weight models today are still at the kernel stage. Weights are there & the customization possibilities are endless (finetuning, quantization & deployment in ways APIs won’t allow). But the surrounding ecosystem is still nascent. If you want to take an open model and make it excellent at your specific use case, you need post-training infra: data creation tools, IDEs, GPU orchestration & inference optimization. The stack that sits between “here’s the weights” and “here’s a model that does what I need” is still immature. Building it out is, I think, the single highest-leverage thing anyone can do for the open-model ecosystem right now.

Why customization matters

One area where model capability has improved rapidly is code. That’s not random - it’s where labs have spent tremendous post-training effort. It’s an example of the implications of taking a capable base set of weights & investing seriously in it to make it excellent at a specific domain.

Now imagine if every vertical progressed at that pace - medical reasoning, legal analysis, scientific workflows, industrial applications. Progress there has been good, but it’s been slower and a little less deliberate. A lot of it is because they haven’t had that level of post-training investment of code.

This requires a sustained, domain-specific effort but also what open-weights can uniquely enable: lots of specialized models, each more intelligent for its specific domain than any general-purpose frontier model. The applications of intelligence are infinite and the closed labs will never staff enough teams to do deep post-training for oncology, contract law, materials science, agricultural planning. They don't have the domain expertise, and they don't have the incentive, the markets are too fragmented, the customization too granular.

Open-weight models let the people who do have the domain expertise build on top of a capable foundation. This doesn't require open models to win the frontier race. It requires them being close enough.

The “close-enough” assumption

This then raises the question - can frontier labs run so far ahead that even “close enough” becomes hard?

It’s a serious concern - because the capital concentration in closed labs has no historical precedent. The combined dollars raised by OAI & Anthropic and a handful of others far exceeds everyone else. Anthropic and OpenAI’s share of AI startup revenue was recently reported to be 89% (https://www.theinformation.com/articles/anthropic-openais-sh...).

The interesting thing though is that a lot of the inputs to model building don’t compound. Talent’s been remarkably fluid across the labs, carrying the tricks of the trade with them. Data too isn’t a cornered resource - there are tons of data vendors & synthetic data pipelines are improving rapidly.

Compute concentration is the most serious concern. But the nature of being closed also structurally demands a lot more compute. Closed labs internalize everything - including inference. If you’re the only one who can serve the model, you’ve to provision compute for every use-case and customer. Growth comes with a proportional capital burden. Open-models don’t have this problem - they can be deployed anywhere by anyone. The inference burden is shared across the ecosystem. This way customization & inference can grow without a proportional capital burden.

None of this means the closed labs won't be ahead. But being ahead is not the same as a runaway. And for the specialization thesis to work, open models just need to stay within striking distance, as they have for the last 2 years.

Why this matters to me

This essay probably reads as me really wanting open-weights models to succeed.

I started working on what was then called NLP as a fifteen-year-old in Singapore, far far from Silicon Valley, because I thought it was the coolest thing ever. It was only possible because there was a robust culture of open-research on the cutting edge, driven by both public institutions and private companies like Google who published their work freely. Open-weight models are the continuation of that tradition. I want that door to stay open for whoever's fifteen and curious right now.

kumama•18m ago
hahahhah someone's punching back against my war on capitalization :)
firebear•2m ago
> open-weight models let the people who do have the domain expertise build on top of a capable foundation. this doesn't require open models to win the frontier race. it requires them being close enough.

This makes sense to me. Access to top-class open weight models allow for the community to do fine-tunes on low-resource languages such as Kinyarwanda or Luganda, which closed labs may not have time or expertise for.

Multiple ground up initiatives here in Rwanda would not be possible without models such as Qwen or NLLB.

Show HN: Tussup.com – TikTok for Debates

https://tussup.com
1•frantzarty•1m ago•0 comments

Japanese security guard finds fame as designer of duct tape signs

https://www.reuters.com/world/asia-pacific/japanese-security-guard-finds-fame-designer-duct-tape-...
1•e2e4•3m ago•0 comments

Show HN: HypeCheck – I built a tool that fact-checks supplement marketing

https://hypecheck.io/
1•dannykim32•7m ago•0 comments

Producthunt Alternative but with a Viral Twist

https://startuplaunchpage.com
1•vnyarongi•8m ago•0 comments

What are all those pins for? (2009)

https://www.reenigne.org/blog/what-are-all-those-pins-for/
1•prayerie•8m ago•0 comments

Show HN: Aquifer – a control plane for agentic API traffic

https://github.com/rjpruitt16/aquifer
1•rjpruitt16•9m ago•0 comments

Iran's president orders reopening of international internet access

https://www.reuters.com/world/middle-east/irans-president-orders-reopening-international-internet...
1•geox•9m ago•0 comments

Ask HN: What small engineering habits compound the most over time?

1•praneetbrar•10m ago•0 comments

Mastodon Isn't Just a Replacement for Twitter (2022)

https://www.noemamag.com/mastodon-isnt-just-a-replacement-for-twitter/
2•cdrnsf•11m ago•0 comments

Free AI APIs – Build Anything with Pollinations

https://pollinations.aivaded.com
1•godsbee•15m ago•0 comments

Detecting and removing dangerous secrets on workstations before Shai-Hulud does

https://recyclebin.zip/posts/2026-05-25-secret-scanning-fleet-bagel/
3•gepeto42•15m ago•0 comments

DeepSeek seems to be leaking random user chat history

https://breakingvibe.dev/news/deepseek-leaking-user-chat-history
1•jklmnopqrstuvw•16m ago•0 comments

On-premises for legal is not a good business

https://robertkarl.net/blog/2026/May/25/on-premises-for-legal-is-not-a-good-business.html
1•robertkarl•17m ago•1 comments

Ferrari Luce Reveal [video]

https://www.youtube.com/watch?v=kSFl4iE1-y8
2•simonebrunozzi•17m ago•1 comments

A Comma and a Question Mark, Redux: Quick Terminal Helpers Using Pi

https://z3ugma.github.io/2026/05/25/a-comma-and-a-question-mark/
1•z3ugma•17m ago•0 comments

Anthropic Cofounder Chris Olah's Remarks on Pope Leo XIV's "Magnifica Humanitas"

https://www.anthropic.com/news/chris-olah-pope-leo-encyclical
2•Philpax•19m ago•0 comments

Is your company concerned by NIS2?

https://www.probo.com/blog/2026-05-25-is-your-company-concerned-by-nis2
1•arthurmyx•20m ago•0 comments

The IPO wave will enshrine the AI gods' control over the future

https://economist.com/by-invitation/2026/05/21/the-ipo-wave-will-enshrine-the-ai-gods-control-ove...
1•andsoitis•20m ago•0 comments

The Art of Assembly Language (1999)

http://web.archive.org/web/20011226171528/http://cs.smith.edu/~thiebaut/ArtOfAssembly/fwd/fwd.html
3•downbad_•20m ago•0 comments

Giraffes Silently Slip onto the Endangered Species List (2016)

https://www.smithsonianmag.com/smart-news/giraffes-silently-slip-endangered-species-list-180961372/
2•downbad_•21m ago•0 comments

BoxStream Cloud Music

https://www.boxstreamapp.com/
1•boxstream•22m ago•1 comments

WebAssembly 128-bit packed SIMD Extension

https://github.com/WebAssembly/spec/blob/main/proposals/simd/SIMD.md
2•tosh•24m ago•0 comments

Show HN: Hypergraph – directed hypergraph library in Rust (40 graph algorithms)

https://github.com/yamafaktory/hypergraph
1•yamafaktory•28m ago•0 comments

Insane AI Breakthroughs with Demis Hassabis [video]

https://www.youtube.com/watch?v=huAwz_BR8WM
1•deepserket•28m ago•0 comments

Ask HN: How is all new software not broken?

1•zwilderrr•29m ago•2 comments

Consciousness might be a fundamental feature of reality, like gravity

https://spacedaily.com/d-consciousness-might-not-be-something-the-brain-creates-it-might-be-a-fun...
2•sowbug•31m ago•1 comments

Annotation Pro

https://apps.microsoft.com/detail/9ns87mqb29c7?hl=en-US&gl=US
1•casultra•31m ago•1 comments

Epic reveals first Unreal Engine 6 game, and it's not Fortnite

https://www.pcgamer.com/gaming-industry/epic-reveals-first-unreal-engine-6-game-and-its-not-fortn...
1•Brajeshwar•31m ago•0 comments

A GenAIration Lost in Space

https://playtechnique.io/blog/a-genairation-lost-in-space.html
1•gwynforthewyn•32m ago•1 comments

Google's Family link can be abused to lock down compromised accounts

https://techwolf12.nl/blog/google-family-link-exploit/
2•techwolf12•32m ago•0 comments