frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

ETH Zurich and EPFL to release a LLM developed on public infrastructure

https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.html
257•andy99•4h ago

Comments

k__•3h ago
"respecting web crawling opt-outs during data acquisition produces virtually no performance degradation"

Great to read that!

Onavo•2h ago
No performance degradation on training metrics except for the end user. At the end of the day users and website owners have completely orthogonal interests. Users want answers and content, website owners want attention so they can upsell/push ads. You can only serve one master.
esafak•2h ago
> Users want answers and content, website owners want attention so they can upsell/push ads. You can only serve one master

How are you going to serve users if web site owners decide to wall their content? You can't ignore one side of the market.

Onavo•1h ago
You don't. You bypass them with crawlers and don't reveal your training data. And this is exactly why open source models can't surpass open weight models.
diggan•26m ago
> And this is exactly why open source models can't surpass open weight models.

It is a fair point, but how strong of a point it is remains to be seen, some architectures are better than others, even with the same training data, so not impossible we could at one point see some innovative architectures beating current proprietary ones. It would probably be short-lived though, as the proprietary ones would obviously improve in their next release after that.

Bengalilol•3h ago
Looking forward to proof test it.
greenavocado•2h ago
Why would you announce this without a release? Be honest.
wood_spirit•2h ago
The announcement was at the International Open-Source LLM Builders Summit held this week in Switzerland. Is it so strange that they announced what they are doing and the timeline?
JumpCrisscross•2h ago
Funding? Deeply biasing European uses to publicly-developed European LLMs (or at least not American or Chinese ones) would make a lot of sense. (Potentially too much sense for Brussels.)
phtrivier•2h ago
The cliché (at least on my side of the Alps) is that people in Switzerland like to take theiiiir tiiiime.
Bengalilol•2h ago
"Move as quickly as possible, but as slowly as necessary."
WeirderScience•2h ago
The open training data is a huge differentiator. Is this the first truly open dataset of this scale? Prior efforts like The Pile were valuable, but had limitations. Curious to see how reproducible the training is.
layer8•2h ago
> The model will be fully open: source code and weights will be publicly available, and the training data will be transparent and reproducible

This leads me to believe that the training data won’t be made publicly available in full, but merely be “reproducible”. This might mean that they’ll provide references like a list of URLs of the pages they trained on, but not their contents.

WeirderScience•2h ago
Yeah, I suspect you're right. Still, even a list of URLs for a frontier model (assuming it does turn out to be of that level) would be welcome over the current situation.
glhaynes•2h ago
That wouldn't seem reproducible if the content at those URLs changes. (Er, unless it was all web.archive.org URLs or something.)
dietr1ch•1h ago
This is a problem with the Web. It should be easier to download content like it was updating a git Repo.
TobTobXX•1h ago
Well, when the actual content is 100s of terabytes big, providing URLs may be more practical for them and for others.
layer8•37m ago
The difference between content they are allowed to train on vs. being allowed to distribute copies of is likely at least as relevant.
evolvedlight•49m ago
Yup, it’s not a dataset packaged like you hope for here, as it still contains traditionally copyrighted material
oytis•2h ago
The press release talks a lot about how it was done, but very little about how capabilities compare to other open models.
pantalaimon•2h ago
It's a university, teaching the 'how it's done' is kind of the point
EA-3167•1h ago
Sure, but usually you teach something that is inherently useful, or can be applied to some sort of useful endeavor. In this case I think it's fair to ask what the collision of two bubbles really achieves, or if it's just a useful teaching model, what it can be applied to.
joot82•2h ago
The model will be released in two sizes — 8 billion and 70 billion parameters [...]. The 70B version will rank among the most powerful fully open models worldwide. [...] In late summer, the LLM will be released under the Apache 2.0 License.

We'll find out in September if it's true?

k__•1h ago
I hope DeepSeek R2, but I fear Llama 4.
oytis•1h ago
Yeah, I was thinking more of a table with benchmark results
wood_spirit•2h ago
The article says

“ Open LLMs are increasingly viewed as credible alternatives to commercial systems, most of which are developed behind closed doors in the United States or China”

It is obvious that the companies producing big LLMs today have the incentive to try to enshitify them. Trying to get subscriptions at the same time as trying to do product placement ads etc. Worse, some already have political biases they promote.

It would be wonderful if a partnership between academia and government in Europe can do a public good search and AI that endeavours to serve the user over the company.

klabb3•32m ago
Yes but it’s a very complicated service to deliver. Even if they train great models, they likely will not operationalize them for inference. Those will still be private actors, and the incentives to enshittify will be the same. Also, for AI generally the incentives is much higher than last tech generation, due to cost of running these things. Basically, the free services where you’re the product must aggressively extract value out of you in order to make a profit.
bee_rider•2h ago
Is this setting the bar for dataset transparency? It seems like a significant step forward. Assuming it works out, that is.

They missed an opportunity though. They should have called their machine the AIps (AI Petaflops Supercomputer).

philipkglass•2h ago
I think that the Allen Institute for Artificial Intelligence OLMo models are also completely open:

OLMo is fully open

Ai2 believes in the power of openness to build a future where AI is accessible to all. Open weights alone aren’t enough – true openness requires models to be trained in the open with fully open access to data, models, and code.

https://allenai.org/olmo

isusmelj•2h ago
I hope they do well. AFAIK they’re training or finetuning an older LLaMA model, so performance might lag behind SOTA. But what really matters is that ETH and EPFL get hands-on experience training at scale. From what I’ve heard, the new AI cluster still has teething problems. A lot of people underestimate how tough it is to train models at this scale, especially on your own infra.

Disclaimer: I’m Swiss and studied at ETH. We’ve got the brainpower, but not much large-scale training experience yet. And IMHO, a lot of the “magic” in LLMs is infrastructure-driven.

luke-stanley•2h ago
When I read "from scratch", I assume they are doing pre-training, not just finetuning, do you have a different take? Do you mean it's normal Llama architecture they're using? I'm curious about the benchmarks!
andy99•1h ago
Imo, a lot of the magic is also dataset driven, specifically the SFT and other fine tuning / RLHF data they have. That's what has separated the models people actually use from the also-rans.

I agree with everything you say about getting the experience, the infrastructure is very important and is probably the most critical part of a sovereign LLM supply chain. I would hope there will also be enough focus on the data, early on, that the model will be useful.

alfalfasprout•56m ago
The infra does become pretty complex to get a SOTA LLM trained. People assume it's as simple as loading up the architecture and a dataset + using something like Ray. There's a lot that goes into designing the dataset, the eval pipelines, the training approach, maximizing the use of your hardware, dealing with cross-node latency, recovering from errors, etc.

But it's good to have more and more players in this space.

hubraumhugo•2h ago
Pretty proud to see this at the top of HN as a Swiss (and I know many are lurking here!). These two universities produce world-class founders, researchers, and engineers. Yet, we always stay in the shadow of the US. With our top-tier public infrastructure, education, and political stability (+ neutrality), we have a unqiue opportunity to build something exceptional in the open LLM space.
amelius•45m ago
Yeah, that's what "democratizing AI" means.

Distinct Lifetimes for X and Z Loop Measurements in a Majorana Tetron Device

https://s7d9.scene7.com/is/content/quantum/advancing_topological_qubits_through_new_measurementpdf
1•andsoitis•2m ago•0 comments

My Favourite Ways to Avoid Spending Money

https://www.autodidacts.io/how-to-spend-less-money/
1•colinprince•6m ago•0 comments

Higher Regional Court [Germany]: Virtual theft of crypto remains unpunished

https://www.heise.de/en/news/Higher-Regional-Court-Virtual-theft-of-crypto-assets-remains-unpunished-10484844.html
1•ano-ther•7m ago•0 comments

After 20k Losses, Russia Is Now Functionally Out of Armored Vehicles

https://daxe.substack.com/p/mark-the-date-russia-is-now-functionally
1•vinnyglennon•9m ago•0 comments

Show HN: SmoothCSV – The Ultimate CSV Editor

https://smoothcsv.com/
1•kohii•10m ago•0 comments

Meta Superintelligence – Leadership Compute, Talent, and Data

https://semianalysis.com/2025/07/11/meta-superintelligence-leadership-compute-talent-and-data/
1•chmaynard•12m ago•0 comments

Should the Federal Government Sell Land? – By Brian Potter

https://www.construction-physics.com/p/should-the-federal-government-sell
1•rbanffy•13m ago•0 comments

The Biggest-Ever Digital Camera Is This Cosmologist's Magnum Opus

https://www.quantamagazine.org/the-biggest-ever-digital-camera-is-this-cosmologists-magnum-opus-20250711/
2•rbanffy•14m ago•0 comments

OpenAI to release web browser in challenge to Google Chrome

https://www.cnbc.com/2025/07/09/openai-to-release-web-browser-in-challenge-to-google-chrome.html
1•MiguelX413•19m ago•0 comments

Hanami and the Elephant in the Room

https://hanamirb.org/blog/2025/07/11/hanami-and-the-elephant-in-the-room/
1•mooreds•21m ago•0 comments

A vibe check on the San Francisco biotech scene

https://www.owlposting.com/p/a-vibe-check-on-the-san-francisco
1•crescit_eundo•21m ago•0 comments

Windsurf's CEO goes to Google; OpenAI's acquisition falls apart

https://techcrunch.com/2025/07/11/windsurfs-ceo-goes-to-google-openais-acquisition-falls-apart/
1•coloneltcb•25m ago•0 comments

Japan Achieves World Record 1.02 Petabits per Second Internet Speed

https://www.guru3d.com/story/japan-achieves-world-record-102-petabits-per-second-internet-speed/
1•jader201•26m ago•0 comments

Show HN: TCP Socket in RISC-V Assembly (RV64I)

https://github.com/triilman25/tcp-socket-in-riscv-assembly
1•triilman•30m ago•0 comments

Will tropical dry forests survive the next 50 years?

https://news.mongabay.com/2025/06/will-tropical-dry-forests-survive-the-next-50-years/
2•PaulHoule•31m ago•0 comments

Standardization of the Ohm as a Unit of Electrical Resistance, 1861–1867 (2019)

https://ieeexplore.ieee.org/document/8880709
1•sandwichsphinx•32m ago•0 comments

Ask HN: Anybody still into the "quantified self" thing in 2025?

2•znpy•32m ago•1 comments

Measuring power network frequency using junk you have in your closet

https://halcy.de/blog/2025/02/09/measuring-power-network-frequency-using-junk-you-have-in-your-closet/
1•zdw•33m ago•0 comments

A universal interface connecting you to today's AI models

https://tenzorro.com/en/models
1•paulo20223•33m ago•0 comments

Pen Scraping Challenge – Test Our ML Bot Detection on Aegilock

https://www.aegilock.de/
1•TimTom89•33m ago•1 comments

AI Tools Are Not Time Machines

https://trunk.io/blog/ai-tools-are-not-time-machines
2•samgutentag•34m ago•0 comments

Next Big Shift in Search: From Product to Infrastructure

https://lsvp.com/stories/next-big-shift-in-search-from-product-to-infrastructure/
1•feel-ix-343•36m ago•0 comments

You Cannot Make Everyone Happy

https://learnerncoffee.wordpress.com/2025/07/08/you-cannot-make-everyone-happy/
2•fzliu•37m ago•0 comments

Gamma Hit $50M ARR and 50M Users with Just 35 People

https://haebom.dev/archive?post=7vgjr4m1neyrq2dwpy86
1•haebom•37m ago•0 comments

Lessons from YouWare's Founder on Building AI-Native Products

https://medium.com/@alexwang_thoughts/insights-from-an-interview-with-youwares-founder-on-ai-startup-3e91bceaa1fb
1•rand_num_gen•38m ago•1 comments

Powering Agentic Observability with the Observe MCP Server

https://www.observeinc.com/blog/powering-agentic-observability-with-the-observe-mcp-server
1•chenjiayuan•41m ago•0 comments

Fuel switches were cut off before Air India Boeing 787 crash

https://text.npr.org/nx-s1-5465063
1•BostonFern•47m ago•1 comments

South China Sea Monster: New Chinese Ekranoplan

http://www.hisutton.com/Chinese-WIG-2025-07.html
1•EA-3167•48m ago•0 comments

GPT 4 agrees that AI is like an asteroid on human race

https://chatgpt.com/share/6871898e-e9b4-8009-8869-556a5779a050
1•zkmon•51m ago•1 comments

Repomix, a tool that packs your entire repository into a single AI-friendly file

https://github.com/yamadashy/repomix
1•consumer451•51m ago•0 comments