How does that work when I run the model myself?
Cry me a river, you tried to build a massive moat to force the rest of the world to suck you off for access and now you got caught with your pants down by a model that has been given out for free.
I wouldn't want to know how the US would use the discovery of cold fusion or a cure for all to make a profit for its elite instead of giving it out for the greater good.
Anyway. Deepseek is the most open of the sota models.
Absolutely not. The intent of the open source movement is sharing methods, not just artifacts, and that would require training code and methodology.
A binary (and that's arguably what weights are) you can semi-freely download and distribute is just shareware – that's several steps away from actual open source.
There's nothing wrong with shareware, but calling it open source, or even just "source available" (i.e. open source with licensing/usage restrictions), when it isn't, is disingenuous.
It's much easier and cheap to make a finetune or LoRA than to train from scratch to adapt it to your use case. So it's not quite like source vs binary in software.
That's not enough. The key point was trust. When executable can be verified by independent review and rebuild. It it cannot be rebuilt it can be virus, troyan, backdoor, etc. For LLMs there is no way to reproduce, thus no way to verify them. So, they cannot be trusted and we have to trust producers. It's not that important when models are just talking, but with tools use it can be a real damage.
It's not quite like executing a binary in userland - you're not really granting code execution to anyone with the model, right? Perhaps there is some undisclosed vulnerability in one or more of the runtimes, like llama.cpp, but that's a separate discussion.
It's "reflections on trusting trust" all the way down.
Whether the model is open source, open weight, both, or neither has essentially zero impact on this.
On top of that, I don't think it works quite that way for ML models. Even their creators, with access to all training data and training steps, are having a very hard time reasoning about what these things will do exactly for a given input without trying it out.
"Reproducible training runs" could at least show that there's not been any active adversarial RHLF, but seem prohibitively expensive in terms of resources.
There are different variations, of course. Mostly related to the rights and permissions.
As for big models even their owners, having all the hardware and training data and code, cannot reproduce them. Model may have some undocumented functionality pretrained or added in post process, and it's almost impossible to detect without knowing the key phrase. It can be a harmless watermark or something else.
It doesn't matter much as in both cases provider has access to you ins and outs. The only question is if you trust company operating the model. (yes, you can run local model, but it's not that capable)
From banning open source software to destroying the business of its largest and most profitable companies.
1. ChatGPT funnels your data to American Intelligence Agencies through backend infrastructure subject to U.S. Government National Security Letters (NSLs) that allow for secret collection of customer data by the US Department of Defense.
2. ChatGPT covertly manipulates the results it presents to align with US propaganda, as a result of the widely disseminated Propaganda Model and close ties between OpenAI's leadership and the US Government.
3. It is highly likely that OpenAI used unlawful model training techniques to create its model, stealing from leading international news sources, academic institutions, and publishing houses.
4. OpenAI’s AI model appears to be powered by advanced chips manufactured by Taiwanese semiconductor giant TSMC and reportedly utilizes tens of thousands of chips that are manufactured by a Trade War adversary of America and subject to a 32% import duty.
Igrom•9mo ago
pmags•9mo ago
https://selectcommitteeontheccp.house.gov/members
Igrom•9mo ago
wormlord•9mo ago