Be one of the few humans still pretty good at using their own brains for those problems LLMs can't solve, and you will be very employable.
The models used in this experiment - deepseek-r1:8b, mistral:7b, qwen3:8b - are tiny. It's honestly a miracle that they produce anything that looks like working code at all!
I'm not surprised that the conclusion was that writing without LLM assistance would be more productive in this case.
Right now, you need the bigger models for good responses, but in a year's time?
So the whole exercise was a bit of a waste of his time, the present target moves too quickly. This isn't a time to be clutching your pearls about running your own models unless you want to do something shady with AI.
And like video streaming was progressed by the porn industry, a lot of people are watching the, um, "thirsty" AI enthusiasts for the big advances in small models.
For anyone interested in playing around with the internals of LLMs without needing to worry about having the hardware to train locally, a couple of projects I've found really fun and educational:
- Implement speculative decoding for two different sized models that share a tokenizer [0]
- Enforce structured outputs through constrained decoding (a great way to dive deeper in to regex parsing as well).
- Create a novel sampler using entropy or other information about token probabilities
The real value of open LLMs, at least for me, has been that they aren't black boxes, you can open them up and take a look inside. For all the AI hype it's a bit of shame that so few people seem to really be messing around with the insides of LLMs.
lrvick•12h ago
If something can not be reproduced from sources which are all distributed under an OSI license it is not Open Source.
Non public sources of unknown license -> Closed source / Proprietary
No training code, no training sources -> Closed source / Proprietary
OSI public source code -> Open Source / Free Software
These terms are very well defined. https://opensource.org/osd
laughingcurve•12h ago
lrvick•12h ago
laughingcurve•12h ago
thewebguyd•9h ago
You and me both. I always preferred and promoted FLOSS where possible but still had a bit of a pragmatic approach, but now the older I get the more I just want to rip out everything not free (as in freedom) from my life, and/or just go become a goat farmer.
Stallman was right from the beginning, and big tech have proven over and over again that they are incapable of being good citizens.
I'm probably a few more years away from "I'd like to interject for a moment..."
diggan•12h ago
> https://www.llama.com/ - "Industry Leading, Open-Source AI"
> https://www.llama.com/llama4/license/ - “Llama Materials” means, collectively, Meta’s proprietary Llama 4
Either the team that built the landing page (Marketing dept?) is wrong, or the legal department is wrong. I'm pretty sure I know who I'd bet on to be more correct.
lrvick•12h ago
The sad part is it is working. It is almost like Meta is especially skilled at mass public manipulation.
oddb0d•11h ago
Can we all please stop confusing Free/Libre Open Source with Open Source?
https://www.gnu.org/philosophy/open-source-misses-the-point....
Maybe if we'd focused on communicating the ethics the world wouldn't be so unaware of the differences
lrvick•11h ago
I was attempting to direct that when software is called Open Source and actually is based on OSI licensed sources, then they are likely talking about Free Software.
oddb0d•9h ago
All those silly ethics, they get in the way of the real work!
jrm4•11h ago
Too much communicating of the ethics would have bogged down the useful legal work.
My take is, Free Software actually won and we're in a post-that world.
oddb0d•9h ago
The reason Free/Libre Open Source Software wins - and always will do in the long run - is because the four freedoms are super-simple and they reflect how the natural world works.
j-bos•11h ago
lrvick•11h ago
jrm4•11h ago
That's why we keep being annoying about "Free Software."
simonw•11h ago
They do continue to require the core freedoms, most importantly "Use the system for any purpose and without having to ask for permission". That's why a lot of the custom licenses (Llama etc) don't fit the OSI definition.
amelius•10h ago
For a (somewhat extreme) example, what if I use the model to write children's stories, and suddenly it regurgitates Mein Kampf? That would certainly ruin the day.
echelon•10h ago
Big tech has been abusing open source to cheaply capture most of the internet and e-commerce anyway, so perhaps it's time we walked away from the term altogether.
The OSI has abdicated the future of open machine learning. And that's fine. We don't need them.
"Free software" is still a thing and it means a very specific and narrow set of criteria. [1, 2]
There's also "Fair software" [3], which walks the line between CC BY-NC-SA and shareware, but also sticks it to big tech by preventing Redis/Elasticsearch capture by the hyperscalers. There's an open game engine [4] that has a pretty nice "Apache + NC" type license.
---
Back on the main topic of "open machine learning": since the OSI fucked up, I came up with a ten point scale here [5] defining open AI models. It's just a draft, but if other people agree with the idea, I'll publish a website about it (so I'd appreciate your feedback!)
There are ten measures by which a model can/should be open:
1. The model code (pytorch, whatever)
2. The pre-training code
3. The fine-tuning code (which might be very different from the pre-training code)
4. The inference code
5. The raw training data (pre-training + fine-tuning)
6. The processed training data (which might vary across various stages of pre-training and fine-tuning: different sizes, features, batches, etc.)
7. The resultant weights blob(s)
8. The inference inputs and outputs (which also need a license; see also usage limits like O-RAIL)
9. The research paper(s) (hopefully the model is also described and characterized in the literature!)
10. The patents (or lack thereof)
A good open model will have nearly all of these made available. A fake "open" model might only give you two of ten.
---
[1] https://www.fsf.org/
[2] https://en.wikipedia.org/wiki/Free_software
[3] https://fair.io/
[4] https://defold.com/license/
[5] https://news.ycombinator.com/item?id=44438329
senko•10h ago
We need better tools to examine the weights (what gets activated to which extent for which topics, for example). Getting full training corpus, while nice, cannot be our only choice.
amelius•10h ago
I can think of a few ways. Perhaps I'd use an LLM to find objectionable content. But anyway, it is the same argument as you can have against e.g. the Linux kernel. Are you going to read every line of code to see if it is secure? Maybe, or maybe not, but that is not the point.
The point is now a model is a black box. It might as well be a Trojan horse.
Ancapistani•5h ago
How would you download it?
Where would you store it?
thewebguyd•10h ago
Poor move IMO. Training data should be required to be released to be considered an open source model. Without it all I can do is set weights, etc. Without training data I can't truly reproduce the model, inspect the data for biases/audit the model for fairness, make improvements & redistribute (a core open source ethos).
Keeping the training data closed means it's not truly open.
simonw•10h ago
Obviously the biggest example here is all of that training data which was scraped from the public web (or worse) and cannot be relicensed because the model producers do not have permission to relicense it.
There are other factors too though. A big one is things like health data - if you train a model that can e.g. visually detect cancer cells you want to be able to release that model without having to release the private health scans that it was trained on.
See their FAQ item: Why do you allow the exclusion of some training data? https://opensource.org/ai/faq#why-do-you-allow-the-exclusion...
actionfromafar•8h ago
tbrownaw•9h ago
The actual poor move is trying to fit the term "open source" onto AI models at all, rather than new terms with names that actually match how models are developed.
pxc•6h ago
I think it greatly diminishes the value of the concept and label of open-source. And it's honestly a bit tragic.
tbrownaw•9h ago
Yes. And you're using them wrong.
From the OSD:
< The source code must be the preferred form in which a programmer would modify the program. >
So, what's the preferred way to modify a model? You get the weights and then run fine-tuning with a relatively small amount of data. Which is way cheaper than re-training the entire thing from scratch.
---
The issue is that normal software doesn't have a way to modify the binary artifacts without completely recreating them, and that AI models not only do have that but have a large cost difference. The development lifecycle has nodes that don't exist for normal software.
Which means that really, AI models need their own different terminology that matches that difference. Say, open-weights and open-data or something.
Kinda like how Creative Commons is a thing because software development lifecycle concepts don't map very well to literature or artwork either.